On the analysis of hyper-parameter space for a genetic programming system with iterated F-Race

Abstract

Evolutionary algorithms (EAs) have been with us for several decades and are highly popular given that they have proved competitive in the face of challenging problems’ features such as deceptiveness, multiple local optima, among other characteristics. However, it is necessary to define multiple hyper-parameter values to have a working EA, which is a drawback for many practitioners. In the case of genetic programming (GP), an EA for the evolution of models and programs, hyper-parameter optimization has been extensively studied only recently. This work builds on recent findings and explores the hyper-parameter space of a specific GP system called neat-GP that controls model size. This is conducted using two large sets of symbolic regression benchmark problems to evaluate system performance, while hyper-parameter optimization is carried out using three variants of the iterated F-Race algorithm, for the first time applied to GP. From all the automatic parametrizations produced by optimization process, several findings are drawn. Automatic parametrizations do not outperform the manual configuration in many cases, and overall, the differences are not substantial in terms of testing error. Moreover, finding parametrizations that produce highly accurate models that are also compact is not trivially done, at least if the hyper-parameter optimization process (F-Race) is only guided by predictive error. This work is intended to foster more research and scrutiny of hyper-parameters in EAs, in general, and GP, in particular.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Notes

  1. 1.

    They are also referred to as parameters, but the distinction between parameters and hyper-parameters is important, particularly when the EA is performing a learning process, searching for models that might also include their own parameters.

  2. 2.

    The division \(\div \) is protected, returning the numerator when the denominator is zero

  3. 3.

    https://github.com/saarahy/NGP-LS

References

  1. Ackermann MR, Blömer J, Sohler C (2008) Clustering for metric and non-metric distance measures. In: Proceedings of the nineteenth annual ACM-SIAM symposium on discrete algorithms, Philadelphia, PA, USA, SODA ’08, pp 799–808

  2. Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. J Mach Learn Res 13:281–305

    MathSciNet  MATH  Google Scholar 

  3. Birattari M (2009) Tuning metaheuristics: a machine learning perspective, 1st edn. Springer Publishing Company, Incorporated, Berlin

    Google Scholar 

  4. Birattari M, Stützle T, Paquete L, Varrentrapp K (2002) A racing algorithm for configuring metaheuristics. In: Proceedings of the 4th annual conference on genetic and evolutionary computation, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, GECCO’02, pp 11–18

  5. Burke EK, Gustafson S, Kendall G (2004) Diversity in genetic programming: an analysis of measures and correlation with fitness. IEEE Trans Evol Comput 8(1):47–62

    Article  Google Scholar 

  6. Cava WL, Silva S, Danai K, Spector L, Vanneschi L, Moore JH (2019) Multidimensional genetic programming for multiclass classification. Swarm Evol Comput 44:260–272

    Article  Google Scholar 

  7. De Rainville FM, Fortin FA, Gardner MA, Parizeau M, Gagné C (2012) Deap: A python framework for evolutionary algorithms. In: Proceedings of the 14th annual conference companion on genetic and evolutionary computation, ACM, New York, NY, USA, GECCO ’12, pp 85–92

  8. Hansen N, Ostermeier A (2001) Completely derandomized self-adaptation in evolution strategies. Evol Comput 9(2):159–195

    Article  Google Scholar 

  9. Hernández-Beltran JE, Díaz-Ramírez VH, Trujillo L, Legrand P (2019) Design of estimators for restoration of images degraded by haze using genetic programming. Swarm Evol Comput 44:49–63

    Article  Google Scholar 

  10. Juárez-Smith P, Trujillo L (2016) Integrating local search within neat-gp. In: Proceedings of the 2016 on genetic and evolutionary computation conference companion, ACM, New York, NY, USA, GECCO ’16 Companion, pp 993–996

  11. Karafotias G, Hoogendoorn M, Eiben AE (2015) Parameter control in evolutionary algorithms: trends and challenges. IEEE Trans Evol Comput 19(2):167–187

    Article  Google Scholar 

  12. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    Google Scholar 

  13. Langdon WB, Poli R (2010) Foundations of genetic programming, 1st edn. Springer Publishing Company, Incorporated, Berlin

    Google Scholar 

  14. López-Ibáñez M, Dubois-Lacoste J, Pérez Cáceres L, Birattari M Stützle (2016) The irace packag: iterated racing for automatic algorithm configuration. Oper Res Perspect 3:43–58

    MathSciNet  Article  Google Scholar 

  15. López-López VR, Trujillo L, Legrand P (2018) Applying genetic improvement to a genetic programming library in c++. Soft Comput 23:11593–11609

    Article  Google Scholar 

  16. McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, Jaskowski W, Krawiec K, Harper R, De Jong K, O’Reilly UM (2012) Genetic programming needs better benchmarks. In: Proceedings of the 14th annual conference on genetic and evolutionary computation, ACM, New York, NY, USA, GECCO ’12, pp 791–798

  17. Michalewicz Z (1996) Genetic algorithms + data structures = evolution programs, 3rd edn. Springer, Berlin

    Google Scholar 

  18. Neumüller C, Wagner S, Kronberger G, Affenzeller M (2012) Parameter meta-optimization of metaheuristic optimization algorithms. In: Proceedings of the 13th international conference on computer aided systems theory–volume part I, Springer, Berlin, EUROCAST’11, pp 367–374

  19. Olson RS, La Cava W, Orzechowski P, Urbanowicz RJ, Moore JH (2017) Pmlb: a large benchmark suite for machine learning evaluation and comparison. BioData Min 10(1):36

    Article  Google Scholar 

  20. Sipper M, Fu W, Ahuja K, Moore JH (2018) Investigating the parameter space of evolutionary algorithms. BioData Min 11(1):2

    Article  Google Scholar 

  21. Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evol Comput 10(2):99–127

    Article  Google Scholar 

  22. Trujillo L, Muñoz L, Galván-López E, Silva S (2016) neat genetic programming: controlling bloat naturally. Inf Sci 333:21–43

    Article  Google Scholar 

  23. Vanneschi L, Castelli M, Scott K, Trujillo L (2019) Alignment-based genetic programming for real life applications. Swarm Evol Comput 44:840–851

    Article  Google Scholar 

  24. Vladislavleva EJ, Smits GF, den Hertog D (2009) Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans Evol Comput 13(2):333–349

    Article  Google Scholar 

Download references

Funding

This research was funded by CONACYT (Mexico) Fronteras de la Ciencia 2015-2 Project No. FC-2015-2:944, TecNM project 6826.18-P, and the second author was supported by CONACYT graduate scholarship for his masters thesis.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Leonardo Trujillo.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Communicated by V. Loia.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Trujillo, L., Álvarez González, E., Galván, E. et al. On the analysis of hyper-parameter space for a genetic programming system with iterated F-Race. Soft Comput 24, 14757–14770 (2020). https://doi.org/10.1007/s00500-020-04829-4

Download citation

Keywords

  • Hyper-parameter optimization
  • Iterated F-Race
  • Genetic programming