Evolving Simple Symbolic Regression Models by Multi-Objective Genetic Programming

  • Michael KommendaEmail author
  • Gabriel Kronberger
  • Michael Affenzeller
  • Stephan M. Winkler
  • Bogdan Burlacu
Part of the Genetic and Evolutionary Computation book series (GEVO)


In this chapter we examine how multi-objective genetic programming can be used to perform symbolic regression and compare its performance to single-objective genetic programming. Multi-objective optimization is implemented by using a slightly adapted version of NSGA-II, where the optimization objectives are the model’s prediction accuracy and its complexity. As the model complexity is explicitly defined as an objective, the evolved symbolic regression models are simpler and more parsimonious when compared to models generated by a single-objective algorithm. Furthermore, we define a new complexity measure that includes syntactical and semantic information about the model, while still being efficiently computed, and demonstrate its performance on several benchmark problems. As a result of the multi-objective approach the appropriate model length and the functions included in the models are automatically determined without the necessity to specify them a-priori.


Symbolic regression Complexity measures Multi-objective optimization Genetic programming NSGA-II 



The work described in this paper was done within the COMET Project Heuristic Optimization in Production and Logistics (HOPL), #843532 funded by the Austrian Research Promotion Agency (FFG).


  1. Affenzeller M, Winkler S, Kronberger G, Kommenda M, Burlacu B, Wagner S (2014) Gaining deeper insights in symbolic regression. In: Riolo R, Moore JH, Kotanchek M (eds) Genetic programming theory and practice XI. Genetic and evolutionary computation. Springer, New YorkGoogle Scholar
  2. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. CRC Press, Boca RatonzbMATHGoogle Scholar
  3. Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evolut Comput 6(2):182–197CrossRefGoogle Scholar
  4. Dignum S, Poli R (2008) Operator equalisation and bloat free gp. In: Genetic programming. Springer, Berlin, pp 110–121CrossRefGoogle Scholar
  5. Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–67. MathSciNetCrossRefzbMATHGoogle Scholar
  6. Keijzer M, Foster J (2007) Crossover bias in genetic programming. In: Genetic programming. Springer, Berlin, pp 33–44CrossRefGoogle Scholar
  7. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MAzbMATHGoogle Scholar
  8. Luke S (2000) Two fast tree-creation algorithms for genetic programming. IEEE Trans Evolut Comput 4(3):274–283CrossRefGoogle Scholar
  9. Luke S, Panait L (2002) Lexicographic Parsimony Pressure. In: Langdon WB, Cantu ​​-Paz E, Mathias K, Roy R, Davis D, Poli R, Balakrishnan K, Honavar V, Rudolph G, Wegener J, Bull L, Potter MA, Schultz AC, Miller JF, Burke E, Jonoska N (eds) Proceedings of the genetic and evolutionary computation conference (GECCO’2002). Morgan Kaufmann Publishers, San Francisco, CA, pp 829–836Google Scholar
  10. Poli R (2010) Covariant Tarpeian method for bloat control in genetic programming. Genet Program Theory Pract VIII 8:71–90MathSciNetGoogle Scholar
  11. Poli R, Langdon WB, McPhee NF (2008) A field guide to genetic programming. Published via and freely available at
  12. Silva S, Costa E (2009) Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genet Program Evolvable Mach 10(2):141–179MathSciNetCrossRefGoogle Scholar
  13. Smits GF, Kotanchek M (2005) Pareto-front exploitation in symbolic regression. In: Genetic programming theory and practice II. Springer, Berlin, pp 283–299CrossRefGoogle Scholar
  14. Srinivas N, Deb K (1994) Multiobjective optimization using nondominated sorting in genetic algorithms. Evol Comput 2(3):221–248CrossRefGoogle Scholar
  15. Vanneschi L, Castelli M, Silva S (2010) Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th annual conference on genetic and evolutionary computation. ACM, New York, pp 877–884CrossRefGoogle Scholar
  16. Vladislavleva EJ, Smits GF, Den Hertog D (2009) Order of nonlinearity as a complexity measure for models generated by symbolic regression via Pareto genetic programming. IEEE Trans Evol Comput 13(2):333–349Google Scholar
  17. Wagner S (2009) Heuristic optimization software systems - modeling of heuristic optimization algorithms in the heuristiclab software environment. Ph.D. thesis, Institute for Formal Models and Verification, Johannes Kepler University, LinzGoogle Scholar
  18. White DR, McDermott J, Castelli M, Manzoni L, Goldman BW, Kronberger G, Jaskowski W, O’Reilly UM, Luke S (2013) Better GP benchmarks: community survey results and proposals. Genet Program Evol Mach 14(1):3–29. doi: 10.1007/s10710-012-9177-2 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Michael Kommenda
    • 1
    • 2
    Email author
  • Gabriel Kronberger
    • 1
  • Michael Affenzeller
    • 1
    • 2
  • Stephan M. Winkler
    • 1
  • Bogdan Burlacu
    • 1
    • 2
  1. 1.Heuristic and Evolutionary Algorithms LaboratoryUniversity of Applied Sciences Upper AustriaHagenbergAustria
  2. 2.Institute for Formal Models and VerificationJohannes Kepler UniversityLinzAustria

Personalised recommendations