Complexity Measures for Multi-objective Symbolic Regression

  • Michael KommendaEmail author
  • Andreas Beham
  • Michael Affenzeller
  • Gabriel Kronberger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9520)


Multi-objective symbolic regression has the advantage that while the accuracy of the learned models is maximized, the complexity is automatically adapted and need not be specified a-priori. The result of the optimization is not a single solution anymore, but a whole Pareto-front describing the trade-off between accuracy and complexity.

In this contribution we study which complexity measures are most appropriately used in symbolic regression when performing multi- objective optimization with NSGA-II. Furthermore, we present a novel complexity measure that includes semantic information based on the function symbols occurring in the models and test its effects on several benchmark datasets. Results comparing multiple complexity measures are presented in terms of the achieved accuracy and model length to illustrate how the search direction of the algorithm is affected.


Symbolic regression Complexity measures Multi-objective optimization NSGA-II Genetic programming 



The work described in this paper was done within the COMET Project Heuristic Optimization in Production and Logistics (HOPL), #843532 funded by the Austrian Research Promotion Agency (FFG).


  1. 1.
    Affenzeller, M., Winkler, S., Kronberger, G., Kommenda, M., Burlacu, B., Wagner, S.: Gaining deeper insights in symbolic regression. In: Riolo, R., Moore, J.H., Kotanchek, M. (eds.) Genetic Programming Theory and Practice XI. Genetic and Evolutionary Computation, pp. 175–190. Springer, New York (2014)CrossRefGoogle Scholar
  2. 2.
    Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)CrossRefGoogle Scholar
  3. 3.
    Dignum, S., Poli, R.: Operator equalisation and bloat free GP. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 110–121. Springer, Heidelberg (2008) CrossRefGoogle Scholar
  4. 4.
    Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat. 19, 1–67 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Keijzer, M., Foster, J.: Crossover bias in genetic programming. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 33–44. Springer, Heidelberg (2007) CrossRefGoogle Scholar
  6. 6.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  7. 7.
    Luke, S.: Issues in scaling genetic programming: breeding strategies, tree generation, and code bloat. Ph.D. thesis, Dept. of Computer Science. University of Maryland, College Park (2000)Google Scholar
  8. 8.
    Luke, S., Panait, L., et al.: Lexicographic parsimony pressure. In: GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, vol. 2, pp. 829–836 (2002)Google Scholar
  9. 9.
    Poli, R.: Covariant tarpeian method for bloat control in genetic programming. In: Riolo, R., McConaghy, T., Vladislavleva, E. (eds.) Genetic Programming Theory and Practice VIII 8, pp. 71–90. Springer, New York (2010)Google Scholar
  10. 10.
    Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming (2008).
  11. 11.
    Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genet. Program Evolvable Mach. 10(2), 141–179 (2009)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Smits, G.F., Kotanchek, M.: Pareto-front exploitation in symbolic regression. In: O’Reilly, U.-M., et al. (eds.) Genetic Programming Theory and Practice II, pp. 283–299. Springer, New York (2005)CrossRefGoogle Scholar
  13. 13.
    Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, pp. 877–884. ACM (2010)Google Scholar
  14. 14.
    Vladislavleva, E.J., Smits, G.F., Den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Trans. Evol. Comput. 13(2), 333–349 (2009)CrossRefGoogle Scholar
  15. 15.
    White, D.R., McDermott, J., Castelli, M., Manzoni, L., Goldman, B.W., Kronberger, G., Jaskowski, W., O’Reilly, U.M., Luke, S.: Better GP benchmarks: community survey results and proposals. Genet. Program Evolvable Mach. 14(1), 3–29 (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Michael Kommenda
    • 1
    • 2
    Email author
  • Andreas Beham
    • 1
    • 2
  • Michael Affenzeller
    • 1
    • 2
  • Gabriel Kronberger
    • 1
  1. 1.Heuristic and Evolutionary Algorithms Laboratory, School of Informatics, Communications and MediaUniversity of Applied Sciences Upper AustriaHagenbergAustria
  2. 2.Institute for Formal Models and VerificationJohannes Kepler University LinzLinzAustria

Personalised recommendations