Skip to main content

Gaining Deeper Insights in Symbolic Regression

  • Chapter
  • First Online:
Genetic Programming Theory and Practice XI

Abstract

A distinguishing feature of symbolic regression using genetic programming is its ability to identify complex nonlinear white-box models. This is especially relevant in practice where models are extensively scrutinized in order to gain knowledge about underlying processes. This potential is often diluted by the ambiguity and complexity of the models produced by genetic programming. In this contribution we discuss several analysis methods with the common goal to enable better insights in the symbolic regression process and to produce models that are more understandable and show better generalization. In order to gain more information about the process we monitor and analyze the progresses of population diversity, building block information, and even more general genealogy information. Regarding the analysis of results, several aspects such as model simplification, relevance of variables, node impacts, and variable network analysis are presented and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    http://dev.heuristiclab.com

References

  • Affenzeller M, Wagner S (2004) SASEGASA: a new generic parallel evolutionary algorithm for achieving highest quality results. J Heuristics Spec Issue New Adv Parallel Meta-Heuristics Complex Probl 10:239–263

    Google Scholar 

  • Affenzeller M, Winkler S, Wagner S, Beham A (2009) Genetic algorithms and genetic programming: modern concepts and practical applications. Numerical Insights. CRC, Singapore

    Book  Google Scholar 

  • Altenberg L (1994) The evolution of evolvability in genetic programming. In: Kinnear KE Jr (ed) Advances in genetic programming. MIT, Cambridge, chap 3, pp 47–74

    Google Scholar 

  • Banzhaf W, Langdon WB (2002) Some considerations on the reason for bloat. Genet Program Evolvable Mach 3(1):81–91

    Article  MATH  Google Scholar 

  • Burke EK, Gustafson S, Kendall G (2004) Diversity in genetic programming: an analysis of measures and correlation with fitness. IEEE Trans Evol Comput 8(1):47–62

    Article  Google Scholar 

  • Burlacu B, Affenzeller M, Kommenda M, Winkler SM, Kronberger G (2013) Visualization of genetic lineages and inheritance information in genetic programming. In: Proceedings of the GECCO’13: VizGEC workshop, Amsterdam (accepted to be published)

    Google Scholar 

  • Ekart A, Nemeth SZ (2000) A metric for genetic programs and fitness sharing. In: Proceedings of EuroGP’2000 genetic programming, Edinburgh. LNCS, vol 1802. Springer, pp 259–270

    Google Scholar 

  • Essam D, Mckay RI (2004) Heritage diversity in genetic programming. In: 5th international conference on simulated evolution and learning, Busan

    Google Scholar 

  • Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–141

    Article  MATH  Google Scholar 

  • Jackson D (2010) The identification and exploitation of dormancy in genetic programming. Genet Program Evolvable Mach 11(1):89–121

    Article  Google Scholar 

  • Keijzer M (1996) Efficiently representing populations in genetic programming. In: Angeline PJ, Kinnear KE Jr (eds) Advances in genetic programming 2. MIT, Cambridge, chap 13, pp 259–278

    Google Scholar 

  • Kotanchek M, Smits G, Vladislavleva E (2007) Trustable symbolic regression models: using ensembles, interval arithmetic and pareto fronts to develop robust and trust-aware models. In: Genetic programming theory and practice V, genetic and evolutionary computation. Springer, Ann Arbor, chap 12, pp 201–220

    Google Scholar 

  • Kotanchek ME, Vladislavleva E, Smits GF (2013) Symbolic regression is not enough: it takes a village to raise a model. In: Genetic programming theory and practice X, genetic and evolutionary computation, vol 10. Springer, Ann Arbor, chap 13, pp 187–203

    Google Scholar 

  • Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT, Cambridge

    MATH  Google Scholar 

  • Kronberger G (2011) Symbolic regression for knowledge discovery. Schriften der Johannes Kepler Universität Linz, Universitätsverlag Rudolf Trauner

    Google Scholar 

  • Kronberger G, Fink S, Kommenda M, Affenzeller M (2011) Macro-economic time series modeling and interaction networks. In: EvoApplications (2). Lecture notes in computer science, vol 6625. Springer, Berlin/New York, pp 101–110

    Google Scholar 

  • Langdon WB, Poli R (2002) Foundations of genetic programming. Springer, Berlin/New York

    Book  MATH  Google Scholar 

  • McPhee NF, Hopper NJ (1999) Analysis of genetic diversity through population history. In: Proceedings of the genetic and evolutionary computation conference, Orlando, vol 2. Kaufmann, pp 1112–1120

    Google Scholar 

  • Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B 36(1):106–117

    Article  Google Scholar 

  • Pearl J (2009) Causality: models, reasoning and inference, 2nd edn. Cambridge University Press, New York

    Book  Google Scholar 

  • Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In: proceedings of EuroGP’2003 genetic programming, Essex. LNCS, vol 2610. Springer, pp 204–217

    Google Scholar 

  • Rosca JP (1995) Entropy-driven adaptive representation. In: Rosca JP (ed) Proceedings of the workshop on genetic programming: from theory to real-world applications, Tahoe City, pp 23–32

    Google Scholar 

  • Smits G, Kordon A, Vladislavleva K, Jordaan E, Kotanchek M (2005) Variable selection in industrial datasets using pareto genetic programming. In: Yu T, Riolo RL, Worzel B (eds) Genetic programming theory and practice III, genetic programming, vol 9. Springer, Ann Arbor, chap 6, pp 79–92

    Google Scholar 

  • Stijven S, Minnebo W, Vladislavleva K (2011) Separating the wheat from the chaff: on feature selection and feature importance in regression random forests and symbolic regression. In: 3rd symbolic regression and modeling workshop for GECCO 2011, Dublin. ACM, pp 623–630

    Google Scholar 

  • Vanneschi L, Gustafson S, Mauri G (2006) Using subtree crossover distance to investigate genetic programming dynamics. In: Proceedings of the 9th European conference on genetic programming, lecture notes in computer science, Budapest, vol 3905. Springer, pp 238–249

    Google Scholar 

  • Vladislavleva E (2008) Model-based problem solving through symbolic regression via pareto genetic programming. PhD thesis, Tilburg University

    Google Scholar 

  • Winkler SM (2009) Evolutionary system identification: modern concepts and practical applications. Johannes Kepler University, Linz, Reihe C, vol 59. Trauner, Linz

    Google Scholar 

  • Winkler SM, Affenzeller M, Kronberger G, Kommenda M, Wagner S, Jacak W, Stekel H (2011) Analysis of selected evolutionary algorithms in feature selection and parameter optimization for data based tumor marker modeling. In: EUROCAST (1). Lecture notes in computer science, vol 6927. Springer, Berlin/New york, pp 335–342

    Google Scholar 

Download references

Acknowledgements

The work described in this chapter was done within the Josef Ressel Center for Heuristic Optimization Heureka! sponsored by the Austrian Research Promotion Agency (FFG).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Affenzeller .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Affenzeller, M., Winkler, S.M., Kronberger, G., Kommenda, M., Burlacu, B., Wagner, S. (2014). Gaining Deeper Insights in Symbolic Regression. In: Riolo, R., Moore, J., Kotanchek, M. (eds) Genetic Programming Theory and Practice XI. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-0375-7_10

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0375-7_10

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-0374-0

  • Online ISBN: 978-1-4939-0375-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics