Symbolic Regression of Implicit Equations

  • Michael Schmidt
  • Hod Lipson
Part of the Genetic and Evolutionary Computation book series (GEVO)


Traditional Symbolic Regression applications are a form of supervised learning, where a label y is provided for every \(\vec{x}\) and an explicit symbolic relationship of the form \(y = f(\vec{x})\) is sought. This chapter explores the use of symbolic regression to perform unsupervised learning by searching for implicit relationships of the form \(f(\vec{x}, y) = 0\). Implicit relationships are more general and more expressive than explicit equations in that they can also represent closed surfaces, as well as continuous and discontinuous multi-dimensional manifolds. However, searching these types of equations is particularly challenging because an error metric is difficult to define. We studied several direct and indirect techniques, and present a successful method based on implicit derivatives. Our experiments identified implicit relationships found in a variety of datasets, such as equations of circles, elliptic curves, spheres, equations of motion, and energy manifolds.


Symbolic Regression Implicit Equations Unsupervised Learning 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bautu, Elena, Bautu, Andrei, and Luchian, Henri (2005). Symbolic regression on noisy data with genetic and gene expression programming. In Seventh International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC’05), pages 321–324.Google Scholar
  2. Bongard, Josh and Lipson, Hod (2007). Automated reverse engineering of non-linear dynamical systems. Proceedings of the National Academy of Sciences, 104(24):9943–9948.zbMATHCrossRefGoogle Scholar
  3. De Falco, Ivanoe, Cioppa, Antonio Della, and Tarantino, Ernesto (2002). Unsupervised spectral pattern recognition for multispectral images by means of a genetic programming approach. In Fogel, David B., El-Sharkawi, Mohamed A., Yao, Xin, Greenwood, Garry, Iba, Hitoshi, Marrow, Paul, and Shackleton, Mark, editors, Proceedings of the 2002 Congress on Evolutionary Computation CEC2002, pages 231–236. IEEE Press.Google Scholar
  4. Duffy, John and Engle-Warnick, Jim (1999). Using symbolic regression to infer strategies from experimental data. In Belsley, David A. and Baum, Christopher F., editors, Fifth International Conference: Computing in Economics and Finance, page 150, Boston College, MA, USA. Book of Abstracts.Google Scholar
  5. Hetland, Magnus Lie and Saetrom, Pal (2005). Evolutionary rule mining in time series databases. Machine Learning, 58(2–3):107–125.zbMATHCrossRefGoogle Scholar
  6. Korns, Michael F. (2006). Large-scale, time-constrained symbolic regression. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 16, pages –. Springer, Ann Arbor.Google Scholar
  7. Koza, John R. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA.zbMATHGoogle Scholar
  8. Mackin, Kenneth J. and Tazaki, Eiichiro (2000). Unsupervised training of Multiobjective Agent Communication using Genetic Programming. In Proceedings of the Fourth International Conference on Knowledge-Based Intelligent Engineering Systems and Allied Technology, volume 2, pages 738–741, Brighton, UK. IEEE.Google Scholar
  9. Mahfoud, Samir W. (1995). Niching methods for genetic algorithms. PhD thesis, Champaign, IL, USA.Google Scholar
  10. McConaghy, Trent, Palmers, Pieter, Gielen, Georges, and Steyaert, Michiel (2008). Automated extraction of expert domain knowledge from genetic programming synthesis results. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 8, pages 111–125. Springer, Ann Arbor.Google Scholar
  11. Schmidt, Michael and Lipson, Hod (2007). Comparison of tree and graph encodings as function of problem complexity. In Thierens, Dirk, Beyer, Hans-Georg, Bongard, Josh, Branke, Jurgen, Clark, John Andrew, Cliff, Dave, Congdon, Clare Bates, Deb, Kalyanmoy, Doerr, Benjamin, Kovacs, Tim, Kumar, Sanjeev, Miller, Julian F., Moore, Jason, Neumann, Frank, Pelikan, Martin, Poli, Riccardo, Sastry, Kumara, Stanley, Kenneth Owen, Stutzle, Thomas, Watson, Richard A, and Wegener, Ingo, editors, GECCO ’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, volume 2, pages 1674–1679, London. ACM Press.Google Scholar
  12. Schmidt, Michael and Lipson, Hod (2009). Distilling Free-Form Natural Laws from Experimental Data. Science, 324(5923):81–85.CrossRefGoogle Scholar
  13. Schmidt, Michael D. and Lipson, Hod (2006). Co-evolving fitness predictors for accelerating and reducing evaluations. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 17, pages –. Springer, Ann Arbor.Google Scholar
  14. Shpitalni, M. and Lipson, H. (1995). Classification of sketch strokes and corner detection using conic sections and adaptive clustering. ASME Journal of Mechanical Design, 119:131–135.CrossRefGoogle Scholar
  15. Smits, Guido and Kotanchek, Mark (2004). Pareto-front exploitation in symbolic regression. In O’Reilly, Una-May, Yu, Tina, Riolo, Rick L., and Worzel, Bill, editors, Genetic Programming Theory and Practice II, chapter 17, pages 283–299. Springer, Ann Arbor.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  • Michael Schmidt
    • 1
  • Hod Lipson
    • 2
    • 3
  1. 1.Computational BiologyCornell UniversityIthacaUSA
  2. 2.School of Mechanical and Aerospace EngineeringCornell UniversityIthacaUSA
  3. 3.Computing and Information ScienceCornell UniversityIthacaUSA

Personalised recommendations