Skip to main content

FFX: Fast, Scalable, Deterministic Symbolic Regression Technology

  • Chapter
  • First Online:

Part of the book series: Genetic and Evolutionary Computation ((GEVO))

Abstract

Symbolic regression is a common application for genetic programming (GP). This paper presents a new non-evolutionary technique for symbolic regression that, compared to competent GP approaches on real-world problems, is orders of magnitude faster (taking just seconds), returns simpler models, has comparable or better prediction on unseen data, and converges reliably and deterministically. I dub the approach FFX, for Fast Function Extraction. FFX uses a recentlydeveloped machine learning technique, pathwise regularized learning, to rapidly prune a huge set of candidate basis functions down to compact models. FFX is verified on a broad set of real-world problems having 13 to 1468 input variables, outperforming GP as well as several state-of-the-art regression techniques.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ampazis, N. and Perantonis, S. J. (2002). Two highly efficient second-order algorithms for training feedforward networks. IEEE-EC, 13:1064–1074.

    Google Scholar 

  • Boyd, Stephen and Vandenberghe, Lieven (2004). Convex Optimization. Cambridge University Press, New York, NY, USA.

    Google Scholar 

  • Castillo, Flor, Kordon, Arthur, and Villa, Carlos (2010). Genetic programming transforms in linear regression situations. In Riolo, Rick,McConaghy, Trent, and Vladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, chapter 11, pages 175–194. Springer, Ann Arbor, USA.

    Google Scholar 

  • Daems, Walter, Gielen, Georges G. E., and Sansen, Willy M. C. (2003). Simulation-based generation of posynomial performance models for the sizing of analog integrated circuits. IEEE Trans. on CAD of Integrated Circuits and Systems, 22(5):517–534.

    Article  Google Scholar 

  • Deb,Kalyanmoy, Pratap, Amrit,Agarwal, Sameer, andMeyarivan, T. (2002). A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Transactions on Evolutionary Computation, 6:182–197.

    Google Scholar 

  • Fonlupt, Cyril and Robilliard, Denis (2011). A continuous approach to genetic programming. In Silva, Sara et al., editors, Proceedings of the 14th European Conference on Genetic Programming, EuroGP 2011, volume 6621 of LNCS, pages 335–346, Turin, Italy. Springer Verlag.

    Google Scholar 

  • Friedman, J. H. (1991). Multivariate adaptive regression splines. Annals of Statistics, 19(1):1–141.

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman, Jerome H., Hastie, Trevor, and Tibshirani, Rob (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33(1):1–22.

    Google Scholar 

  • Hansen, N. and Ostermeier, A. (2001). Completely derandomized selfadaptation in evolution strategies. Evolutionary Computation, 9(2):159–195.

    Article  Google Scholar 

  • Hastie, Trevor, Tibshirani, Robert, and Friedman, Jerome (2008). The elements of statistical learning: data mining, inference and prediction. Springer, 2 edition.

    Google Scholar 

  • Kim,Minkyu, Becker, Ying L., Fei, Peng, and O’Reilly, Una-May (2008). Constrained genetic programming to minimize overfitting in stock selection. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice VI, Genetic and Evolutionary Computation, chapter 12, pages 179–195. Springer, Ann Arbor.

    Google Scholar 

  • Korns, Michael F. (2010). Abstract expression grammar symbolic regression.

    Google Scholar 

  • In Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors,

    Google Scholar 

  • Genetic Programming Theory and Practice VIII, volume 8 of Genetic and

    Google Scholar 

  • Evolutionary Computation, chapter 7, pages 109–128. Springer, Ann Arbor,

    Google Scholar 

  • USA.

    Google Scholar 

  • Koza, John R. (1992). Genetic Programming: On the Programming of Computers

    Google Scholar 

  • by Means of Natural Selection. MIT Press, Cambridge, MA, USA.

    Google Scholar 

  • Langley, Pat, Simon,HerbertA., Bradshaw,Gary L., and Zytkow, JanM. (1987). Scientific discovery: computational explorations of the creative process.MIT Press, Cambridge, MA, USA.

    Google Scholar 

  • Leung, Henry and Haykin, Simon (1993). Rational function neural network. Neural Comput., 5:928–938.

    Article  Google Scholar 

  • Looks,Moshe (2006). Competent Program Evolution. Doctor of science,Washington University, St. Louis, USA.

    Google Scholar 

  • McConaghy, Trent, Eeckelaert, Tom, and Gielen, Georges (2005). CAFFEINE: Template-free symbolic model generation of analog circuits via canonical form functions and genetic programming. In Proceedings of the Design Automation and Test Europe (DATE) Conference, volume 2, pages 1082–1087, Munich.

    Google Scholar 

  • McConaghy, Trent and Gielen, Georges (2005). Analysis of simulation-driven numerical performance modeling techniques for application to analog circuit optimization. In Proceedings of the IEEE International Symposium on Circuits and Systems (ISCAS). IEEE Press.

    Google Scholar 

  • McConaghy, Trent and Gielen, Georges (2006). Double-strength caffeine: fast template-free symbolic modeling of analog circuits via implicit canonical form functions and explicit introns. In Proceedings of the conference on Design, automation and test in Europe: Proceedings, DATE ’06, pages 269– 274, 3001 Leuven, Belgium, Belgium. European Design and Automation Association.

    Google Scholar 

  • McConaghy, Trent and Gielen, Georges G. E. (2009). Template-free symbolic performance modeling of analog circuits via canonical-form functions and genetic programming. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 28(8):1162–1175.

    Article  Google Scholar 

  • McConaghy, Trent, Vladislavleva, Ekaterina, and Riolo, Rick (2010). Genetic programming theory and practice 2010: An introduction. In Riolo, Rick,Mc- Conaghy, Trent, andVladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, pages xvii–xxviii. Springer, Ann Arbor, USA.

    Google Scholar 

  • Montgomery, Douglas C. (2009). Design and analysis of experiments. Wiley, Hoboken, NJ, 7. ed., international student version edition.

    Google Scholar 

  • Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A, General, 135:370–384.

    Article  Google Scholar 

  • Nikolaev, Nikolay Y. and Iba, Hitoshi (2001). Regularization approach to inductive genetic programming. IEEE Transactions on Evolutionary Computing, 54(4):359–375.

    Article  Google Scholar 

  • O’Neill,Michael and Brabazon, Anthony (2006). Grammatical differential evolution. In Arabnia, Hamid R., editor, Proceedings of the 2006 International Conference on Artificial Intelligence, ICAI 2006, volume 1, pages 231–236, Las Vegas, Nevada, USA. CSREA Press.

    Google Scholar 

  • O’Neill, Michael and Ryan, Conor (2003). Grammatical Evolution: Evolutionary Automatic Programming in a Arbitrary Language, volume 4 of Genetic programming. Kluwer Academic Publishers.

    Google Scholar 

  • O’Reilly, Una-May (1995). An Analysis of Genetic Programming. PhD thesis, Carleton University, Ottawa-Carleton Institute for Computer Science, Ottawa, Ontario, Canada.

    Google Scholar 

  • Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors (2010). Genetic Programming Theory and Practice VIII, Genetic and Evolutionary Computation, Ann Arbor, USA. Springer.

    Google Scholar 

  • Rothlauf, Franz (2006). Representations for genetic and evolutionary algorithms. Springer-Verlag, pub-SV:adr, second edition. First published 2002, 2nd edition available electronically.

    Google Scholar 

  • Sacks, Jerome,Welch,William J.,Mitchell, Toby J., andWynn, Henry P. (1989). Design and analysis of computer experiments. Statistical Science, 4(4.409– 435):409–427.

    Google Scholar 

  • Schmidt, Michael D. and Lipson, Hod (2006). Co-evolving fitness predictors for accelerating and reducing evaluations. In Riolo, Rick L., Soule, Terence, and Worzel, Bill, editors, Genetic Programming Theory and Practice IV, volume 5 of Genetic and Evolutionary Computation, chapter 17, pages –. Springer, Ann Arbor.

    Google Scholar 

  • Smits, Guido F., Vladislavleva, Ekaterina, and Kotanchek, Mark E. (2010).Scalable symbolic regression by continuous evolution with very small pop260 ulations. In Riolo, Rick, McConaghy, Trent, and Vladislavleva, Ekaterina, editors, Genetic Programming Theory and Practice VIII, volume 8 of Genetic and Evolutionary Computation, chapter 9, pages 147–160. Springer, Ann Arbor, USA.

    Google Scholar 

  • Suykens, J. A. K., Gestel, T. Van, Brabanter, J. De, Moor, B. De, and Vandewalle, J. (2002). Least Squares Support Vector Machines. World Scientific, Singapore.

    Google Scholar 

  • Teytaud, Olivier and Gelly, Sylvain (2007). Dcma: yet another derandomization in covariance-matrix-adaptation. In Proceedings of the 9th annual conference onGenetic and evolutionary computation,GECCO’07, pages 955–963,New York, NY, USA. ACM.

    Google Scholar 

  • Topchy, Alexander and Punch, William F. (2001). Faster genetic programming based on local gradient search of numeric leaf values. In Spector,Lee et al., editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 155–162, San Francisco, California, USA. Morgan Kaufmann.

    Google Scholar 

  • Zou, Hui and Hastie, Trevor (2005). Regularization and variable selection via the elastic net. Journal Of The Royal Statistical Society Series B, 67(2):301– 320.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

McConaghy, T. (2011). FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. In: Riolo, R., Vladislavleva, E., Moore, J. (eds) Genetic Programming Theory and Practice IX. Genetic and Evolutionary Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1770-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1770-5_13

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1769-9

  • Online ISBN: 978-1-4614-1770-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics