Skip to main content

Hybrid Single Node Genetic Programming for Symbolic Regression

  • Chapter
  • First Online:
Transactions on Computational Collective Intelligence XXIV

Part of the book series: Lecture Notes in Computer Science ((TCCI,volume 9770))

Abstract

This paper presents a first step of our research on designing an effective and efficient GP-based method for symbolic regression. First, we propose three extensions of the standard Single Node GP, namely (1) a selection strategy for choosing nodes to be mutated based on depth and performance of the nodes, (2) operators for placing a compact version of the best-performing graph to the beginning and to the end of the population, respectively, and (3) a local search strategy with multiple mutations applied in each iteration. All the proposed modifications have been experimentally evaluated on five symbolic regression benchmarks and compared with standard GP and SNGP. The achieved results are promising showing the potential of the proposed modifications to improve the performance of the SNGP algorithm. We then propose two variants of hybrid SNGP utilizing a linear regression technique, LASSO, to improve its performance. The proposed algorithms have been compared to the state-of-the-art symbolic regression methods that also make use of the linear regression techniques on four real-world benchmarks. The results show the hybrid SNGP algorithms are at least competitive with or better than the compared methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://en.wikipedia.org/wiki/Symbolic_regression.

  2. 2.

    https://cs.gmu.edu/~eclab/projects/ecj/.

  3. 3.

    Checked using the t-test calculated with the significance level \(\alpha =0.05\).

  4. 4.

    The only exception is EFS: we changed the round variable to false (which was originally hard-coded to true) according to the issue on the algorithm’s GitHub repository, see https://github.com/exgp/efs/issues/1.

References

  1. Arnaldo, I., Krawiec, K., O’Reilly, U.-M.: Multiple regression genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, GECCO 2014, pp. 879–886. ACM, New York (2014)

    Google Scholar 

  2. Arnaldo, I., O’Reilly, U.-M., Veeramachaneni, K.: Building predictive models via feature synthesis. In: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, GECCO 2015, pp. 983–990. ACM, New York (2015)

    Google Scholar 

  3. EFS commit 6d991fa. http://github.com/exgp/efs/tree/6d991fa

  4. Ferreira, C.: Gene expression programming: a new adaptive algorithm for solving problems. Complex Syst. 13(2), 87–129 (2001)

    MathSciNet  MATH  Google Scholar 

  5. Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010)

    Article  Google Scholar 

  6. Garg, A., Garg, A., Tai, K.: A multi-gene genetic programming model for estimating stress-dependent soil water retention curves. Comput. Geosci. 18(1), 45–56 (2013)

    Article  Google Scholar 

  7. Hart, E., Smith, J.E., Krasnogor, N.: Recent Advances in Memetic Algorithms. STUDFUZZ, vol. 166. Springer, Heidelberg (2005)

    Book  MATH  Google Scholar 

  8. Hinchliffe, M., Hiden, H., McKay, B., Willis, M., Tham, M., Barton, G. Modelling chemical process systems using a multi-gene genetic programming algorithm. In: Koza, J.R. (ed.) Late Breaking Papers at the Genetic Programming 1996 Conference, pp. 56–65 (1996)

    Google Scholar 

  9. Jackson, D.: A new, node-focused model for genetic programming. In: Moraglio, A., Silva, S., Krawiec, K., Machado, P., Cotta, C. (eds.) EuroGP 2012. LNCS, vol. 7244, pp. 49–60. Springer, Heidelberg (2012). doi:10.1007/978-3-642-29139-5_5

    Chapter  Google Scholar 

  10. Jackson, D.: Single node genetic programming on problems with side effects. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012. LNCS, vol. 7491, pp. 327–336. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32937-1_33

    Chapter  Google Scholar 

  11. Keijzer, M.: Scaled symbolic regression. Genet. Program Evolvable Mach. 5(3), 259–269 (2004)

    Article  Google Scholar 

  12. Koza, J.: On the Programming of Computers by Means of Natural Selection, 2nd edn. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  13. Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of GECCO 2002, pp. 829–836. Morgan Kaufmann Publishers (2002)

    Google Scholar 

  14. Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2013). http://archive.ics.uci.edu/ml

  15. McConaghy, T.: Fast, scalable, deterministic symbolic regression technology. In: Riolo, R., Vladislavleva, E., Moore, J.H. (eds.) Genetic Programming Theory and Practice IX, Genetic and Evolutionary Computation, pp. 235–260 (2011)

    Google Scholar 

  16. FFX 1.3.4. http://pypi.python.org/pypi/ffx/1.3.4

  17. McDermott, J., et al.: Genetic programming needs better benchmarks. In: Proceedings of the GECCO 2012, pp. 791–798. ACM, New York (2012)

    Google Scholar 

  18. Miller, J.F., Thomson, P.: Cartesian genetic programming. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 121–132. Springer, Heidelberg (2000). doi:10.1007/978-3-540-46239-2_9

    Chapter  Google Scholar 

  19. Ryan, C., Azad, R.M.A.: A simple approach to lifetime learning in genetic programming-based symbolic regression. Evol. Comput. 22(2), 287–317 (2014)

    Article  Google Scholar 

  20. Ryan, C., Collins, J.J., Neill, M.O.: Grammatical evolution: evolving programs for an arbitrary language. In: Banzhaf, W., Poli, R., Schoenauer, M., Fogarty, T.C. (eds.) EuroGP 1998. LNCS, vol. 1391, pp. 83–96. Springer, Heidelberg (1998). doi:10.1007/BFb0055930

    Chapter  Google Scholar 

  21. Searson, D.P., Leahy, D.E., Willis, M.J.: Gptips: an open source genetic programming toolbox for multigene symbolic regression. In International MultiConference of Engineers and Computer Scientists, vol. 1, pp. 77–80 (2010)

    Google Scholar 

  22. Searson, D.P.: GPTIPS 2: an open-source software platform for symbolic datamining. In: Gandomi, A.H., Alavi, A.H., Ryan, C. (eds.) Springer Handbook of Genetic Programming Applications, pp. 551–573. Springer, Switzerland (2015)

    Google Scholar 

  23. GPTIPS 2. http://sites.google.com/site/gptips4matlab

  24. Vladislavleva, E.J., Smits, G.F., Den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. Trans. Evol. Comp. 13(2), 333–349 (2009)

    Article  Google Scholar 

  25. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgment

This research was supported by the Grant Agency of the Czech Republic (GAČR) with the grant no. 15-22731S entitled “Symbolic Regression for Reinforcement Learning in Continuous Spaces”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiří Kubalík .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Kubalík, J., Alibekov, E., Žegklitz, J., Babuška, R. (2016). Hybrid Single Node Genetic Programming for Symbolic Regression. In: Nguyen, N., Kowalczyk, R., Filipe, J. (eds) Transactions on Computational Collective Intelligence XXIV. Lecture Notes in Computer Science(), vol 9770. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53525-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53525-7_4

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53524-0

  • Online ISBN: 978-3-662-53525-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics