Skip to main content

A semantic genetic programming framework based on dynamic targets

Abstract

Semantic GP is a promising branch of GP that introduces semantic awareness during genetic evolution to improve various aspects of GP. This paper presents a new Semantic GP approach based on Dynamic Target (SGP-DT) that divides the search problem into multiple GP runs. The evolution in each run is guided by a new (dynamic) target based on the residual errors of previous runs. To obtain the final solution, SGP-DT combines the solutions of each run using linear scaling. SGP-DT presents a new methodology to produce the offspring that does not rely on the classic crossover. The synergy between such a methodology and linear scaling yields final solutions with low approximation error and computational cost. We evaluate SGP-DT on eleven well-known data sets and compare with \(\epsilon\)-lexicase, a state-of-the-art evolutionary technique, and seven Machine Learning techniques. SGP-DT achieves small RMSE values, on average 23.19% smaller than the one of \(\epsilon\)-lexicase. Tuning SGP-DT ’s configuration greatly reduces the computational cost while still obtaining competitive results.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Notes

  1. 1.

    \(f(x)=10/(5 + \sum _{i=1}^{5} (x_i -3)^2)\).

  2. 2.

    https://github.com/EpistasisLab/ellyn.

  3. 3.

    calculated with \(((M_T- M_D)/M_T) \cdot 100\), where \(M_D\) is the median RMSE of SGP-DT and \(M_T\) is the one of the competing technique.

  4. 4.

    for readability reasons we omitted 4 out-layers for lasso, 13 for \(\epsilon\)-lexicase, 30 for SGP-DT, 30 for DT-NM and 35 for DT-EM.

References

  1. 1.

    I. Arnaldo, K. Krawiec, U.M. O’Reilly, Multiple regression genetic programming. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, ACM, New York, NY, USA, GECCO ’14, pp 879–886, https://doi.org/10.1145/2576768.2598291, URL http://proxy.library.upenn.edu:4604/10.1145/2576768.2598291 (2014)

  2. 2.

    A. Asuncion, D. Newman, Uci machine learning repository (2007)

  3. 3.

    L. Breiman, Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  Google Scholar 

  4. 4.

    M. Cassotti, D. Ballabio, V. Consonni, A. Mauri, I.V. Tetko, R. Todeschini, Prediction of acute aquatic toxicity toward daphnia magna by using the ga-knn method. Alternatives Lab Animals 42(1), 31–41 (2014). https://doi.org/10.1177/026119291404200106 (pMID: 24773486)

    Article  Google Scholar 

  5. 5.

    M. Castelli, L. Trujillo, L. Vanneschi, S. Silva, E. Z-Flores, P. Legrand, Geometric semantic genetic programming with local search. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2015, Madrid, Spain, July 11-15, 2015, ACM, pp 999–1006, https://doi.org/10.1145/2739480.2754795 (2015)

  6. 6.

    W.L. Cava, T. Helmuth, L. Spector, J.H. Moore, A probabilistic and multi-objective analysis of lexicase selection and \(\varepsilon\)-lexicase selection. Evol. Comput. 5, 1–28 (2018)

  7. 7.

    S. Dignum, R. Poli, Operator equalisation and bloat free gp. In: European Conference on Genetic Programming, Springer, pp 110–121 (2008)

  8. 8.

    B. Efron, T. Hastie, I. Johnstone, R. Tibshirani et al., Least angle regression. Ann. Stat. 32(2), 407–499 (2004)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting. In: Computational Learning Theory, Second European Conference, EuroCOLT ’95, Barcelona, Spain, March 13-15, 1995, Proceedings, Springer, Lecture Notes in Computer Science, vol 904, pp 23–37, https://doi.org/10.1007/3-540-59119-2_166 (1995)

  10. 10.

    A.H. Gandomi, A.H. Alavi, A new multi-gene genetic programming approach to nonlinear system modeling. part i: materials and structural engineering problems. Neural Comput. Appl. 21(1), 171–187 (2012). https://doi.org/10.1007/s00521-011-0734-z

    Article  Google Scholar 

  11. 11.

    G. Gerules, C. Janikow, A survey of modularity in genetic programming. In: 2016 IEEE Congress on Evolutionary Computation (CEC), pp 5034–5043, https://doi.org/10.1109/CEC.2016.7748328 (2016)

  12. 12.

    T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction (Springer Science & Business Media, New York, 2009)

    Book  Google Scholar 

  13. 13.

    G.E. Hinton, Connectionist Learning Procedures Machine Learning (Elsevier, Amsterdam, 1990)

    Google Scholar 

  14. 14.

    M. Keijzer, Improving symbolic regression with interval arithmetic and linear scaling. In: European Conference on Genetic Programming, Springer, pp 70–82 (2003)

  15. 15.

    M. Keijzer, Scaled symbolic regression. Genet. Program. Evolvable Mach. 5(3), 259–269 (2004)

    Article  Google Scholar 

  16. 16.

    K. Krawiec, P. Liskowski, Automatic derivation of search objectives for test-based genetic programming. In: European Conference on Genetic Programming, Springer, pp 53–65 (2015)

  17. 17.

    K. Krawiec, U.M. O’Reilly, Behavioral programming: a broader and more detailed take on semantic gp. In: Proceedings of the 2014 Annual Conference on Genetic and Evolutionary Computation, ACM, pp 935–942 (2014)

  18. 18.

    K. Krawiec, T. Pawlak, Locally geometric semantic crossover: a study on the roles of semantics and homology in recombination operators. Genet. Program. Evolvable Mach. 14(1), 31–63 (2013). https://doi.org/10.1007/s10710-012-9172-7

    Article  Google Scholar 

  19. 19.

    W. La Cava, L. Spector, K. Danai, Epsilon-lexicase selection for regression. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, ACM, pp 741–748 (2016)

  20. 20.

    P. Liskowski, K. Krawiec, Online discovery of search objectives for test-based problems. Evol. Comput. 25(3), 375–406 (2017). https://doi.org/10.1162/evco_a_00179 (pMID: 26953882)

    Article  MATH  Google Scholar 

  21. 21.

    S. Luke, L. Panait, A comparison of bloat control methods for genetic programming. Evol. Comput. 14(3), 309–344 (2006)

    Article  Google Scholar 

  22. 22.

    N.F. McPhee, B. Ohs, T. Hutchison, Semantic building blocks in genetic programming. Genet. Program. 4971, 134–145 (2008). https://doi.org/10.1007/978-3-540-78671-9-12

    Article  Google Scholar 

  23. 23.

    D. Medernach, J. Fitzgerald, R.M.A. Azad, C. Ryan, Wave: A genetic programming approach to divide and conquer. In: Proceedings of the Companion Publication of the 2015 Annual Conference on Genetic and Evolutionary Computation, ACM, New York, NY, USA, GECCO Companion ’15, pp 1435–1436, https://doi.org/10.1145/2739482.2764659 (2015)

  24. 24.

    D. Medernach, J. Fitzgerald, R.M.A. Azad, C. Ryan, A new wave: A dynamic approach to genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, ACM, New York, NY, USA, GECCO ’16, pp 757–764, https://doi.org/10.1145/2908812.2908857 (2016)

  25. 25.

    A. Moraglio, K. Krawiec, C.G. Johnson, Geometric semantic genetic programming, in Parallel Problem Solving from Nature - PPSN XII. (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2012), pp. 21–31

  26. 26.

    Q.U. Nguyen, T.H. Chu, Semantic approximation for reducing code bloat in Genetic Programming. Swarm and Evolutionary Computation 58(2020). https://doi.org/10.1016/j.swevo.2020.100729. URL https://www.sciencedirect.com/science/article/pii/S2210650220303825

  27. 27.

    M. Nicolau, A. Agapitos, On the effect of function set to the generalisation of symbolic regression models. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion, ACM, New York, NY, USA, GECCO ’18, pp 272–273, https://doi.org/10.1145/3205651.3205773 (2018)

  28. 28.

    L.O.V. Oliveira, F.E. Otero, G.L. Pappa, J. Albinati, Sequential symbolic regression with genetic programming. In: Genetic Programming Theory and Practice XII, Springer, pp 73–90 (2015)

  29. 29.

    P. Orzechowski, W.L. Cava, J.H. Moore, Where are we now?: a large benchmark study of recent symbolic regression methods. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2018, Kyoto, Japan, July 15-19, 2018, pp 1183–1190, https://doi.org/10.1145/3205455.3205539 (2018)

  30. 30.

    F.E.B. Otero, C.G. Johnson, Automated problem decomposition for the boolean domain with genetic programming, in Genetic Programming. (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2013), pp. 169–180

  31. 31.

    M. O’Neill, Semantic methods in genetic programming. Genet. Program. Evol. Mach. 17(1), 3–4 (2016)

    Article  Google Scholar 

  32. 32.

    T.P. Pawlak, B. Wieloch, K. Krawiec, Semantic backpropagation for designing search operators in genetic programming. IEEE Trans. Evol. Comput. 19(3), 326–340 (2014)

    Article  Google Scholar 

  33. 33.

    T.P. Pawlak, B. Wieloch, K. Krawiec, Review and comparative analysis of geometric semantic crossovers. Genet. Programm. Evol. Mach. 16(3), 351–386 (2015). https://doi.org/10.1007/s10710-014-9239-8

    Article  Google Scholar 

  34. 34.

    F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, E. Duchesnay, Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  35. 35.

    J.C. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: ADVANCES IN LARGE MARGIN CLASSIFIERS, MIT Press, pp 61–74 (1999)

  36. 36.

    R. Poli, W.B. Langdon, Schema theory for genetic programming with one-point crossover and point mutation. Evol. Comput. 6(3), 231–252 (1998)

    Article  Google Scholar 

  37. 37.

    S. Ruberto, L. Vanneschi, M. Castelli, S. Silva, Esagp - a semantic gp framework based on alignment in the error space, in Genetic Programming. (Springer, Berlin Heidelberg, Berlin, Heidelberg, 2014), pp. 150–161

  38. 38.

    S. Ruberto, L. Vanneschi, M. Castelli, Genetic programming with semantic equivalence classes. Swarm and Evolutionary Computation 44, 453–469 (2019). https://doi.org/10.1016/j.swevo.2018.06.001. URL http://www.sciencedirect.com/science/article/pii/S2210650216300384

  39. 39.

    S. Ruberto, V. Terragni, J.H. Moore, Image feature learning with a genetic programming autoencoder. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2020, Cancun, Mexico, July 8-12, 2020, pp 245–246, https://doi.org/10.1145/3377929.3389981 (2020a)

  40. 40.

    S. Ruberto, V. Terragni, J.H. Moore, Image Feature Learning with Genetic Programming. In: Parallel Problem Solving from Nature - PPSN XVI, Springer International Publishing, Cham, Lecture Notes in Computer Science, pp 63–78, https://doi.org/10.1007/978-3-030-58115-2_5 (2020b)

  41. 41.

    S. Ruberto, V. Terragni, J.H. Moore, SGP-DT: Semantic Genetic Programming Based on Dynamic Targets. In: Proceedings of the 23rd European Conference on Genetic Programming, EuroGP 2020, Springer, Lecture Notes in Computer Science, vol 12101, pp 167–183, https://doi.org/10.1007/978-3-030-44094-7_11 (2020c)

  42. 42.

    S. Ruberto, V. Terragni, J.H. Moore, Sgp-dt: Towards effective symbolic regression with a semantic gp approach based on dynamic targets. In: Proceedings of the Genetic and Evolutionary Computation Conference (Hot Off the Press track), GECCO 2020, Cancun, Mexico, July 8-12, 2020, pp 25–26, https://doi.org/10.1145/3377929.3397486 (2020d)

  43. 43.

    S. Ruberto, V. Terragni, J.H. Moore, Towards effective gp multi-class classification based on dynamic targets. In: Proceedings of the 2021 Genetic and Evolutionary Computation Conference, ACM, https://doi.org/10.1145/3449639.3459324 (2021)

  44. 44.

    S. Silva, E. Costa, Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genet. Program. Evol. Mach. 10(2), 141–179 (2009)

    Article  Google Scholar 

  45. 45.

    S. Silva, S. Dignum, L. Vanneschi, Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genet. Program. Evol. Mach. 13(2), 197–238 (2012)

    Article  Google Scholar 

  46. 46.

    P. Tufekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods. International Journal of Electrical Power and Energy Systems 60, 126–140 (2014). https://doi.org/10.1016/j.ijepes.2014.02.027. URL https://www.sciencedirect.com/science/article/pii/S0142061514000908

  47. 47.

    L. Vanneschi, M. Castelli, S. Silva, A survey of semantic methods in genetic programming. Genet. Program. Evol. Mach. 15(2), 195–214 (2014). https://doi.org/10.1007/s10710-013-9210-0

    Article  Google Scholar 

  48. 48.

    L. Vanneschi, M. Castelli, K. Scott, L. Trujillo, Alignment-based genetic programming for real life applications. Swarm and Evolutionary Computation 44, 840–851 (2019). https://doi.org/10.1016/j.swevo.2018.09.006. URL http://www.sciencedirect.com/science/article/pii/S2210650218300208

  49. 49.

    D.R. White, J. Mcdermott, M. Castelli, L. Manzoni, B.W. Goldman, G. Kronberger, W. Jaśkowski, U.M. O’Reilly, S. Luke, Better gp benchmarks: community survey results and proposals. Genet. Program. Evol. Mach. 14(1), 3–29 (2013)

    Article  Google Scholar 

  50. 50.

    I.C. Yeh, T.K. Hsu, Building real estate valuation models with comparative approach through case-based reasoning. Applied Soft Computing 65, 260–271 (2018). https://doi.org/10.1016/j.asoc.2018.01.029. URL https://www.sciencedirect.com/science/article/pii/S1568494618300358

Download references

Acknowledgements

National Institute of Health Grant NIH R01 LM010098.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Stefano Ruberto.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ruberto, S., Terragni, V. & Moore, J.H. A semantic genetic programming framework based on dynamic targets. Genet Program Evolvable Mach 22, 463–493 (2021). https://doi.org/10.1007/s10710-021-09419-3

Download citation

Keywords

  • Semantic GP
  • Genetic Programming
  • Natural Selection
  • Symbolic Regression
  • Residuals
  • Linear Scaling
  • Crossover
  • Mutation