Untapped Potential of Genetic Programming: Transfer Learning and Outlier Removal

  • Leonardo TrujilloEmail author
  • Luis Muñoz
  • Uriel López
  • Daniel E. Hernández
Part of the Genetic and Evolutionary Computation book series (GEVO)


In the era of Deep Learning and Big Data, the place of Genetic Programming (GP) within the Machine Learning area seems difficult to define. Whether it is due to technical constraints or conceptual barriers, GP is currently not a paradigm of choice for the development of state-of-the-art machine learning systems. Nonetheless, there are important features of the GP approach that make it unique and should continue to be actively explored and studied. In this work we focus on two aspects of GP that have previously received little or no attention, particularly in tree-based GP for symbolic regression. First, on the potential of GP to perform transfer learning, where solutions evolved for one problem are transferred to another. Second, on the potential of GP individuals to detect the true underlying structure of an input dataset and detect anomalies in the input data, what are known as outliers. This work presents initial results on both issues, with the goal of fostering discussion and showing that there is still untapped potential in the GP paradigm.



This research was funded by CONACYT (Mexico) Fronteras de la Ciencia 2015-2 Project No. FC-2015-2/944 and TecNM project no. 6826-18-P, and first and third authors were respectively supported by CONACYT graduate scholarship No. 302526 and No. 573397.


  1. 1.
    Castelli, M., Trujillo, L., Vanneschi, L., Popovi, A.: Prediction of energy performance of residential buildings: A genetic programming approach. Energy and Buildings 102, 67–74 (2015)CrossRefGoogle Scholar
  2. 2.
    Chen, X., Ong, Y.S., Lim, M.H., Tan, K.C.: A multi-facet survey on memetic computation. IEEE Transactions on Evolutionary Computation 15(5), 591–607 (2011)CrossRefGoogle Scholar
  3. 3.
    Chitty, D.M.: Faster GPU based genetic programming using A two dimensional stack. CoRR abs/1601.00221 (2016)Google Scholar
  4. 4.
    Dozal, L., Olague, G., Clemente, E., Hernández, D.E.: Brain programming for the evolution of an artificial dorsal stream. Cognitive Computation 6(3), 528–557 (2014)CrossRefGoogle Scholar
  5. 5.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Floreano, D., Mattiussi, C.: Bio-Inspired Artificial Intelligence: Theories, Methods, and Technologies. MIT Press (2008)Google Scholar
  7. 7.
    Fortin, F.A., et al.: DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research 13, 2171–2175 (2012)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Friedman, J.H.: Multivariate adaptive regression splines. Ann. Statist. 19(1), 1–67 (1991)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Galván-López, E., Vazquez-Mendoza, L., Schoenauer, M., Trujillo, L.: On the Use of Dynamic GP Fitness Cases in Static and Dynamic Optimisation Problems. In: EA 2017- International Conference on Artificial Evolution, pp. 1–14. Paris, France (2017)Google Scholar
  10. 10.
    Gonçalves, I., Silva, S.: Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In: K. Krawiec, et al. (eds.) Genetic Programming, LNCS, vol. 7831, pp. 73–84. Springer Berlin Heidelberg (2013)CrossRefGoogle Scholar
  11. 11.
    Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press (2016)Google Scholar
  12. 12.
    Hubert, M., Rousseeuw, P.J., Van Aelst, S.: High-breakdown robust multivariate methods. Statist. Sci. 23 (2008)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Kotanchek, M., et al.: Pursuing the Pareto Paradigm: Tournaments, Algorithm Variations and Ordinal Optimization, pp. 167–185. Springer US (2007)Google Scholar
  14. 14.
    López, U., Trujillo, L., Martinez, Y., Legrand, P., Naredo, E., Silva, S.: RANSAC-GP: Dealing with Outliers in Symbolic Regression with Genetic Programming, pp. 114–130. Springer International Publishing, Cham (2017)Google Scholar
  15. 15.
    Martínez, Y., Trujillo, L., Legrand, P., Galván-López, E.: Prediction of expected performance for a genetic programming classifier. Genetic Programming and Evolvable Machines 17(4), 409–449 (2016)CrossRefGoogle Scholar
  16. 16.
    McConaghy, T.: Genetic Programming Theory and Practice IX, chap. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology, pp. 235–260. Springer New York, New York, NY (2011)Google Scholar
  17. 17.
    Miranda, L.F., Oliveira, L.O.V.B., Martins, J.F.B.S., Pappa, G.L.: How noisy data affects geometric semantic genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’17, pp. 985–992. ACM, New York, NY, USA (2017)Google Scholar
  18. 18.
    Moraglio, A., Krawiec, K., Johnson, C.G.: Parallel Problem Solving from Nature - PPSN XII: 12th International Conference, Taormina, Italy, September 1–5, 2012, Proceedings, Part I, chap. Geometric Semantic Genetic Programming, pp. 21–31. Springer Berlin Heidelberg, Berlin, Heidelberg (2012)Google Scholar
  19. 19.
    Muñoz, L., Silva, S., Trujillo, L.: M3GP: multiclass classification with GP. In: P. Machado, et al. (eds.) 18th European Conference on Genetic Programming, LNCS, vol. 9025, pp. 78–91. Springer, Copenhagen (2015)Google Scholar
  20. 20.
    Muñoz, L., Trujillo, L., Silva, S., Vanneschi, L.: Evolving multidimensional transformations for symbolic regression with m3gp. Memetic Computing (2018).
  21. 21.
    Nguyen, T.T., Yang, S., Branke, J.: Evolutionary dynamic optimization: A survey of the state of the art. Swarm and Evolutionary Computation 6, 1–24 (2012)CrossRefGoogle Scholar
  22. 22.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. on Knowl. and Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  23. 23.
    Qiu, J., Wu, Q., Ding, G., Xu, Y., Feng, S.: A survey of machine learning for big data processing. EURASIP Journal on Advances in Signal Processing 2016 (1), 67 (2016)CrossRefGoogle Scholar
  24. 24.
    Roberts, S.C., Howard, D., Koza, J.R.: Evolving modules in genetic programming by subtree encapsulation. In: Proceedings of the 4th European Conference on Genetic Programming, EuroGP ’01, pp. 160–175. Springer-Verlag, Berlin, Heidelberg (2001)zbMATHGoogle Scholar
  25. 25.
    Spector, L.: Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In: Proceedings of the fourteenth international conference on Genetic and evolutionary computation conference companion, GECCO Companion ’12, pp. 401–408. ACM (2012)Google Scholar
  26. 26.
    Tran, C.T., Zhang, M., Andreae, P., Xue, B.: Genetic programming based feature construction for classification with incomplete data. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO ’17, pp. 1033–1040. ACM, New York, NY, USA (2017)Google Scholar
  27. 27.
    Trujillo, L., Muñoz, L., Galván-López, E., Silva, S.: Neat genetic programming. Inf. Sci. 333, 21–43 (2016)CrossRefGoogle Scholar
  28. 28.
    Tsanas, A., Xifara, A.: Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and buildings 49, 560–567 (2012)CrossRefGoogle Scholar
  29. 29.
    Vladislavleva, E.J., Smits, G.F., den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. IEEE Transactions on Evolutionary Computation 13(2), 333–349 (2009)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Leonardo Trujillo
    • 1
    Email author
  • Luis Muñoz
    • 1
  • Uriel López
    • 1
  • Daniel E. Hernández
    • 1
  1. 1.Instituto Tecnológico de TijuanaTijuanaMéxico

Personalised recommendations