On the Generalization Ability of Geometric Semantic Genetic Programming

Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9025)

Abstract

Geometric Semantic Genetic Programming (GSGP) is a recently proposed form of Genetic Programming (GP) that searches directly the space of the underlying semantics of the programs. The fitness landscape seen by the GSGP variation operators is unimodal with a linear slope by construction and, consequently, easy to search. Despite this advantage, the offspring produced by these operators grow very quickly. A new implementation of the same operators was proposed that computes the semantics of the offspring without having to explicitly build their syntax. This allowed GSGP to be used for the first time in real-life multidimensional datasets. GSGP presented a surprisingly good generalization ability, which was justified by some properties of the geometric semantic operators. In this paper, we show that the good generalization ability of GSGP was the result of a small implementation deviation from the original formulation of the mutation operator, and that without it the generalization results would be significantly worse. We explain the reason for this difference, and then we propose two variants of the geometric semantic mutation that deterministically and optimally adapt the mutation step. They reveal to be more efficient in learning the training data, and they also achieve a competitive generalization in only a single operator application. This provides a competitive alternative when performing semantic search, particularly since they produce small individuals and compute fast.

Keywords

Geometric semantic genetic programming Generalization Overfitting Pharmacokinetics Drug discovery 

References

  1. 1.
    Archetti, F., Lanzeni, S., Messina, E., Vanneschi, L.: Genetic programming for computational pharmacokinetics in drug discovery and development. Genet. Program. Evolvable Mach. 8(4), 413–432 (2007)CrossRefGoogle Scholar
  2. 2.
    Castelli, M., Silva, S., Vanneschi, L.: A C++ framework for geometric semantic genetic programming. Genet. Program. Evolvable Mach. 15, 73–81 (2014)Google Scholar
  3. 3.
    Dietterich, T.G.: Ensemble methods in machine learning. In: Kittler, J., Roli, F. (eds.) MCS 2000. LNCS, vol. 1857, pp. 1–15. Springer, Heidelberg (2000) CrossRefGoogle Scholar
  4. 4.
    Gonçalves, I., Silva, S.: Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 73–84. Springer, Heidelberg (2013) CrossRefGoogle Scholar
  5. 5.
    Hansen, L.K., Salamon, P.: Neural network ensembles. IEEE Trans. Pattern Anal. Mach. Intell. 12(10), 993–1001 (1990)CrossRefGoogle Scholar
  6. 6.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems), 1st edn. The MIT Press, Cambridge (1992)Google Scholar
  7. 7.
    Moraglio, A.: Towards a geometric unification of evolutionary algorithms. Ph.D. thesis, Department of Computer Science, University of Essex, UK, November 2007Google Scholar
  8. 8.
    Moraglio, A., Krawiec, K., Johnson, C.G.: Geometric semantic genetic programming. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012, Part I. LNCS, vol. 7491, pp. 21–31. Springer, Heidelberg (2012) CrossRefGoogle Scholar
  9. 9.
    Poli, R., Langdon, W.B., McPhee, N.F., Koza, J.R.: A field guide to genetic programming (2008). Lulu.com
  10. 10.
    Vanneschi, L., Castelli, M., Manzoni, L., Silva, S.: A new implementation of geometric semantic GP and its application to problems in pharmacokinetics. In: Krawiec, K., Moraglio, A., Hu, T., Etaner-Uyar, A.Ş., Hu, B. (eds.) EuroGP 2013. LNCS, vol. 7831, pp. 205–216. Springer, Heidelberg (2013) CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Ivo Gonçalves
    • 1
    • 2
  • Sara Silva
    • 1
    • 2
    • 3
  • Carlos M. Fonseca
    • 1
  1. 1.CISUC, Department of Informatics EngineeringUniversity of CoimbraCoimbraPortugal
  2. 2.BioISI - Biosystems and Integrative Sciences Institute, Faculty of SciencesUniversity of LisbonLisbonPortugal
  3. 3.NOVA IMS, Universidade Nova de LisboaLisbonPortugal

Personalised recommendations