Soft Computing

, Volume 22, Issue 10, pp 3237–3260 | Cite as

An improved semantic schema modeling for genetic programming

Foundations
  • 64 Downloads

Abstract

A considerable research effort has been performed recently to improve the power of genetic programming (GP) by accommodating semantic awareness. The semantics of a tree implies its behavior during the execution. A reliable theoretical modeling of GP should be aware of the behavior of individuals. Schema theory is a theoretical tool used to model the distribution of the population over a set of similar points in the search space, referred by schema. There are several major issues with relying on prior schema theories, which define schemata in syntactic level. Incorporating semantic awareness in schema theory has been scarcely studied in the literature. In this paper, we present an improved approach for developing the semantic schema in GP. The semantics of a tree is interpreted as the normalized mutual information between its output vector and the target. A new model of the semantic search space is introduced according to semantics definition, and the semantic building block space is presented as an intermediate space between semantic and genotype ones. An improved approach is provided for representing trees in building block space. The presented schema is characterized by Poisson distribution of trees in this space. The corresponding schema theory is developed for predicting the expected number of individuals belonging to proposed schema, in the next generation. The suggested schema theory provides new insight on the relation between syntactic and semantic spaces. It has been shown to be efficient in comparison with the existing semantic schema, in both generalization and diversity-preserving aspects. Experimental results also indicate that the proposed schema is much less computationally expensive than the similar work.

Keywords

Genetic programming Schema theory Semantic building blocks Mutual information Semantic genetic programming 

Notes

Compliance with ethical standards

Conflict of interest

Authors Zahra Zojaji and Mohammad Mehdi Ebadzadeh declare that they have no conflict of interest regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

References

  1. Altenberg L (1994a) Emergent phenomena in genetic programming. In: Evolutionary programming—proceedings of the third annual conference, pp 233–241Google Scholar
  2. Altenberg L (1994b) The evolution of evolvability in genetic programming. In: Kinnear K (ed) Advances in genetic programming. MIT Press, Cambridge, pp 47–74Google Scholar
  3. Altenberg L (1995) The schema theorem and Price’s theorem. In: Whitley D, Vose M (eds) Foundations of genetic algorithms 3. Morgan Kaufmann, Los Altos, pp 23–49Google Scholar
  4. Amir Haeri M, Ebadzadeh M (2014) Estimation of mutual information by the fuzzy histogram. Fuzzy Optim Decis Mak 13:287–318CrossRefGoogle Scholar
  5. Beadle L, Johnson CG (2008) Semantically driven crossover in genetic programming. In: IEEE congress on evolutionary computation, pp 111–116Google Scholar
  6. Beadle L, Johnson CG (2009a) Semantic analysis of program initialisation in genetic programming. Genet Program Evolvable Mach 10:307–337CrossRefGoogle Scholar
  7. Beadle L, Johnson CG (2009b) Semantically driven mutation in genetic programming. In: IEEE congress on evolutionary computation, pp 1336–1342Google Scholar
  8. Card S, Mohan C (2008) Towards an information theoretic framework for genetic programming. In: Riolo R, Soule T, Worzel B (eds) Genetic programming theory and practice V. Genetic and evolutionary computation series. Springer, Berlin, pp 87–106CrossRefGoogle Scholar
  9. Castelli M, Fumagalli A (2016) An evolutionary system for exploitation of fractured geothermal reservoirs. Comput Geosci 20:385–396MathSciNetCrossRefGoogle Scholar
  10. Castelli M, Vanneschi L, Silva S (2014) Prediction of the unified Parkinson’s disease rating scale assessment using a genetic programming system with geometric semantic genetic operators. Expert Syst Appl 41:4608–4616CrossRefGoogle Scholar
  11. Castelli M, Silva S, Vanneschi L (2015) A C++ framework for geometric semantic genetic programming. Genet Program Evolvable Mach 16:73–81. doi: 10.1007/s10710-014-9218-0 CrossRefGoogle Scholar
  12. Castelli M, Manzoni L, Silva S, Vanneschi L, Popovič A (2016) The influence of population size in geometric semantic GP. Swarm Evol Comput 32:110–120Google Scholar
  13. D’haeseleer P, Bluming J (1994) Effects of locality in individual and population evolution. In: Kinnear K (ed) Advances in genetic programming. MIT Press, Cambridge, pp 177–198Google Scholar
  14. Galvan-Lopez E, Cody-Kenny B, Trujillo L, Kattan A (2013) Using semantics in the selection mechanism in genetic programming: a simple method for promoting semantic diversity. In: 2013 IEEE congress on evolutionary computation. IEEE, pp 2972–2979Google Scholar
  15. Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley Longman Publishing Co., Inc., ReadingMATHGoogle Scholar
  16. Gustafson S, Burke EK, Kendall G (2004) Sampling of unique structures and behaviours in genetic programming. In: Keijzer M et al (eds) Genetic programming. Springer, Berlin, pp 279–288Google Scholar
  17. Haynes T (1997) Phenotypical building blocks for genetic programming. In: Back T (ed) Genetic algorithms: proceedings of the seventh international conference, Michigan State University, East Lansing, MI, USA, 19–23 July. Morgan Kaufmann, pp 26–33Google Scholar
  18. Holland JH (1992) Adaptation in natural and artificial systems. MIT Press, CambridgeGoogle Scholar
  19. Jackson D (2010a) Phenotypic diversity in initial genetic programming populations. In: Esparcia-Alcazar AI et al (eds) Genetic programming. Springer, Istanbul, pp 98–109Google Scholar
  20. Jackson D (2010b) Promoting phenotypic diversity in genetic programming. In: Schaefer R et al (eds) Parallel problem solving from nature, PPSN XI. Springer, Krakow, pp 472–481Google Scholar
  21. Keijzer M (2003) Improving symbolic regression with interval arithmetic and linear scaling. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Genetic programming, vol 2610. Lecture notes in computer science. Springer, Berlin, pp 70–82. doi: 10.1007/3-540-36599-0_7
  22. Kinzett D, Zhang M, Johnston M (2010) Analysis of building blocks with numerical simplification in genetic programming. In: Esparcia-Alcázar A, Ekárt A, Silva S, Dignum S, Uyar AŞ (eds) Genetic programming, vol 6021. Lecture notes in computer science. Springer, Berlin, pp 289–300Google Scholar
  23. Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, CambridgeMATHGoogle Scholar
  24. Kraskov A, Stögbauer H, Grassberger P (2004) Estimating mutual information. Phys Rev E 69:066138MathSciNetCrossRefGoogle Scholar
  25. Krawiec K (2016) The framework of behavioral program synthesis. In: Behavioral program synthesis with genetic programming. Springer, Switzerland, pp 35–41Google Scholar
  26. Krawiec K, Lichocki P (2009a) Approximating geometric crossover in semantic space. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 987–994Google Scholar
  27. Krawiec K, Lichocki P (2009b) Approximating geometric crossover in semantic space. Paper presented at the proceedings of the 11th annual conference on genetic and evolutionary computation, Montreal, Qubec, CanadaGoogle Scholar
  28. Krawiec K, Pawlak T (2013a) Approximating geometric crossover by semantic backpropagation. Paper presented at the proceedings of the 15th annual conference on genetic and evolutionary computation, Amsterdam, The NetherlandsGoogle Scholar
  29. Krawiec K, Pawlak T (2013b) Locally geometric semantic crossover: a study on the roles of semantics and homology in recombination operators. Genet Program Evolvable Mach 14:31–63CrossRefGoogle Scholar
  30. Langdon WB, Poli R (2002) Foundations of genetic programming. Springer, BerlinCrossRefMATHGoogle Scholar
  31. Langdon WB, Banzhaf W (2005) Repeated sequences in linear genetic programming genomes. Complex Syst 15:285–306Google Scholar
  32. Langdon WB, Banzhaf W (2008) Repeated patterns in genetic programming. Nat Comput 7:589–613MathSciNetCrossRefMATHGoogle Scholar
  33. Majeed H (2005) A new approach to evaluate GP schema in context. Paper presented at the proceedings of the 2005 workshops on genetic and evolutionary computation, Washington, D.C., USA, 25–29 JuneGoogle Scholar
  34. McDermott J et al (2012) Genetic programming needs better benchmarks. In: Proceedings of the 14th annual conference on genetic and evolutionary computation. ACM, pp 791–798Google Scholar
  35. McKay RI, Nguyen XH, Cheney JR, Kim M, Mori N, Hoang TH (2009) Estimating the distribution and propagation of genetic programming building blocks through tree compression. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1011–1018Google Scholar
  36. McPhee NF, Poli R (2002) Using schema theory to explore interactions of multiple operators. Paper presented at the GECCO 2002: proceedings of the genetic and evolutionary computation conference, New YorkGoogle Scholar
  37. McPhee NF, Ohs B, Hutchison T (2008) Semantic building blocks in genetic programming. Paper presented at the proceedings of the 11th European conference on genetic programming, Naples, ItalyGoogle Scholar
  38. Moddemeijer R (1989) On estimation of entropy and mutual information of continuous distributions. Signal Process 16:233–248Google Scholar
  39. Moraglio A, Mambrini A (2013) Runtime analysis of mutation-based geometric semantic genetic programming for basis functions regression. In: Coello Coello CA et al (eds) Proceedings of the 15th annual conference on genetic and evolutionary computation. ACM, pp 989–996Google Scholar
  40. Moraglio A, Krawiec K, Johnson CG (2012) Geometric semantic genetic programming. In: Coello Coello CA (ed) Parallel problem solving from nature-PPSN XII. Springer, Berlin, pp 21–31Google Scholar
  41. Nguyen QU, Neill MO, Hoai NX (2010) Predicting the tide with genetic programming and semantic-based crossovers. In: 2010 second international conference on knowledge and systems engineering (KSE). IEEE, pp 89–95Google Scholar
  42. Nguyen QU, Nguyen XH, O’Neill M (2011a) Examining the landscape of semantic similarity based mutation. In: Proceedings of the 13th annual conference on genetic and evolutionary computation. ACM, pp 1363–1370Google Scholar
  43. Nguyen QU, Nguyen XH, O’Neill M, Mckay RI, Galvan-Lopez E (2011b) Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet Program Evolvable Mach 12:91–119CrossRefGoogle Scholar
  44. Nguyen QU, Nguyen XH, O’Neill M, McKay RI, Phong DN (2013) On the roles of semantic locality of crossover in genetic programming. Inf Sci 235:195–213MathSciNetCrossRefMATHGoogle Scholar
  45. Nguyen QU, Pham TA, Nguyen XH, McDermott J (2016) Subtree semantic geometric crossover for genetic programming. Genet Program Evolvable Mach 17:25–53CrossRefGoogle Scholar
  46. O’Reilly UM, Oppacher F (1994) The troubling aspects of a building block hypothesis for genetic programming. In: Whitley LD, Vose MD (eds) Foundations of genetic algorithms 3. Morgan Kaufmann, Estes Park, pp 73–88Google Scholar
  47. Pawlak TP (2015) Competent algorithms for geometric semantic genetic programming review. Ph.D. thesis, Poznan University of Technology, Pozna’n, PolandGoogle Scholar
  48. Pawlak TP, Krawiec K (2016) Semantic geometric initialization. In: Heywood IM, McDermott J, Castelli M, Costa E, Sim K (eds) Genetic programming: 19th European conference, EuroGP 2016, Porto, Portugal, March 30–April 1, 2016, proceedings. Springer, Cham, pp 261–277Google Scholar
  49. Pawlak TP, Wieloch B, Krawiec K (2015) Semantic backpropagation for designing search operators in genetic programming. IEEE Trans Evol Comput 19:326–340CrossRefGoogle Scholar
  50. Pham TA, Nguyen QU, Nguyen XH, O’Neill M (2013) Examining the diversity property of semantic similarity based crossover. In: Krawiec K, Moraglio A, Hu T, Etaner-Uyar AŞ, Hu B (eds) Genetic programming: 16th European conference, EuroGP 2013, Vienna, Austria, April 3–5, 2013. Proceedings. Springer, Berlin, pp 265–276Google Scholar
  51. Poli R (2000) Exact schema theorem and effective fitness for GP with one-point crossover. In: Whitley D, Goldberg D, Cantu-Paz E, Spector L, Parmee I, Beyer H-G (eds) Proceedings of the genetic and evolutionary computation conference, Las Vegas. Morgan Kaufmann, pp 469–476Google Scholar
  52. Poli R (2001) General schema theory for genetic programming with subtree-swapping crossover. In: Miller J, Tomassini M, Lanzi P, Ryan C, Tettamanzi AB, Langdon W (eds) Genetic programming, vol 2038. Lecture notes in computer science. Springer, Berlin, pp 143–159Google Scholar
  53. Poli R, Langdon WB (1997a) An experimental analysis of schema creation, propagation and disruption in genetic programming. In: Genetic algorithms: proceedings of the seventh international conference, 19–23 July. Morgan Kaufmann, Michigan State University, East Lansing, MI, USA, pp 18–25Google Scholar
  54. Poli R, Langdon WB (1997b) A new schema theory for genetic programming with one-point crossover and point mutation. In: Genetic programming 1997: proceedings of the second annual conference, 13–16 July. Morgan Kaufmann, Stanford University, CA, USA, pp 278–285Google Scholar
  55. Poli R, Langdon WB (1998) Schema theory for genetic programming with one-point crossover and point mutation. Evol Comput 6:231–252CrossRefGoogle Scholar
  56. Poli R, McPhee NF (2001) Exact schema theorems for GP with one-point and standard crossover operating on linear structures and their application to the study of the evolution of size. Paper presented at the genetic programming, proceedings of EuroGP’2001, Lake Como, ItalyGoogle Scholar
  57. Poli R, McPhee NF (2003a) General schema theory for genetic programming with subtree-swapping crossover: part I. Evol Comput 11:53–66CrossRefGoogle Scholar
  58. Poli R, McPhee NF (2003b) General schema theory for genetic programming with subtree-swapping crossover: part II. Evol Comput 11:169–206CrossRefGoogle Scholar
  59. Poli R, Stephens CR (2005) The building block basis for genetic programming and variable-length. Genet Algorithms Int J Comput Intell Res 1:183–197Google Scholar
  60. Poli R, Banzhaf W, Langdon W, Miller J, Nordin P, Fogarty T (2000) Hyperschema theory for GP with one-point crossover, building blocks, and some new results in GA theory. In: Genetic programming, vol 1802. Lecture notes in computer science. Springer, Berlin, pp 163–180Google Scholar
  61. Poli R, McPhee N, Rowe J (2004) Exact schema theory and Markov chain models for genetic programming and variable-length genetic algorithms with homologous crossover. Genet Program Evolvable Mach 5:31–70CrossRefGoogle Scholar
  62. Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471CrossRefMATHGoogle Scholar
  63. Rosca JP (1995a) Entropy-driven adaptive representation. In: Proceedings of the workshop on genetic programming: from theory to real-world applications. Citeseer, pp 719–736Google Scholar
  64. Rosca JP (1995b) Genetic programming exploratory power and the discovery of functions. In: Evolutionary programming. MIT Press, Cambridge, pp 719–736Google Scholar
  65. Rosca JP (1997) Analysis of complexity drift in genetic programming. In: Koza JR, Deb K, Dorigo M, Fogel DB, Garzon M, Iba H, Riolo RL (eds) Genetic programming 1997: proceedings of the second annual conference, Stanford University, CA, USA, 13–16 July. Morgan Kaufmann, pp 286–294Google Scholar
  66. Rosca JP, Ballard DH (1995) Causality in genetic programming. Paper presented at the proceedings of the 6th international conference on genetic algorithmsGoogle Scholar
  67. Rosca JP, Ballard DH (1996) Discovery of subroutines in genetic programming. In: Angeline PJ, Kinnear K (eds) Advances in genetic programming. MIT Press, Cambridge, pp 177–201Google Scholar
  68. Rosca JP, Ballard DH (1999) Rooted-tree schemata in genetic programming. In: Spector L, Langdon WB, O’Reilly UM, Angeline PJ (eds) Advances in genetic programming. MIT Press, Cambridge, pp 243–271Google Scholar
  69. Ryan C (1994) Pygmies and civil servants. In: Advances in genetic programming. MIT Press, Cambridge, pp 243–263Google Scholar
  70. Sastry K, O’Reilly U-M, Goldberg DE, Hill D (2003) Building block supply in genetic programming. In: Riolo RL, Worzel B (eds) Genetic programming theory and practice. Kluwer, Dordrecht, pp 137–154CrossRefGoogle Scholar
  71. Shan Y, McKay R, Essam D, Abbass H (2006) A Survey of probabilistic model building genetic programming. In: Studies in computational intelligence. Scalable optimization via probabilistic modeling, vol 33. Springer, Berlin, pp 121–160Google Scholar
  72. Smart W, Zhang M (2008) Empirical analysis of schemata in genetic programming using maximal schemata and MSG. In: Evolutionary computation. CEC 2008. (IEEE world congress on computational intelligence). IEEE, pp 2983–2990Google Scholar
  73. Smart W, Andreae P, Zhang M (2007) Empirical analysis of GP tree-fragments. Paper presented at the proceedings of the 10th European conference on genetic programming, Valencia, SpainGoogle Scholar
  74. Snedecor GW, Cochran WG (1967) Statistical methods, 6th edn. The Iowa State University, AmesMATHGoogle Scholar
  75. Tackett WA (1995) Mining the genetic program. IEEE Expert Intell Syst Appl 10:28–38Google Scholar
  76. Tomassini M, Vanneschi L, Collard P, Clergue M (2005) A study of fitness distance correlation as a difficulty measure in genetic programming. Evol Comput 13:213–239CrossRefMATHGoogle Scholar
  77. Vanneschi L, Castelli M, Manzoni L, Silva S (2013) A new implementation of geometric semantic GP and its application to problems in pharmacokinetics. Springer, BerlinGoogle Scholar
  78. Vanneschi L, Castelli M, Silva S (2014a) A survey of semantic methods in genetic programming. Genet Program Evolvable Mach 15:195–214CrossRefGoogle Scholar
  79. Vanneschi L, Silva S, Castelli M, Manzoni L (2014b) Geometric semantic genetic programming for real life applications. In: Riolo R, Moore HJ, Kotanchek M (eds) Genetic programming theory and practice XI. Springer, New York, pp 191–209CrossRefGoogle Scholar
  80. Welch BL (1947) The generalization of ‘student’s’ problem when several different population variances are involved. Biometrika 34:28–35MathSciNetMATHGoogle Scholar
  81. Whigham PA (1995) A schema theorem for context-free grammars. In: IEEE conference on evolutionary computation, Perth, Australia, 29 Nov–1 Dec 1995. IEEE Press, pp 178–181Google Scholar
  82. Wilson GC, Heywood MI (2005) Context-based repeated sequences in linear genetic programming. Paper presented at the proceedings of the 8th European conference on genetic programming, Lausanne, Switzerland, 30 Mar–1 AprGoogle Scholar
  83. Wyns B, De Bruyne P, Boullart L (2006) Characterizing diversity in genetic programming. In: Collet P et al (eds) Genetic programming. Springer, Budapest, pp 250–259Google Scholar
  84. Zhu Z, Nandi AK, Aslam MW (2013) Adapted geometric semantic genetic programming for diabetes and breast cancer classification. In: 2013 IEEE international workshop on machine learning for signal processing (MLSP). IEEE, pp 1–5Google Scholar
  85. Zojaji Z, Ebadzadeh MM (2015) Semantic schema theory for genetic programming. Appl Intell 44:67–87CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany 2017

Authors and Affiliations

  1. 1.Department of Computer Engineering and Information TechnologyAmirkabir University of TechnologyTehranIran

Personalised recommendations