Schema Analysis in Tree-Based Genetic Programming

  • Bogdan BurlacuEmail author
  • Michael Affenzeller
  • Michael Kommenda
  • Gabriel Kronberger
  • Stephan Winkler
Conference paper
Part of the Genetic and Evolutionary Computation book series (GEVO)


In this chapter we adopt the concept of schemata from schema theory and use it to analyze population dynamics in genetic programming for symbolic regression. We define schemata as tree-based wildcard patterns and we empirically measure their frequencies in the population at each generation. Our methodology consists of two steps: in the first step we generate schemata based on genealogical information about crossover parents and their offspring, according to several possible schema definitions inspired from existing literature. In the second step, we calculate the matching individuals for each schema using a tree pattern matching algorithm. We test our approach on different problem instances and algorithmic flavors and we investigate the effects of different selection mechanisms on the identified schemata and their frequencies.



The work described in this paper was done within the COMET Project Heuristic Optimization in Production and Logistics (HOPL), #843532 funded by the Austrian Research Promotion Agency (FFG).


  1. 1.
    Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications. Numerical Insights. CRC Press, Singapore (2009)CrossRefGoogle Scholar
  2. 2.
    Altenberg, L., et al.: The evolution of evolvability in genetic programming. Advances in genetic programming 3, 47–74 (1994)Google Scholar
  3. 3.
    Banzhaf, W.: Genetic programming and emergence. Genetic Programming and Evolvable Machines 15(1), 63–73 (2014). CrossRefGoogle Scholar
  4. 4.
    Banzhaf, W., Leier, A.: Evolution on neutral networks in genetic programming. In: Genetic programming theory and practice III, pp. 207–221. Springer (2006)Google Scholar
  5. 5.
    Burke, E., Gustafson, S., Kendall, G.: A survey and analysis of diversity measures in genetic programming. In: Proceedings of the 4th Annual Conference on Genetic and Evolutionary Computation, pp. 716–723. Morgan Kaufmann Publishers Inc. (2002)Google Scholar
  6. 6.
    Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Transactions on Evolutionary Computation 8(1), 47–62 (2004)CrossRefGoogle Scholar
  7. 7.
    Götz, M., Koch, C., Martens, W.: Efficient algorithms for descendant-only tree pattern queries. Inf. Syst. 34(7), 602–623 (2009). CrossRefGoogle Scholar
  8. 8.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. The University of Michigan Press (1975)Google Scholar
  9. 9.
    Hu, T., Banzhaf, W., Moore, J.H.: Population Exploration on Genotype Networks in Genetic Programming. In: Proceedings of the 13th International Conference on Parallel Problem Solving from Nature – PPSN XIII, 2014, pp. 424–433. Springer International Publishing, Cham (2014)Google Scholar
  10. 10.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)zbMATHGoogle Scholar
  11. 11.
    Krawiec, K., Wieloch, B.: Functional modularity for genetic programming. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation, GECCO ’09, pp. 995–1002. ACM, New York, NY, USA (2009).
  12. 12.
    Poli, R.: Hyperschema theory for gp with one-point crossover, building blocks, and some new results in ga theory. In: Genetic Programming, Proceedings of EuroGP 2000, pp. 15–16. Springer-Verlag (2000)Google Scholar
  13. 13.
    Poli, R.: Exact schema theory for genetic programming and variable-length genetic algorithms with one-point crossover. Genetic Programming and Evolvable Machines 2(2), 123–163 (2001). CrossRefGoogle Scholar
  14. 14.
    Poli, R.: A simple but theoretically-motivated method to control bloat in genetic programming. In: Proceedings of the 6th European Conference on Genetic Programming, EuroGP’03, pp. 204–217. Springer-Verlag, Berlin, Heidelberg (2003). zbMATHGoogle Scholar
  15. 15.
    Poli, R., Langdon, W.B., Dignum, S.: Generalisation of the limiting distribution of program sizes in tree-based genetic programming and analysis of its effects on bloat. In: in GECCO 2007: Proceedings of the 9th Annual Conference on Genetic and Evolutionary, pp. 1588–1595. ACM Press (2007)Google Scholar
  16. 16.
    Poli, R., McPhee, N.F.: General schema theory for genetic programming with subtree-swapping crossover: Part I. Evolutionary Computation 11(1), 53–66 (2003).CrossRefGoogle Scholar
  17. 17.
    Poli, R., McPhee, N.F.: General schema theory for genetic programming with subtree-swapping crossover: Part II. Evolutionary Computation 11(2), 169–206 (2003). CrossRefGoogle Scholar
  18. 18.
    Poli, R., McPhee, N.F.: Covariant parsimony pressure for genetic programming. In: GECCO 2008: Proceedings of the 10th annual conference on Genetic and Evolutionary Computation, pp. 1267–1274. ACM Press (2008)Google Scholar
  19. 19.
    Poli, R., Vanneschi, L., Langdon, W.B., McPhee, N.F.: Theoretical results in genetic programming: The next ten years? Genetic Programming and Evolvable Machines 11(3–4), 285–320 (2010). CrossRefGoogle Scholar
  20. 20.
    Stephens, C.R., Waelbroeck, H.: Effective degrees of freedom in genetic algorithms. Physical Review E 57(3), 3251–3264 (1998)CrossRefGoogle Scholar
  21. 21.
    Vladislavleva, E.J., Smits, G.F., Den Hertog, D.: Order of nonlinearity as a complexity measure for models generated by symbolic regression via pareto genetic programming. Evolutionary Computation, IEEE Transactions on 13(2), 333–349 (2009)CrossRefGoogle Scholar
  22. 22.
    Wagner, G.P., Altenberg, L.: Perspective: complex adaptations and the evolution of evolvability. Evolution 50, 967–976 (1996)CrossRefGoogle Scholar
  23. 23.
    Wagner, S., Kronberger, G., Beham, A., Kommenda, M., Scheibenpflug, A., Pitzer, E., Vonolfen, S., Kofler, M., Winkler, S.M., Dorfer, V., Affenzeller, M.: Architecture and design of the heuristiclab optimization environment. Advanced Methods and Applications in Computational Intelligence, Topics in Intelligent Engineering and Informatics 6, 197–261 (2013)CrossRefGoogle Scholar
  24. 24.
    White, D.: An overview of schema theory. Computing Research Repository CoRR abs/1401.2651 (2014).
  25. 25.
    Woodward, J.R.: Modularity in Genetic Programming. Proc. of Genetic Programming: 6th European Conference, EuroGP 2003 Essex, pp. 254–263. Springer (2003). Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • Bogdan Burlacu
    • 1
    • 2
    Email author
  • Michael Affenzeller
    • 1
    • 2
  • Michael Kommenda
    • 1
    • 2
  • Gabriel Kronberger
    • 3
  • Stephan Winkler
    • 1
    • 2
  1. 1.Heuristic and Evolutionary Algorithms LaboratoryUniversity of Applied Sciences Upper AustriaHagenbergAustria
  2. 2.Institute for Formal Models and VerificationJohannes Kepler UniversityLinzAustria
  3. 3.Heuristic and Evolutionary Algorithms LaboratoryUniversity of Applied Sciences Upper AustriaHagenbergAustria

Personalised recommendations