Structural Versus Evaluation Based Solutions Similarity in Genetic Programming Based System Identification

  • Stephan M. Winkler
Part of the Studies in Computational Intelligence book series (SCI, volume 284)


Estimating the similarity of solution candidates represented as structure trees is an important point in the context of many genetic programming (GP) applications. For example, when it comes to observing population diversity dynamics, solutions have to be compared to each other. In the context of GP based system identification, i.e., when mathematical expressions are evolved, solutions can be compared to each other with respect to their structure as well as to their evaluation. Obviously, structural similarity estimation of formula trees is not equivalent to evaluation based similarity estimation; we here want to see whether there is a significant correlation between the results calculated using these two approaches. In order to get an overview regarding this issue, we have analyzed a series of GP tests including both similarity estimation strategies; in this paper we describe the similarity estimation methods as well as the test data sets used in these tests, and we document the results of these tests. We see that in most cases there is a significant positive linear correlation for the results returned by the evaluation based and structural methods. Especially in some cases showing very low structural similarity there can be significantly different results when using the evaluation based similarity methods.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Affenzeller, M., Wagner, S.: Offspring selection: A new self-adaptive selection scheme for genetic algorithms. In: Adaptive and Natural Computing Algorithms, pp. 218–221. Springer, Springer Computer Science, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Affenzeller, M., Winkler, S., Wagner, S., Beham, A.: Genetic Algorithms and Genetic Programming - Modern Concepts and Practical Applications. Chapman & Hall / CRC (2009)Google Scholar
  3. 3.
    Burke, E.K., Gustafson, S., Kendall, G.: Diversity in genetic programming: An analysis of measures and correlation with fitness. IEEE Transactions on Evolutionary Computation 8(1), 47–62 (2004)CrossRefGoogle Scholar
  4. 4.
    Deb, K., Goldberg, D.E.: An investigation of niche and species formation in genetic function optimization. In: Proceedings of the Third International Conference on Genetic Algorithms, pp. 42–50. Morgan Kaufmann, San Francisco (1989)Google Scholar
  5. 5.
    Ekart, A., Nemeth, S.Z.: A metric for genetic programs and fitness sharing. In: Poli, R., Banzhaf, W., Langdon, W.B., Miller, J.F., Nordin, P., Fogarty, T.C. (eds.) EuroGP 2000. LNCS, vol. 1802, pp. 259–270. Springer, Heidelberg (2000)Google Scholar
  6. 6.
    Keijzer, M.: Efficiently representing populations in genetic programming. In: Angeline, P.J., Kinnear Jr., K.E. (eds.) Advances in Genetic Programming, vol. 2, ch.13, pp. 259–278. MIT Press, Cambridge (1996)Google Scholar
  7. 7.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  8. 8.
    Langdon, W.B., Poli, R.: Foundations of Genetic Programming. Springer, Heidelberg (2002)zbMATHGoogle Scholar
  9. 9.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)MathSciNetGoogle Scholar
  10. 10.
    McKay, R.I.B.: Fitness sharing in genetic programming. In: Whitley, D., Goldberg, D., Cantu-Paz, E., Spector, L., Parmee, I., Beyer, H.G. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2000), pp. 435–442. Morgan Kaufmann, Las Vegas (2000)Google Scholar
  11. 11.
    McPhee, N.F., Hopper, N.J.: Analysis of genetic diversity through population history. In: Banzhaf, W., et al. (eds.) Proceedings of the Genetic and Evolutionary Computation Conference, vol. 2, pp. 1112–1120. Morgan Kaufmann, Orlando (1999)Google Scholar
  12. 12.
    O’Reilly, U.M.: Using a distance metric on genetic programs to understand genetic operators. In: IEEE International Conference on Systems, Man, and Cybernetics, Computational Cybernetics and Simulation, Orlando, Florida, USA, vol. 5, pp. 4092–4097 (1997)Google Scholar
  13. 13.
    Rosca, J.P.: Entropy-driven adaptive representation. In: Rosca, J.P. (ed.) Proceedings of the Workshop on Genetic Programming: From Theory to Real-World Applications, Tahoe City, California, USA, pp. 23–32 (1995)Google Scholar
  14. 14.
    Wagner, S.: Heuristic optimization software systems – modeling of heuristic optimization algorithms in the heuristiclab software environment. PhD thesis, Johannes Kepler University Linz (2009)Google Scholar
  15. 15.
    Winkler, S.: Evolutionary system identification - modern concepts and practical applications. PhD thesis, Institute for Formal Models and Verification, Johannes Kepler University Linz (2008)Google Scholar
  16. 16.
    Winkler, S., Affenzeller, M., Wagner, S.: Using enhanced genetic programming techniques for evolving classifiers in the context of medical diagnosis - an empirical study. Genetic Programming and Evolvable Machines 10(2), 111–140 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Stephan M. Winkler
    • 1
  1. 1.Department for Medical and BioinformaticsUpper Austria University of Applied Sciences, Heuristic and Evolutionary Algorithms Laboratory 

Personalised recommendations