Improving the Generalisation Ability of Genetic Programming with Semantic Similarity based Crossover

  • Nguyen Quang Uy
  • Nguyen Thi Hien
  • Nguyen Xuan Hoai
  • Michael O’Neill
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6021)


This paper examines the impact of semantic control on the ability of Genetic Programming (GP) to generalise via a semantic based crossover operator (Semantic Similarity based Crossover - SSC). The use of validation sets is also investigated for both standard crossover and SSC. All GP systems are tested on a number of real-valued symbolic regression problems. The experimental results show that while using validation sets barely improve generalisation ability of GP, by using semantics, the performance of Genetic Programming is enhanced both on training and testing data. Further recorded statistics shows that the size of the evolved solutions by using SSC are often smaller than ones obtained from GP systems that do not use semantics. This can be seen as one of the reasons for the success of SSC in improving the generalisation ability of GP.


Genetic Programming Semantics Generalisation Crossover 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Banzhaf, W., Francone, F.D., Nordin, P.: The effect of extensive use of the mutation operator on generalization in genetic programming using sparse data sets. In: Ebeling, W., Rechenberg, I., Voigt, H.-M., Schwefel, H.-P. (eds.) PPSN 1996. LNCS, vol. 1141, pp. 300–309. Springer, Heidelberg (1996)CrossRefGoogle Scholar
  2. 2.
    Beadle, L., Johnson, C.: Semantically driven crossover in genetic programming. In: Proceedings of the IEEE World Congress on Computational Intelligence, pp. 111–116. IEEE Press, Los Alamitos (2008)CrossRefGoogle Scholar
  3. 3.
    Cleary, R., O’Neill, M.: An attribute grammar decoder for the 01 multi-constrained knapsack problem. In: Raidl, G.R., Gottlieb, J. (eds.) EvoCOP 2005. LNCS, vol. 3448, pp. 34–45. Springer, Heidelberg (2005)Google Scholar
  4. 4.
    Costa, L.E.D., Landry, J.-A.: Relaxed genetic programming. In: GECCO 2006: Proceedings of the 8th annual conference on Genetic and evolutionary computation, Seattle, Washington, USA, July 2006, vol. 1, pp. 937–938. ACM Press, New York (2006)CrossRefGoogle Scholar
  5. 5.
    Costelloe, D., Ryan, C.: On improving generalisation in genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 61–72. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  6. 6.
    de la Cruz Echeanda, M., de la Puente, A.O., Alfonseca, M.: Attribute grammar evolution. In: Mira, J., Álvarez, J.R. (eds.) IWINAC 2005. LNCS, vol. 3562, pp. 182–191. Springer, Heidelberg (2005)Google Scholar
  7. 7.
    Foreman, N., Evett, M.: Preventing overfitting in GP with canary functions. In: GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, Washington DC, USA, June 2005, vol. 2, pp. 1779–1780. ACM Press, New York (2005)CrossRefGoogle Scholar
  8. 8.
    Francone, F.D., Nordin, P., Banzhaf, W.: Benchmarking the generalization capabilities of a compiling genetic programming system using sparse data sets. In: Genetic Programming 1996: Proceedings of the First Annual Conference, Stanford University, CA, USA, July 28–31, pp. 72–80. MIT Press, Cambridge (1996)Google Scholar
  9. 9.
    Gagne, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic programming, validation sets, and parsimony pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  10. 10.
    Gustafson, S., Burke, E.K., Krasnogor, N.: On improving genetic programming for symbolic regression. In: Proceedings of the 2005 IEEE Congress on Evolutionary Computation, Edinburgh, UK, vol. 1, pp. 912–919. IEEE Press, Los Alamitos (2005)CrossRefGoogle Scholar
  11. 11.
    Johnson, C.: Deriving genetic programming fitness properties by static analysis. In: Foster, J.A., Lutton, E., Miller, J., Ryan, C., Tettamanzi, A.G.B. (eds.) EuroGP 2002. LNCS, vol. 2278, pp. 299–308. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Johnson, C.: What can automatic programming learn from theoretical computer science. In: Proceedings of the UK Workshop on Computational Intelligence. University of Birmingham (2002)Google Scholar
  13. 13.
    Johnson, C.: Genetic programming with fitness based on model checking. In: Ebner, M., O’Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 114–124. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  14. 14.
    Katz, G., Peled, D.: Genetic programming and model checking: Synthesizing new mutual exclusion algorithms. In: Cha, S(S.), Choi, J.-Y., Kim, M., Lee, I., Viswanathan, M. (eds.) ATVA 2008. LNCS, vol. 5311, pp. 33–47. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  15. 15.
    Katz, G., Peled, D.: Model checking-based genetic programming with an application to mutual exclusion. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 141–156. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  16. 16.
    Keijzer, M.: Improving symbolic regression with interval arithmetic and linear scaling. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 70–82. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  17. 17.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  18. 18.
    Krawiec, K., Lichocki, P.: Approximating geometric crossover in semantic space. In: Rothlauf, F. (ed.) Proceedings of Genetic and Evolutionary Computation Conference, GECCO 2009, Montreal, Québec, Canada, July 8-12, pp. 987–994. ACM, New York (2009)CrossRefGoogle Scholar
  19. 19.
    Kushchu, I.: An evaluation of evolutionary generalisation in genetic programming. Artificial Intelligence Review 18(1), 3–14 (2002)zbMATHCrossRefGoogle Scholar
  20. 20.
    Mahler, S., Robilliard, D., Fonlupt, C.: Tarpeian bloat control and generalization accuracy. In: Keijzer, M., Tettamanzi, A.G.B., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 203–214. Springer, Heidelberg (2005)Google Scholar
  21. 21.
    McPhee, N., Ohs, B., Hutchison, T.: Semantic building blocks in genetic programming. In: O’Neill, M., Vanneschi, L., Gustafson, S., Esparcia Alcázar, A.I., De Falco, I., Della Cioppa, A., Tarantino, E. (eds.) EuroGP 2008. LNCS, vol. 4971, pp. 134–145. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  22. 22.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1996)zbMATHGoogle Scholar
  23. 23.
    Poli, R., Langdon, W.B., McPhee, N.F.: A Field Guide to Genetic Programming (With contributions by J. R. Koza) (2008),,
  24. 24.
    Uy, N.Q., Hoai, N.X., O’Neill, M.: Semantic aware crossover for genetic programming: the case for real-valued function regression. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 292–302. Springer, Heidelberg (2009)Google Scholar
  25. 25.
    Uy, N.Q., O’Neill, M., Hoai, N.X., McKay, B., Lopez, E.G.: Semantic similarity based crossover in GP: The case for real-valued function regression. In: Collet, P. (ed.) 9th International Conference Evolution Artificielle, October 2009. LNCS, pp. 13–24. Springer, Heidelberg (2009)Google Scholar
  26. 26.
    Vanneschi, L., Gustafson, S.: Using crossover based similarity measure to improve genetic programming generalization ability. In: GECCO 2009: Proceedings of the 11th Annual conference on Genetic and evolutionary computation, Montreal, July 8-12, pp. 1139–1146. ACM, New York (2009)CrossRefGoogle Scholar
  27. 27.
    Wong, M.L., Leung, K.S.: An induction system that learns programs in different programming languages using genetic programming and logic grammars. In: Proceedings of the 7th IEEE International Conference on Tools with Artificial Intelligence (1995)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Nguyen Quang Uy
    • 1
  • Nguyen Thi Hien
    • 2
  • Nguyen Xuan Hoai
    • 2
  • Michael O’Neill
    • 1
  1. 1.Natural Computing Research & Applications GroupUniversity College DublinIreland
  2. 2.School of Information TechnologyVietnamese Military Technical AcademyVietnam

Personalised recommendations