Advertisement

Genetic Programming, Validation Sets, and Parsimony Pressure

  • Christian Gagné
  • Marc Schoenauer
  • Marc Parizeau
  • Marco Tomassini
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3905)

Abstract

Fitness functions based on test cases are very common in Genetic Programming (GP). This process can be assimilated to a learning task, with the inference of models from a limited number of samples. This paper is an investigation on two methods to improve generalization in GP-based learning: 1) the selection of the best-of-run individuals using a three data sets methodology, and 2) the application of parsimony pressure in order to reduce the complexity of the solutions. Results using GP in a binary classification setup show that while the accuracy on the test sets is preserved, with less variances compared to baseline results, the mean tree size obtained with the tested methods is significantly reduced.

Keywords

Pareto Front Tree Size Technical Trading Rule Evolvable Machine Code Growth 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)MATHGoogle Scholar
  2. 2.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons, Inc, New York (2001)MATHGoogle Scholar
  3. 3.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)MATHGoogle Scholar
  4. 4.
    Eiben, A.E., Jelasity, M.: A critical note on experimental research methodology in EC. In: Proceedings of the 2002 Congress on Evolutionary Computation (CEC 2002), Honolulu (HI), USA, pp. 582–587. IEEE Press, Los Alamitos (2002)Google Scholar
  5. 5.
    Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)CrossRefMATHGoogle Scholar
  6. 6.
    Domingos, P.: The role of occam’s razor in knowledge discovery. Data Mining and Knowledge Discovery 3(4), 409–425 (1999)CrossRefGoogle Scholar
  7. 7.
    Banzhaf, W., Langdon, W.B.: Some considerations on the reason for bloat. Genetic Programming and Evolvable Machines 3(1), 81–91 (2002)CrossRefMATHGoogle Scholar
  8. 8.
    Langdon, W.B.: Size fair and homologous tree genetic programming crossovers. Genetic Programming and Evolvable Machines 1(1/2), 95–119 (2000)CrossRefMATHGoogle Scholar
  9. 9.
    Ekárt, A., Németh, S.Z.: Selection based on the pareto nondomination criterion for controlling code growth in genetic programming. Genetic Programming and Evolvable Machines 2(1), 61–73 (2001)CrossRefMATHGoogle Scholar
  10. 10.
    Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of the 2002 Genetic and Evolutionary Computation Conference (GECCO 2002), pp. 829–836. Morgan Kaufmann Publishers, New York (2002)Google Scholar
  11. 11.
    Silva, S., Almeida, J.: Dynamic maximum tree depth. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2723, pp. 1776–1787. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
  13. 13.
    Sherrah, J., Bogner, R.E., Bouzerdoum, A.: The evolutionary pre-processor: Automatic feature extraction for supervised classification using genetic programming. In: Genetic Programming 1997: Proceedings of the Second Annual Conference, Stanford University (CA), USA, pp. 304–312. Morgan Kaufmann, San Francisco (1997)Google Scholar
  14. 14.
    Brameier, M., Banzhaf, W.: Evolving teams of predictors with linear genetic programming. Genetic Programming and Evolvable Machines 2(4), 381–407 (2001)CrossRefMATHGoogle Scholar
  15. 15.
    Yu, T., Chen, S.H., Kuo, T.W.: Discovering financial technical trading rules using genetic programming with lambda abstraction. In: Genetic Programming Theory and Practice II, Ann Arbor (MI), USA, pp. 11–30 (2004)Google Scholar
  16. 16.
    Panait, L., Luke, S.: Methods for evolving robust programs. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1740–1751. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  17. 17.
    Rowland, J.J.: Generalisation and model selection in supervised learning with evolutionary computation. In: Raidl, G.R., Cagnoni, S., Cardalda, J.J.R., Corne, D.W., Gottlieb, J., Guillot, A., Hart, E., Johnson, C.G., Marchiori, E., Meyer, J.-A., Middendorf, M. (eds.) EvoIASP 2003, EvoWorkshops 2003, EvoSTIM 2003, EvoROB/EvoRobot 2003, EvoCOP 2003, EvoBIO 2003, and EvoMUSART 2003. LNCS, vol. 2611, pp. 119–130. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  18. 18.
    Kushchu, I.: Genetic programming and evolutionary generalization. IEEE transactions on Evolutionary Computation 6(5), 431–442 (2002)CrossRefMATHGoogle Scholar
  19. 19.
    Nordin, P., Banzhaf, W.: Complexity compression and evolution. In: Proceedings of the Sixth International Conference Genetic Algorithms, Pittsburgh (PA), USA, pp. 310–317. Morgan Kaufmann, San Francisco (1995)Google Scholar
  20. 20.
    Soule, T., Foster, J.A.: Effects of code growth and parsimony pressure on populations in genetic programming. Evolutionary Computation 6(4), 293–309 (1998)CrossRefGoogle Scholar
  21. 21.
    Gustafson, S., Ekart, A., Burke, E., Kendall, G.: Problem difficulty and code growth in genetic programming. Genetic Programming and Evolvable Machines 5(3), 271–290 (2004)CrossRefGoogle Scholar
  22. 22.
    Iba, H., de Garis, H., Sato, T.: Genetic programming using a minimum description length principle. In: Advances in Genetic Programming. Complex Adaptive Systems, pp. 265–284. MIT Press, Cambridge (1994)Google Scholar
  23. 23.
    Zhang, B.T., Mühlenbein, H.: Balancing accuracy and parsimony in genetic programming. Evolutionary Computation 3(1), 17–38 (1995)CrossRefGoogle Scholar
  24. 24.
    Rosca, J.: Generality versus size in genetic programming. In: Genetic Programming 1996: Proceedings of the First Annual Conference, Stanford University (CA), USA, pp. 381–387 (1996)Google Scholar
  25. 25.
    Cavaretta, M.J., Chellapilla, K.: Data mining using genetic programming: The implications of parsimony on generalization error. In: Proceedings of the 1999 Congress on Evolutionary Computation (CEC 1999), Washington (DC), USA, pp. 1330–1337 (1999)Google Scholar
  26. 26.
    Gagné, C., Parizeau, M.: Open BEAGLE: A new versatile C++ framework for evolutionary computation. In: Late-Breaking Papers of the 2002 Genetic and Evolutionary Computation Conference (GECCO 2002), New York (NY), USA (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Christian Gagné
    • 1
    • 2
    • 3
  • Marc Schoenauer
    • 1
  • Marc Parizeau
    • 2
  • Marco Tomassini
    • 3
  1. 1.Équipe TAO – INRIA Futurs, LRI Bat. 490Université Paris SudOrsayFrance
  2. 2.Laboratoire de Vision et Systèmes Numériques (LVSN), Département de Génie Électrique et de Génie InformatiqueUniversité Laval, Québec (QC)Canada
  3. 3.Information Systems InstituteUniversité de LausanneDorignySwitzerland

Personalised recommendations