Advertisement

Where Should We Stop? An Investigation on Early Stopping for GP Learning

  • Thi Hien Nguyen
  • Xuan Hoai Nguyen
  • Bob McKay
  • Quang Uy Nguyen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7673)

Abstract

We investigate the impact of early stopping on the speed and accuracy of Genetic Programming (GP) learning from noisy data. Early stopping, using a popular stopping criterion, maintains the generalisation capacity of GP while significantly reducing its training time.

Keywords

Genetic Programming Evolutionary Computation Generalisation Error Symbolic Regression Grammatical Evolution 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT Press, Cambridge (1992)zbMATHGoogle Scholar
  2. 2.
    Poli, R., McPhee, W.L.N.: A Field Guide to Genetic Programming (2008), http://lulu.com
  3. 3.
    Mitchell, T.M.: Machine Learning. McGraw Hill (1997)Google Scholar
  4. 4.
    Costelloe, D., Ryan, C.: On Improving Generalisation in Genetic Programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 61–72. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    Foreman, N., Evett, M.: Preventing overfitting in GP with canary functions. In: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation (GECCO 2005), pp. 1779–1780. ACM (2005)Google Scholar
  6. 6.
    Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic Programming, Validation Sets, and Parsimony Pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Kushchu, I.: Genetic programming and evolutionary generalization. IEEE Transactions on Evolutionary Computation 6, 431–442 (2002)CrossRefGoogle Scholar
  8. 8.
    Uy, N.Q., Hien, N.T., Hoai, N.X., O’Neill, M.: Improving the Generalisation Ability of Genetic Programming with Semantic Similarity based Crossover. In: Esparcia-Alcázar, A.I., Ekárt, A., Silva, S., Dignum, S., Uyar, A.Ş. (eds.) EuroGP 2010. LNCS, vol. 6021, pp. 184–195. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Panait, L., Luke, S.: Methods for Evolving Robust Programs. In: Cantú-Paz, E., Foster, J.A., Deb, K., Davis, L., Roy, R., O’Reilly, U.-M., Beyer, H.-G., Kendall, G., Wilson, S.W., Harman, M., Wegener, J., Dasgupta, D., Potter, M.A., Schultz, A., Dowsland, K.A., Jonoska, N., Miller, J., Standish, R.K. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1740–1751. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  10. 10.
    Paris, G., Robilliard, D., Fonlupt, C.: Exploring Overfitting in Genetic Programming. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 267–277. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Vanneschi, L., Gustafson, S.: Using crossover based similarity measure to improve genetic programming generalization ability. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO 2009), pp. 1139–1146. ACM (2009)Google Scholar
  12. 12.
    Prechelt, L.: Early Stopping - But When? In: Orr, G.B., Müller, K.-R. (eds.) NIPS-WS 1996. LNCS, vol. 1524, pp. 55–69. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  13. 13.
    Finno, W., Hergert, F., Zimmermann, H.: Improving model selection by nonconvergent methods. Neural Networks 6, 771–783 (1993)CrossRefGoogle Scholar
  14. 14.
    Zhang, B.T., Muhlenbein, H.: Balancing accuracy and parsimony in genetic programming. Evolutionary Computation 3, 17–38 (1995)CrossRefGoogle Scholar
  15. 15.
    Hooper, D., Flann, N.: Improving the accuracy and robustness of genetic programming through expression simplification. In: Proceedings of the First Annual Conference on Genetic Programming 1996, vol. 428. MIT Press (1996)Google Scholar
  16. 16.
    Becker, L., Seshadri, M.: Comprehensibility and overfitting avoidance in genetic programming for technical trading rules. Technical report, Worcester Polytechnic Institute (2003)Google Scholar
  17. 17.
    Liu, Y., Khoshgoftaar, T.: Reducing overfitting in genetic programming models for software quality classification. In: Proceedings of the Eighth IEEE Symposium on International High Assurance Systems Engineering, pp. 56–65 (2004)Google Scholar
  18. 18.
    Silva, S., Vanneschi, L.: Operator equalisation, bloat and overfitting: a study on human oral bioavailability prediction. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation (GECCO 2009), pp. 1115–1122 (2009)Google Scholar
  19. 19.
    Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: Early stopping criteria to counteract overfitting in genetic programming. In: Proceedings of the 13th Annual Conference Companion on Genetic and Evolutionary Computation, GECCO 2011, pp. 203–204. ACM, New York (2011)Google Scholar
  20. 20.
    Hien, N.T., Hoai, N.X., Uy, N.Q., McKay, R.: Where should we stop? - an investigation on early stopping for gp learning. Technical Report TRSNUSC:2011:001, Strutural Complexity Laboratory, Seoul National University, Seoul, Korea (February 2011)Google Scholar
  21. 21.
    Francone, F., Nordin, P., Banzhaf, W.: Benchmarking the generalization capabilities of a compiling genetic programming system using sparse data sets. In: Proceedings of the First Annual Conference on Genetic Programming 1996, pp. 72–80. MIT Press (1996)Google Scholar
  22. 22.
    Iba, H.: Bagging, boosting, and bloating in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1053–1060. Morgan Kaufmann (1999)Google Scholar
  23. 23.
    Paris, G., Robilliard, D., Fonlupt, C.: Exploring Overfitting in Genetic Programming. In: Liardet, P., Collet, P., Fonlupt, C., Lutton, E., Schoenauer, M. (eds.) EA 2003. LNCS, vol. 2936, pp. 267–277. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  24. 24.
    Mahler, S., Robilliard, D., Fonlupt, C.: Tarpeian Bloat Control and Generalization Accuracy. In: Keijzer, M., Tettamanzi, A.G.B., Collet, P., van Hemert, J., Tomassini, M. (eds.) EuroGP 2005. LNCS, vol. 3447, pp. 203–214. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  25. 25.
    Keijzer, M.: Improving Symbolic Regression with Interval Arithmetic and Linear Scaling. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E.P.K., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 70–82. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  26. 26.
    Gustafson, S., Burke, E.K., Krasnogor, N.: On improving genetic programming for symbolic regression. In: Proceedings of the 2005 IEEE Congress on Evolutionary Computation, vol. 1, pp. 912–919. IEEE Press, Edinburgh (2005)CrossRefGoogle Scholar
  27. 27.
    Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation (GECCO 2010), pp. 877–884. ACM (2010)Google Scholar
  28. 28.
    Shafi, K., Abbass, H.A., Zhu, W.: The Role of Early Stopping and Population Size in XCS for Intrusion Detection. In: Wang, T.-D., Li, X., Chen, S.-H., Wang, X., Abbass, H.A., Iba, H., Chen, G.-L., Yao, X. (eds.) SEAL 2006. LNCS, vol. 4247, pp. 50–57. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  29. 29.
    Blake, C., Keogh, E., Merz, C.J.: UCI machine learning repository (1998)Google Scholar
  30. 30.
    Vlachos, P.: Statlib project repository (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Thi Hien Nguyen
    • 1
  • Xuan Hoai Nguyen
    • 2
  • Bob McKay
    • 3
  • Quang Uy Nguyen
    • 1
  1. 1.Le Quy Don UniversityVietnam
  2. 2.Hanoi UniversityVietnam
  3. 3.Seoul National UniversityKorea

Personalised recommendations