Validation Sets for Evolutionary Curtailment with Improved Generalisation

  • Jeannie Fitzgerald
  • Conor Ryan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6935)

Abstract

This paper investigates the leveraging of a validation data set with Genetic Programming (GP) to counteract over-fitting. It considers fitness on both training and validation fitness, combined with with an early stopping mechanism to improve generalisation while significantly reducing run times.

The method is tested on six benchmark binary classification data sets. Results of this preliminary investigation suggest that the strategy can deliver equivalent or improved results on test data.

Keywords

Genetic Programming Evolutionary Computation Trigger Point Average Validation Standard Genetic Programming 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Azad, R.M.A., Ryan, C.: Abstract functions and lifetime learning in genetic programming for symbolic regression. In: Branke, J., et al. (eds.) GECCO 2010: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, July 7-11, pp. 893–900. ACM, Portland (2010)Google Scholar
  2. 2.
    Azad, R.M.A., Ryan, C.: Variance based selection to improve test set performance in genetic programming. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, Ireland, July 12-16 (to appear, 2011)Google Scholar
  3. 3.
    Costelloe, D., Ryan, C.: On improving generalisation in genetic programming. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 61–72. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  4. 4.
    Fitzgerald, J., Ryan, C.: Drawing boundaries: Using individual evolved class boundaries for binary classification problems. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2011, Dublin, Ireland, July 12-16 (to appear, 2011)Google Scholar
  5. 5.
    Foreman, N., Evett, M.: Preventing overfitting in GP with canary functions. In: Beyer, H.G., et al. (eds.) GECCO 2005: Proceedings of the 2005 Conference on Genetic and Evolutionary Computation, June 25-29, vol. 2, pp. 1779–1780. ACM Press, Washington DC (2005)Google Scholar
  6. 6.
    Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml
  7. 7.
    Gagné, C., Parizeau, M.: Open beagle: A new c++ evolutionary computation framework. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2002, p. 888. Morgan Kaufmann Publishers Inc., San Francisco (2002)Google Scholar
  8. 8.
    Gagné, C., Schoenauer, M., Parizeau, M., Tomassini, M.: Genetic programming, validation sets, and parsimony pressure. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 109–120. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  9. 9.
    Kushchu, I.: Genetic programming and evolutionary generalization. IEEE Transactions on Evolutionary Computation 6(5), 431–442 (2002)CrossRefMATHGoogle Scholar
  10. 10.
    Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Langdon, W.B., et al. (eds.) GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, July 9-13, pp. 829–836. Morgan Kaufmann Publishers, New York (2002)Google Scholar
  11. 11.
    Miller, J.F., Thomson, P.: Aspects of digital evolution: Geometry and learning. In: Sipper, M., Mange, D., Pérez-Uribe, A. (eds.) ICES 1998. LNCS, vol. 1478, pp. 25–35. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  12. 12.
    Robilliard, D., Fonlupt, C.: Backwarding: An overfitting control for genetic programming in a remote sensing application. In: Collet, P., Fonlupt, C., Hao, J.-K., Lutton, E., Schoenauer, M. (eds.) EA 2001. LNCS, vol. 2310, pp. 245–254. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Sarle, W.S.: Stopped training and other remedies for overfitting. In: Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics, pp. 352–360 (1995)Google Scholar
  14. 14.
    Tuite, C., Agapitos, A., O’Neill, M., Brabazon, A.: A preliminary investigation of overfitting in evolutionary driven model induction: Implications for financial modelling. In: Di Chio, C., Brabazon, A., Di Caro, G.A., Drechsler, R., Farooq, M., Grahl, J., Greenfield, G., Prins, C., Romero, J., Squillero, G., Tarantino, E., Tettamanzi, A.G.B., Urquhart, N., Uyar, A.Ş. (eds.) EvoApplications 2011, Part II. LNCS, vol. 6625, pp. 121–130. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  15. 15.
    Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of the 12th Annual Conference on Genetic and Evolutionary Computation, GECCO 2010, pp. 877–884. ACM, New York (2010)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jeannie Fitzgerald
    • 1
  • Conor Ryan
    • 2
  1. 1.Jeannie Fitzgerald, BDS Group, CSIS DepartmentUniversity of LimerickIreland
  2. 2.Conor Ryan, BDS Group, CSIS DepartmentUniversity of LimerickIreland

Personalised recommendations