Dynamic training subset selection for supervised learning in Genetic Programming

  • Chris Gathercole
  • Peter Ross
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 866)


When using the Genetic Programming (GP) Algorithm on a difficult problem with a large set of training cases, a large population size is needed and a very large number of function-tree evaluations must be carried out. This paper describes how to reduce the number of such evaluations by selecting a small subset of the training data set on which to actually carry out the GP algorithm.

Three subset selection methods described in the paper are:
  • Dynamic Subset Selection (DSS), using the current GP run to select ‘difficult’ and/or disused cases,

  • Historical Subset Selection (HSS), using previous GP runs,

  • Random Subset Selection (RSS).

Various runs have shown that GP+DSS can produce better results in less than 20% of the time taken by GP. GP+HSS can nearly match the results of GP, and, perhaps surprisingly, GP+RSS can occasionally approach the results of GP. GP+DSS also produced better, more general results than those reported in a paper for a variety of Neural Networks when used on a substantial problem, known as the Thyroid problem.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Goldberg, D.E.: GENETIC ALGORITHMS in Search, Optimisation & Machine Learning. Addison-Wesley (1989)Google Scholar
  2. 2.
    Holland, J.H.: Adaption in Natural Selection and Artificial Systems. New edition of the original GA work. The MIT Press (1992)Google Scholar
  3. 3.
    Koza, J.: Genetic Programming: on the programming of computers by natural selection. Contains clear description of a basic Genetic Algorithm as well as a detailed description of Genetic Programming.MIT Press, Cambridge, MA, (1992)Google Scholar
  4. 4.
    Schiffmann, W., Joost, M., Werner, R.: Optimization of the Backpropogation Algorithm for Training Multilayer Perceptrons. University of Koblenz, Institute of Physics, 15 (1992)Google Scholar
  5. 5.
    Schiffmann, W., Joost, M., Werner, R.: Synthesis and Performance Analysis of Multilayer Neural Network Architectures. University of Koblenz, Institute of Physics, 16 (1992)Google Scholar
  6. 6.
    Schiffmann, W., Joost, M., Werner, R.: THYROID training and test data sets. Obtained via electronic mail (1992)Google Scholar
  7. 7.
    Swayne, D., Cook, D., Buja, A.: User's Manual for XGobi, a Dynamic Graphics Program for Data Analysis Implemented in the X Window System (Version 2). Bellcore Technical Memorandum TM ARH-020368 (1992)Google Scholar
  8. 8.
    Tackett, W.A., Carmi, A.: S G P C: Simple Genetic Programming in C. Original source code for GP program used in this paper. Available via FTP at (1993)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1994

Authors and Affiliations

  • Chris Gathercole
    • 1
  • Peter Ross
    • 1
  1. 1.Department of Artificial IntelligenceUniversity of EdinburghEdinburghUK

Personalised recommendations