Dynamic training subset selection for supervised learning in Genetic Programming
When using the Genetic Programming (GP) Algorithm on a difficult problem with a large set of training cases, a large population size is needed and a very large number of function-tree evaluations must be carried out. This paper describes how to reduce the number of such evaluations by selecting a small subset of the training data set on which to actually carry out the GP algorithm.
Dynamic Subset Selection (DSS), using the current GP run to select ‘difficult’ and/or disused cases,
Historical Subset Selection (HSS), using previous GP runs,
Random Subset Selection (RSS).
Various runs have shown that GP+DSS can produce better results in less than 20% of the time taken by GP. GP+HSS can nearly match the results of GP, and, perhaps surprisingly, GP+RSS can occasionally approach the results of GP. GP+DSS also produced better, more general results than those reported in a paper for a variety of Neural Networks when used on a substantial problem, known as the Thyroid problem.
Unable to display preview. Download preview PDF.
- 1.Goldberg, D.E.: GENETIC ALGORITHMS in Search, Optimisation & Machine Learning. Addison-Wesley (1989)Google Scholar
- 2.Holland, J.H.: Adaption in Natural Selection and Artificial Systems. New edition of the original GA work. The MIT Press (1992)Google Scholar
- 3.Koza, J.: Genetic Programming: on the programming of computers by natural selection. Contains clear description of a basic Genetic Algorithm as well as a detailed description of Genetic Programming.MIT Press, Cambridge, MA, (1992)Google Scholar
- 4.Schiffmann, W., Joost, M., Werner, R.: Optimization of the Backpropogation Algorithm for Training Multilayer Perceptrons. University of Koblenz, Institute of Physics, 15 (1992)Google Scholar
- 5.Schiffmann, W., Joost, M., Werner, R.: Synthesis and Performance Analysis of Multilayer Neural Network Architectures. University of Koblenz, Institute of Physics, 16 (1992)Google Scholar
- 6.Schiffmann, W., Joost, M., Werner, R.: THYROID training and test data sets. Obtained via electronic mail (1992)Google Scholar
- 7.Swayne, D., Cook, D., Buja, A.: User's Manual for XGobi, a Dynamic Graphics Program for Data Analysis Implemented in the X Window System (Version 2). Bellcore Technical Memorandum TM ARH-020368 (1992)Google Scholar
- 8.Tackett, W.A., Carmi, A.: S G P C: Simple Genetic Programming in C. Original source code for GP program used in this paper. Available via FTP at sfi.santafe.edu:pub/Users/tackett (1993)Google Scholar