Using Genetic Algorithms for Training Data Selection in RBF Networks
The problem of generalization in the application of neural networks (NNs) to classification and regression problems has been addressed from many different viewpoints. The basic problem is well-known: minimization of an error function on a training set may lead to poor performance on data not included in the training set—a phenomenon sometimes called over-fitting.
In this paper we report on an approach that is inspired by data editing concepts in k-nearest neighbour methods, and by outlier detection in traditional statistics. The assumption is made that not all the data are equally useful in fitting the underlying (but unknown) function—in fact, some points may be positively misleading. We use a genetic algorithm (GA) to identify a ‘good’ training set for fitting radial basis function (RBF) networks, and test the methodology on two artificial classification problems, and on a real regression problem. Empirical results show that improved generalization can indeed be obtained using this approach.
KeywordsGenetic algorithms radial basis functions classification regression generalization forecasting
Unable to display preview. Download preview PDF.
- Reeves, C.R. and Steele, N.C. (1993). Neural networks for multivariate analysis: results of some cross-validation studies. Proc. of 6 th International Symposium on Applied Stochastic Models and Data Analysis, Vol II, 780–791. World Scientific Publishing, Singapore.Google Scholar
- Bernoulli, D. (1777). The most probable choice between several discrepant observations and the formation therefrom of the most likely induction. Biornetrika, 48: 3–13. Translated by Allen, C.G. (1961).Google Scholar
- Dasarathy, B.V. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques. IEEE Computer Society Press, Los Alamitos, CA.Google Scholar
- Reeves, C.R. and Taylor, S.J. (1998). Selection of training sets for neural networks by a genetic algorithm. In Eiben, A.E., Bäck, T., Schoenauer, M. and Schwefel, H-P. (Eds.) Parallel Problem-Solving from Nature PPSN V, 633–642. Springer-Verlag, Berlin.Google Scholar
- Plutowski, M. (1994). Selecting Training Exemplars for Neural Network Learning. PhD Dissertation, University of California, San Diego.Google Scholar
- Röbel, A. (1994). The Dynamic Pattern Selection Algorithm: Effective Training and Controlled Generalization of Backpropagation Neural Networks. Technical Report, Technical University of Berlin.Google Scholar
- Tambouratzis, T. and Tambouratzis, D.G. (1995) Optimal training pattern selection using a cluster-generating artificial neural network. In Pearson, D.W., Albrecht, R.F. and Steele, N.C. (Eds.) (1995) Proc. of 2nd International Conference on Artificial Neural Nets and Genetic Algorithms, 472–475. Springer-Verlag, Vienna.CrossRefGoogle Scholar
- Radcliffe, N.J. and George, F.A.W. (1993). A study in set recombination. In Forrest, S. (Ed.) (1993) Proceedings of 5 th International Conference on Genetic Algorithms, 23–30.Google Scholar
- Morgan Kaufmann, San Mateo, CA. Moody, J. and Darken, C.J. (1990). Fast learning in networks of locallytuned processing units. Neural Computation, 1: 281–294.Google Scholar