Improving the generalization performance of multi-layer-perceptrons with population-based incremental learning
Based on Population-Based Incremental Learning (PBIL) we present a new approach for the evolution of neural network architectures and their corresponding weights. The main idea is to use a probability vector rather than bit strings to represent a population of networks in each generation. We show that crucial issues of neural network training can effectively be integrated into the PBIL framework. First, a Quasi-Newton method for local weight optimization is integrated and the moving average update rule of the PBIL is extended to continuous parameters in order to transmit the best network to the next generation. Second, and more important, we incorporate cross-validation to focus the evolution towards networks with optimal generalization performance. A comparison with standard pruning and stopped-training algorithms shows that our approach effectively finds small networks with increased generalization ability.
Unable to display preview. Download preview PDF.
- 1.Baluja, C., Caruana R.: Removing the genetics from the standard genetic algorithm, Proc. of the Twelfth Int. Conference on Machine Learning (1995)Google Scholar
- 2.Braun, H., Zagorski, P.: ENZO-M, A Hybrid Approach for Optimizing Neural Networks by Evolution and Learning, Parallel Problem Solving from Nature, Springer (1994)Google Scholar
- 3.Fletcher, R. Practical methods for optimization, John Wiley and Sons, Chichester (1995)Google Scholar
- 4.Harp, S., Samad, T., Guha, A.: Designing application-specific neural networks using the genetic algorithm, Advances in Neural Information Processing Systems 2, Morgan Kaufmann, San Mateo, CA (1990)Google Scholar
- 5.Hergert, F.,Finnoff, W. and Zimmermann H.: A comparison of weight elimination methods for reducing complexity in neural networks, Int. Joint Conf. on Neural Networks, Baltimore (1992)Google Scholar
- 6.Liu, Y.: Neural Network Model Selection using Asymptotic Jackknife Estimator and Cross-Validation, Advances in Neural Information Processing Systems 4, Morgan Kaufmann, San Mateo, CA (1992)Google Scholar
- 7.Svarer, C., Hansen, L., Larsen, J.: On design and evaluation of tapped-delay neural network architectures, IEEE International Conference on Neural Networks, San Francisco (1993)Google Scholar
- 8.Tong, H., Lim, K,: Threshold autoregression, limit cycles and cyclical data, Journ. Roy. Stat. Soc. B, 42 (1980) 245Google Scholar
- 9.Weigend, A., Rummelhart, D., Huberman, B.: Predicting the future: A connectionist approach, Int. Jour. of Neural Systems (1990)Google Scholar
- 10.Goldberg, D.: Gentic Algorithms in Search, Optimization and Machine Learning, Addison-Wesley, Redwood City (1989)Google Scholar
- 11.Schwefel, H.-P.: Evolution and Optimium Seeking, John Wiley and Suns, Chichester (1995)Google Scholar
- 12.Hertz, J., Krogh, A. and Palmer, R. Introduction to the theory of neural computation, Addison-Wesley, Redwood City (1991)Google Scholar