Architecture Optimization in Feedforward Connectionist Models
Given a set of training examples, determining the number of free parameters is a fundamental problem in neural network modeling. The number of such parameters influence the quality of the solution obtained. This paper deals with the problem of adapting the effective network complexity to the information contained in the training data set, and the task’s difficulty. The method we propose consists of choosing an oversized network architecture, training it until it is assumed to be close to a training error minimum then selecting the most important input variables and pruning irrelevant hidden neurones. This method is an extension of our previous one used for input variables selection, it is simple, cheap and effective. We show its effect experimentally through one classification and one regression problem.
Unable to display preview. Download preview PDF.
- Yacoub M. and Bennani Y.: HVS: A Heuristic for Variable Selection in Multilayer Artificial Neural Network Classifier. In Intelligent Engineering Systems Through Artificial Neural Networks, Vol 7: C. Dagli, M. Akay, O. Ersoy, B. Fernandez and A. Smith (Editors), pp. 527–532, (1997).Google Scholar
- Yacoub M. and Bennani Y.: A Neural Network Methodology for Machines’ Class Identification. Proc. IEEE International Joint Conference on Neural Networks (IJCNN’98), Vol. 1, pp. 322–325 (1998).Google Scholar
- Breiman L., Freidman J., Olshen R., Stone C.: Classification and regression trees. Wadsworth Int. Group. (1984).Google Scholar
- De Bollivier M., Gallinari P., Thiria S.: Cooperation of neural nets and task decomposition. International Joint Conference on Neural Networks (IJCNN’91), Vol. 2, pp. 573–576 (1991).Google Scholar
- Goutte C.: On the use of pruning prior for neural networks. In Neural Network for Signal Processing VI, pp. 52–61 (1996).Google Scholar
- Svarer C., Hansen L.K., Larsen J.: On design and evaluation of tapped-delay neural networks architectures. In IEEE International Conference on Neural Networks, pp. 46–51 (1993).Google Scholar