Abstract
This paper presents simple and practical approaches for controlling the complexity of neural networks (NN) in order to optimize their generalization ability. Several formal and heuristic methods have been proposed in the literature for improving the performances of NNs. It is of major importance for the user to understand which cf these methods are of practical use and which are the more efficient. We will try here to fill the gap between specialists of these techniques and the user by presenting and analyzing some methods which we have selected both for their simplicity and efficiency. We will consider only supervised learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akaike H. A new look at the statistical model identification. Proceedings of the IEEE trans. Auto. Control, 1974; 19: 716–723.
Amari S, Murata N. Statistical Theory of Learning Curves under Entropic Loss Criterion. Neural Computation, 1993; 5: 140–153.
Badeva V, Morosov V. Problèmes incorrectements posés, théorie et applications. Masson, 1991.
Bishop CM. Training with noise is equivalent to Tikhonov Regularization, 1994; to appear in Neural Computation.
Breiman L, Friedman JR, Olsen RA, Stone CJ. Classification and Regression Trees. Wadsworth, Belmont, CA., 1984.
Buntime WL, Weigend AS. Bayesian Back-Propagation. Complex Systems, 1991; 5: 603–643.
Cibas T, Gallinari P, Gascuel O. Experimental investigations on the complexity-performance relations in multi-layer Perceptrons. ICANN 95, Paris; 1995.
Efron BE. The Jacknife, the Bootstrap and other resampling plans. Proceedings of the SIAM's Regional conference series in applied mathematics, 1982; vol. 38.
Fahlman SE, Lebiere C. “The cascade correlation learning architecture.” In NIPS 2, D.S. Touretzky ed., Morgan Kaufmann, 1990; 524–532.
Finnoff W, Hergert F, Zimmermann HG. Improving model selection by non convergent methods. Neural Networks, 1993; 6: 771–783.
Fleming H.E. “Equivalence of regularization and truncated iteration in the solution of ill-posed image reconstruction problems.” In Linear algebra and its applications, 1990; 130: 133–150.
Girosi F, Jones M, Poggio T. Regularization theory and neural networks architectures. Neural Computation, 1995; 7, 2: 219–269.
Grandvalet Y, Canu S. Comments on “noise Injection into inputs in back propagation learning.” IEEE Trans. on Systems, Man and Cybernetics, 1995; 25, 4: 678–681.
Grandvallet Y. Injection de bruit dans les perceptrons multi-couches. Thèse Univ. Tech. Compiègne, 1995.
Gustafson, Hajlmarsson. 21 maximum likelihood estimators for model selection. Automatica, 1995.
Guyon I, Vapnik V, Boser BE, Bottou LY, Solla SA. “Strictural risk minimization for character recognition.” NIPS 4, Moody J.E., Hanson S.J., Lippmann R.P. eds., M. Kaufmann, 1992; 471–479.
Hassibi B, Stork DG. “Second order derivatives for Neural Pruning: Optimal Brain Surgeon.” In Neural Information Processing Systems 5; C.L.Giles, S.J.Hanson and J.D.Cowan eds., Morgan Kaufmann, San Mateo, 1993.
Hochreiter S, Schmidhuber J. Flat minimum search finds simple nets. Neural Computation, 1994; 9, 1: 142.
Hornik M, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural networks, 1989; 2: 359–366.
Larsen J, Hansen LK. Generalization Performance of Regularized Neural Networks Models. Proceedings of the IEEE Workshop on Neural Networks for Signal Processing NNSP’94, 1994.
Le Cun Y, Denker JS, Solla SA. “Optimal brain damage.” In NIPS 2, 1990; 598–605.
Ljung L. System Identification: Theory for the User. Prentice-Hall, Englewood Cliffts, NJ:1987.
MacKay DJC. Bayesian interpolation. Neural Comp., 1992(a); 4: 415–447.
MacKay D.J.C. A practical framework for backpropagation networks. Neural Comp., 1992(b); 3: 448–472.
MacKay DJC. The evidence framework applied to classification networks. Neural Comp., 1992(c); 4: 720–736.
Matsuoka K. Noise injection into inputs in back propagation learning. IEEE trans. SMC, 1992; 22, 3: 436–440.
Moody JE. “The effective number of parameters: an analysis of generalization in non linear learning systems.” In NIPS 4, 1992; 847–855.
Murray AF, Edwards PJ. “Synaptic weight noise during MLP learning enhences fault tolerance, generalisation and learning trajectory.” In NIPS 6, 1994; 491–498.
Nadal J.P. Study of growth algorithm for a feedforward neural network. Int. Journal of Neural Systems 1989; 1, 1: 55–69.
Neal RM. “Bayesian Learning for Neural Networks.” In Lecture Notes in Statistics, Springer, 1995.
Poggio T, Girosi F. Regularization algorithms that are equivalent to multilayer networks. Science, 1990; 247: 978 - 982.
Powell MJD. “Radial basis functions for multivariable interpolation: a review.” In Algorithms for approximation, J.C. Mason and M.G. Cox eds, Clarendon Press Oxford, 1987.
Rissanen J. Modeling by shortest data description. Automatica, 1978; 14: 465.
Sjöberg J. Regularization issues in neural networks models of dynamical systems. PhD Thesis, Linköping University, Sw., 1993.
Tikhonov AN, Arsenin VY. Solutions of Ill posed Problems. Winston, Washington DC: 1977.
Vapnik VN. “Principle of risk minimization for learning theory.” In NIPS 4, 1992; 831–840.
Vapnik VN. The Nature of Statistical Learning Theory. Springer-Verlag, New York, Inc: 1995.
Vapnik VN, Chervonenkis YA. “On the uniform convergence of relative frequencies of events to their probabilities.” In Theory of Probabiliry and its Applications, 1971; 16: 264–280.
Williams P.M. Bayesian regularization and pruning using a Laplace prior. TR CDRP-312, Univ. Sussex: 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Gallinari, P., Cibas, T. (1998). Complexity Control and Generalization in Multilayer Perceptrons. In: Aurifeille, JM., Deissenberg, C. (eds) Bio-Mimetic Approaches in Management Science. Advances in Computational Management Science, vol 1. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-2821-7_2
Download citation
DOI: https://doi.org/10.1007/978-1-4757-2821-7_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-4791-8
Online ISBN: 978-1-4757-2821-7
eBook Packages: Springer Book Archive