Cross-validation in MLP Training

Bourlard, Hervé A.; Morgan, Nelson

doi:10.1007/978-1-4615-3210-1_12

Hervé A. Bourlard^4,5 &
Nelson Morgan^5,6

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 247))

158 Accesses

Abstract

It is well known that system models which have too many parameters (with respect to the number of measurements) do not generalize well to new measurements. For instance, an autoregressive (AR) model can be derived which will represent the training data with no error by using as many parameters as there are data points. This would generally be of no value, as it would only represent the training data. Criteria such as the Akaike Information Criterion (AIC) [Akaike, 1974, 1986] can be used to penalize both the complexity of AR models and their training error variance. In feedforward nets, we do not currently have such a measure. In fact, given the aim of building systems which are biologically plausible, there is a temptation to assume the usefulness of indefinitely large adaptive networks. In contrast to our best guess at Nature’stricks, manmade systems for pattern recognition seem to require nasty amounts of data for training. In short, the design of massively parallel systems is limited by the number of parameters that can be learned with available training data. It is likely that the only way truly massive systems can be built is with the help of prior information, e.g., connection topology and weights that need not be learned [Feldman et al., 1988]. Learning theory [Valiant, 1984; Pearl, 1978] has begun to establish what is possible for trained systems. Order-of-magnitude lower bounds have been established for the number of required measurements to train a desired size feedforward net [Baum & Haussler, 1988]. Rules of thumb suggesting the number of samples required for specific distributions could be useful for practical problems. Widrow has suggested having a training sample size that is 10 times the number of weights in a network (“Uncle Berllie’s Rule”) [Widrow, 1987].

We should be careful to get out of an experience only the wisdom that is in it — and stop there; lest we be like the cat that sits down on a hot stove-lid. She will never sit down on a hot stove-lid again — and that is well; but also she will never sit down on a cold one anymore. - Mark Twain -

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Lernout & Hauspie Speech Products, Belgium
Hervé A. Bourlard
International Computer Science Institute, Berkeley, CA, USA
Hervé A. Bourlard & Nelson Morgan
University of California, Berkeley, CA, USA
Nelson Morgan

Authors

Hervé A. Bourlard
View author publications
You can also search for this author in PubMed Google Scholar
Nelson Morgan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bourlard, H.A., Morgan, N. (1994). Cross-validation in MLP Training. In: Connectionist Speech Recognition. The Springer International Series in Engineering and Computer Science, vol 247. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-3210-1_12

Download citation

DOI: https://doi.org/10.1007/978-1-4615-3210-1_12
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6409-2
Online ISBN: 978-1-4615-3210-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics