Statistical Ideas for Selecting Network Architectures

Ripley, B. D.

doi:10.1007/978-1-4471-3087-1_36

B. D. Ripley²

214 Accesses
25 Citations

Abstract

Choosing the architecture of a neural network is one of the most important problems in making neural networks practically useful, but accounts of applications usually sweep these details under the carpet. How many hidden units are needed? Should weight decay be used, and if so how much? What type of output units should be chosen? And so on.

We address these issues within the framework of statistical theory for model choice, which provides a number of workable approximate answers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

The autofeat Python Library for Automated Feature Engineering and Selection

The Best Neural Network Architecture

Artificial Neural Networks

References

Besag, J., Green, P., Higdon, D. and Mengersen, K. (1995) Bayesian computation and stochastic systems. Statistical Science1995.
Google Scholar
Bishop, C. Improving the generalization properties of radial basis function neural networks. Neural Computation1991; 3: 579–588.
Article Google Scholar
Cheng, B, and Titterington, D. M. Neural networks: a review from a statistical perspective (with discussion). Statistical Science1994; 9: 2–54.
Article MATH MathSciNet Google Scholar
Draper, D. Assessment and propagation of model uncertainty (with discussion). Journal of the Royal Statistical Society series B1995; 57: 45–97.
MATH MathSciNet Google Scholar
Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. Bayesian Data Analysis. Chapman & Hall, New York, 1995.
Google Scholar
Geman, S., Bienenstock, E. and Doursat, R. Neural networks and the bias/variance dilemma. Neural Computation1992; 4: 1–58.
Article Google Scholar
Huber, P. J. The behavior of maximum likelihood estimates under nonstandard conditions. In: Le Cam, L. M. and Neyman, J. (eds) Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. University of California Press, Berkeley, 1967, 1:221–233
Google Scholar
Jeffreys, H. Theory of Probability. Third edition. Clarendon Press, Oxford, 1961.
MATH Google Scholar
Lincoln, W. P. and Skrzypek, J. Synergy of clustering multiple backpropagation networks. In: Touretzky, D. S. (ed) Advances in Neural Information Processing Systems 2. Morgan Kaufmann, San Mateo, CA, 1990, pp. 650 - 657.
Google Scholar
Madigan, D. and Raftery, A. E. Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association1994; 89: 1535–1546.
Article MATH Google Scholar
Madigan, D. and York, J. Bayesian graphical models for discrete data. Technical report 239, Department of Statistics, University of Washington, 1993.
Google Scholar
Moody, J. E. Note on generalization, regularization and architecture selection in nonlinear learning systems. In First IEEE-SP Workshop on Neural Networks in Signal Processing. IEEE Computer Society Press, 1991, pp. 1–10.
Google Scholar
Moody, J. E. The effectivenumber of parameters: an analysis of generalization and regularization in nonlinear learning systems. In: Moody, J. E., Hanson, S. J. and Lippmann, R. P. (eds) Advances in Neural Information Processing Systems 4. Morgan Kaufmann, San Mateo, CA, 1992, pp. 847 - 854.
Google Scholar
Moody, J. and Utans, J. Principled architecture selection for neural networks: Application to corporate bond rating prediction. In: Moody, J. E., Hanson, S. J. and Lippmann, R. P. (eds) Advances in Neural Information Processing Systems 4. Morgan Kaufmann, San Mateo, CA, 1992, pp. 683 - 690.
Google Scholar
Moody, J. and Utans, J. Architecture selection strategies for neural networks: Application to corporate bond rating prediction. In: Refenes, A.-P. (ed) Neural Networks in the Capital Markets. Wiley, Chichester, 1995, pp. 277–300.
Google Scholar
Murata, N., Yoshizawa, S. and Amari, S. A criterion for determining the number of parameters in an artificial neural network model. In: Kohonen, T., Mäkisara, K., Simula, O. and Kangas, J. (eds) Artificial Neural Networks. North Holland, Amsterdam, 1991, pp. 9–14.
Google Scholar
Murata, N., Yoshizawa, S. and Amari, S. Learning curves, model selection and complexity of neural networks. In: Hanson, S. J., Cowan, J. D. and Giles, C. L. (eds) Advances in Neural Information Processing Systems 5. Morgan Kaufmann, San Mateo, CA, 1993, pp. 607 - 614.
Google Scholar
Murata, N., Yoshizawa, S. and Amari, S. Network information criterion determining the number of hidden units for artificial neural network models. IEEE Transactions on Neural Networks1994; 5: 865–872.
Article Google Scholar
Perrone, M. P. and Cooper, L. N. When networks disagree: Ensemble methods for hybrid neural networks. In: Mammon, R. J. (ed) Artificial Neural Networks for Speech and Vision. Chapman & Hall, London, 1993, pp. 126–142.
Google Scholar
Ripley, B. D. Statistical aspects of neural networks. In Barndorff-Nielsen, O. E., Jensen, J. L. and Kendall, W. S. (eds) Networks and Chaos — Statistical and Probabilistic Aspects. Chapman & Hall, London, 1993, pp. 40–123
Google Scholar
Ripley, B. D. Neural networks and related methods for classification (with discussion). Journal of the Royal Statistical Society series B1994; 56: 409–456
MATH MathSciNet Google Scholar
Ripley, B. D. Neural networks and flexible regression and discrimination. In: Mardia, K. V. (ed) Statistics and Images Carfax, Abingdon, 1994, pp. 39–57 (Advances in Applied Statistics 2).
Google Scholar
Ripley, B. D. Flexible non-linear approaches to classification. In: Cherkassky, V., Friedman, J. H. and Wechsler, H. (eds) From Statistics to Neural Networks. Theory and Pattern Recognition Applications. Springer, Berlin, 1994, pp. 105–126.
Google Scholar
Ripley, B. D. Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge, 1995.
Google Scholar
Seber, G. A. F. and Wild, C. J. Nonlinear Regression. Wiley, New York, 1989.
Book MATH Google Scholar
Stewart, L. Hierarchical Bayesian analysis using Monte Carlo integration: computing posterior distributions when there are many possible models. The Statistician1987; 36: 211–219.
Article Google Scholar
Stone, M. Cross-validatory choice and assessment of statistical predictions (with discussion). Journal of the Royal Statistical Society seriesB 1974; 36: 111–147.
MATH Google Scholar
Stone, M. An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. Journal of the Royal Statistical Society series B1977; 39: 4447.
Google Scholar
Wahba, G. and Wold, S. A completely automatic French curve. Communications in Statistics1975; 4: 1–17.
Article MathSciNet Google Scholar
Wolpert, D. H. Stacked generalization. Neural Networks1992; 5: 241–259.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Professor of Applied Statistics, University of Oxford, Oxford, UK
B. D. Ripley

Authors

B. D. Ripley
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dutch Foundation for Neural Networks (SNN), Geert Grooteplein Noord 21, 6525 EZ, Nijmegen, The Netherlands
Bert Kappen & Stan Gielen &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ripley, B.D. (1995). Statistical Ideas for Selecting Network Architectures. In: Kappen, B., Gielen, S. (eds) Neural Networks: Artificial Intelligence and Industrial Applications. Springer, London. https://doi.org/10.1007/978-1-4471-3087-1_36

Download citation

DOI: https://doi.org/10.1007/978-1-4471-3087-1_36
Publisher Name: Springer, London
Print ISBN: 978-3-540-19992-2
Online ISBN: 978-1-4471-3087-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Statistical Ideas for Selecting Network Architectures

Abstract

Access this chapter

Preview

Similar content being viewed by others

The autofeat Python Library for Automated Feature Engineering and Selection

The Best Neural Network Architecture

Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Statistical Ideas for Selecting Network Architectures

Abstract

Access this chapter

Preview

Similar content being viewed by others

The autofeat Python Library for Automated Feature Engineering and Selection

The Best Neural Network Architecture

Artificial Neural Networks

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation