# Foundations of Statistical Learning and Model Selection

Chapter

First Online:

## Abstract

**What the reader should know to understand this chapter** \(\bullet \) Basic notions of machine learning. \(\bullet \) Notions of calculus. \(\bullet \) Chapter 5.

## Keywords

Akaike Information Criterion Bayesian Information Criterion Test Error Generalization Error Structural Risk Minimization
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

- 1.H. Akaike. Statistical predictor identification.
*Annals of the Institute of Statistical Mathematics*, 21:202–217, 1970.Google Scholar - 2.H. Akaike. Information theory and an extension of the maximum likelihood principle. In \(2^{nd}\)
*International Symposium on Information Theory*, pages 267–281, 1973.Google Scholar - 3.M. Anthony.
*Neural Network Learning: Theoretical Foundations*. Cambridge University Press, 1999.Google Scholar - 4.C. M. Bishop.
*Neural Networks for Pattern Recognition*. Cambridge University Press, 1995.Google Scholar - 5.S. Boucheron, G. Lugosi, and S. Massart. A sharp concentration inequality with applications.
*Random Structures and Algorithms*, 16(3):277–292, 2000.Google Scholar - 6.V. Cherkassky and F. Mulier.
*Learning from Data*. John Wiley, 1998.Google Scholar - 7.H. Chernoff. A measure of asymptotic efficiency of tests of a hypothesis based on the sum of observations.
*Annals of Mathematical Sciences*, 23:493–507, 1952.Google Scholar - 8.P. Craven and G. Wahba. Smoothing noisy data with spline functions: estimating the correct degree of smoothing by the method of generalized crossvalidation.
*Numerische Mathematik*, 31(4):377–403, 1978.Google Scholar - 9.L. Devroye, L. Gyorfi, and G. Lugosi.
*A Probabilistic Theory of Pattern Recognition*. Springer-Verlag, 1996.Google Scholar - 10.R. O. Duda, P. E. Hart, and D. G. Stork.
*Pattern Classification*. John Wiley, 2001.Google Scholar - 11.B. Efron and R.J. Tibshirani.
*An Introduction to the Bootstrap*. Chapman & Hall, 1993.Google Scholar - 12.R. A. Fisher. The use of multiple measurements in taxonomic problems.
*Annals of Eugenics*, 7(2):179–188, 1936.Google Scholar - 13.K. Fukunaga.
*Introduction to Statistical Pattern Recognition*. Academic Press, 1990.Google Scholar - 14.S. Geman, E. Bienenstock, and R. Doursat. Neural networks and the bias-variance dilemma.
*Neural Networks*, 4(1):1–58, 1992.Google Scholar - 15.T. Hastie, R.J. Tibshirani, and J. Friedman.
*The Elements of Statistical Learning*. Springer-Verlag, 2001.Google Scholar - 16.F. Mosteller and J.W. Tukey. Data analysis, including statistics. In
*Handbook of Social Psychology*, pages 80–203. Addison-Wesley, 1968.Google Scholar - 17.J. Rissanen. A universal prior for integers and estimation by minimum description length.
*Annals of Statistics*, 11(2):416–431, 1983.Google Scholar - 18.B. Schölkopf and A.J. Smola.
*Learning with Kernels*. MIT Press, 2002.Google Scholar - 19.G. Schwartz. Estimating the dimension of a model.
*Annals of Statistics*, 6(2):461–464, 1978.Google Scholar - 20.R. Shibata. An optimal selection of regression variables.
*Biometrika*, 68(1):45–54, 1981.Google Scholar - 21.M. Stone. Cross-validatory choice and assessment of statistical predictions.
*Journal of the Royal Statistical Society*, B36:111–147, 1974.Google Scholar - 22.M. Stone. An asymptotic equivalence of choice of model by crossvalidation and akaike’s criterion.
*Journal of the Royal Statistical Society*, B39:44–47, 1977.Google Scholar - 23.V.N. Vapnik.
*Estimation of Dependences based on Empirical Data*. Springer-Verlag, 1982.Google Scholar - 24.V.N. Vapnik.
*The Nature of Statistical Learning Theory*. Springer-Verlag, 1995.Google Scholar - 25.V.N. Vapnik.
*Statistical Learning Theory*. John Wiley, 1998.Google Scholar - 26.V.N. Vapnik and A. Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities.
*Theory of Probability and its Applications*, 16(2):264–280, 1971.Google Scholar - 27.V.N. Vapnik and A. Ya. Chervonenkis.
*Theory of Pattern Recognition*. Nauka, 1974.Google Scholar

## Copyright information

© Springer-Verlag London 2015