Introduction
In applications there are usually several models for describing a population from a given sample of observations and one is thus confronted with the problem of model selection. For example, different distributions can be fitted to a given sample of univariate observations; in polynomial regression one has to decide which degree of the polynomial to use; in multivariate regression one has to select which covariates to include in the model; in fitting an autoregressive model to a stationary time series one must choose which order to use.
When the set of models under consideration is nested, as is the case in polynomial regression, the fit of the model to the sample improves as the complexity of the model (e.g., the number of parameters) increases but, at some stage, its fit to the population deteriorates. That is because the model increasingly moulds itself to the features of the sample rather than to the “true model,” namely the one that characterizes the population. The...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences and Further Reading
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csáki F (eds) Second international symposium on information theory, Akadémiai Kiadó, Budapest, pp 267–281
Burnham PK, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer, New York
Candes E, Tao T (2008) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35:2313–2351
Claeskens G, Hjort NL (2003) The focussed information criterion (with discussion). J Am Stat Assoc 98:900–916
Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge
Grünwald P (2007) The minimum description length principle. MIT Press, Boston
Hjort NL, Claeskens G (2003) Frequentist model average estimators (with discussion). J Am Stat Assoc 98:879–899
Leeb H, Pötscher BM (2005) Model selection and inference: fact and fiction. Economet Theor 21:21–59
Linhart H, Zucchini W (1986) Model selection. Wiley, New York
McQuarrie ADR, Tsai CL (1998) Regression and time series model selection. World Scientific, River Edge
Miller AJ (2002) Subset selection in regression, 2nd edn. Chapman and Hall/CRC, Boca Raton
Pötscher BM (1991) Effects of model selection on inference. Economet Theor 7:163–185
Rissanen JJ (1996) Fisher information and stochastic complexity. IEEE Trans Inform Theory 42:40–47
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Shen X, Huang HC, Ye J (2004) Inference after model selection. J Am Stat Assoc 99:751–762
Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J Roy Stat Soc B 64:583–639
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc B 58(1):267–288
Wasserman L (2000) Bayesian model selection and model averaging. J Math Psychol 44:92–107
Zucchini W (2000) An introduction to model selection. J Math Psychol 44:41–61
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this entry
Cite this entry
Zucchini, W., Claeskens, G., Nguefack-Tsague, G. (2011). Model Selection. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_373
Download citation
DOI: https://doi.org/10.1007/978-3-642-04898-2_373
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering