Skip to main content

Model Selection

  • Reference work entry
  • First Online:

Introduction

In applications there are usually several models for describing a population from a given sample of observations and one is thus confronted with the problem of model selection. For example, different distributions can be fitted to a given sample of univariate observations; in polynomial regression one has to decide which degree of the polynomial to use; in multivariate regression one has to select which covariates to include in the model; in fitting an autoregressive model to a stationary time series one must choose which order to use.

When the set of models under consideration is nested, as is the case in polynomial regression, the fit of the model to the sample improves as the complexity of the model (e.g., the number of parameters) increases but, at some stage, its fit to the population deteriorates. That is because the model increasingly moulds itself to the features of the sample rather than to the “true model,” namely the one that characterizes the population. The...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   1,100.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References and Further Reading

  • Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov B, Csáki F (eds) Second international symposium on information theory, Akadémiai Kiadó, Budapest, pp 267–281

    Google Scholar 

  • Burnham PK, Anderson DR (2002) Model selection and multimodel inference: a practical information-theoretic approach, 2nd edn. Springer, New York

    Google Scholar 

  • Candes E, Tao T (2008) The Dantzig selector: statistical estimation when p is much larger than n. Ann Stat 35:2313–2351

    MathSciNet  Google Scholar 

  • Claeskens G, Hjort NL (2003) The focussed information criterion (with discussion). J Am Stat Assoc 98:900–916

    MATH  MathSciNet  Google Scholar 

  • Claeskens G, Hjort NL (2008) Model selection and model averaging. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  • Grünwald P (2007) The minimum description length principle. MIT Press, Boston

    Google Scholar 

  • Hjort NL, Claeskens G (2003) Frequentist model average estimators (with discussion). J Am Stat Assoc 98:879–899

    MATH  MathSciNet  Google Scholar 

  • Leeb H, Pötscher BM (2005) Model selection and inference: fact and fiction. Economet Theor 21:21–59

    MATH  Google Scholar 

  • Linhart H, Zucchini W (1986) Model selection. Wiley, New York

    MATH  Google Scholar 

  • McQuarrie ADR, Tsai CL (1998) Regression and time series model selection. World Scientific, River Edge

    MATH  Google Scholar 

  • Miller AJ (2002) Subset selection in regression, 2nd edn. Chapman and Hall/CRC, Boca Raton

    MATH  Google Scholar 

  • Pötscher BM (1991) Effects of model selection on inference. Economet Theor 7:163–185

    Google Scholar 

  • Rissanen JJ (1996) Fisher information and stochastic complexity. IEEE Trans Inform Theory 42:40–47

    MATH  MathSciNet  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    MATH  Google Scholar 

  • Shen X, Huang HC, Ye J (2004) Inference after model selection. J Am Stat Assoc 99:751–762 

    MATH  MathSciNet  Google Scholar 

  • Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J Roy Stat Soc B 64:583–639

    MATH  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc B 58(1):267–288

    MATH  MathSciNet  Google Scholar 

  • Wasserman L (2000) Bayesian model selection and model averaging. J Math Psychol 44:92–107

    MATH  MathSciNet  Google Scholar 

  • Zucchini W (2000) An introduction to model selection. J Math Psychol 44:41–61

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this entry

Cite this entry

Zucchini, W., Claeskens, G., Nguefack-Tsague, G. (2011). Model Selection. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_373

Download citation

Publish with us

Policies and ethics