Variable Selection

Part of the Springer Texts in Statistics book series (STS)


This chapter addresses the question of which predictor variables should be included in a linear model. The easiest version of the problem is, given a linear model, which variables should be excluded. To that end we examine the question of selecting the best subset of predictor variables from amongst the original variables. To do this requires us to define a “best” model and we examine several competing measures. We also examine the “greedy” algorithm for this problem known as backward elimination. The more difficult problem of deciding which variables to place into a linear model is addressed by the greedy algorithm of forward selection. (These algorithms are greedy in the sense of always wanting the best thing right now, rather than seeking a global sense of what is best.) We examine traditional forward selection as well as the modern adaptations of forward selection known as boosting, bagging, and random forests.


  1. Bedrick, E. J., & Tsai, C.-L. (1994). Model selection for multivariate regression in small samples. Biometrics, 50, 226–231.CrossRefGoogle Scholar
  2. Cavanaugh, J. E. (1997). Unifying the derivations of the Akaike and corrected Akaike information criteria. Statistics and Probability Letters, 31, 201–208.MathSciNetCrossRefGoogle Scholar
  3. Christensen, R. (1997). Log-linear models and logistic regression (2nd ed.). New York: Springer.zbMATHGoogle Scholar
  4. Christensen, R. (2015). Analysis of variance, design, and regression: Linear modeling for unbalanced data (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC Pres.Google Scholar
  5. Christensen, R., Johnson, W., Branscum, A., & Hanson, T. E. (2010). Bayesian ideas and data analysis: An introduction for scientists and statisticians. Boca Raton, FL: Chapman and Hall/CRC Press.CrossRefGoogle Scholar
  6. Cook, R. D., Forzani, L., & Rothman, A. J. (2013). Prediction in abundant high-dimensional linear regression. Electronic Journal of Statistics, 7, 3059–3088.MathSciNetCrossRefGoogle Scholar
  7. Cook, R. D., Forzani, L., & Rothman, A. J. (2015). Letter to the editor. The American Statistician, 69, 253–254.CrossRefGoogle Scholar
  8. Efron, B., & Hastie, T. (2016). Computer age statistical inference: Algorithms, evidence, and data science. Cambridge: Cambridge University Press.CrossRefGoogle Scholar
  9. Fraser, D. A. S. (1957). Nonparametric methods in statistics. New York: Wiley.zbMATHGoogle Scholar
  10. Furnival, G. M., & Wilson, R. W. (1974). Regression by leaps and bounds. Technometrics, 16, 499–511.CrossRefGoogle Scholar
  11. Hastie, T., Tibshirani, R., & Friedman, J. (2016). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). New York: Springer.zbMATHGoogle Scholar
  12. Hurvich, C. M., & Tsai, C.-L. (1989). Regression and time series model selection in small samples. Biometrika, 76, 297–307.Google Scholar
  13. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. New York: Springer.CrossRefGoogle Scholar
  14. Schatzoff, M., Tsao, R., & Fienberg, S. (1968). Efficient calculations of all possible regressions. Technometrics, 10, 768–779.CrossRefGoogle Scholar
  15. Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461–464.MathSciNetCrossRefGoogle Scholar
  16. Sugiura, N. (1978). Further analysis of the data by Akaike’s information criterion and the finite corrections. Communications in Statistics, Part A, Theory and Methods, 7, 13–26.CrossRefGoogle Scholar
  17. Tarpey, T., Ogden, R., Petkova, E., & Christensen, R. (2015). Reply. The American Statistician, 69, 254–255.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsUniversity of New MexicoAlbuquerqueUSA

Personalised recommendations