Computational Statistics

, Volume 33, Issue 2, pp 787–806 | Cite as

Extending AIC to best subset regression

  • J. G. Liao
  • Joseph E. Cavanaugh
  • Timothy L. McMurry
Original Paper


The Akaike information criterion (AIC) is routinely used for model selection in best subset regression. The standard AIC, however, generally under-penalizes model complexity in the best subset regression setting, potentially leading to grossly overfit models. Recently, Zhang and Cavanaugh (Comput Stat 31(2):643–669, 2015) made significant progress towards addressing this problem by introducing an effective multistage model selection procedure. In this paper, we present a rigorous and coherent conceptual framework for extending AIC to best subset regression. A new model selection algorithm derived from our framework possesses well understood and desirable asymptotic properties and consistently outperforms the procedure of Zhang and Cavanaugh in simulation studies. It provides an effective tool for combating the pervasive overfitting that detrimentally impacts best subset regression analysis so that the selected models contain fewer irrelevant predictors and predict future observations more accurately.


Akaike information criterion Expected optimism Model selection Overfitting 


  1. Akaike H (1973) Information theorey and an extension of the maximum likelihood principle. In: Proceeding of the second international symposium of information theory, pp 267–281Google Scholar
  2. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control AC–19:716–723MathSciNetCrossRefzbMATHGoogle Scholar
  3. Bengtsson T, Cavanaugh JE (2006) An improved akaike information criterion for state-space model selection. Comput Stat Data Anal 50(10):2635–2654MathSciNetCrossRefzbMATHGoogle Scholar
  4. Bertsimas D, King A, Mazumder R (2016) Best subset selection via a modern optimization lens. Ann Stat 35(6):813–852MathSciNetCrossRefzbMATHGoogle Scholar
  5. Efron B (1983) Estimating the error rate of a prediction rule: improvement on crossvalidation. J Am Stat Assoc 78(382):316–331CrossRefzbMATHGoogle Scholar
  6. Efron B (1986) How biased is the apparent error rate of a prediction rule? J Am Stat Assoc 81(394):461–470MathSciNetCrossRefzbMATHGoogle Scholar
  7. Fujikoshi Y (1983) A criterion for variable selection in multiple discriminant analysis. Hiroshima Math 13:203–214MathSciNetzbMATHGoogle Scholar
  8. Hurvich CM, Tsai C-L (1989) Regression and time series model selection in small samples. Biometrika 76(2):297–307MathSciNetCrossRefzbMATHGoogle Scholar
  9. Hurvich CM, Shumway R, Tsai C-L (1990) Improved estimators of Kullback–Leibler information for autoregressive model selection in small samples. Biometrika 77(4):709–719MathSciNetGoogle Scholar
  10. Kitagawa G, Konishi S (2008) Information criteria and statistical modeling. Springer, New YorkzbMATHGoogle Scholar
  11. Liao J, McGee D (2003) Adjusted coefficients of determination for logistic regression. Am Stat 57:161–165MathSciNetCrossRefzbMATHGoogle Scholar
  12. Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefzbMATHGoogle Scholar
  13. Shibata RITEI (1976) Selection of the order of an autoregressive model by Akaike’s information criterion. Biometrika 63(1):117–126MathSciNetCrossRefzbMATHGoogle Scholar
  14. Sugiura N (1978) Further analysts of the data by Akaike’s information criterion and the finite corrections. Commun Stat 7(1):13–26MathSciNetCrossRefzbMATHGoogle Scholar
  15. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B 58:267–288MathSciNetzbMATHGoogle Scholar
  16. White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50(1):1–25MathSciNetCrossRefzbMATHGoogle Scholar
  17. Ye J (1998) On measuring and correcting the effects of data mining and model selection. J Am Stat Assoc 93(441):120–131MathSciNetCrossRefzbMATHGoogle Scholar
  18. Zhang T, Cavanaugh JE (2015) A multistage algorithm for best-subset model selection based on the Kullback-Leibler discrepancy. Comput Stat 31(2):643–669MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  • J. G. Liao
    • 1
  • Joseph E. Cavanaugh
    • 2
  • Timothy L. McMurry
    • 3
  1. 1.Penn State UniversityHersheyUSA
  2. 2.University of IowaIowa CityUSA
  3. 3.University of VirginiaCharlottesvilleUSA

Personalised recommendations