# Counterexamples to parsimony and BIC

- 184 Downloads
- 25 Citations

## Abstract

Suppose that the log-likelihood-ratio sequence of two models with different numbers of estimated parameters is bounded in probability, without necessarily having a chi-square limiting distribution. Then BIC and all other related “consistent” model selection criteria, meaning those which penalize the number of estimated parameters with a weight which becomes infinite with the sample size, will, with asymptotic probability 1, select the model having fewer parameters. This note presents examples of nested and non-nested regression model pairs for which the likelihood-ratio sequence is bounded in probability and which have the property that the model in each pair with *more* estimated parameters has better predictive properties, for an independent replicate of the observed data, than the model with fewer parameters. Our second example also shows how a one-dimensional regressor can overfit the data used for estimation in comparison to the fit of a two-dimensional regressor.

## Key words and phrases

Model selection linear regression misspecified models AIC BIC MDL Hannan-Quinn criterion overfitting## Preview

Unable to display preview. Download preview PDF.

## References

- Akaike, H. (1973). Information theory and an extension of the likelihood principle,
*2nd International Symposium on Information Theory*(eds. B. N. Petrov and F. Czáki), 267–281, Akadémiai Kiadó, Budapest.Google Scholar - Akaike, H. (1985). Prediction and entropy,
*A Celebration of Statistics*(eds. A. C. Atkinson and S. E. Fienberg), Springer, New York.Google Scholar - Anderson, T. W. (1971).
*The Statistical Analysis of Time Series*, Wiley, New York.Google Scholar - Box, G. E. P. and Jenkins, G. M. (1976).
*Time Series Analysis: Forecasting and Control*, 2nd ed., Holden-Day, San Francisco.Google Scholar - Durbin, J. (1960). The fitting of time series models,
*Review of the International Institute of Statistics*,**28**, 233–244.Google Scholar - Findley, D. F. (1990). Making difficult model comparisons (submitted for publication).Google Scholar
- Findley, D. F. and Wei, C.-Z. (1988). Beyond chi-square: likelihood ratio procedures for comparing non-nested, possibly incorrect regressors,
*J. Amer. Statist. Assoc.*(to appear).Google Scholar - Findley, D. F. and Wei, C.-Z. (1991). Bias properties of AIC for possibly incorrect stochastic regression models (in preparation).Google Scholar
- Hannan, E. J. and Quinn, B. (1979). The determination of the order of an autoregression,
*J. Roy. Statist. Soc. Ser. B*,**41**, 190–195.Google Scholar - Kashyap, R. L. (1980). Inconsistency of the AIC rule for estimating the order of autoregressive models,
*IEEE Trans. Automat. Control*,**AC-****25**, 996–998.Google Scholar - Levinson, N. (1946). The Wiener RMS (root mean square) error criterion in filter design and prediction,
*J. Math. Phys.*,**25**, 261–278.Google Scholar - Poskitt, D. S. (1987). Precision, complexity and Bayesian model determination,
*J. Roy. Statist. Soc. Ser. B*,**49**, 199–208.Google Scholar - Rissanen, J. (1978). Modelling by shortest data description,
*Automatica—J. IFAC*,**14**, 465–471.Google Scholar - Rissanen, J. (1986). Stochastic complexity and modeling,
*Ann. Statist.*,**14**, 1080–1100.Google Scholar - Rissanen, J. (1989).
*Stochastic Complexity in Statistical Inquiry*, World Scientific, Singapore.Google Scholar - Shibata, R. (1976). Selection of the order of an autoregressive model by Akaike's information criterion,
*Biometrika*,**63**, 117–126.Google Scholar - Shibata, R. (1980). Asymptotically efficient selection of the order of the model for estimating parameters of a linear process,
*Ann. Statist.*,**8**, 147–164.Google Scholar - Shibata, R. (1981). An optimal selection of regression variables,
*Biometrika*,**68**, 45–54 (Correction: ibid.**69**, 494).Google Scholar - Takada, Y. (1982). Admissibility of some variable selection rules in the linear regression model,
*J. Japan Statist. Soc.*,**12**, 45–49.Google Scholar - Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses,
*Econometrica*,**57**, 307–333.Google Scholar - Wei, C.-Z. (1991). On predictive least squares principles,
*Ann. Statist.*(to appear).Google Scholar - White, H. (1990).
*Estimation, Inference and Specification Analysis*, Cambridge University Press, New York.Google Scholar - Woodroofe, M. (1982). On model selection and are sine laws,
*Ann. Statist.*,**10**, 1182–1194.Google Scholar