# Statistical Aspects of Model Selection

• Ritei Shibata
Chapter

## Abstract

Various aspects of statistical model selection are discussed from the view point of a statistician. Our concern here is about selection procedures based on the Kullback Leibler information number. Derivation of AIC (Akaike’s Information Criterion) is given. As a result a natural extension of AIC, called TIC (Takeuchi’s Information Criterion) follows. It is shown that the TIC is asymptotically equivalent to Cross Validation in a general context, although AIC is asymptotically equivalent only for the case of independent identically distributed observations. Next, the maximum penalized likelihood estimate is considered in place of the maximum likelihood estimate as an estimation of parameters after a model is selected. Then the weight of penalty is also the one to be selected. We will show that, starting from the same Kullback-Leibler information number, a useful criterion RIC (Regularization Information Criterion) is derived to select both the model and the weight of penalty. This criterion is in fact an extension of TIC as well as of AIC. Comparison of various criteria, including consistency and efficiency is summarized in Section 5. Applications of such criteria to time series models are given in the last section.

## Keywords

Statistical modelling model selection information criterion cross validation

## References

1. [1]
Akaike, H., A fundamental relation between predictor identification and power spectrum estimation, Ann. Inst. Statist. Math., Vol. 22, pp. 219–223, 1970.
2. [2]
Akaike, H., Information theory and an extension of the maximum likelihood principle, pp. 267–281 in 2nd Int. Symposium on Information Theory, Eds. B.N. Petrov and F. Csáki, Akedémia Kiado, Budapest, 1973.Google Scholar
3. [3]
Ansley, C. F., An algorithm for the exact likelihood of a mixed autoregressive- moving average process, Biometrika, Vol. 66, pp. 59–65, 1979.
4. [4]
Atkinson, A. C., A note on the generalized information criterion for choice of a model, Biometrika, Vol. 67, pp. 413–418, 1980.
5. [5]
Bhansali, R. J. and D. Y. Downham, Some properties of the order of an autoregressive model selected by a generalization of Akaike’s EPF criterion, Biometrika, Vol. 64, pp. 547–551, 1977.Google Scholar
6. [6]
Box, G. E. P. and G. M. Jenkins, Time Series Analysis: Forecasting and Control, Holden-Day, 1976.Google Scholar
7. [7]
Findley, D. F., On the unbiasedness property of AIC for exact or approximating linear stochastoic time series models, J. Time Series Analysis, Vol. 6, pp. 229–252, 1985.
8. [8]
Gooijer, J. G. de, B. Abraham, A. Gould and L. Robinson, Method for determining the order of an autoregressive-moving average process: A survey, Int. Statist. Rev., Vol. 53, pp. 301–329, 1985.
9. [9]
Hampel, F. R., E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel, Robust Statistics: the Approach Based on Influence Functions, John Wiley, 1986.Google Scholar
10. [10]
Hannan, E. J. and B. G. Quinn, The determination of the order of an autoregression, J. Roy. Statist. Soc., Vol. B 41, pp. 190–195, 1979.Google Scholar
11. [11]
Hannan, E. J., The estimation of the order of an ARMA process, Ann. Statist., Vol. 8, pp. 1071–1081, 1980.
12. [12]
Hannan, E. J., Testing for autocorrelation and Akaike’s criterion, pp. 403–412 in Essays in Statistical Science (Papers in honour of P.A.P. Moran ), Eds. J. M. Gani and E. J. Hannan, Applied Probability Trust, Sheffield, 1982.Google Scholar
13. [13]
Hannan, E. J. and J. Rissanen, Recursive estimation of mixed autoregressive- moving average order, Biometrika, Vol. 69, pp. 81–94, 1982.
14. [14]
Hannan, E. J., Fitting multivariate ARMA models, pp. 307–316 in Statistics and Probability (Essays in Honor of C. R. Rao ) Eds. G. Kallianpur, P. R. Krishnaiah and J. K. Ghosh, North-Holland Publishing Company, Amsterdam, 1982.Google Scholar
15. [15]
Hosking, J. R. M., Lagrange-multiplier tests of time-series models, J. R. Statist. Soc., Vol. B42, pp. 170–181, 1980.Google Scholar
16. [16]
Hurvich, C. M., Data-Driven choice of a spectraum estimate: Extending the applicability of cross-validation methods, J. Amer. Statist. Soc., Vol. 80, pp. 933–940, 1985.Google Scholar
17. [17]
Kempthorne, P. J., Admissible variable-selection procedures when fitting regression models by least squares for prediction, Biometrika, Vol. 71, pp. 593–597, 1984.
18. [18]
Li, K. C., From Stein’s unbiased estimates to the method of generalized cross validation, Ann. Statist., Vol. 13, pp. 1352–1377, 1985.
19. [19]
Mallows, C. L., Some comments on C p , Technometrics, Vol. 15, pp. 661–675, 1973.
20. [20]
Parzen, E., Some recent advances in time series modeling, IEEE, Vol. AC-19, pp. 723–730, 1974.Google Scholar
21. [21]
Parzen, E., Multiple time series: determining the order of approximating autoregressive schemes, pp. 283–295 in Multivariate Analysis-lV, North-Holland, 1977.Google Scholar
22. [22]
Priestly, M. B., Spectral Analysis and Time Series, Academic Press, 1981.Google Scholar
23. [23]
Sakai, H., Asymptotic distribution of the order selected by AIC in multivariate autoregressive model fitting, Int. J. Control, Vol. 33, pp. 175–180, 1981.
24. [24]
Schwarz, G., Estimating the dimension of a model, Ann. Statist., Vol. 6, pp. 461–464, 1978.
25. [25]
Shibata, R., Selection of the order of an autoreegressive model by Akaike’s information criterion, Biometrika, Vol. 63, pp. 117–126, 1976.
26. [26]
Shibata, R., An optimal selection of regression variables, Biometrika, Vol. 68, pp. 45–54, Correction 69, p. 492, 1981.
27. [27]
Shibata, R., An optimal autoregressive spectral estimate, Ann. Statist. 9, pp. 300–306, 1981.
28. [28]
Shibata, R., Various model selection techniques in time series analysis, p. 179–187 in Handbook of Statistics, Eds. E. J. Hannan and P. R. Krishnaiah, Elsevier, 1985.Google Scholar
29. [29]
Shibata, R., Selection of regression variables, pp. 709–714 in Encyclopedia of Statistical Sciences, John Wiley & Sons, 1986.Google Scholar
30. [30]
Shibata, R., Consistency of model selection and parameter estimateion, pp. 127–141 in Essays in Time Series and Allied Processes, Eds. J. M. Gani and M. B. Priestley, Applied Probability Trust, Sheffield, 1986.Google Scholar
31. [31]
Shibata, R., Selection of the number of regression variables; a minimax choice of generalized FPE, Ann. Inst. Statist. Math., Vol. 38 A, pp. 459–474, 1986.
32. [32]
Stone, M., Cross-validatory choice and assessment of statistical predictions, J. Roy. Statist. Soc., Vol. 36, pp. 111–133, 1974.Google Scholar
33. [33]
Stone, M., An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion, J. Roy. Statist. Soc., Vol. B 39, pp. 44–47, 1977.Google Scholar
34. [34]
Stone, C. J., Local asymptotic admissibility of a generalization of Akaike’s model selection rule, Ann. Inst. Statist. Math., Vol. 34, pp. 123–133, 1982.
35. [35]
Takada, Y., Admissibility of some variable selection rules in linear regression model, J. Japan Statist. Soc., Vol. 12, pp. 45–49, 1982.Google Scholar
36. [36]
Takeuchi, K., Distribution of information statistics and a criterion of model fitting, Suri-Kagaku (Mathematical Sciences), Vol. 153, pp. 12–18, (in Japanese), 1976.Google Scholar
37. [37]
Taniguchi, M., On selection of the order of the spectral density model for a stationary process, Ann. Inst. Statist. Math., Vol. 32 A, pp. 401–419, 1980.
38. [38]
Titterington, D. M., Common structure of smoothing techniques in statistics, Int. Statist. Rev., Vol. 53, pp. 141–170, 1985.
39. [39]
Tjϕstheim, D. and J. Paulsen, Least squares estimates and order determination procedures for autoregressive processes with a time dependent variance, J. Time Series Analysis, Vol. 6, pp. 117–133, 1985.
40. [40]
Wahba, G., A comparison of GCV and GML for choosing the smoothing parameter in the generalized spline smoothing problem, Ann. Statist., Vol. 13, pp. 1378–1402, 1985.
41. [41]
Walker, A. M., Asymptotic properties of least squares estimates of parameters of the spectrum of a stationary non-deterministic time series, Austral. Math. Soc., Vol. 4, pp. 363–384, 1964.
42. [42]
Woodroofe, ML, On model selection and the arc sine laws, Ann. Statist., Vol. 10, pp. 1182–1194, 1982.
43. [43]
Yajima, Y., Estimation of the degree of differencing of an ARIMA process, Ann. Inst. Statist. Math., Vol. 37, pp. 389–408, 1985.