Skip to main content

Understanding predictive information criteria for Bayesian models

Abstract

We review the Akaike, deviance, and Watanabe-Akaike information criteria from a Bayesian perspective, where the goal is to estimate expected out-of-sample-prediction error using a bias-corrected adjustment of within-sample error. We focus on the choices involved in setting up these measures, and we compare them in three simple examples, one theoretical and two applied. The contribution of this paper is to put all these information criteria into a Bayesian predictive context and to better understand, through small examples, how these methods can apply in practice.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. Aitkin, M.: Statistical Inference: an Integrated Bayesian/Likelihood Approach. Chapman & Hall, London (2010)

    Book  Google Scholar 

  2. Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Proceedings of the Second International Symposium on Information Theory, pp. 267–281. Akademiai Kiado, Budapest (1973). Reprinted in: Kotz, S. (ed.) Breakthroughs in Statistics, pp. 610–624. Springer, New York (1992)

    Google Scholar 

  3. Ando, T., Tsay, R.: Predictive likelihood for Bayesian model selection and averaging. Int. J. Forecast. 26, 744–763 (2010)

    Article  Google Scholar 

  4. Bernardo, J.M.: Expected information as expected utility. Ann. Stat. 7, 686–690 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  5. Bernardo, J.M., Smith, A.F.M.: Bayesian Theory. Wiley, New York (1994)

    Book  MATH  Google Scholar 

  6. Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76, 503–514 (1989)

    Article  MATH  MathSciNet  Google Scholar 

  7. Burman, P., Chow, E., Nolan, D.: A cross-validatory method for dependent data. Biometrika 81, 351–358 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  8. Burnham, K.P., Anderson, D.R.: Model Selection and Multimodel Inference: a Practical Information Theoretic Approach. Springer, New York (2002)

    Google Scholar 

  9. Celeux, G., Forbes, F., Robert, C., Titterington, D.: Deviance information criteria for missing data models. Bayesian Anal. 1, 651–706 (2006)

    Article  MathSciNet  Google Scholar 

  10. DeGroot, M.H.: Optimal Statistical Decisions. McGraw-Hill, New York (1970)

    MATH  Google Scholar 

  11. Dempster, A.P.: The direct use of likelihood for significance testing. In: Proceedings of Conference on Foundational Questions in Statistical Inference, Department of Theoretical Statistics: University of Aarhus, pp. 335–352 (1974)

    Google Scholar 

  12. Draper, D.: Model uncertainty yes, discrete model averaging maybe. Stat. Sci. 14, 405–409 (1999)

    MathSciNet  Google Scholar 

  13. Efron, B., Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall, New York (1993)

    Book  MATH  Google Scholar 

  14. Geisser, S., Eddy, W.: A predictive approach to model selection. J. Am. Stat. Assoc. 74, 153–160 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  15. Gelfand, A., Dey, D.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. B 56, 501–514 (1994)

    MATH  MathSciNet  Google Scholar 

  16. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. CRC Press, London (2003)

    Google Scholar 

  17. Gelman, A., Meng, X.L., Stern, H.S.: Posterior predictive assessment of model fitness via realized discrepancies (with discussion). Stat. Sin. 6, 733–807 (1996)

    MATH  MathSciNet  Google Scholar 

  18. Gelman, A., Shalizi, C.: Philosophy and the practice of Bayesian statistics (with discussion). Br. J. Math. Stat. Psychol. 66, 8–80 (2013)

    Article  MathSciNet  Google Scholar 

  19. Gneiting, T.: Making and evaluating point forecasts. J. Am. Stat. Assoc. 106, 746–762 (2011)

    Article  MATH  MathSciNet  Google Scholar 

  20. Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  21. Hibbs, D.: Implications of the ‘bread and peace’ model for the 2008 U.S. presidential election. Public Choice 137, 1–10 (2008)

    Article  Google Scholar 

  22. Hoeting, J., Madigan, D., Raftery, A.E., Volinsky, C.: Bayesian model averaging (with discussion). Stat. Sci. 14, 382–417 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  23. Jones, H.E., Spiegelhalter, D.J.: Improved probabilistic prediction of healthcare performance indicators using bidirectional smoothing models. J. R. Stat. Soc. A 175, 729–747 (2012)

    Article  MathSciNet  Google Scholar 

  24. McCulloch, R.E.: Local model influence. J. Am. Stat. Assoc. 84, 473–478 (1989)

    Article  Google Scholar 

  25. Plummer, M.: Penalized loss functions for Bayesian model comparison. Biostatistics 9, 523–539 (2008)

    Article  MATH  Google Scholar 

  26. Ripley, B.D.: Statistical Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)

    Book  Google Scholar 

  27. Robert, C.P.: Intrinsic losses. Theory Decis. 40, 191–214 (1996)

    Article  MATH  Google Scholar 

  28. Rubin, D.B.: Estimation in parallel randomized experiments. J. Educ. Stat. 6, 377–401 (1981)

    Google Scholar 

  29. Rubin, D.B.: Bayesianly justifiable and relevant frequency calculations for the applied statistician. Ann. Stat. 12, 1151–1172 (1984)

    Article  MATH  Google Scholar 

  30. Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)

    Article  MATH  Google Scholar 

  31. Shibata, R.: Statistical aspects of model selection. In: Willems, J.C. (ed.) From Data to Model, pp. 215–240. Springer, Berlin (1989)

    Chapter  Google Scholar 

  32. Spiegelhalter, D.J., Best, N.G., Carlin, B.P., van der Linde, A.: Bayesian measures of model complexity and fit (with discussion). J. R. Stat. Soc. B (2002)

  33. Spiegelhalter, D., Thomas, A., Best, N., Gilks, W., Lunn, D.: BUGS: Bayesian inference using Gibbs sampling. MRC Biostatistics Unit, Cambridge, England (1994, 2003). http://www.mrc-bsu.cam.ac.uk/bugs/

  34. Stone, M.: An asymptotic equivalence of choice of model cross-validation and Akaike’s criterion. J. R. Stat. Soc. B 36, 44–47 (1977)

    Google Scholar 

  35. van der Linde, A.: DIC in variable selection. Stat. Neerl. 1, 45–56 (2005)

    Article  Google Scholar 

  36. Vehtari, A., Lampinen, J.: Bayesian model assessment and comparison using cross-validation predictive densities. Neural Comput. 14, 2439–2468 (2002)

    Article  MATH  Google Scholar 

  37. Vehtari, A., Ojanen, J.: A survey of Bayesian predictive methods for model assessment, selection and comparison. Stat. Surv. 6, 142–228 (2012)

    Article  MATH  MathSciNet  Google Scholar 

  38. Watanabe, S.: Algebraic Geometry and Statistical Learning Theory. Cambridge University Press, Cambridge (2009)

    Book  MATH  Google Scholar 

  39. Watanabe, S.: Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. J. Mach. Learn. Res. 11, 3571–3594 (2010)

    MATH  MathSciNet  Google Scholar 

  40. Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)

    MATH  MathSciNet  Google Scholar 

Download references

Acknowledgements

We thank two reviewers for helpful comments and the National Science Foundation, Institute of Education Sciences, and Academy of Finland (grant 218248) for partial support of this research.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Andrew Gelman.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Gelman, A., Hwang, J. & Vehtari, A. Understanding predictive information criteria for Bayesian models. Stat Comput 24, 997–1016 (2014). https://doi.org/10.1007/s11222-013-9416-2

Download citation

Keywords

  • AIC
  • DIC
  • WAIC
  • Cross-validation
  • Prediction
  • Bayes