Statistics and Computing

, Volume 27, Issue 3, pp 833–844 | Cite as

Investigation of the widely applicable Bayesian information criterion

  • N. Friel
  • J. P. McKeone
  • C. J. Oates
  • A. N. Pettitt
Article

Abstract

The widely applicable Bayesian information criterion (WBIC) is a simple and fast approximation to the model evidence that has received little practical consideration. WBIC uses the fact that the log evidence can be written as an expectation, with respect to a powered posterior proportional to the likelihood raised to a power \(t^*\in {(0,1)}\), of the log deviance. Finding this temperature value \(t^*\) is generally an intractable problem. We find that for a particular tractable statistical model that the mean squared error of an optimally-tuned version of WBIC with correct temperature \(t^*\) is lower than an optimally-tuned version of thermodynamic integration (power posteriors). However in practice WBIC uses the a canonical choice of \(t=1/\log (n)\). Here we investigate the performance of WBIC in practice, for a range of statistical models, both regular models and singular models such as latent variable models or those with a hierarchical structure for which BIC cannot provide an adequate solution. Our findings are that, generally WBIC performs adequately when one uses informative priors, but it can systematically overestimate the evidence, particularly for small sample sizes.

Keywords

Marginal likelihood Evidence Power posteriors Widely applicable Bayesian information criterion 

References

  1. Bash, P.A., Singh, U.C., Langridge, R., Kollman, P.A.: Free energy calculations by computer simulation. Science 236(4801), 564–568 (1987)CrossRefGoogle Scholar
  2. Burrows, B.: A new approach to numerical integration. IMA J. Appl. Math. 26(2), 151–173 (1980)MathSciNetCrossRefMATHGoogle Scholar
  3. Calderhead, B., Girolami, M.: Estimating Bayes factors via thermodynamic integration and population MCMC. Comput. Stat. Data Anal. 53, 4028–4045 (2009)MathSciNetCrossRefMATHGoogle Scholar
  4. Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Mach. Learn. 29(2–3), 181–212 (1997)CrossRefMATHGoogle Scholar
  5. Chipot, C., Pohorille, A.: Free Energy Calculations: Theory and Applications in Chemistry and Biology, vol. 86. Springer, Berlin (2007)Google Scholar
  6. Chopin, N., Robert, C.: Contemplating evidence: properties, extensions of, and alternatives to nested sampling. Technical report, 2007-46, Ceremade, Université Paris Dauphine (2007)Google Scholar
  7. Drton, M., Plummer, M.: A Bayesian information criterion for singular models. arXiv preprint (2013)Google Scholar
  8. Friel, N., Hurn, M., Wyse, J.: Improving power posterior estimation of statistical evidence. Stat. Comput. 24, 709–723 (2014)MathSciNetCrossRefMATHGoogle Scholar
  9. Friel, N., Pettitt, A.N.: Marginal likelihood estimation via power posteriors. J. R. Stat. Soc. Ser. B 70, 589–607 (2008)MathSciNetCrossRefMATHGoogle Scholar
  10. Friel, N., Wyse, J.: Estimating the evidence a review. Statistica Neerlandica 66(3), 288–308 (2012)MathSciNetCrossRefGoogle Scholar
  11. Gelfand, A.E., Dey, D.K.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. Ser. B (Methodological) 56, 501–514 (1994)MathSciNetMATHGoogle Scholar
  12. Gelman, A., Hwang, J., Vehtari, A.: Understanding predictive information criteria for Bayesian models. Stat. Comput. 24, 1–20 (2013)MathSciNetMATHGoogle Scholar
  13. Gelman, A., Meng, X.-L.: Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat. Sci. 13, 163–185 (1998)MathSciNetCrossRefMATHGoogle Scholar
  14. Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)Google Scholar
  15. Hug, S., Schwarzfischer, M., Hasenauer, J., Marr, C., Theis, F.J.: An adaptive scheduling scheme for calculating Bayes factors with thermodynamic integration using Simpsons rule. Stat.Comput. 26, 663–677 (2016)MathSciNetCrossRefMATHGoogle Scholar
  16. Kass, R.E., Wasserman, L.: A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Am. Stat. Assoc. 90(431), 928–934 (1995)MathSciNetCrossRefMATHGoogle Scholar
  17. Kirkwood, J.G.: Statistical mechanics of fluid mixtures. J. Chem. Phys. 3(5), 300–313 (1935)CrossRefMATHGoogle Scholar
  18. Mononen, T.: A case study of the widely applicable Bayesian information criterion and its optimality. Stat. Comput. 25, 929–940 (2015)MathSciNetCrossRefMATHGoogle Scholar
  19. Neal, R.M.: Probabilistic inference using Markov chain Monte Carlo methods. Technical report, CRG-TR-93-1 Department of Computer Science, University of Toronto Toronto, Ontario (1993)Google Scholar
  20. Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model comparison. Journal of the American Statistical Association (to appear) (2016)Google Scholar
  21. Raftery, A.E.: Bayes factors and BIC. Sociol. Methods. Res. 27(3), 411–417 (1999)CrossRefGoogle Scholar
  22. Robert, C., Wraith D.: Computational methods for Bayesian model choice. In: Bayesian Inference and maximum entropy methods in Science and Engineering: The 29th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, vol. 1193, pp. 251–262 (2009)Google Scholar
  23. Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefMATHGoogle Scholar
  24. Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–859 (2006)MathSciNetCrossRefMATHGoogle Scholar
  25. Vitoratou, S., Ntzoufras, I.: Thermodynamic assessment of probability distribution divergencies and Bayesian model comparison. arXiv preprint (2013)Google Scholar
  26. Volinsky, C.T., Raftery, A.E.: Bayesian information criterion for censored survival models. Biometrics 56(1), 256–262 (2000)CrossRefMATHGoogle Scholar
  27. Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)MathSciNetMATHGoogle Scholar
  28. Williams, E.: Regression Analysis. Wiley, New York (1959)MATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • N. Friel
    • 1
  • J. P. McKeone
    • 3
    • 4
  • C. J. Oates
    • 2
    • 4
  • A. N. Pettitt
    • 3
    • 4
  1. 1.School of Mathematics and Statistics and Insight Centre for Data AnalyticsUniversity College DublinDublinIreland
  2. 2.School of Mathematical and Physical SciencesUniversity of Technology SydneySydneyAustralia
  3. 3.School of Mathematical SciencesQueensland University of TechnologyBrisbaneAustralia
  4. 4.Australian Research Council Centre for Excellence in Mathematical and Statistical FrontiersParkvilleAustralia

Personalised recommendations