Statistics and Computing

, Volume 27, Issue 3, pp 833–844 | Cite as

Investigation of the widely applicable Bayesian information criterion

  • N. FrielEmail author
  • J. P. McKeone
  • C. J. Oates
  • A. N. Pettitt


The widely applicable Bayesian information criterion (WBIC) is a simple and fast approximation to the model evidence that has received little practical consideration. WBIC uses the fact that the log evidence can be written as an expectation, with respect to a powered posterior proportional to the likelihood raised to a power \(t^*\in {(0,1)}\), of the log deviance. Finding this temperature value \(t^*\) is generally an intractable problem. We find that for a particular tractable statistical model that the mean squared error of an optimally-tuned version of WBIC with correct temperature \(t^*\) is lower than an optimally-tuned version of thermodynamic integration (power posteriors). However in practice WBIC uses the a canonical choice of \(t=1/\log (n)\). Here we investigate the performance of WBIC in practice, for a range of statistical models, both regular models and singular models such as latent variable models or those with a hierarchical structure for which BIC cannot provide an adequate solution. Our findings are that, generally WBIC performs adequately when one uses informative priors, but it can systematically overestimate the evidence, particularly for small sample sizes.


Marginal likelihood Evidence Power posteriors Widely applicable Bayesian information criterion 



The Insight Centre for Data Analytics is supported by Science Foundation Ireland under Grant Number SFI/12/RC/2289. Nial Friel’s research was also supported by an Science Foundation Ireland Grant: 12/IP/1424. James McKeone is grateful for the support of an Australian Postgraduate Award (APA). Tony Pettitt’s research was supported by the Australian Research Council Discovery Grant, DP1101000159.


  1. Bash, P.A., Singh, U.C., Langridge, R., Kollman, P.A.: Free energy calculations by computer simulation. Science 236(4801), 564–568 (1987)CrossRefGoogle Scholar
  2. Burrows, B.: A new approach to numerical integration. IMA J. Appl. Math. 26(2), 151–173 (1980)MathSciNetCrossRefzbMATHGoogle Scholar
  3. Calderhead, B., Girolami, M.: Estimating Bayes factors via thermodynamic integration and population MCMC. Comput. Stat. Data Anal. 53, 4028–4045 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Chickering, D.M., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Mach. Learn. 29(2–3), 181–212 (1997)CrossRefzbMATHGoogle Scholar
  5. Chipot, C., Pohorille, A.: Free Energy Calculations: Theory and Applications in Chemistry and Biology, vol. 86. Springer, Berlin (2007)Google Scholar
  6. Chopin, N., Robert, C.: Contemplating evidence: properties, extensions of, and alternatives to nested sampling. Technical report, 2007-46, Ceremade, Université Paris Dauphine (2007)Google Scholar
  7. Drton, M., Plummer, M.: A Bayesian information criterion for singular models. arXiv preprint (2013)Google Scholar
  8. Friel, N., Hurn, M., Wyse, J.: Improving power posterior estimation of statistical evidence. Stat. Comput. 24, 709–723 (2014)MathSciNetCrossRefzbMATHGoogle Scholar
  9. Friel, N., Pettitt, A.N.: Marginal likelihood estimation via power posteriors. J. R. Stat. Soc. Ser. B 70, 589–607 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  10. Friel, N., Wyse, J.: Estimating the evidence a review. Statistica Neerlandica 66(3), 288–308 (2012)MathSciNetCrossRefGoogle Scholar
  11. Gelfand, A.E., Dey, D.K.: Bayesian model choice: asymptotics and exact calculations. J. R. Stat. Soc. Ser. B (Methodological) 56, 501–514 (1994)MathSciNetzbMATHGoogle Scholar
  12. Gelman, A., Hwang, J., Vehtari, A.: Understanding predictive information criteria for Bayesian models. Stat. Comput. 24, 1–20 (2013)MathSciNetzbMATHGoogle Scholar
  13. Gelman, A., Meng, X.-L.: Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat. Sci. 13, 163–185 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  14. Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995)Google Scholar
  15. Hug, S., Schwarzfischer, M., Hasenauer, J., Marr, C., Theis, F.J.: An adaptive scheduling scheme for calculating Bayes factors with thermodynamic integration using Simpsons rule. Stat.Comput. 26, 663–677 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Kass, R.E., Wasserman, L.: A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. J. Am. Stat. Assoc. 90(431), 928–934 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  17. Kirkwood, J.G.: Statistical mechanics of fluid mixtures. J. Chem. Phys. 3(5), 300–313 (1935)CrossRefzbMATHGoogle Scholar
  18. Mononen, T.: A case study of the widely applicable Bayesian information criterion and its optimality. Stat. Comput. 25, 929–940 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  19. Neal, R.M.: Probabilistic inference using Markov chain Monte Carlo methods. Technical report, CRG-TR-93-1 Department of Computer Science, University of Toronto Toronto, Ontario (1993)Google Scholar
  20. Oates, C.J., Papamarkou, T., Girolami, M.: The controlled thermodynamic integral for Bayesian model comparison. Journal of the American Statistical Association (to appear) (2016)Google Scholar
  21. Raftery, A.E.: Bayes factors and BIC. Sociol. Methods. Res. 27(3), 411–417 (1999)CrossRefGoogle Scholar
  22. Robert, C., Wraith D.: Computational methods for Bayesian model choice. In: Bayesian Inference and maximum entropy methods in Science and Engineering: The 29th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering, vol. 1193, pp. 251–262 (2009)Google Scholar
  23. Schwarz, G.E.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  24. Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–859 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  25. Vitoratou, S., Ntzoufras, I.: Thermodynamic assessment of probability distribution divergencies and Bayesian model comparison. arXiv preprint (2013)Google Scholar
  26. Volinsky, C.T., Raftery, A.E.: Bayesian information criterion for censored survival models. Biometrics 56(1), 256–262 (2000)CrossRefzbMATHGoogle Scholar
  27. Watanabe, S.: A widely applicable Bayesian information criterion. J. Mach. Learn. Res. 14, 867–897 (2013)MathSciNetzbMATHGoogle Scholar
  28. Williams, E.: Regression Analysis. Wiley, New York (1959)zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  • N. Friel
    • 1
    Email author
  • J. P. McKeone
    • 3
    • 4
  • C. J. Oates
    • 2
    • 4
  • A. N. Pettitt
    • 3
    • 4
  1. 1.School of Mathematics and Statistics and Insight Centre for Data AnalyticsUniversity College DublinDublinIreland
  2. 2.School of Mathematical and Physical SciencesUniversity of Technology SydneySydneyAustralia
  3. 3.School of Mathematical SciencesQueensland University of TechnologyBrisbaneAustralia
  4. 4.Australian Research Council Centre for Excellence in Mathematical and Statistical FrontiersParkvilleAustralia

Personalised recommendations