Advertisement

Statistics and Computing

, Volume 28, Issue 4, pp 971–988 | Cite as

Variational Bayes with synthetic likelihood

  • Victor M. H. Ong
  • David J. Nott
  • Minh-Ngoc Tran
  • Scott A. Sisson
  • Christopher C. Drovandi
Article

Abstract

Synthetic likelihood is an attractive approach to likelihood-free inference when an approximately Gaussian summary statistic for the data, informative for inference about the parameters, is available. The synthetic likelihood method derives an approximate likelihood function from a plug-in normal density estimate for the summary statistic, with plug-in mean and covariance matrix obtained by Monte Carlo simulation from the model. In this article, we develop alternatives to Markov chain Monte Carlo implementations of Bayesian synthetic likelihoods with reduced computational overheads. Our approach uses stochastic gradient variational inference methods for posterior approximation in the synthetic likelihood context, employing unbiased estimates of the log likelihood. We compare the new method with a related likelihood-free variational inference technique in the literature, while at the same time improving the implementation of that approach in a number of ways. These new algorithms are feasible to implement in situations which are challenging for conventional approximate Bayesian computation methods, in terms of the dimensionality of the parameter and summary statistic.

Keywords

Approximate Bayesian computation Stochastic gradient ascent Synthetic likelihoods Variational Bayes 

Notes

Acknowledgements

Victor Ong and David Nott were supported by a Singapore Ministry of Education Academic Research Fund Tier 2 grant (R-155-000-143-112). Christopher Drovandi was supported by an Australian Research Council’s Discovery Early Career Researcher Award funding scheme (DE160100741). Scott Sisson was supported by the Australian Research Council through the Discovery Scheme (DP160102544) and the ACEMS Centre of Excellence (CE140100049).

References

  1. Adler, R.J., Feldman, R.E., Taqqu, M.S. (eds.): A Practical Guide to Heavy Tails: Statistical Techniques and Applications. Birkhauser Boston Inc., Cambridge (1998)MATHGoogle Scholar
  2. Allingham, D.R., King, A.R., Mengersen, K.L.: Bayesian estimation of quantile distributions. Stat. Comput. 19, 189–201 (2009)MathSciNetCrossRefGoogle Scholar
  3. Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10, 251–276 (1998)CrossRefGoogle Scholar
  4. Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37(2), 697–725 (2009)MathSciNetCrossRefMATHGoogle Scholar
  5. Barthelmé, S., Chopin, N.: Expectation propagation for likelihood-free inference. J. Am. Stat. Assoc. 109(505), 315–333 (2014)MathSciNetCrossRefMATHGoogle Scholar
  6. Beaumont, M.A.: Estimation of population growth or decline in genetically monitored populations. Genetics 164(3), 1139–1160 (2003)Google Scholar
  7. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin (2006)MATHGoogle Scholar
  8. Blum, M.G.B., Nunes, M.A., Prangle, D., Sisson, S.A.: A comparative review of dimension reduction methods in approximate Bayesian computation. Stat. Sci. 28(2), 189–208 (2013)MathSciNetCrossRefMATHGoogle Scholar
  9. Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of the 19th International Conference on Computational Statistics (COMPSTAT’2010), pp 177–187. Springer, Berlin (2010)Google Scholar
  10. Box, G.E.P.: Sampling and Bayes’ inference in scientific modelling and robustness (with discussion). J. R. Stat. Soc. Ser. A 143, 383–430 (1980)CrossRefMATHGoogle Scholar
  11. Brown, V.L., Drake, J.M., Barton, H.D., Stallknecht, D.E., Brown, J.D., Rohani, P.: Neutrality, cross-immunity and subtype dominance in avian influenza viruses. PLOS ONE 9(2), 1–10 (2014)Google Scholar
  12. Doucet, A., Pitt, M.K., Deligiannidis, G., Kohn, R.: Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102(2), 295–313 (2015). doi: 10.1093/biomet/asu075 MathSciNetCrossRefMATHGoogle Scholar
  13. Drovandi, C.C., Pettitt, A.N.: Likelihood-free Bayesian estimation of multivariate quantile distributions. Comput. Stat. Data Anal. 55, 2541–2556 (2011)MathSciNetCrossRefGoogle Scholar
  14. Drovandi, C.C., Pettitt, A.N., Faddy, M.J.: Approximate Bayesian computation using indirect inference. J. R. Stat. Soc. Ser. C (Appl. Stat.) 60(3), 503–524 (2011)MathSciNetGoogle Scholar
  15. Dutta, R., Corander, J., Kaski, S., Gutmann, M.U.: Likelihood-Free Inference by Penalised Logistic Regression. arXiv:1611.10242 (2016)
  16. Everitt, R.G., Johansen, A.M., Rowing, E., Evdemon-Hogan, M.: Bayesian model comparison with un-normalised likelihoods. Stat. Comput. 27(2), 403–422 (2017)MathSciNetCrossRefMATHGoogle Scholar
  17. Fasiolo, M., Pya, N., Wood, S.N.: A comparison of inferential methods for highly nonlinear state space models in ecology and epidemiology. Stat. Sci. 31, 96–118 (2016a)MathSciNetCrossRefGoogle Scholar
  18. Fasiolo, M., Wood, S.N., Hartig, F., Bravington, M.V.: An extended empirical saddlepoint approximation for intractable likelihoods. arXiv:1601.01849 (2016b)
  19. Fisher, R.A., Yates, F.: Statistical Tables for Biological, Agricultural and Medical Research. Hafner, New York (1948)MATHGoogle Scholar
  20. Gelman, A., Meng, X.L., Stern, H.: Posterior predictive assessment of model fitness via realized discrepancies. Stat. Sin. 6, 733–807 (1996)MathSciNetMATHGoogle Scholar
  21. Ghurye, S.G., Olkin, I.: Unbiased estimation of some multivariate probability densities and related functions. Ann. Math. Stat. 40(4), 1261–1271 (1969)MathSciNetCrossRefMATHGoogle Scholar
  22. Gunawan, D., Tran, M.N., Kohn, R.: Fast inference for intractable likelihood problems using variational Bayes. Working Paper, Discipline of Business Analytics, University of Sydney. http://hdl.handle.net/2123/14594 (2016)
  23. Gutmann, M.U., Corander, J.: Bayesian optimization for likelihood-free inference of simulator-based statistical models. J. Mach. Learn. Res. 17(125), 1–47 (2015)Google Scholar
  24. Hartig, F., Dislich, C., Wiegand, T., Huth, A.: Technical note: approximate Bayesian parameterization of a process-based tropical forest model. Biogeosciences 11, 1261–1272 (2014)CrossRefGoogle Scholar
  25. Hoffman, M.D., Blei, D.M., Wang, C., Paisley, J.: Stochastic variational inference. J. Mach. Learn. Res. 14(1), 1303–1347 (2013)MathSciNetMATHGoogle Scholar
  26. Ji, C., Shen, H., West, M.: Bounded approximations for marginal likelihoods. Technical Report 10-05, Institute of Decision Sciences, Duke University. http://ftp.stat.duke.edu/WorkingPapers/10-05.html (2010)
  27. Joe, H.: Multivariate models and dependence concepts. Chapman & Hall, London (1997)CrossRefMATHGoogle Scholar
  28. Kingma, D.P., Welling, M.: Auto-encoding variational Bayes. arXiv: 1312.6114 (2013)
  29. Kucukelbir, A., Tran, D., Ranganath, R., Gelman, A., Blei, D.M.: Automatic differentiation variational inference. J. Mach. Learn. Res. 18(14), 1–45 (2017)MathSciNetMATHGoogle Scholar
  30. Li, J., Nott, D.J., Fan, Y., Sisson, S.A.: Extending approximate Bayesian computation methods to high dimensions via Gaussian copula. Comput. Stat. Data Anal. 106, 77–89 (2017)Google Scholar
  31. Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York (1999)MATHGoogle Scholar
  32. Marin, J.M., Pudlo, P., Robert, C.P., Ryder, R.J.: Approximate Bayesian computational methods. Stat. Comput. 22(6), 1167–1180 (2012)MathSciNetCrossRefMATHGoogle Scholar
  33. McCulloch, J.: Simple consistent estimators of stable distribution parameters. Commun. Stat. Simul. Comput. 15(4), 1109–1136 (1986)MathSciNetCrossRefMATHGoogle Scholar
  34. Meeds, E., Welling, M.: GPS-ABC: Gaussian process surrogate approximate Bayesian computation. In: Proceedings of the Thirtieth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-14), pp. 593–602 (2014)Google Scholar
  35. Moores, M.T., Drovandi, C.C., Mengersen, K.L., Robert, C.P.: Pre-processing for approximate Bayesian computation in image analysis. Stat. Comput. 25(1), 23–33 (2015)MathSciNetCrossRefMATHGoogle Scholar
  36. Moreno, A., Adel, T., Meeds, E., Rehg, J.M., Welling, M.: Automatic variational ABC. arXiv:1606.08549 (2016)
  37. Nott, D., Tan, S., Villani, M., Kohn, R.: Regression density estimation with variational methods and stochastic approximation. J. Comput. Graph. Stat. 21(3), 797–820 (2012)MathSciNetCrossRefGoogle Scholar
  38. Ormerod, J., Wand, M.: Explaining variational approximations. Am. Stat. 64, 140–153 (2010)MathSciNetCrossRefMATHGoogle Scholar
  39. Paisley, J.W., Blei, D.M., Jordan, M.I.: Variational Bayesian inference with stochastic search. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12) (2012)Google Scholar
  40. Peters, G.W., Sisson, S.A., Fan, Y.: Likelihood-free Bayesian inference for \(\alpha \)-stable models. Comput. Stat. Data Anal. 56, 3743–3756 (2012)MathSciNetCrossRefMATHGoogle Scholar
  41. Pinheiro, J.C., Bates, D.M.: Unconstrained parametrizations for variance-covariance matrices. Stat. Comput. 6(3), 289–296 (1996)CrossRefGoogle Scholar
  42. Pitt, M.K., Silva, RdS, Giordani, P., Kohn, R.: On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econ. 171(2), 134–151 (2012)MathSciNetCrossRefMATHGoogle Scholar
  43. Price, L.F., Drovandi, C.C., Lee, A.C., Nott, D.J.: Bayesian synthetic likelihood. J. Comput. Graph. Stat. (2016) (to appear) (2016)Google Scholar
  44. Ranganath, R., Wang, C., Blei, D.M., Xing, E.P.: An adaptive learning rate for stochastic variational inference. In: Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 298–306 (2013)Google Scholar
  45. Ranganath, R., Gerrish, S., Blei, D.M.: Black box variational inference. Int. Conf. Artif. Intell. Stat. 33, 814–822 (2014)Google Scholar
  46. Rayner, G., MacGillivray, H.: Weighted quantile-based estimation for a class of transformation distributions. Comput. Stat. Data Anal. 39(4), 401–433 (2002)MathSciNetCrossRefMATHGoogle Scholar
  47. Reserve Bank of Australia (2014) Historical data. http://www.rba.gov.au/statistics/historical-data.html. Accessed 16 Sept 2014
  48. Rezende, D.J., Mohamed, S., Wierstra, D.: Stochastic backpropagation and approximate inference in deep generative models. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1278–1286 (2014)Google Scholar
  49. Ripley, B.D.: Pattern Recognition and Neural Networks. Cambridge University Press, Cambridge (1996)CrossRefMATHGoogle Scholar
  50. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)MathSciNetCrossRefMATHGoogle Scholar
  51. Salimans, T., Knowles, D.A.: Fixed-form variational posterior approximation through stochastic linear regression. Bayesian Anal. 8(4), 837–882 (2013)MathSciNetCrossRefMATHGoogle Scholar
  52. Tan, L.S.L., Nott, D.J.: Gaussian variational approximation with sparse precision matrices. Stat. Comput. (2017). doi: 10.1007/s11222-017-9729-7
  53. Titsias, M., Lázaro-Gredilla, M.: Doubly stochastic variational Bayes for non-conjugate inference. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1971–1979 (2014)Google Scholar
  54. Titsias, M., Lázaro-Gredilla, M.: Local expectation gradients for black box variational inference. In: Cortes, C., Lawrence, N.D., Lee, D.D., Sugiyama, M., Garnett, R. (eds.) Advances in Neural Information Processing Systems, vol. 28, pp. 2638–2646. Curran Associates Inc, Red Hook (2015)Google Scholar
  55. Tran, M.N., Nott, D.J., Kohn, R.: Variational Bayes with intractable likelihood. arXiv:1503.08621v1 (2015)
  56. Tran, M.N., Nott, D.J., Kohn, R.: Variational Bayes with intractable likelihood J. Comput. Graph. Stat. (2016) (to appear)Google Scholar
  57. Wand, M.P.: Fully simplified multivariate normal updates in non-conjugate variational message passing. J. Mach. Learn. Res. 15, 1351–1369 (2014)MathSciNetMATHGoogle Scholar
  58. Wilkinson, R.: Accelerating ABC methods using Gaussian processes. J. Mach. Learn. Res. 33, 1015–1023 (2014)Google Scholar
  59. Wood, S.N.: Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466, 1102–1107 (2010)CrossRefGoogle Scholar
  60. Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv:1212.5701 (2012)

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  1. 1.Department of Statistics and Applied ProbabilityNational University of SingaporeSingaporeSingapore
  2. 2.Discipline of Business Analytics, The University of Sydney Business SchoolThe University of SydneySydneyAustralia
  3. 3.School of Mathematics and StatisticsUniversity of New South WalesSydneyAustralia
  4. 4.School of Mathematical SciencesQueensland University of TechnologyBrisbaneAustralia

Personalised recommendations