Langevin incremental mixture importance sampling

  • Matteo Fasiolo
  • Flávio Eler de Melo
  • Simon Maskell
Article

Abstract

This work proposes a novel method through which local information about the target density can be used to construct an efficient importance sampler. The backbone of the proposed method is the incremental mixture importance sampling (IMIS) algorithm of Raftery and Bao (Biometrics 66(4):1162–1173, 2010), which builds a mixture importance distribution incrementally, by positioning new mixture components where the importance density lacks mass, relative to the target. The key innovation proposed here is to construct the mean vectors and covariance matrices of the mixture components by numerically solving certain differential equations, whose solution depends on the local shape of the target log-density. The new sampler has a number of advantages: (a) it provides an extremely parsimonious parametrization of the mixture importance density, whose configuration effectively depends only on the shape of the target and on a single free parameter representing pseudo-time; (b) it scales well with the dimensionality of the target; (c) it can deal with targets that are not log-concave. The performance of the proposed approach is demonstrated on two synthetic non-Gaussian densities, one being defined on up to eighty dimensions, and on a Bayesian logistic regression model, using the Sonar dataset. The Julia code implementing the importance sampler proposed here can be found at https://github.com/mfasiolo/LIMIS.

Keywords

Importance sampling Langevin diffusion Mixture density Optimal importance distribution Local approximation Kalman-Bucy filter 

Supplementary material

11222_2017_9747_MOESM1_ESM.pdf (68 kb)
Supplementary material 1 (pdf 68 KB)

References

  1. Ascher, U.M, Petzold, L.R.: Computer methods for ordinary differential equations and differential-algebraic equations. Soc. Ind. Appl. Math. 73–78 (1998)Google Scholar
  2. Bates, S.: Bayesian inference for deterministic simulation models for environmental assessment. PhD Thesis, University of Washington (2001)Google Scholar
  3. Bezanson, J., Karpinski, S., Shah, V.B., Edelman, A.: Julia: a fast dynamic language for technical computing. arXiv:1209.5145 (2012)
  4. Brent, R.P.: Algorithms for Minimization Without Derivatives. Courier Corporation, North Chelmsford (2013)MATHGoogle Scholar
  5. Bucy, R.S., Joseph, P.D.: Filtering for stochastic processes with applications to guidance. Am. Math. Soc. 43–55 (1987)Google Scholar
  6. Bunch, P., Godsill, S.: Approximations of the optimal importance density using Gaussian particle flow importance sampling. J. Am. Stat. Assoc. 111(514), 748–762 (2016)MathSciNetCrossRefGoogle Scholar
  7. Cappé, O., Douc, R., Guillin, A., Marin, J.M., Robert, C.P.: Adaptive importance sampling in general mixture classes. Stat. Comput. 18(4), 447–459 (2008)MathSciNetCrossRefGoogle Scholar
  8. Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. 76(1), 1–32 (2017)Google Scholar
  9. Daum, F., Huang, J.: Particle flow for nonlinear filters with log-homotopy. In: SPIE Defense and Security Symposium, International Society for Optics and Photonics, pp. 696,918–696,918 (2008)Google Scholar
  10. Duane, S., Kennedy, A.D., Pendleton, B.J., Roweth, D.: Hybrid monte carlo. Phys. Lett. B 195(2), 216–222 (1987)CrossRefGoogle Scholar
  11. Faes, C., Ormerod, J.T., Wand, M.P.: Variational Bayesian inference for parametric and nonparametric regression with missing data. J. Am. Stat. Assoc. 106(495), 959–971 (2011)MathSciNetCrossRefMATHGoogle Scholar
  12. Girolami, M., Calderhead, B.: Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 73(2), 123–214 (2011)MathSciNetCrossRefGoogle Scholar
  13. Givens, G.H., Raftery, A.E.: Local adaptive importance sampling for multivariate densities with strong nonlinear relationships. J. Am. Stat. Assoc. 91(433), 132–141 (1996)MathSciNetCrossRefMATHGoogle Scholar
  14. Gorman, R.P., Sejnowski, T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75–89 (1988)CrossRefGoogle Scholar
  15. Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)MathSciNetCrossRefMATHGoogle Scholar
  16. Hoffman, M.D., Gelman, A.: The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mac. Learn. Res. 15(1), 1593–1623 (2014)MathSciNetMATHGoogle Scholar
  17. Ionides, E.L.: Truncated importance sampling. J. Comput. Graph. Stat. 17(2), 295–311 (2008)MathSciNetCrossRefGoogle Scholar
  18. Kong, A., Liu, J.S., Wong, W.H.: Sequential imputations and Bayesian missing data problems. J. Am. Stat. Assoc. 89(425), 278–288 (1994)CrossRefMATHGoogle Scholar
  19. Lichman, M.: UCI machine learning repository. URL http://archive.ics.uci.edu/ml (2013)
  20. Raftery, A.E., Bao, L.: Estimating and projecting trends in HIV/AIDS generalized epidemics using incremental mixture importance sampling. Biometrics 66(4), 1162–1173 (2010)MathSciNetCrossRefMATHGoogle Scholar
  21. Roberts, G.O., Tweedie, R.L.: Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2(4), 341–363 (1996)Google Scholar
  22. Roberts, G.O., Rosenthal, J.S., et al.: Optimal scaling for various Metropolis-Hastings algorithms. Stat. Sci. 16(4), 351–367 (2001)MathSciNetCrossRefMATHGoogle Scholar
  23. Schuster I.: Gradient importance sampling. arXiv:1507.05781. (2015)
  24. Sim, A., Filippi, S., Stumpf, M.P.: Information geometry and sequential Monte Carlo. arXiv:1212.0764. (2012)
  25. Süli, E., Mayers, D.F.: An Introduction to Numerical Analysis. Cambridge University Press, Cambridge (2003)CrossRefMATHGoogle Scholar
  26. West, M.: Modelling with mixtures. In: Berger, J., Bernardo, J., Dawid, A., Smith, A. (eds.) Bayesian Statistics, pp. 503–525. Oxford University Press, Oxford (1992)Google Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.School of MathematicsUniversity of BristolBristolUK
  2. 2.School of Electrical Engineering, Electronics and Computer ScienceUniversity of LiverpoolLiverpoolUK

Personalised recommendations