Langevin incremental mixture importance sampling
This work proposes a novel method through which local information about the target density can be used to construct an efficient importance sampler. The backbone of the proposed method is the incremental mixture importance sampling (IMIS) algorithm of Raftery and Bao (Biometrics 66(4):1162–1173, 2010), which builds a mixture importance distribution incrementally, by positioning new mixture components where the importance density lacks mass, relative to the target. The key innovation proposed here is to construct the mean vectors and covariance matrices of the mixture components by numerically solving certain differential equations, whose solution depends on the local shape of the target log-density. The new sampler has a number of advantages: (a) it provides an extremely parsimonious parametrization of the mixture importance density, whose configuration effectively depends only on the shape of the target and on a single free parameter representing pseudo-time; (b) it scales well with the dimensionality of the target; (c) it can deal with targets that are not log-concave. The performance of the proposed approach is demonstrated on two synthetic non-Gaussian densities, one being defined on up to eighty dimensions, and on a Bayesian logistic regression model, using the Sonar dataset. The Julia code implementing the importance sampler proposed here can be found at https://github.com/mfasiolo/LIMIS.
KeywordsImportance sampling Langevin diffusion Mixture density Optimal importance distribution Local approximation Kalman-Bucy filter
The authors would like to thank Samuel Livingstone and two anonymous referees for providing useful comments on an earlier version of this paper.
- Ascher, U.M, Petzold, L.R.: Computer methods for ordinary differential equations and differential-algebraic equations. Soc. Ind. Appl. Math. 73–78 (1998)Google Scholar
- Bates, S.: Bayesian inference for deterministic simulation models for environmental assessment. PhD Thesis, University of Washington (2001)Google Scholar
- Bezanson, J., Karpinski, S., Shah, V.B., Edelman, A.: Julia: a fast dynamic language for technical computing. arXiv:1209.5145 (2012)
- Bucy, R.S., Joseph, P.D.: Filtering for stochastic processes with applications to guidance. Am. Math. Soc. 43–55 (1987)Google Scholar
- Carpenter, B., Gelman, A., Hoffman, M., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. 76(1), 1–32 (2017)Google Scholar
- Daum, F., Huang, J.: Particle flow for nonlinear filters with log-homotopy. In: SPIE Defense and Security Symposium, International Society for Optics and Photonics, pp. 696,918–696,918 (2008)Google Scholar
- Lichman, M.: UCI machine learning repository. URL http://archive.ics.uci.edu/ml (2013)
- Roberts, G.O., Tweedie, R.L.: Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2(4), 341–363 (1996)Google Scholar
- Schuster I.: Gradient importance sampling. arXiv:1507.05781. (2015)
- Sim, A., Filippi, S., Stumpf, M.P.: Information geometry and sequential Monte Carlo. arXiv:1212.0764. (2012)
- West, M.: Modelling with mixtures. In: Berger, J., Bernardo, J., Dawid, A., Smith, A. (eds.) Bayesian Statistics, pp. 503–525. Oxford University Press, Oxford (1992)Google Scholar