Statistics and Computing

, Volume 27, Issue 3, pp 599–623 | Cite as

Layered adaptive importance sampling

Article

Abstract

Monte Carlo methods represent the de facto standard for approximating complicated integrals involving multidimensional target distributions. In order to generate random realizations from the target distribution, Monte Carlo techniques use simpler proposal probability densities to draw candidate samples. The performance of any such method is strictly related to the specification of the proposal distribution, such that unfortunate choices easily wreak havoc on the resulting estimators. In this work, we introduce a layered (i.e., hierarchical) procedure to generate samples employed within a Monte Carlo scheme. This approach ensures that an appropriate equivalent proposal density is always obtained automatically (thus eliminating the risk of a catastrophic performance), although at the expense of a moderate increase in the complexity. Furthermore, we provide a general unified importance sampling (IS) framework, where multiple proposal densities are employed and several IS schemes are introduced by applying the so-called deterministic mixture approach. Finally, given these schemes, we also propose a novel class of adaptive importance samplers using a population of proposals, where the adaptation is driven by independent parallel or interacting Markov chain Monte Carlo (MCMC) chains. The resulting algorithms efficiently combine the benefits of both IS and MCMC methods.

Keywords

Bayesian inference Adaptive importance sampling Population Monte Carlo Parallel MCMC Multiple importance sampling 

References

  1. Ali, A.M., Yao, K., Collier, T.C., Taylor, E., Blumstein, D., Girod, L.: An empirical study of collaborative acoustic source localization. In: Proceedings of the Information Processing in Sensor Networks (IPSN07), Boston (2007)Google Scholar
  2. Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.: An introduction to MCMC for machine learning. Mach. Learn. 50, 5–43 (2003)CrossRefMATHGoogle Scholar
  3. Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. B 72(3), 269–342 (2010)MathSciNetCrossRefMATHGoogle Scholar
  4. Andrieu, C., Thoms, J.: A tutorial on adaptive mcmc. Stat. Comput. 18, 343373 (2015)MathSciNetGoogle Scholar
  5. Beaujean, F., Caldwell, A.: Initializing adaptive importance sampling with Markov chains. arXiv:1304.7808 (2013)
  6. Botev, Z.I., Kroese, D.P.: An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodol. Comput. Appl. Probab. 10(4), 471–505 (2008)MathSciNetCrossRefMATHGoogle Scholar
  7. Botev, Z.I., LEcuyer, P., Tuffin, B.: Markov chain importance sampling with applications to rare event probability estimation. Stat. Comput. 23, 271–285 (2013)MathSciNetCrossRefMATHGoogle Scholar
  8. Brockwell, A., Del Moral, P., Doucet, A.: Interacting Markov chain Monte Carlo methods. Ann. Stat. 38(6), 3387–3411 (2010)MathSciNetCrossRefMATHGoogle Scholar
  9. Bugallo, M.F., Martino, L., Corander, J.: Adaptive importance sampling in signal processing. Digit. Signal Process. 47, 36–49 (2015)MathSciNetCrossRefGoogle Scholar
  10. Caldwell, A., Liu, C.: Target density normalization for Markov Chain Monte Carlo algorithms. arXiv:1410.7149 (2014)
  11. Cappé, O., Douc, R., Guillin, A., Marin, J.M., Robert, C.P.: Adaptive importance sampling in general mixture classes. Stat. Comput. 18, 447–459 (2008)MathSciNetCrossRefGoogle Scholar
  12. Cappé, O., Guillin, A., Marin, J.M., Robert, C.P.: Population Monte Carlo. J. Comput. Graph. Stat. 13(4), 907–929 (2004)MathSciNetCrossRefGoogle Scholar
  13. Chib, S., Jeliazkov, I.: Marginal likelihood from the metropolis-hastings output. J. Am. Stat. Assoc. 96, 270–281 (2001)MathSciNetCrossRefMATHGoogle Scholar
  14. Chopin, N.: A sequential particle filter for static models. Biometrika 89, 539–552 (2002)MathSciNetCrossRefMATHGoogle Scholar
  15. Cornuet, J.M., Marin, J.M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Scand. J. Stat. 39(4), 798–812 (2012)MathSciNetCrossRefMATHGoogle Scholar
  16. Craiu, R., Rosenthal, J., Yang, C.: Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J. Am. Stat. Assoc. 104(448), 1454–1466 (2009)MathSciNetCrossRefMATHGoogle Scholar
  17. Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(3), 411–436 (2006)MathSciNetCrossRefMATHGoogle Scholar
  18. Douc, G.R., Marin, J.M., Robert, C.: Convergence of adaptive mixtures of importance sampling schemes. Ann. Stat. 35, 420–448 (2007a)MathSciNetCrossRefMATHGoogle Scholar
  19. Douc, G.R., Marin, J.M., Robert, C.: Minimum variance importance sampling via population Monte Carlo. ESAIM Probab. Stat. 11, 427–447 (2007b)MathSciNetCrossRefMATHGoogle Scholar
  20. Doucet, A., Johansen, A.M.: A tutorial on particle filtering and smoothing: fifteen years later. Technical report (2008)Google Scholar
  21. Doucet, A., Wang, X.: Monte Carlo methods for signal processing. IEEE Signal Process. Mag. 22(6), 152–170 (2005)CrossRefGoogle Scholar
  22. Elvira, V., Martino, L., Luengo, D., Bugallo, M.: Efficient multiple importance sampling estimators. IEEE Signal Process. Lett. 22(10), 1757–1761 (2015)CrossRefGoogle Scholar
  23. Elvira, V., Martino, L., Luengo, D., Bugallo, M.F.: Generalized multiple importance sampling. arXiv:1511.03095 (2015)
  24. Fearnhead, P., Taylor, B.M.: An adaptive sequential Monte Carlo sampler. Bayesian Anal. 8(2), 411–438 (2013)MathSciNetCrossRefMATHGoogle Scholar
  25. Fitzgerald, W.J.: Markov chain Monte Carlo methods with applications to signal processing. Signal Process. 81(1), 3–18 (2001)MathSciNetCrossRefMATHGoogle Scholar
  26. Friel, N., Wyse, J.: Estimating the model evidence: a review. arXiv:1111.1957 (2011)
  27. Geyer, C.J.: Markov chain Monte Carlo maximum likelihood. In: Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, pp. 156–163 (1991)Google Scholar
  28. Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)MathSciNetCrossRefMATHGoogle Scholar
  29. Ihler, A.T., Fisher, J.W., Moses, R.L., Willsky, A.S.: Nonparametric belief propagation for self-localization of sensor networks. IEEE Trans. Sel. Areas Commun. 23(4), 809–819 (2005)CrossRefGoogle Scholar
  30. Jacob, P., Robert, C.P., Smith, M.H.: Using parallel computation to improve Independent Metropolis–Hastings based estimation. J. Comput. Graph. Stat. 3(20), 616–635 (2011)MathSciNetCrossRefGoogle Scholar
  31. Liang, F., Liu, C., Caroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. Wiley Series in Computational Statistics, England (2010)Google Scholar
  32. Liesenfeld, R., Richard, J.F.: Improving MCMC, using efficient importance sampling. Comput. Stat. Data Anal. 53, 272–288 (2008)MathSciNetCrossRefMATHGoogle Scholar
  33. Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, Berlin (2004)CrossRefGoogle Scholar
  34. Liu, J.S., Liang, F., Wong, W.H.: The multiple-try method and local optimization in metropolis sampling. J. Am. Stat. Assoc. 95(449), 121–134 (2000)MathSciNetCrossRefMATHGoogle Scholar
  35. Luengo, D., Martino, L.: Fully adaptive Gaussian mixture Metropolis–Hastings algorithm. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013)Google Scholar
  36. Marin, J.M., Pudlo, P., Sedki, M.: Consistency of the adaptive multiple importance sampling. arXiv:1211.2548 (2012)
  37. Marinari, E., Parisi, G.: Simulated tempering: a new Monte Carlo scheme. Europhys. Lett. 19(6), 451–458 (1992)CrossRefGoogle Scholar
  38. Martino, L., Elvira, V., Luengo, D., Artes, A., Corander, J.: Orthogonal MCMC algorithms. In: IEEE Workshop on Statistical Signal Processing (SSP), pp. 364–367 (2014)Google Scholar
  39. Martino, L., Elvira, V., Luengo, D., Artes, A., Corander, J.: Smelly parallel MCMC chains. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015)Google Scholar
  40. Martino, L., Elvira, V., Luengo, D., Corander, J.: An adaptive population importance sampler: learning from the uncertanity. IEEE Trans. Signal Process. 63(16), 4422–4437 (2015)MathSciNetCrossRefGoogle Scholar
  41. Martino, L., Elvira, V., Luengo, D., Corander, J.: MCMC-driven adaptive multiple importance sampling. In: Interdisciplinary Bayesian Statistics Springer Proceedings in Mathematics & Statistics, vol. 118, Chap. 8, pp. 97–109 (2015)Google Scholar
  42. Martino, L., Míguez, J.: A generalization of the adaptive rejection sampling algorithm. Stati. Comput. 21(4), 633–647 (2011)MathSciNetCrossRefMATHGoogle Scholar
  43. Mendes, E.F., Scharth, M., Kohn, R.: Markov Interacting Importance Samplers. arXiv:1502.07039 (2015)
  44. Neal, R.: MCMC using ensembles of states for problems with fast and slow variables such as Gaussian process regression. arXiv:1101.0387 (2011)
  45. Neal, R.M.: Annealed importance sampling. Stat. Comput. 11(2), 125–139 (2001)MathSciNetCrossRefGoogle Scholar
  46. Owen, A.: Monte Carlo theory, methods and examples. http://statweb.stanford.edu/~owen/mc/ (2013)
  47. Owen, A., Zhou, Y.: Safe and effective importance sampling. J. Am. Stat. Assoc. 95(449), 135–143 (2000)MathSciNetCrossRefMATHGoogle Scholar
  48. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (2004)Google Scholar
  49. Schäfer, C., Chopin, N.: Sequential Monte Carlo on large binary sampling spaces. Stat. Comput. 23(2), 163–184 (2013)MathSciNetCrossRefMATHGoogle Scholar
  50. Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–860 (2006)MathSciNetCrossRefMATHGoogle Scholar
  51. Veach, E., Guibas, L.: Optimally combining sampling techniques for Monte Carlo rendering. In: SIGGRAPH 1995 Proceedings, pp. 419–428 (1995)Google Scholar
  52. Wand, M.P., Jones, M.C.: Kernel Ssoothing. Chapman and Hall, London (1994)Google Scholar
  53. Wang, X., Chen, R., Liu, J.S.: Monte Carlo Bayesian signal processing for wireless communications. J. VLSI Signal Process. 30, 89–105 (2002)CrossRefMATHGoogle Scholar
  54. Warnes, G.R.: The Normal Kernel Coupler: an adaptive Markov Chain Monte Carlo method for efficiently sampling from multi-modal distributions. Technical Report (2001)Google Scholar
  55. Weinberg, M.D.: Computing the Bayes factor from a Markov chain Monte Carlo simulation of the posterior distribution. arXiv:0911.1777 (2010)
  56. Yuan, X., Lu, Z., Yue, C.Z.: A novel adaptive importance sampling algorithm based on Markov chain and low-discrepancy sequence. Aerosp. Sci. Technol. 29, 253–261 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsUniversity of HelsinkiHelsinkiFinland
  2. 2.Department of Signal Theory and CommunicationsUniversidad Carlos III de MadridLeganésSpain
  3. 3.Department of Circuits and Systems EngineeringUniversidad Politécnica de MadridMadridSpain

Personalised recommendations