Statistics and Computing

, Volume 27, Issue 3, pp 599–623 | Cite as

Layered adaptive importance sampling

  • L. Martino
  • V. Elvira
  • D. Luengo
  • J. Corander


Monte Carlo methods represent the de facto standard for approximating complicated integrals involving multidimensional target distributions. In order to generate random realizations from the target distribution, Monte Carlo techniques use simpler proposal probability densities to draw candidate samples. The performance of any such method is strictly related to the specification of the proposal distribution, such that unfortunate choices easily wreak havoc on the resulting estimators. In this work, we introduce a layered (i.e., hierarchical) procedure to generate samples employed within a Monte Carlo scheme. This approach ensures that an appropriate equivalent proposal density is always obtained automatically (thus eliminating the risk of a catastrophic performance), although at the expense of a moderate increase in the complexity. Furthermore, we provide a general unified importance sampling (IS) framework, where multiple proposal densities are employed and several IS schemes are introduced by applying the so-called deterministic mixture approach. Finally, given these schemes, we also propose a novel class of adaptive importance samplers using a population of proposals, where the adaptation is driven by independent parallel or interacting Markov chain Monte Carlo (MCMC) chains. The resulting algorithms efficiently combine the benefits of both IS and MCMC methods.


Bayesian inference Adaptive importance sampling Population Monte Carlo Parallel MCMC Multiple importance sampling 



This work has been supported by the projects COMONSENS (CSD2008 00010), ALCIT (TEC2012 38800C03 01), DISSECT (TEC2012 38058 C03 01), OTOSiS (TEC 2013 41718 R), and COMPREHENSION (TEC 2012 38883 C02 01), by the BBVA Foundation with “I Convocatoria de Ayudas Fundacin BBVA a Investigadores, Innovadores y Creadores Culturales”- MG FIAR project, by the ERC Grant 239784 and AoF Grant 251170, and by the European Union 7th Framework Programme through the Marie Curie Initial Training Network “Machine Learning for Personalized Medicine” MLPM2012, Grant No. 316861.


  1. Ali, A.M., Yao, K., Collier, T.C., Taylor, E., Blumstein, D., Girod, L.: An empirical study of collaborative acoustic source localization. In: Proceedings of the Information Processing in Sensor Networks (IPSN07), Boston (2007)Google Scholar
  2. Andrieu, C., de Freitas, N., Doucet, A., Jordan, M.: An introduction to MCMC for machine learning. Mach. Learn. 50, 5–43 (2003)CrossRefzbMATHGoogle Scholar
  3. Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. B 72(3), 269–342 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  4. Andrieu, C., Thoms, J.: A tutorial on adaptive mcmc. Stat. Comput. 18, 343373 (2015)MathSciNetGoogle Scholar
  5. Beaujean, F., Caldwell, A.: Initializing adaptive importance sampling with Markov chains. arXiv:1304.7808 (2013)
  6. Botev, Z.I., Kroese, D.P.: An efficient algorithm for rare-event probability estimation, combinatorial optimization, and counting. Methodol. Comput. Appl. Probab. 10(4), 471–505 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  7. Botev, Z.I., LEcuyer, P., Tuffin, B.: Markov chain importance sampling with applications to rare event probability estimation. Stat. Comput. 23, 271–285 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  8. Brockwell, A., Del Moral, P., Doucet, A.: Interacting Markov chain Monte Carlo methods. Ann. Stat. 38(6), 3387–3411 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  9. Bugallo, M.F., Martino, L., Corander, J.: Adaptive importance sampling in signal processing. Digit. Signal Process. 47, 36–49 (2015)MathSciNetCrossRefGoogle Scholar
  10. Caldwell, A., Liu, C.: Target density normalization for Markov Chain Monte Carlo algorithms. arXiv:1410.7149 (2014)
  11. Cappé, O., Douc, R., Guillin, A., Marin, J.M., Robert, C.P.: Adaptive importance sampling in general mixture classes. Stat. Comput. 18, 447–459 (2008)MathSciNetCrossRefGoogle Scholar
  12. Cappé, O., Guillin, A., Marin, J.M., Robert, C.P.: Population Monte Carlo. J. Comput. Graph. Stat. 13(4), 907–929 (2004)MathSciNetCrossRefGoogle Scholar
  13. Chib, S., Jeliazkov, I.: Marginal likelihood from the metropolis-hastings output. J. Am. Stat. Assoc. 96, 270–281 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  14. Chopin, N.: A sequential particle filter for static models. Biometrika 89, 539–552 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  15. Cornuet, J.M., Marin, J.M., Mira, A., Robert, C.P.: Adaptive multiple importance sampling. Scand. J. Stat. 39(4), 798–812 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  16. Craiu, R., Rosenthal, J., Yang, C.: Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J. Am. Stat. Assoc. 104(448), 1454–1466 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  17. Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(3), 411–436 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  18. Douc, G.R., Marin, J.M., Robert, C.: Convergence of adaptive mixtures of importance sampling schemes. Ann. Stat. 35, 420–448 (2007a)MathSciNetCrossRefzbMATHGoogle Scholar
  19. Douc, G.R., Marin, J.M., Robert, C.: Minimum variance importance sampling via population Monte Carlo. ESAIM Probab. Stat. 11, 427–447 (2007b)MathSciNetCrossRefzbMATHGoogle Scholar
  20. Doucet, A., Johansen, A.M.: A tutorial on particle filtering and smoothing: fifteen years later. Technical report (2008)Google Scholar
  21. Doucet, A., Wang, X.: Monte Carlo methods for signal processing. IEEE Signal Process. Mag. 22(6), 152–170 (2005)CrossRefGoogle Scholar
  22. Elvira, V., Martino, L., Luengo, D., Bugallo, M.: Efficient multiple importance sampling estimators. IEEE Signal Process. Lett. 22(10), 1757–1761 (2015)CrossRefGoogle Scholar
  23. Elvira, V., Martino, L., Luengo, D., Bugallo, M.F.: Generalized multiple importance sampling. arXiv:1511.03095 (2015)
  24. Fearnhead, P., Taylor, B.M.: An adaptive sequential Monte Carlo sampler. Bayesian Anal. 8(2), 411–438 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  25. Fitzgerald, W.J.: Markov chain Monte Carlo methods with applications to signal processing. Signal Process. 81(1), 3–18 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  26. Friel, N., Wyse, J.: Estimating the model evidence: a review. arXiv:1111.1957 (2011)
  27. Geyer, C.J.: Markov chain Monte Carlo maximum likelihood. In: Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface, pp. 156–163 (1991)Google Scholar
  28. Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  29. Ihler, A.T., Fisher, J.W., Moses, R.L., Willsky, A.S.: Nonparametric belief propagation for self-localization of sensor networks. IEEE Trans. Sel. Areas Commun. 23(4), 809–819 (2005)CrossRefGoogle Scholar
  30. Jacob, P., Robert, C.P., Smith, M.H.: Using parallel computation to improve Independent Metropolis–Hastings based estimation. J. Comput. Graph. Stat. 3(20), 616–635 (2011)MathSciNetCrossRefGoogle Scholar
  31. Liang, F., Liu, C., Caroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples. Wiley Series in Computational Statistics, England (2010)Google Scholar
  32. Liesenfeld, R., Richard, J.F.: Improving MCMC, using efficient importance sampling. Comput. Stat. Data Anal. 53, 272–288 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  33. Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, Berlin (2004)CrossRefGoogle Scholar
  34. Liu, J.S., Liang, F., Wong, W.H.: The multiple-try method and local optimization in metropolis sampling. J. Am. Stat. Assoc. 95(449), 121–134 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  35. Luengo, D., Martino, L.: Fully adaptive Gaussian mixture Metropolis–Hastings algorithm. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2013)Google Scholar
  36. Marin, J.M., Pudlo, P., Sedki, M.: Consistency of the adaptive multiple importance sampling. arXiv:1211.2548 (2012)
  37. Marinari, E., Parisi, G.: Simulated tempering: a new Monte Carlo scheme. Europhys. Lett. 19(6), 451–458 (1992)CrossRefGoogle Scholar
  38. Martino, L., Elvira, V., Luengo, D., Artes, A., Corander, J.: Orthogonal MCMC algorithms. In: IEEE Workshop on Statistical Signal Processing (SSP), pp. 364–367 (2014)Google Scholar
  39. Martino, L., Elvira, V., Luengo, D., Artes, A., Corander, J.: Smelly parallel MCMC chains. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015)Google Scholar
  40. Martino, L., Elvira, V., Luengo, D., Corander, J.: An adaptive population importance sampler: learning from the uncertanity. IEEE Trans. Signal Process. 63(16), 4422–4437 (2015)MathSciNetCrossRefGoogle Scholar
  41. Martino, L., Elvira, V., Luengo, D., Corander, J.: MCMC-driven adaptive multiple importance sampling. In: Interdisciplinary Bayesian Statistics Springer Proceedings in Mathematics & Statistics, vol. 118, Chap. 8, pp. 97–109 (2015)Google Scholar
  42. Martino, L., Míguez, J.: A generalization of the adaptive rejection sampling algorithm. Stati. Comput. 21(4), 633–647 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  43. Mendes, E.F., Scharth, M., Kohn, R.: Markov Interacting Importance Samplers. arXiv:1502.07039 (2015)
  44. Neal, R.: MCMC using ensembles of states for problems with fast and slow variables such as Gaussian process regression. arXiv:1101.0387 (2011)
  45. Neal, R.M.: Annealed importance sampling. Stat. Comput. 11(2), 125–139 (2001)MathSciNetCrossRefGoogle Scholar
  46. Owen, A.: Monte Carlo theory, methods and examples. (2013)
  47. Owen, A., Zhou, Y.: Safe and effective importance sampling. J. Am. Stat. Assoc. 95(449), 135–143 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  48. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (2004)Google Scholar
  49. Schäfer, C., Chopin, N.: Sequential Monte Carlo on large binary sampling spaces. Stat. Comput. 23(2), 163–184 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  50. Skilling, J.: Nested sampling for general Bayesian computation. Bayesian Anal. 1(4), 833–860 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  51. Veach, E., Guibas, L.: Optimally combining sampling techniques for Monte Carlo rendering. In: SIGGRAPH 1995 Proceedings, pp. 419–428 (1995)Google Scholar
  52. Wand, M.P., Jones, M.C.: Kernel Ssoothing. Chapman and Hall, London (1994)Google Scholar
  53. Wang, X., Chen, R., Liu, J.S.: Monte Carlo Bayesian signal processing for wireless communications. J. VLSI Signal Process. 30, 89–105 (2002)CrossRefzbMATHGoogle Scholar
  54. Warnes, G.R.: The Normal Kernel Coupler: an adaptive Markov Chain Monte Carlo method for efficiently sampling from multi-modal distributions. Technical Report (2001)Google Scholar
  55. Weinberg, M.D.: Computing the Bayes factor from a Markov chain Monte Carlo simulation of the posterior distribution. arXiv:0911.1777 (2010)
  56. Yuan, X., Lu, Z., Yue, C.Z.: A novel adaptive importance sampling algorithm based on Markov chain and low-discrepancy sequence. Aerosp. Sci. Technol. 29, 253–261 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of Mathematics and StatisticsUniversity of HelsinkiHelsinkiFinland
  2. 2.Department of Signal Theory and CommunicationsUniversidad Carlos III de MadridLeganésSpain
  3. 3.Department of Circuits and Systems EngineeringUniversidad Politécnica de MadridMadridSpain

Personalised recommendations