Statistics and Computing

, Volume 26, Issue 1–2, pp 361–381 | Cite as

Adaptive Metropolis–Hastings sampling using reversible dependent mixture proposals

  • Minh-Ngoc Tran
  • Michael K. Pitt
  • Robert Kohn


This article develops a general-purpose adaptive sampler for sampling from a high-dimensional and/or multimodal target. The adaptive sampler is based on reversible proposal densities each of which has a mixture of multivariate \(t\) densities as its invariant density. The reversible proposals are a combination of independent and correlated components that allow the sampler to traverse the sample space efficiently as well as allowing the sampler to keep moving and exploring the sample space locally. We employ a two-chain approach, using a trial chain to adapt the proposal in the main chain. Convergence of the main chain and a strong law of large numbers are obtained under checkable conditions, and without imposing a diminishing adaptation condition. The mixtures of multivariate \(t\) densities are fitted by an efficient variational approximation algorithm in which the number of components is determined automatically. The performance of the sampler is evaluated using simulated and real examples. Our framework is quite general and can handle reversible proposal densities whose invariant densities are mixtures other than \(t\) mixtures.


Ergodic convergence Markov chain Monte Carlo Metropolis-within Gibbs sampling Multivariate \(t\) mixtures Simulated annealing Variational approximation 



The research of Minh-Ngoc Tran and Robert Kohn was partially supported by Australian Research Council grant DP0667069. The authors thank the referee and the Associate Editor for their comments which improved the presentation and technical content of the paper.


  1. Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B 72(2), 1–33 (2010)MathSciNetGoogle Scholar
  2. Andrieu, C., Moulines, E.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006)zbMATHMathSciNetCrossRefGoogle Scholar
  3. Andrieu, C., Thoms, J.: A tutorial on adaptive MCMC. Stat. Comput. 18, 343–373 (2008)MathSciNetCrossRefGoogle Scholar
  4. Athreya, K.B., Ney, P.: A new approach to the limit theory of recurrent Markov chains. Trans. Am. Math. Soc. 245, 493–501 (1978)zbMATHMathSciNetCrossRefGoogle Scholar
  5. Azzalini, A., Capitanio, A.: Statistical applications of the multivariate skew-normal distribution. J. R. Stat. Soc. Ser. B 61(3), 579–602 (1999)zbMATHMathSciNetCrossRefGoogle Scholar
  6. Belsley, D.A., Kuh, E., Welsch, R.E.: Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. Wiley, New York (1980)zbMATHCrossRefGoogle Scholar
  7. Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, New York (2005)zbMATHGoogle Scholar
  8. Casarin, R., Craiu, R., Leisen, F.: Interacting multiple try algorithms with different proposal distributions. Stat. Comput. 23, 185–200 (2013)MathSciNetCrossRefGoogle Scholar
  9. Chib, S., Ramamurthy, S.: Tailored randomized block mcmc methods with application to dsge models. J. Econom. 155(1), 19–38 (2010)MathSciNetCrossRefGoogle Scholar
  10. Chopin, N.: Central limit theorem for sequential Monte Carlo methods and its application to Bayesian inference. Ann. Stat. 32(6), 2385–2411 (2004)zbMATHMathSciNetCrossRefGoogle Scholar
  11. Corduneanu, A., Bishop, C.: Variational Bayesian model selection for mixture distributions. In: Jaakkola, T., Richardson, T. (eds.) Artifcial Intelligence and Statistics, vol. 14, pp. 27–34. Morgan Kaufmann, Waltham (2001)Google Scholar
  12. Craiu, R.V., Rosenthal, J., Yang, C.: Learn from thy neighbor: parallel-chain and regional adaptive MCMC. J. Am. Stat. Assoc. 104, 1454–1466 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
  13. de Freitas, N., Højen-Sørensen, P., Jordan, M., Russell, S.: Variational MCMC. In: Breese, J., Kollar, D. (eds.) Uncertainty in Artificial Intelligence (UAI), Proceedings of the Seventeenth Conference, pp. 120–127. Morgan Kaufmann Publishers, San Francisco (2001)Google Scholar
  14. Del Moral, P., Doucet, A., Jasra, A.: Sequential Monte Carlo samplers. J. R. Stat. Soc. Ser. B 68, 411–436 (2006)zbMATHCrossRefGoogle Scholar
  15. Diks, C., Panchenko, V., Dijk, D.: Likelihood-based scoring rules for comparing density forecasts in tails. J. Econom. 163, 215–230 (2011)CrossRefGoogle Scholar
  16. Gelman, A., Jakulin, A., Grazia, P., Su, Y.-S.: A weakly informative default prior distribution for logistic and other regression models. Ann. Appl. Stat. 2, 1360–1383 (2008)Google Scholar
  17. Giordani, P., Kohn, R.: Adaptive independent Metropolis–Hastings by fast estimation of mixtures of normals. J. Comput. Graph. Stat. 19(2), 243–259 (2010)MathSciNetCrossRefGoogle Scholar
  18. Giordani, P., Mun, X., Tran, M.-N., Kohn, R.: Flexible multivariate density estimation with marginal adaptation. J. Comput. Graph. Stat. (2012). doi: 10.1080/10618600.2012.672784
  19. Gneiting, T., Raftery, A.: Strictly proper scoring rules, prediction, and estimation. J. Am. Stat. Assoc. 102, 359–378 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
  20. Good, I.: Rational decisions. J. R. Stat. Soc. Ser. B 14, 107–114 (1952)MathSciNetGoogle Scholar
  21. Grimmett, G., Stirzaker, D.: Probability and Random Processes, 3rd edn. Oxford University Press, New York (2001)Google Scholar
  22. Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14, 375–395 (1999)Google Scholar
  23. Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7, 223–242 (2001)Google Scholar
  24. Hersbach, H.: Decomposition of the continuous ranked probability score for ensemble prediction systems. Weather Forecast. 15, 559–570 (2000)CrossRefGoogle Scholar
  25. Holden, L., Hauge, R., Holden, M.: Adaptive independent Metropolis–Hastings. Ann. Appl. Probab. 19(1), 395–413 (2009)zbMATHMathSciNetCrossRefGoogle Scholar
  26. Johnson, A.A., Jones, G.L., Neath, R.C.: Component-wise Markov chain Monte Carlo: uniform and geometric ergodicity under mixing and composition. Stat. Sci. 28(3), 360–375 (2013)MathSciNetCrossRefGoogle Scholar
  27. Leonard, T., Hsu, J.S.J.: Bayesian inference for a covariance matrix. Ann. Stat. 20(4), 1669–1696 (1992)zbMATHMathSciNetCrossRefGoogle Scholar
  28. Liang, F., Liu, C., Carroll, R.: Advanced Markov Chain Monte Carlo Methods: Learning from Past Samples, vol. 714. Wiley, New York (2011)Google Scholar
  29. Liu, J.S.: Monte Carlo Strategies in Scientific Computing. Springer, New York (2001)zbMATHGoogle Scholar
  30. Liu, J.S., Liang, F., Wong, W.H.: The multiple-try method and local optimization in Metropolis sampling. J. Am. Stat. Assoc. 95(449), 121–134 (2000)zbMATHMathSciNetCrossRefGoogle Scholar
  31. Martino, L., Casarin, R., Leisen, F., Luengo, D.: Adaptive sticky generalized Metropolis. Technical report, Universidad Carlos III de Madrid. (2013). Accessed 2013
  32. McGrory, C.A., Titterington, : Variational approximations in Bayesian model selection for finite mixture distributions. Comput. Stat. Data. Anal. 51, 5352–5367 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
  33. Neal, R.: Annealed importance sampling. Stat. Comput. 11, 125–139 (2001)MathSciNetCrossRefGoogle Scholar
  34. Nott, D.J., Tran, M.-N., Leng, C.: Variational approximation for heteroscedastic linear models and matching pursuit algorithms. Stat. Comput. 22(2), 497–512 (2012)MathSciNetCrossRefGoogle Scholar
  35. Pasarica, C., Gelman, A.: Adaptively scaling the Metropolis algorithm using expected squared jumped distance. Stat. Sin. 20, 343–364 (2010)zbMATHMathSciNetGoogle Scholar
  36. Pitt, M.K., Silva, R.S., Giordani, P., Kohn, R.: On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econom. 172(2), 134–151 (2012)MathSciNetCrossRefGoogle Scholar
  37. Pitt, M.K., Walker, S.G.: Extended constructions of stationary autoregressive processes. Stat. Probab. Lett. 76(12), 1219–1224 (2006)zbMATHMathSciNetCrossRefGoogle Scholar
  38. Roberts, G.O., Rosenthal, J.S.: Coupling and ergodicity of adaptive Markov chain Monte Carlo algorithms. J. Appl. Probab. 44(2), 458–475 (2007)zbMATHMathSciNetCrossRefGoogle Scholar
  39. Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. J. Comput. Graph. Stat. 18(2), 349–367 (2009)MathSciNetCrossRefGoogle Scholar
  40. Schmidl, D., Czado, C., Hug, S., Theis, F.J.: A vine-copula based adaptive mcmc sampler for efficient inference of dynamical systems. Bayesian Anal. J. 8(1), 1–22 (2013)MathSciNetCrossRefGoogle Scholar
  41. Schmidler, S.: Exploration vs. exploitation in adaptive MCMC. Technical report, AdapSki invited presentation. (2011)
  42. Tran, M.-N., Giordani, P., Mun, X., Kohn, R., Pitt, M. K.: Copula-type estimators for flexible multivariate density modeling using mixtures. J. Comput. Graph. Stat. (2013) To appearGoogle Scholar
  43. Wang, F., Landau, D.: Determining the density of states for classical statistical models: a random walk algorithm to produce a flat histogram. Phys. Rev. E 64(5), 56101 (2001a)CrossRefGoogle Scholar
  44. Wang, F., Landau, D.: Efficient, multiple-range random walk algorithm to calculate the density of states. Phys. Rev. Lett. 86(10), 2050–2053 (2001b)CrossRefGoogle Scholar
  45. Yang, R., Berger, J.O.: Estimation of a covariance matrix using the reference prior. Ann. Stat. 22(3), 1195–1211 (1994)zbMATHMathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  1. 1.Australian School of BusinessUniversity of New South WalesSydneyAustralia
  2. 2.Department of EconomicsUniversity of WarwickCoventryUnited Kingdom

Personalised recommendations