Advertisement

Statistics and Computing

, Volume 18, Issue 4, pp 343–373 | Cite as

A tutorial on adaptive MCMC

  • Christophe Andrieu
  • Johannes Thoms
Article

Abstract

We review adaptive Markov chain Monte Carlo algorithms (MCMC) as a mean to optimise their performance. Using simple toy examples we review their theoretical underpinnings, and in particular show why adaptive MCMC algorithms might fail when some fundamental properties are not satisfied. This leads to guidelines concerning the design of correct algorithms. We then review criteria and the useful framework of stochastic approximation, which allows one to systematically optimise generally used criteria, but also analyse the properties of adaptive MCMC algorithms. We then propose a series of novel adaptive algorithms which prove to be robust and reliable in practice. These algorithms are applied to artificial and high dimensional scenarios, but also to the classic mine disaster dataset inference problem.

Keywords

MCMC Adaptive MCMC Controlled Markov chain Stochastic approximation 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahn, J.-H., Oh, J.-H.: A constrained EM algorithm for principal component analysis. Neural Comput. 15, 57–65 (2003) zbMATHCrossRefGoogle Scholar
  2. Andradóttir, S.: A stochastic approximation algorithm with varying bounds. Oper. Res. 43(6), 1037–1048 (1995) zbMATHCrossRefMathSciNetGoogle Scholar
  3. Andrieu, C.: Discussion of Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing (December 2003). J. R. Stat. Soc. Ser. B 66(3), 497–813 (2004) Google Scholar
  4. Andrieu, C., Atchadé, Y.F.: On the efficiency of adaptive MCMC algorithms. Electron. Commun. Probab. 12, 336–349 (2007) zbMATHGoogle Scholar
  5. Andrieu, C., Doucet, A.: Discussion of Brooks, S.P., Giudici, P., Roberts, G.O.: Efficient construction of reversible jump Markov chain Monte Carlo proposal distributions. Part 1. J. R. Stat. Soc. B 65, 3–55 (2003) Google Scholar
  6. Andrieu, C., Jasra, A.: Efficient and principled implementation of the tempering procedure. Tech. Rep. University of Bristol (2008) Google Scholar
  7. Andrieu, C., Moffa, G.: A Gaussian copula approach for adaptation in discrete scenarios (2008, in preparation) Google Scholar
  8. Andrieu, C., Moulines, É.: On the ergodicity properties of some adaptive MCMC algorithms. Ann. Appl. Probab. 16(3), 1462–1505 (2006) zbMATHCrossRefMathSciNetGoogle Scholar
  9. Andrieu, C., Robert, C.P.: Controlled MCMC for optimal sampling. Tech. Rep. 0125, Cahiers de Mathématiques du Ceremade, Université Paris-Dauphine (2001) Google Scholar
  10. Andrieu, C., Tadić, V.B.: The boundedness issue for controlled MCMC algorithms. Tech. Rep. University of Bristol (2007) Google Scholar
  11. Andrieu, C., Moulines, É., Priouret, P.: Stability of stochastic approximation under verifiable conditions. SIAM J. Control Optim. 44(1), 283–312 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  12. Atchadé, Y.F.: An adaptive version for the Metropolis adjusted Langevin algorithm with a truncated drift. Methodol. Comput. Appl. Probab. 8, 235–254 (2006) zbMATHCrossRefMathSciNetGoogle Scholar
  13. Atchadé, Y.F., Fort, G.: Limit Theorems for some adaptive MCMC algorithms with subgeometric kernels. Tech. Rep. (2008) Google Scholar
  14. Atchadé, Y.F., Liu, J.S.: The Wang-Landau algorithm in general state spaces: applications and convergence analysis. Technical report Univ. of Michigan (2004) Google Scholar
  15. Atchadé, Y.F., Rosenthal, J.S.: On adaptive Markov chain Monte Carlo algorithms. Bernoulli 11, 815–828 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  16. Bai, Y., Roberts, G.O., Rosenthal, J.S.: On the Containment Condition for Adaptive Markov Chain Monte Carlo Algorithms. Tech. Rep. University of Toronto (2008) Google Scholar
  17. Bédard, M.: Optimal acceptance rates for metropolis algorithms: moving beyond 0.234. Tech. Rep. University of Montréal (2006) Google Scholar
  18. Bédard, M.: Weak convergence of metropolis algorithms for non-i.i.d. target distributions. Ann. Appl. Probab. 17, 1222–1244 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  19. Bennet, J.E., Racine-Poon, A., Wakefield, J.C.: MCMC for nonlinear hierarchical models. In: MCMC in Practice. Chapman & Hall, London (1996) Google Scholar
  20. Benveniste, A., Métivier, M., Priouret, P.: Adaptive Algorithms and Stochastic Approximations. Springer, Berlin (1990) zbMATHGoogle Scholar
  21. Besag, J., Green, P.J.: Spatial statistics and Bayesian computation. J. R. Stat. Soc. Ser. B Stat. Methodol. 55, 25–37 (1993) zbMATHMathSciNetGoogle Scholar
  22. Borkar, V.S.: Topics in Controlled Markov Chains. Longman, Harlow (1990) Google Scholar
  23. Browne, W.J., Draper, D.: Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Comput. Stat. 15, 391–420 (2000) zbMATHCrossRefGoogle Scholar
  24. Cappé, O., Douc, R., Gullin, A., Marin, J.-M., Robert, C.P.: Adaptive Importance Sampling in General Mixture Classes. Preprint (2007) Google Scholar
  25. Ceperley, D., Chester, G.V., Kalos, M.H.: Monte Carlo simulation of a many fermion study. Phys. Rev. B 16(7), 3081–3099 (1977) CrossRefGoogle Scholar
  26. Chauveau, D., Vandekerkhove, P.: Improving convergence of the Hastings-Metropolis algorithm with an adaptive proposal. Scand. J. Statist. 29(1), 13–29 (2001) CrossRefMathSciNetGoogle Scholar
  27. Chen, H.F., Guo, L., Gao, A.J.: Convergence and robustness of the Robbins-Monro algorithm truncated at randomly varying bounds. Stoch. Process. Their Appl. 27(2), 217–231 (1988) zbMATHGoogle Scholar
  28. Chib, S., Greenberg, E., Winkelmann, R.: Posterior simulation and Bayes factors in panel count data models. J. Econ. 86, 33–54 (1998) zbMATHGoogle Scholar
  29. de Freitas, N., Højen-Sørensen, P., Jordan, M., Russell, S.: Variational MCMC. In: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, pp. 120–127. Morgan Kaufman, San Mateo (2001). ISBN:1-55860-800-1 Google Scholar
  30. Delmas, J.-F., Jourdain, B.: Does waste-recycling really improve Metropolis-Hastings Monte Carlo algorithm? Tech. Rep. Cermics, ENPC (2007) Google Scholar
  31. Delyon, B.: General results on the convergence of stochastic algorithms. IEEE Trans. Automat. Control 41(9), 1245–1256 (1996) zbMATHCrossRefMathSciNetGoogle Scholar
  32. Delyon, B., Juditsky, A.: Accelerated stochastic approximation. SIAM J. Optim. 3(4), 868–881 (1993) zbMATHCrossRefMathSciNetGoogle Scholar
  33. Douglas, C.: Simple adaptive algorithms for cholesky, LDL T, QR, and eigenvalue decompositions of autocorrelation matrices for sensor array data. In: Signals, Systems and Computers, 2001, Conference Record of the Thirty-Fifth Asilomar Conference, vol. 21, pp. 1134–1138 (2001) Google Scholar
  34. Erland, S.: On Adaptivity and Eigen-Decompositions of Markov Chains. Ph.D. thesis Norwegian University of Science and Technology (2003) Google Scholar
  35. Frenkel, D.: Waste-recycling Monte Carlo. In: Computer Simulations In Condensed Matter: from Materials to Chemical Biology. Lecture Notes in Physics, vol. 703, pp. 127–138. Springer, Berlin (2006) CrossRefGoogle Scholar
  36. Gåsemyr, J.: On an adaptive Metropolis-Hastings algorithm with independent proposal distribution. Scand. J. Stat. 30(1), 159–173 (2003). ISSN 0303-6898 CrossRefGoogle Scholar
  37. Gåsemyr, J., Natvig, B., Nygård, C.S.: An application of adaptive independent chain Metropolis–Hastings algorithms in Bayesian hazard rate estimation. Methodol. Comput. Appl. Probab. 6(3), 293–302(10) (2004) zbMATHCrossRefMathSciNetGoogle Scholar
  38. Gelfand, A.E., Sahu, S.K.: On Markov chain Monte Carlo acceleration. J. Comput. Graph. Stat. 3(3), 261–276 (1994) CrossRefMathSciNetGoogle Scholar
  39. Gelman, A., Roberts, G., Gilks, W.: Efficient Metropolis jumping rules. In: Bayesian Statistics, vol. 5. Oxford University Press, New York (1995) Google Scholar
  40. Geyer, C.J., Thompson, E.A.: Annealing Markov chain Monte Carlo with applications to ancestral inference. J. Am. Stat. Assoc. 90, 909–920 (1995) zbMATHCrossRefGoogle Scholar
  41. Ghasemi, A., Sousa, E.S.: An EM-based subspace tracker for wireless communication applications. In: Vehicular Technology Conference. VTC-2005-Fall. IEEE 62nd, pp. 1787–1790 (2005) Google Scholar
  42. Gilks, W.R., Roberts, G.O., George, E.I.: Adaptive direction sampling. The Statistician 43, 179–189 (1994) CrossRefGoogle Scholar
  43. Gilks, W.R., Roberts, G.O., Sahu, S.K.: Adaptive Markov chain Monte Carlo through regeneration. J. Am. Stat. Assoc. 93, 1045–1054 (1998) zbMATHCrossRefMathSciNetGoogle Scholar
  44. Giordani, P., Kohn, R.: Efficient Bayesian inference for multiple change-point and mixture innovation models. Sveriges Riksbank Working Paper No. 196 (2006) Google Scholar
  45. Green, P.J.: Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82, 711–732 (1995) zbMATHCrossRefMathSciNetGoogle Scholar
  46. Green, P.J.: Trans-dimensional Markov chain Monte Carlo. In: Green, P.J., Hjort, N.L., Richardson, S. (eds.) Highly Structured Stochastic Systems. Oxford Statistical Science Series, vol. 27, pp. 179–198. Oxford University Press, London (2003) Google Scholar
  47. Green, P.J., Mira, A.: Delayed rejection in reversible jump Metropolis-Hastings. Biometrica 88(3) (2001) Google Scholar
  48. Haario, H., Saksman, E., Tamminen, J.: Adaptive proposal distribution for random walk Metropolis algorithm. Comput. Stat. 14(3), 375–395 (1999) zbMATHCrossRefGoogle Scholar
  49. Haario, H., Saksman, E., Tamminen, J.: An adaptive Metropolis algorithm. Bernoulli 7(2), 223–242 (2001) zbMATHCrossRefMathSciNetGoogle Scholar
  50. Haario, H., Laine, M., Mira, A., Saksman, E.: DRAM: Efficient adaptive MCMC (2003) Google Scholar
  51. Haario, H., Laine, M., Lehtinen, M., Saksman, E.: Markov chain Monte Carlo methods for high dimensional inversion in remote sensing. J. R. Stat. Soc. Ser. B 66(3), 591–607 (2004) zbMATHCrossRefMathSciNetGoogle Scholar
  52. Haario, H., Saksman, E., Tamminen, J.: Componentwise adaptation for high dimensional MCMC. Comput. Stat. 20, 265–274 (2005) zbMATHCrossRefMathSciNetGoogle Scholar
  53. Hastie, D.I.: Towards automatic reversible jump Markov chain Monte Carlo. Ph.D. thesis Bristol University, March 2005 Google Scholar
  54. Holden, L.: Adaptive chains. Tech. Rep. Norwegian Computing Center (1998) Google Scholar
  55. Holden, L. et al.: History matching using adaptive chains. Tech. Report Norwegian Computing Center (2002) Google Scholar
  56. Kesten, H.: Accelerated stochastic approximation. Ann. Math. Stat. 29(1), 41–59 (1958) zbMATHCrossRefMathSciNetGoogle Scholar
  57. Kim, S., Shephard, N., Chib, S.: Stochastic volatility: likelihood inference and comparison with ARCH models. Rev. Econ. Stud. 65, 361–393 (1998) zbMATHCrossRefGoogle Scholar
  58. Laskey, K.B., Myers, J.: Population Markov chain Monte Carlo. Mach. Learn. 50(1–2), 175–196 (2003) zbMATHCrossRefGoogle Scholar
  59. Liu, J., Liang, F., Wong, W.H.: The use of multiple-try method and local optimization in Metropolis sampling. J. Am. Stat. Assoc. 95, 121–134 (2000) zbMATHCrossRefMathSciNetGoogle Scholar
  60. Mykland, P., Tierney, L., Yu, B.: Regeneration in Markov chain samplers. J. Am. Stat. Assoc. 90, 233–241 (1995) zbMATHCrossRefMathSciNetGoogle Scholar
  61. Nott, D.J., Kohn, R.: Adaptive sampling for Bayesian variable selection. Biometrika 92(4), 747–763 (2005) CrossRefMathSciNetzbMATHGoogle Scholar
  62. Pasarica, C., Gelman, A.: Adaptively scaling the Metropolis algorithm using the average squared jumped distance. Tech. Rep. Department of Statistics, Columbia University (2003) Google Scholar
  63. Plakhov, A., Cruz, P.: A stochastic approximation algorithm with step-size adaptation. J. Math. Sci. 120(1), 964–973 (2004) zbMATHCrossRefMathSciNetGoogle Scholar
  64. Ramponi, A.: Stochastic adaptive selection of weights in the simulated tempering algorithm. J. Ital. Stat. Soc. 7(1), 27–55 (1998) CrossRefGoogle Scholar
  65. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951) zbMATHCrossRefMathSciNetGoogle Scholar
  66. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods. Springer, Berlin (1999) zbMATHGoogle Scholar
  67. Roberts, G.O., Rosenthal, J.: Optimal scaling of discrete approximation to Langevin diffusion. J. R. Stat. Soc. B 60, 255–268 (1998) zbMATHCrossRefMathSciNetGoogle Scholar
  68. Roberts, G.O., Rosenthal, J.S.: Examples of adaptive MCMC. Technical Report University of Toronto (2006) Google Scholar
  69. Roberts, G.O., Rosenthal, J.S.: Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44(2), 458–475 (2007) zbMATHCrossRefMathSciNetGoogle Scholar
  70. Roberts, G.O., Gelman, A., Gilks, W.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7, 110–120 (1997) zbMATHCrossRefMathSciNetGoogle Scholar
  71. Roweis, S.: EM algorithms for PCA and SPCA. Neural Inf. Process. Syst. 10, 626–632 (1997) Google Scholar
  72. Sahu, S.K., Zhigljavsky, A.A.: Adaptation for self regenerative MCMC. Available from http://www.maths.soton.ac.uk/staff/Sahu/research/papers/self.html
  73. Saksman, E., Vihola, M.: On the ergodicity of the adaptive Metropolis algorithm on unbounded domains (2008). arXiv:0806.2933
  74. Sherlock, C., Roberts, G.O.: Optimal scaling of the random walk Metropolis on elliptically symmetric unimodal targets. Tech. Rep. University of Lancaster (2006) Google Scholar
  75. Sims, C.A.: Adaptive Metropolis-Hastings algorithm or Monte Carlo kernel estimation. Tech. report Princeton University (1998) Google Scholar
  76. Spall, J.C.: Adaptive stochastic approximation by the simultaneous perturbation method. IEEE Trans. Automat. Control 45(10), 1839–1853 (2000) zbMATHCrossRefMathSciNetGoogle Scholar
  77. Stramer, O., Tweedie, R.L.: Langevin-type models II: self-targeting candidates for MCMC algorithms. Methodol. Comput. Appl. Probab. 1(3), 307–328 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  78. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998) Google Scholar
  79. Tierney, L., Mira, A.: Some adaptive Monte Carlo methods for Bayesian inference. Stat. Med. 18, 2507–2515 (1999) CrossRefGoogle Scholar
  80. Tipping, M.E., Bishop, C.M.: Probabilistic principal component analysis. J. R. Stat. Soc. Ser. B Stat. Methodol. 61, 611–622 (1999) zbMATHCrossRefMathSciNetGoogle Scholar
  81. Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A Mathematical Introduction. Stochastic Modelling and Applied Probability. Springer, Berlin (2003) Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  1. 1.School of MathematicsUniversity of BristolBristolUK
  2. 2.Chairs of StatisticsÉcole Polytechnique Fédérale de LausanneLausanneSwitzerland

Personalised recommendations