Statistics and Computing

, Volume 11, Issue 2, pp 125–139 | Cite as

Annealed importance sampling

  • Radford M. Neal


Simulated annealing—moving from a tractable distribution to a distribution of interest via a sequence of intermediate distributions—has traditionally been used as an inexact method of handling isolated modes in Markov chain samplers. Here, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler. The Markov chain aspect allows this method to perform acceptably even for high-dimensional problems, where finding good importance sampling distributions would otherwise be very difficult, while the use of importance weights ensures that the estimates found converge to the correct values as the number of annealing runs increases. This annealed importance sampling procedure resembles the second half of the previously-studied tempered transitions, and can be seen as a generalization of a recently-proposed variant of sequential importance sampling. It is also related to thermodynamic integration methods for estimating ratios of normalizing constants. Annealed importance sampling is most attractive when isolated modes are present, or when estimates of normalizing constants are required, but it may also be more generally useful, since its independent sampling allows one to bypass some of the problems of assessing convergence and autocorrelation in Markov chain samplers.

tempered transitions sequential importance sampling estimation of normalizing constants free energy computation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Evans M. 1991. Chaining via annealing. Annals of Statistics 19: 382-393.Google Scholar
  2. Gelman A. and Meng X.-L. 1998. Simulating normalizing constants: From importance sampling to bridge sampling to path sampling. Statistical Science 13: 163-185.Google Scholar
  3. Geweke J. 1989. Bayesian inference in econometric models using Monte Carlo integration. Econometrica 57: 1317-1339.Google Scholar
  4. Geyer C.J. 1991. Markov chain Monte Carlo maximum likelihood. In: Keramidas E.M. (Ed.), Computing Science and Statistics: Proceedings of the 23rd Symposium on the Interface. Interface Foundation, pp. 156-163.Google Scholar
  5. Geyer C.J. and Thompson E.A. 1995. Annealing Markov chain Monte Carlo with applications to ancestral inference. Journal of the American Statistical Association 90: 909-920.Google Scholar
  6. Gilks W.R., Richardson S., and Spiegelhalter D.J. 1996. Markov Chain Monte Carlo in Practice. Chapman and Hall, London.Google Scholar
  7. Hastings W.K. 1970. Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57: 97-109.Google Scholar
  8. Jarzynski C. 1997a. Nonequilibrium equality for free energy differences. Physical Review Letters 78: 2690-2693.Google Scholar
  9. Jarzynski C. 1997b. Equilibrium free-energy differences from nonequilibrium measurements: A master-equation approach. Physical Review E 56: 5018-5035.Google Scholar
  10. Kirkpatrick S., Gelatt C.D., and Vecchi M.P. 1983. Optimization by simulated annealing. Science 220: 671-680.Google Scholar
  11. Liu J.S. 1996. Metropolized independent sampling with comparisons to rejection sampling and importance sampling. Statistics and Computing 6: 113-119.Google Scholar
  12. MacEachern S.N., Clyde M., and Liu J.S. 1999. Squential importance sampling for nonparametric Bayes models: The next generation. Canadian Journal of Statistics 27: 251-267.Google Scholar
  13. Marinari E. and Parisi G. 1992. Simulated tempering: A new Monte Carlo scheme. Europhysics Letters 19: 451-458.Google Scholar
  14. Metropolis N., Rosenbluth A.W., Rosenbluth M.N., Teller A.H., and Teller E. 1953. Equation of state calculations by fast computing machines. Journal of Chemical Physics 21: 1087-1092.Google Scholar
  15. Mykland C., Tierney L., and Yu B. 1995. Regeneration in Markov Chain samplers. Journal of the American Statistical Association 90: 233-241.Google Scholar
  16. Neal R.M. 1996a. Sampling from multimodal distributions using tempered transitions. Statistics and Computing 6: 353-366.Google Scholar
  17. Neal R.M. 1996b. Bayesian Learning for Neural Networks. Springer-Verlag, New York. Lecture Notes in Statistics, Vol. 118.Google Scholar
  18. Ripley B.D. 1987. Stochastic Simulation. John Wiley, New York.Google Scholar
  19. Torrie G.M. and Valleau J.P. 1977. Nonphysical sampling distributions in Monte Carlo free-energy estimation: Umbrella sampling. Journal of Computational Physics 23: 187-199.Google Scholar

Copyright information

© Kluwer Academic Publishers 2001

Authors and Affiliations

  • Radford M. Neal
    • 1
  1. 1.Department of Statistics and Department of Computer ScienceUniversity of TorontoTorontoCanada

Personalised recommendations