Advertisement

Statistics and Computing

, Volume 25, Issue 1, pp 81–92 | Cite as

Particle Metropolis–Hastings using gradient and Hessian information

  • Johan Dahlin
  • Fredrik Lindsten
  • Thomas B. Schön
Article

Abstract

Particle Metropolis–Hastings (PMH) allows for Bayesian parameter inference in nonlinear state space models by combining Markov chain Monte Carlo (MCMC) and particle filtering. The latter is used to estimate the intractable likelihood. In its original formulation, PMH makes use of a marginal MCMC proposal for the parameters, typically a Gaussian random walk. However, this can lead to a poor exploration of the parameter space and an inefficient use of the generated particles. We propose a number of alternative versions of PMH that incorporate gradient and Hessian information about the posterior into the proposal. This information is more or less obtained as a byproduct of the likelihood estimation. Indeed, we show how to estimate the required information using a fixed-lag particle smoother, with a computational cost growing linearly in the number of particles. We conclude that the proposed methods can: (i) decrease the length of the burn-in phase, (ii) increase the mixing of the Markov chain at the stationary phase, and (iii) make the proposal distribution scale invariant which simplifies tuning.

Keywords

Sequential Monte Carlo Particle Markov chain Monte Carlo Manifold MALA Fixed-lag particle smoothing Parameter inference 

Notes

Acknowledgments

This work was supported by: Learning of complex dynamical systems (Contract number: 637-2014-466) and Probabilistic modeling of dynamical systems (Contract number: 621-2013-5524) and CADICS, a Linnaeus Center, all funded by the Swedish Research Council.

References

  1. Andrieu, C., Roberts, G.O.: The pseudo-marginal approach for efficient Monte Carlo computations. Ann. Stat. 37(2), 697–725 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  2. Andrieu, C., Thoms, J.: A tutorial on adaptive MCMC. Stat. Comput. 18(4), 343–373 (2008)CrossRefMathSciNetGoogle Scholar
  3. Andrieu, C., Vihola, M.: Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms. Pre-print arXiv:1012.1484v1 (2011)
  4. Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. 72(3), 269–342 (2010)CrossRefMathSciNetGoogle Scholar
  5. Beaumont, M.A.: Estimation of population growth or decline in genetically monitored populations. Genetics 164(3), 1139–1160 (2003)Google Scholar
  6. Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, Berlin (2005)zbMATHGoogle Scholar
  7. Carpenter, J., Clifford, P., Fearnhead, P.: Improved particle filter for nonlinear problems. IEE Proc. Radar Sonar Navig. 146(1), 2–7 (1999)CrossRefGoogle Scholar
  8. Dahlin, J.: Sequential Monte Carlo for inference in nonlinear state space models. Licentiate’s thesis no. 1652, Linköping University (2014)Google Scholar
  9. Dahlin, J., Lindsten, F., Schön, T.B.: Particle Metropolis Hastings using Langevin dynamics. In: Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver (2013)Google Scholar
  10. Dahlin, J., Lindsten, F., Schön, T.B.: Second-order particle MCMC for Bayesian parameter inference. In: Proceedings of the 19th IFAC World Congress, Cape Town (2014)Google Scholar
  11. Del Moral, P.: Feynman-Kac Formulae—Genealogical and Interacting Particle Systems with Applications. Probability and its applications. Springer, Berlin (2004)zbMATHGoogle Scholar
  12. Del Moral, P., Doucet, A., Singh, S.: Forward smoothing using sequential Monte Carlo. Pre-print arXiv:1012.5390v1 (2010)
  13. Diaconis, P., Holmes, S., Neal, R.: Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10(3), 685–1064 (2000)MathSciNetGoogle Scholar
  14. Doucet, A., Johansen, A.: A tutorial on particle filtering and smoothing: fifteen years later. In: Crisan, D., Rozovsky, B. (eds.) The Oxford Handbook of Nonlinear Filtering. Oxford University Press, Oxford (2011)Google Scholar
  15. Doucet, A., Jacob, P., Johansen, A.M.: Discussion on Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. 73(2), 162 (2011)Google Scholar
  16. Doucet, A., Pitt, M.K., Deligiannidis, G., Kohn, R.: Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. arXiv.org, Pre-print arXiv:1210.1871v3 (2012)
  17. Doucet, A., Jacob, P.E., Rubenthaler, S.: Derivative-free estimation of the score vector and observed information matrix with application to state-space models. Pre-print arXiv:1304.5768v2 (2013)
  18. Duane, S., Kennedy, A.D., Pendleton, B.J., Roweth, D.: Hybrid Monte Carlo. Phys. Lett. B 195(2), 216–222 (1987)CrossRefGoogle Scholar
  19. Everitt, R.G.: Bayesian parameter estimation for latent Markov random fields and social networks. J. Comput. Gr. Stat. 21(4), 940–960 (2012)CrossRefMathSciNetGoogle Scholar
  20. Flury, T., Shephard, N.: Bayesian inference based only on simulated likelihood: particle filter analysis of dynamic economic models. Econom. Theory 27(5), 933–956 (2011)Google Scholar
  21. Girolami, M., Calderhead, B.: Riemann manifold Langevin and Hamiltonian Monte Carlo methods. J. R. Stat. Soc. 73(2), 1–37 (2011)Google Scholar
  22. Golightly, A., Wilkinson, D.J.: Bayesian parameter inference for stochastic biochemical network models using particle Markov chain Monte Carlo. Interface Focus 1(6), 807–820 (2011)Google Scholar
  23. Gordon, N.J., Salmond, D.J., Smith, A.F.M.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEEE Proc. Radar Signal Process. 140(2), 107–113 (1993)CrossRefGoogle Scholar
  24. Kitagawa, G., Sato, S.: Monte Carlo smoothing and self-organising state-space model. In: Doucet, A., de Fretias, N., Gordon, N. (eds.) Sequential Monte Carlo methods in practice, pp. 177–195. Springer, Berlin (2001)CrossRefGoogle Scholar
  25. Langrock, R.: Some applications of nonlinear and non-Gaussian state-space modelling by means of hidden Markov models. J. Appl. Stat. 38(12), 2955–2970 (2011)CrossRefMathSciNetGoogle Scholar
  26. Neal, R.M.: MCMC using Hamiltonian dynamics. In: Brooks, S., Gelman, A., Jones, G., Meng, X.L. (eds.) Handbook of Markov Chain Monte Carlo. Chapman & Hall, London (2010)Google Scholar
  27. Nemeth, C., Fearnhead, P.: Particle Metropolis adjusted Langevin algorithms for state-space models. Pre-print arXiv:1402.0694v1 (2014)
  28. Nocedal, J., Wright, S.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)zbMATHGoogle Scholar
  29. Olsson, J., Cappé, O., Douc, R., Moulines, E.: Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli 14(1), 155–179 (2008)CrossRefzbMATHMathSciNetGoogle Scholar
  30. Peters, G.W., Hosack, G.R., Hayes, K.R.: Ecological non-linear state space model selection via adaptive particle Markov chain Monte Carlo. Pre-print arXiv:1005.2238v1 (2010)
  31. Pitt, M.K., Shephard, N.: Filtering via simulation: auxiliary particle filters. J. Am. Stat. Assoc. 94(446), 590–599 (1999)CrossRefzbMATHMathSciNetGoogle Scholar
  32. Pitt, M.K., Silva, R.S., Giordani, P., Kohn, R.: On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. J. Econom. 171(2), 134–151 (2012)CrossRefMathSciNetGoogle Scholar
  33. Poyiadjis, G., Doucet, A., Singh, S.S.: Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98(1), 65–80 (2011)CrossRefzbMATHMathSciNetGoogle Scholar
  34. Rauch, H.E., Tung, F., Striebel, C.T.: Maximum likelihood estimates of linear dynamic systems. AIAA J. 3(8), 1445–1450 (1965)CrossRefMathSciNetGoogle Scholar
  35. Robert, C.P., Casella, G.: Monte Carlo Statistical Methods, 2nd edn. Springer, Berlin (2004)CrossRefzbMATHGoogle Scholar
  36. Roberts, G.O., Rosenthal, J.S.: Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 60(1), 255–268 (1998)CrossRefzbMATHMathSciNetGoogle Scholar
  37. Roberts, G.O., Stramer, O.: Langevin diffusions and Metropolis–Hastings algorithms. Methodol. Comput. Appl. Probab. 4(4), 337–357 (2003)CrossRefMathSciNetGoogle Scholar
  38. Roberts, G.O., Gelman, A., Gilks, W.R.: Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann. Appl. Probab. 7(1), 110–120 (1997)CrossRefzbMATHMathSciNetGoogle Scholar
  39. Sherlock, C., Thiery, A.H., Roberts, G.O., Rosenthal, J.S. On the efficency of pseudo-marginal random walk Metropolis algorithms. Pre-print arXiv:1309.7209v1 (2013)

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Johan Dahlin
    • 1
  • Fredrik Lindsten
    • 2
  • Thomas B. Schön
    • 3
  1. 1.Department of Electrical EngineeringLinköping UniversityLinköpingSweden
  2. 2.Department of EngineeringUniversity of CambridgeCambridgeUK
  3. 3.Department of Information TechnologyUppsala UniversityUppsalaSweden

Personalised recommendations