Jump Markov chains and rejection-free Metropolis algorithms

Rosenthal, Jeffrey S.; Dote, Aki; Dabiri, Keivan; Tamura, Hirotaka; Chen, Sigeng; Sheikholeslami, Ali

doi:10.1007/s00180-021-01095-2

Jump Markov chains and rejection-free Metropolis algorithms

Original paper
Published: 13 March 2021

Volume 36, pages 2789–2811, (2021)
Cite this article

Computational Statistics Aims and scope Submit manuscript

Jeffrey S. Rosenthal ORCID: orcid.org/0000-0002-5118-6808¹,
Aki Dote^2,3,
Keivan Dabiri²,
Hirotaka Tamura³,
Sigeng Chen¹ &
…
Ali Sheikholeslami²

444 Accesses
7 Citations
4 Altmetric
Explore all metrics

Abstract

We consider versions of the Metropolis algorithm which avoid the inefficiency of rejections. We first illustrate that a natural Uniform Selection algorithm might not converge to the correct distribution. We then analyse the use of Markov jump chains which avoid successive repetitions of the same state. After exploring the properties of jump chains, we show how they can exploit parallelism in computer hardware to produce more efficient samples. We apply our results to the Metropolis algorithm, to Parallel Tempering, to a Bayesian model, to a two-dimensional ferromagnetic 4$\times $4 Ising model, and to a pseudo-marginal MCMC algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Direct statistical inference for finite Markov jump processes via the matrix exponential

Article Open access 19 April 2021

On the use of Markov chain Monte Carlo methods for the sampling of mixture models: a statistical perspective

Article 19 December 2014

Markov Chain Monte Carlo Algorithms

Notes

Performed using the C program available at http://probability.ca/rejfree.c
Performed using the R program available at: http://probability.ca/rejectionfreesim.
Performed using the R program available at: http://probability.ca/rejectionfreemod.

References

Andrieu C, Roberts GO (2009) The pseudo-marginal approach for efficient Monte Carlo computations. Ann Stat 37(2):697–725
Article MathSciNet Google Scholar
Bortz AB, Kalos MH, Lebowitz JL (1975) A new algorithm for Monte Carlo simulation of Ising spin systems. J Comp Phys 17:10–18
Article Google Scholar
Brooks S, Gelman A, Jones GL, Meng X-L (eds) (2011) Handbook of Markov chain Monte Carlo. Chapman and Hall/CRC Press, Boca Raton
MATH Google Scholar
Deligiannidis G, Lee A (2018) Which ergodic averages have finite asymptotic variance? Ann Appl Prob 28(4):2309–2334
Article MathSciNet Google Scholar
Douc R, Robert CP (2011) A vanilla Rao-Blackwellization of Metropolis–Hastings algorithms. Ann Stat 39:261–277
MathSciNet MATH Google Scholar
Doucet A, Pitt MK, Deligiannidis G, Kohn R (2015) Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102(2):295–313
Article MathSciNet Google Scholar
Durrett R (1999) Essentials of Stochastic Processes. Springer, New York
MATH Google Scholar
National Center for Education Statistics (2002) Education Longitudinal Study of 2002. Available at: https://nces.ed.gov/surveys/els2002/
Geyer CJ (1991) Markov chain Monte Carlo maximum likelihood. In: Computing Science and Statistics, Proceedings of the 23rd Symposium on the Interface, American Statistical Association, New York, pp 156–163
Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109
Article MathSciNet Google Scholar
Iliopoulos G, Malefaki S (2013) Variance reduction of estimators arising from Metropolis–Hastings algorithms. Stat Comput 23:577–587
Article MathSciNet Google Scholar
Kornissa G, Novotnya MA, Rikvoldab PA (1999) Parallelization of a dynamic Monte Carlo algorithm: a partially rejection-free conservative approach. J Comp Phys 153(2):488–508
Article Google Scholar
Lubachevsky BD (1988) Efficient parallel simulations of dynamic ising spin systems. J Comp Phys 75(1):103–122
Article Google Scholar
Malefaki S, Iliopoulos G (2008) On convergence of properly weighted samples to the target distribution. J Stat Plan Inference 138:1210–1225
Article MathSciNet Google Scholar
Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21:1087–1091
Article Google Scholar
Meyn SP, Tweedie RL (1993) Markov chains and stochastic stability. Springer, London. Available at: http://probability.ca/MT/
Roberts GO, Gelman A, Gilks WR (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Prob 7:110–120
MathSciNet MATH Google Scholar
Roberts GO, Rosenthal JS (2001) Optimal scaling for various Metropolis–Hastings algorithms. Stat Sci 16:351–367
Article MathSciNet Google Scholar
Roberts GO, Rosenthal JS (2014) Minimising MCMC variance via diffusion limits, with an application to simulated tempering. Ann Appl Prob 24:131–149
Article MathSciNet Google Scholar
Rosenthal JS (2006) A first look at rigorous probability theory, 2nd edn. World Scientific Publishing, Singapore
Book Google Scholar
Rosenthal JS (2019) A first look at stochastic processes. World Scientific Publishing, Singapore
Book Google Scholar
Swendsen RH, Wang JS (1986) Replica Monte Carlo simulation of spin glasses. Phys Rev Lett 57:2607–2609
Article MathSciNet Google Scholar

Download references

Acknowledgements

This work was supported by research grants from Fujitsu Laboratories Ltd. We thank the editor and referees for very helpful comments which have greatly improved the manuscript.

Author information

Authors and Affiliations

Department of Statistical Sciences, University of Toronto, Toronto, Canada
Jeffrey S. Rosenthal & Sigeng Chen
Department of Electrical and Computer Engineering, University of Toronto, Toronto, Canada
Aki Dote, Keivan Dabiri & Ali Sheikholeslami
Fujitsu Laboratories Ltd., Kanagawa, Japan
Aki Dote & Hirotaka Tamura

Authors

Jeffrey S. Rosenthal
View author publications
You can also search for this author in PubMed Google Scholar
Aki Dote
View author publications
You can also search for this author in PubMed Google Scholar
Keivan Dabiri
View author publications
You can also search for this author in PubMed Google Scholar
Hirotaka Tamura
View author publications
You can also search for this author in PubMed Google Scholar
Sigeng Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ali Sheikholeslami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jeffrey S. Rosenthal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Proposition 1

Lemma 15

For the Uniform Selection chain of Fig. 6, let $s(x) = \mathbf{P}($hit 4 before 0$\, | \, X_0=x)$. Then $s(0)=0$, $s(1)=3/7$, $s(2)=4/7$, $s(3)=13/21$, and $s(4)=1$.

Proof

Clearly $s(0)=0$ and $s(4)=1$. Also, by conditioning on the first step, for $1 \le x \le 3$ we have $s(x) = p_{x,x-1} \, s(x-1) + p_{x,x+1} \, s(x+1)$. In particular, $s(1) = (1/4) s(0) + (3/4) s(2) = (3/4) s(2)$, and $s(2) = (1/4) s(1) + (3/4) s(3)$, and $s(3) = (8/9) s(2) + (1/9) s(4) = (8/9) s(2) + (1/9)$. We solve these equations using algebra. Substituting the first equation into the second, $s(2) = (1/4)(3/4) s(2) + (3/4) s(3)$, so $(13/16) s(2) = (3/4) s(3)$, so $s(3) = (13/16)(4/3) s(2) = (13/12) s(2)$. Then the third equation gives $(13/12) s(2) = (8/9) s(2) + (1/9)$, so $(7/36) s(2) = (1/9)$, so $s(2)=(1/9)(36/7) = 4/7$. Then $s(1) = (3/4) s(2) = (3/4)(4/7) = 3/7$, and $s(3) = (8/9) s(2) + (1/9) = (8/9) (4/7) + (1/9) = 13/21$, as claimed. $\square $

Lemma 16

Suppose the Uniform Selection chain for Example 2 begins at state $x=4a$ for some positive integer a. Let C be the event that the chain hits $4(a+1)$ before hitting $4(a-1)$. Then $q := \mathbf{P}(C) = 9/17 > 1/2$.

Proof

By conditioning on the first step, we have that

$$\begin{aligned} q= & {} \mathbf{P}(C \, | \, X_0=4a) \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \\= & {} \mathbf{P}(X_1=4a+1) \ \mathbf{P}(C \, | \, X_0=4a+1) + \mathbf{P}(X_1=4a-1) \ \mathbf{P}(C \, | \, X_0=4a-1)\\= & {} (1/2) \ \mathbf{P}(C \, | \, X_0=4a+1) + (1/2) \ \mathbf{P}(C \, | \, X_0=4a-1) \, . \end{aligned}$$

But from $4a+1$, by Lemma 15, we either reach $4a+4$ before returning to 4a (and “win”) with probability 3/7, or we first return to 4a (and “start over”) with probability 4/7. Similarly, from $4a-1$, we either return to 4a (and “start over”) with probability 13/21, or we reach $4a-4$ before returning to 4a (and “lose”) with probability 8/21. Hence,

$$\begin{aligned} q = (1/2) \, [ (3/7) + (4/7) q ] + (1/2) \, [ (13/21) q + 0] \, . \end{aligned}$$

That is, $q = (3/14) + (2/7) q + (13/42) q = (3/14) + (25/42) q$. Hence, $q = (3/14) \bigm / (17/42) = 9/17 > 1/2$. $\square $

We then have:

Corollary 17

Suppose the Uniform Selection chain for Example 2 begins at state $4a \ge 8$ for some positive integer $a \ge 2$. Then the probability it will ever reach the state 4 is $(8/9)^{a-1} < 1$.

Proof

Consider a sub-chain $\{\tilde{X}_n\}$ of $\{X_n\}$ which just records new multiples of 4. That is, if the original chain is at the state 4b, then the new chain is at b. Then, we wait until the original reaches either $4(b-1)$ or $4(b+1)$ at which point the next state of the new chain is $b-1$ or $b+1$ respectively. Then Lemma 16 says that this new chain is performing simple random walk on the positive integers, with up-probability 9/17 and down-probability 8/17. Then it follows from the Gambler’s Ruin formula (e.g. Rosenthal 2006, equation 7.2.7) that, starting from state a, the probability that the new chain will ever reach the state 1 is equal to $[(8/17)/(9/17)]^{a-1} = (8/9)^{a-1} < 1$, as claimed. $\square $

Since the chain starting at 4a for $a \ge 2$ cannot reach state 3 without first reaching state 4, Proposition 1 follows immediately from Corollary 17.

If we instead cut off the example at the state 4L, then the Gambler’s Ruin formula (e.g. Rosenthal 2006, equation 7.2.2) says that from the state $4(L-1)$, the probability of reaching the state 4 before returning to the state 4L is $[(9/8)^1-1] \bigm / [(9/8)^{L-2}-1] < (8/9)^{L-1}$ (since $[A-1] \bigm / [B-1] < A/B$ whenever $1<A<B$), so the expected number of attempts to reach state 4 from state 4L is more than $(9/8)^{L-1}$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rosenthal, J.S., Dote, A., Dabiri, K. et al. Jump Markov chains and rejection-free Metropolis algorithms. Comput Stat 36, 2789–2811 (2021). https://doi.org/10.1007/s00180-021-01095-2

Download citation

Received: 25 April 2020
Accepted: 03 March 2021
Published: 13 March 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s00180-021-01095-2

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Jump Markov chains and rejection-free Metropolis algorithms

Abstract

Access this article

Similar content being viewed by others

Direct statistical inference for finite Markov jump processes via the matrix exponential

On the use of Markov chain Monte Carlo methods for the sampling of mixture models: a statistical perspective

Markov Chain Monte Carlo Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of Proposition 1

Lemma 15

Proof

Lemma 16

Proof

Corollary 17

Proof

Rights and permissions

About this article

Cite this article

Navigation

Jump Markov chains and rejection-free Metropolis algorithms

Abstract

Access this article

Similar content being viewed by others

Direct statistical inference for finite Markov jump processes via the matrix exponential

On the use of Markov chain Monte Carlo methods for the sampling of mixture models: a statistical perspective

Markov Chain Monte Carlo Algorithms

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix: Proof of Proposition 1

Appendix: Proof of Proposition 1

Lemma 15

Proof

Lemma 16

Proof

Corollary 17

Proof

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation