Skip to main content
Log in

Jump Markov chains and rejection-free Metropolis algorithms

  • Original paper
  • Published:
Computational Statistics Aims and scope Submit manuscript

Abstract

We consider versions of the Metropolis algorithm which avoid the inefficiency of rejections. We first illustrate that a natural Uniform Selection algorithm might not converge to the correct distribution. We then analyse the use of Markov jump chains which avoid successive repetitions of the same state. After exploring the properties of jump chains, we show how they can exploit parallelism in computer hardware to produce more efficient samples. We apply our results to the Metropolis algorithm, to Parallel Tempering, to a Bayesian model, to a two-dimensional ferromagnetic 4\(\times \)4 Ising model, and to a pseudo-marginal MCMC algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Performed using the C program available at http://probability.ca/rejfree.c

  2. Performed using the R program available at: http://probability.ca/rejectionfreesim.

  3. Performed using the R program available at: http://probability.ca/rejectionfreemod.

References

  • Andrieu C, Roberts GO (2009) The pseudo-marginal approach for efficient Monte Carlo computations. Ann Stat 37(2):697–725

    Article  MathSciNet  Google Scholar 

  • Bortz AB, Kalos MH, Lebowitz JL (1975) A new algorithm for Monte Carlo simulation of Ising spin systems. J Comp Phys 17:10–18

    Article  Google Scholar 

  • Brooks S, Gelman A, Jones GL, Meng X-L (eds) (2011) Handbook of Markov chain Monte Carlo. Chapman and Hall/CRC Press, Boca Raton

    MATH  Google Scholar 

  • Deligiannidis G, Lee A (2018) Which ergodic averages have finite asymptotic variance? Ann Appl Prob 28(4):2309–2334

    Article  MathSciNet  Google Scholar 

  • Douc R, Robert CP (2011) A vanilla Rao-Blackwellization of Metropolis–Hastings algorithms. Ann Stat 39:261–277

    MathSciNet  MATH  Google Scholar 

  • Doucet A, Pitt MK, Deligiannidis G, Kohn R (2015) Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102(2):295–313

    Article  MathSciNet  Google Scholar 

  • Durrett R (1999) Essentials of Stochastic Processes. Springer, New York

    MATH  Google Scholar 

  • National Center for Education Statistics (2002) Education Longitudinal Study of 2002. Available at: https://nces.ed.gov/surveys/els2002/

  • Geyer CJ (1991) Markov chain Monte Carlo maximum likelihood. In: Computing Science and Statistics, Proceedings of the 23rd Symposium on the Interface, American Statistical Association, New York, pp 156–163

  • Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

    Article  MathSciNet  Google Scholar 

  • Iliopoulos G, Malefaki S (2013) Variance reduction of estimators arising from Metropolis–Hastings algorithms. Stat Comput 23:577–587

    Article  MathSciNet  Google Scholar 

  • Kornissa G, Novotnya MA, Rikvoldab PA (1999) Parallelization of a dynamic Monte Carlo algorithm: a partially rejection-free conservative approach. J Comp Phys 153(2):488–508

    Article  Google Scholar 

  • Lubachevsky BD (1988) Efficient parallel simulations of dynamic ising spin systems. J Comp Phys 75(1):103–122

    Article  Google Scholar 

  • Malefaki S, Iliopoulos G (2008) On convergence of properly weighted samples to the target distribution. J Stat Plan Inference 138:1210–1225

    Article  MathSciNet  Google Scholar 

  • Metropolis N, Rosenbluth A, Rosenbluth M, Teller A, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21:1087–1091

    Article  Google Scholar 

  • Meyn SP, Tweedie RL (1993) Markov chains and stochastic stability. Springer, London. Available at: http://probability.ca/MT/

  • Roberts GO, Gelman A, Gilks WR (1997) Weak convergence and optimal scaling of random walk Metropolis algorithms. Ann Appl Prob 7:110–120

    MathSciNet  MATH  Google Scholar 

  • Roberts GO, Rosenthal JS (2001) Optimal scaling for various Metropolis–Hastings algorithms. Stat Sci 16:351–367

    Article  MathSciNet  Google Scholar 

  • Roberts GO, Rosenthal JS (2014) Minimising MCMC variance via diffusion limits, with an application to simulated tempering. Ann Appl Prob 24:131–149

    Article  MathSciNet  Google Scholar 

  • Rosenthal JS (2006) A first look at rigorous probability theory, 2nd edn. World Scientific Publishing, Singapore

    Book  Google Scholar 

  • Rosenthal JS (2019) A first look at stochastic processes. World Scientific Publishing, Singapore

    Book  Google Scholar 

  • Swendsen RH, Wang JS (1986) Replica Monte Carlo simulation of spin glasses. Phys Rev Lett 57:2607–2609

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported by research grants from Fujitsu Laboratories Ltd. We thank the editor and referees for very helpful comments which have greatly improved the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jeffrey S. Rosenthal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Proof of Proposition 1

Appendix: Proof of Proposition 1

Lemma 15

For the Uniform Selection chain of Fig. 6, let \(s(x) = \mathbf{P}(\)hit 4 before 0\(\, | \, X_0=x)\). Then \(s(0)=0\), \(s(1)=3/7\), \(s(2)=4/7\), \(s(3)=13/21\), and \(s(4)=1\).

Proof

Clearly \(s(0)=0\) and \(s(4)=1\). Also, by conditioning on the first step, for \(1 \le x \le 3\) we have \(s(x) = p_{x,x-1} \, s(x-1) + p_{x,x+1} \, s(x+1)\). In particular, \(s(1) = (1/4) s(0) + (3/4) s(2) = (3/4) s(2)\), and \(s(2) = (1/4) s(1) + (3/4) s(3)\), and \(s(3) = (8/9) s(2) + (1/9) s(4) = (8/9) s(2) + (1/9)\). We solve these equations using algebra. Substituting the first equation into the second, \(s(2) = (1/4)(3/4) s(2) + (3/4) s(3)\), so \((13/16) s(2) = (3/4) s(3)\), so \(s(3) = (13/16)(4/3) s(2) = (13/12) s(2)\). Then the third equation gives \((13/12) s(2) = (8/9) s(2) + (1/9)\), so \((7/36) s(2) = (1/9)\), so \(s(2)=(1/9)(36/7) = 4/7\). Then \(s(1) = (3/4) s(2) = (3/4)(4/7) = 3/7\), and \(s(3) = (8/9) s(2) + (1/9) = (8/9) (4/7) + (1/9) = 13/21\), as claimed. \(\square \)

Lemma 16

Suppose the Uniform Selection chain for Example 2 begins at state \(x=4a\) for some positive integer a. Let C be the event that the chain hits \(4(a+1)\) before hitting \(4(a-1)\). Then \(q := \mathbf{P}(C) = 9/17 > 1/2\).

Proof

By conditioning on the first step, we have that

$$\begin{aligned} q= & {} \mathbf{P}(C \, | \, X_0=4a) \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad \\= & {} \mathbf{P}(X_1=4a+1) \ \mathbf{P}(C \, | \, X_0=4a+1) + \mathbf{P}(X_1=4a-1) \ \mathbf{P}(C \, | \, X_0=4a-1)\\= & {} (1/2) \ \mathbf{P}(C \, | \, X_0=4a+1) + (1/2) \ \mathbf{P}(C \, | \, X_0=4a-1) \, . \end{aligned}$$

But from \(4a+1\), by Lemma 15, we either reach \(4a+4\) before returning to 4a (and “win”) with probability 3/7, or we first return to 4a (and “start over”) with probability 4/7. Similarly, from \(4a-1\), we either return to 4a (and “start over”) with probability 13/21, or we reach \(4a-4\) before returning to 4a (and “lose”) with probability 8/21. Hence,

$$\begin{aligned} q = (1/2) \, [ (3/7) + (4/7) q ] + (1/2) \, [ (13/21) q + 0] \, . \end{aligned}$$

That is, \(q = (3/14) + (2/7) q + (13/42) q = (3/14) + (25/42) q\). Hence, \(q = (3/14) \bigm / (17/42) = 9/17 > 1/2\). \(\square \)

We then have:

Corollary 17

Suppose the Uniform Selection chain for Example 2 begins at state \(4a \ge 8\) for some positive integer \(a \ge 2\). Then the probability it will ever reach the state 4 is \((8/9)^{a-1} < 1\).

Proof

Consider a sub-chain \(\{\tilde{X}_n\}\) of \(\{X_n\}\) which just records new multiples of 4. That is, if the original chain is at the state 4b, then the new chain is at b. Then, we wait until the original reaches either \(4(b-1)\) or \(4(b+1)\) at which point the next state of the new chain is \(b-1\) or \(b+1\) respectively. Then Lemma 16 says that this new chain is performing simple random walk on the positive integers, with up-probability 9/17 and down-probability 8/17. Then it follows from the Gambler’s Ruin formula (e.g. Rosenthal 2006, equation 7.2.7) that, starting from state a, the probability that the new chain will ever reach the state 1 is equal to \([(8/17)/(9/17)]^{a-1} = (8/9)^{a-1} < 1\), as claimed. \(\square \)

Since the chain starting at 4a for \(a \ge 2\) cannot reach state 3 without first reaching state 4, Proposition 1 follows immediately from Corollary 17.

If we instead cut off the example at the state 4L, then the Gambler’s Ruin formula (e.g. Rosenthal 2006, equation 7.2.2) says that from the state \(4(L-1)\), the probability of reaching the state 4 before returning to the state 4L is \([(9/8)^1-1] \bigm / [(9/8)^{L-2}-1] < (8/9)^{L-1}\) (since \([A-1] \bigm / [B-1] < A/B\) whenever \(1<A<B\)), so the expected number of attempts to reach state 4 from state 4L is more than \((9/8)^{L-1}\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rosenthal, J.S., Dote, A., Dabiri, K. et al. Jump Markov chains and rejection-free Metropolis algorithms. Comput Stat 36, 2789–2811 (2021). https://doi.org/10.1007/s00180-021-01095-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00180-021-01095-2

Navigation