Skip to main content
Log in

Optimal scaling of the MALA algorithm with irreversible proposals for Gaussian targets

  • Published:
Stochastics and Partial Differential Equations: Analysis and Computations Aims and scope Submit manuscript

Abstract

It is well known in many settings that reversible Langevin diffusions in confining potentials converge to equilibrium exponentially fast. Adding irreversible perturbations to the drift of a Langevin diffusion that maintain the same invariant measure accelerates its convergence to stationarity. Many existing works thus advocate the use of such non-reversible dynamics for sampling. When implementing Markov Chain Monte Carlo algorithms (MCMC) using time discretisations of such Stochastic Differential Equations (SDEs), one can append the discretization with the usual Metropolis–Hastings accept–reject step and this is often done in practice because the accept–reject step eliminates bias. On the other hand, such a step makes the resulting chain reversible. It is not known whether adding the accept–reject step preserves the faster mixing properties of the non-reversible dynamics. In this paper, we address this gap between theory and practice by analyzing the optimal scaling of MCMC algorithms constructed from proposal moves that are time-step Euler discretisations of an irreversible SDE, for high dimensional Gaussian target measures. We call the resulting algorithm the ipMALA , in comparison to the classical MALA algorithm (here ip is for irreversible proposal). In order to quantify how the cost of the algorithm scales with the dimension N, we prove invariance principles for the appropriately rescaled chain. In contrast to the usual MALA algorithm, we show that there could be two regimes asymptotically: (i) a diffusive regime, as in the MALA algorithm and (ii) a “fluid” regime where the limit is an ordinary differential equation. We provide concrete examples where the limit is a diffusion, as in the standard MALA, but with provably higher limiting acceptance probabilities. Numerical results are also given corroborating the theory.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. Indeed, in this case Assumptions 5.1 and 5.2 presented in Sect. 5 are satisfied with \(c_1\) as in (4.8) and \(c_2=c_3=0\). This makes Assumption 5.4 easy to verify. Moreover, Condition 5.3 is trivially satisfied for matrices in Jordan block form. Detailed comments on this can be found in Sect. 5, see comments after (5.29).

  2. This does not mean that the effect of the irreversible term is destroyed. It simply means that the acceptance probability will not feel it.

  3. These calculations are a bit long but straightforward and follow the lines of the calculation done in Lemma B.1, proof of point (vi).

  4. Having used Lemma 5.6 for the last equality.

References

  1. Berger, E.: Asymptotic behaviour of a class of stochastic approximation procedures. Probab. Theory Relat. Fields 71(4), 517–552 (1986)

    Article  MathSciNet  Google Scholar 

  2. Bernard, E.P., Krauth, W., Wilson, D.B.: Event-chain algorithms for hard-sphere systems. Phys. Rev. E 80(5), 056704 (2009)

    Article  Google Scholar 

  3. Bierkens, J.: Non-reversible Metropolis–Hastings. Stat. Comput 26(6), 1213–1228 (2016)

    Article  MathSciNet  Google Scholar 

  4. Bouchard-Côte, A., Vollmer, S.J., Doucet, A.: The bouncy particle sampler: a non-reversible rejection-free Markov Chain Monte Carlo method. J. Am. Stat. Assoc. 113(522), 855–867 (2018). https://doi.org/10.1080/01621459.2017.1294075

    Article  MATH  Google Scholar 

  5. Christensen, O.F., Roberts, G.O., Rosenthal, J.S.: Scaling limits for the transient phase of local Metropolis–Hastings algorithms. J. R. Stat. Soc. Ser. B Stat. Methodol. 67(2), 253–268 (2005)

    Article  MathSciNet  Google Scholar 

  6. Diaconis, P., Holmes, S., Neal, R.M.: Analysis of a nonreversible Markov chain sampler. Ann. Appl. Probab. 10(3), 726–752 (2000)

    Article  MathSciNet  Google Scholar 

  7. Duncan, A.B., Lelievre, T., Pavliotis, G.A.: Variance reduction using nonreversible Langevin samplers. J. Stat. Phys. 163(3), 457–491 (2016)

    Article  MathSciNet  Google Scholar 

  8. Duncan, A.B., Pavliotis, G.A., Zygalakis, K.: Nonreversible Langevin samplers: splitting schemes, analysis and implementation. submitted (2017)

  9. Dvoretzky, A., et al.: Asymptotic normality for sums of dependent random variables. In: Proceedings of 6th Berkeley Symposium on Mathematical Statistics and Probability, vol. 2, pp. 513–535 (1972)

  10. Horowitz, A.M.: A generalized guided Monte Carlo algorithm. Phys. Lett. B 268(2), 247–252 (1991)

    Article  Google Scholar 

  11. Hwang, C.-R., Hwang-Ma, S.-Y., Sheu, S.-J.: Accelerating diffusions. Ann. Appl. Probab. 15(2), 1433–1444 (2005)

    Article  MathSciNet  Google Scholar 

  12. Jourdain, B., Lelievre, T., Miasojedow, B.: Optimal scaling for the transient phase of Metropolis Hastings algorithms: the longtime behavior. Bernoulli 20(4), 1930–1978 (2014)

    Article  MathSciNet  Google Scholar 

  13. Jourdain, B., Lelievre, T., Miasojedow, B.: Optimal scaling for the transient phase of the random walk Metropolis algorithm: the mean field limit. Ann. Appl. Probab. 25(4), 2263–2300 (2015)

    Article  MathSciNet  Google Scholar 

  14. Kuntz, J., Ottobre, M., Stuart, A.M.: Diffusion Limit for the Random Walk Metropolis Algorithm Out of Stationarity. arXiv:1405.4896 (2014)

  15. Kuntz, J., Ottobre, M., Stuart, A.M.: Non-stationary Phase of the MALA Algorithm. arXiv:1608.08379 (2016)

  16. Lu, J., Spiliopoulos, K.: Multiscale Integrators for Stochastic Differential Equations and Irreversible Langevin Samplers. arXiv:1606.09539 (2016)

  17. Ma, Y.-A., Chen, T., Fox, E.: A complete recipe for stochastic-gradient MCMC. Adv. Neural Inf. Process. Syst. 28, 2899–2907 (2015)

    Google Scholar 

  18. Mattingly, J.C., Pillai, N.S., Stuart, A.M.: Diffusion limits of the random walk metropolis algorithm in high dimensions. Ann. Appl. Probab. 22(3), 881–930 (2012)

    Article  MathSciNet  Google Scholar 

  19. Monmarche, P.: Piecewise deterministic simulated annealing, ALEA. Lat. Am. J. Probab. Math. Stat. 13(1), 357–398 (2016)

    Article  MathSciNet  Google Scholar 

  20. Ottobre, M.: Markov chain Monte Carlo and irreversibility. Rep. Math. Phys. 77, 267–292 (2016)

    Article  MathSciNet  Google Scholar 

  21. Ottobre, M., Pillai, N.S., Pinski, F.J., Stuart, A.M.: A function space hmc algorithm with second order Langevin diffusion limit. Bernoulli 22(1), 60–106 (2016)

    Article  MathSciNet  Google Scholar 

  22. Pillai, N.S., Stuart, A.M., Thiéry, A.H.: Optimal scaling and diffusion limits for the Langevin algorithm in high dimensions. Ann. Appl. Probab. 22(6), 2320–2356 (2012)

    Article  MathSciNet  Google Scholar 

  23. Poncet, R.: Generalized and Hybrid Metropolis–Hastings Overdamped Langevin Algorithms. arXiv:1701.05833 (2017)

  24. Rey-Bellet, L., Spiliopoulos, K.: Irreversible Langevin samplers and variance reduction: a large deviations approach. Nonlinearity 28(7), 2081–2103 (2015)

    Article  MathSciNet  Google Scholar 

  25. Rey-Bellet, L., Spiliopoulos, K.: Variance reduction for irreversible Langevin samplers and diffusion on graphs. Electron. Commun. Probab. 20(15), 1–16 (2015)

    MathSciNet  MATH  Google Scholar 

  26. Roberts, G.O., Rosenthal, J.S.: Optimal scaling of discrete approximations to Langevin diffusions. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 60(1), 255–268 (1998)

    Article  MathSciNet  Google Scholar 

  27. Roberts, G.O., Rosenthal, J.S.: Optimal scaling for various Metropolis–Hastings algorithms. Stat. Sci. 16(4), 351–367 (2001)

    Article  MathSciNet  Google Scholar 

  28. Roberts, G.O., Tweedie, R.L.: Exponential convergence of Langevin distributions and their discrete approximations. Bernoulli 2, 341–363 (1996)

    Article  MathSciNet  Google Scholar 

  29. Rossky, P.J., Doll, J.D., Friedman, H.L.: Brownian dynamics as smart Monte Carlo simulation. J. Chem. Phys. 69(10), 4628–4633 (1978)

    Article  Google Scholar 

Download references

Acknowledgements

K.S. was partially supported by NSF CAREER AWARD NSF-DMS 1550918. N.S.P. was partially supported by the ONR grant N00014-18-1-2730.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Spiliopoulos.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A. Proof of Theorem 4.2

In this section we present the proof of our main results. The proof is based on diffusion approximation techniques analogous to those used in [18]. In [18] the authors consider the MALA algorithm with reversible proposal. That is, if we fix \(S=0\) in our paper and \(\Psi =0\) in their paper, the algorithms we consider coincide. For this reason we try to adopt a notation as similar as possible to the one used in [18] and, for the sake of brevity, we detail only the parts of the proof that differ from the work [18] and just sketch the rest.

We start by recalling that, by Fernique’s theorem,

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert x^N\Vert ^p ={\mathbb {E}}\Vert ({\mathcal {C}}^N)^{1/2} z^N\Vert ^p \lesssim 1, \quad \text{ for } \text{ all } p\ge 1. \end{aligned}$$
(A.1)

This fact will be often implicitly used without mention in the remainder of the paper.

We also recall that the chain \(\{x_k^N\}_k\) that we consider is defined in (3.6); the drift-martingale decomposition of such a chain is given in Eq. (4.1) . Let us start by recalling the definition of the continuous interpolant of the chain, Eqs. (4.5)–(4.6), and by introducing the piecewise constant interpolant of the chain \(x_k^N\), that is,

$$\begin{aligned} {\bar{x}}^{(N)}(t)=x_{k}^N \qquad t_k\le t< t_{k+1}, \end{aligned}$$

where \( t_k=k/N^{\zeta \gamma }\). It is easy to see (see e.g., [21, Appendix A]) that

$$\begin{aligned} x^{(N)}(t)=x_0^N+\int _{0}^t d_{{\mathbf {p}}}^N({\bar{x}}^{(N)}(v)) dv+ w_{{\mathbf {p}}}^N(t), \end{aligned}$$
(A.2)

where

$$\begin{aligned} w^N_{{\mathbf {p}}}(t):= \frac{1}{N^{\zeta \gamma /2}} \sum _{j=1}^{k-1}M_j^N +N^{\zeta \gamma /2}(t-t_k)M_k^N. \end{aligned}$$

For any \(t \in [0,T]\), we set

$$\begin{aligned} {\hat{w}}_{{\mathbf {p}}}^N(t)&:= \int _0^t \left[ d_{{\mathbf {p}}}^N({\bar{x}}^{(N)}(v))-d_{{\mathbf {p}}}(x^{(N)}(v)) \right] dv + w_{{\mathbf {p}}}^N(t)\nonumber \\&= \int _0^t \left[ d_{{\mathbf {p}}}^N({\bar{x}}^{(N)}(v))- d_{{\mathbf {p}}}({\bar{x}}^{(N)}(v)) \right] dv \nonumber \\&\quad +\int _0^t \left[ d_{{\mathbf {p}}}({\bar{x}}^{(N)}(v)) - d_{{\mathbf {p}}}({x}^{(N)}(v)) \right] dv +w_{{\mathbf {p}}}^N(t). \end{aligned}$$
(A.3)

With the above notation, we can then rewrite (A.2) as

$$\begin{aligned} x^{(N)}(t)&=x_0^N+\int _{0}^t d_{{\mathbf {p}}}(x^{(N)}(v)) dv+{\hat{w}}_{{\mathbf {p}}}^N(t). \end{aligned}$$
(A.4)

Let now \(C([0,T];{\mathcal {H}})\) denote the space of \({\mathcal {H}}\)-valued continuous functions, endowed with the uniform topology and consider the map

$$\begin{aligned} {\mathcal {I}}: {\mathcal {H}} \times C([0,T];{\mathcal {H}})&\longrightarrow C([0,T];{\mathcal {H}})\\ (x_0 , \eta (t))&\longrightarrow x(t). \end{aligned}$$

That is, \({\mathcal {I}}\) is the map that to every \((x_0 , \eta (t)) \in {\mathcal {H}} \times C([0,T];{\mathcal {H}})\) associates the (unique solution) of the equation

$$\begin{aligned} x(t)=x_0+\int _0^t d_{{\mathbf {p}}}(x(s)) ds + \eta (t). \end{aligned}$$
(A.5)

From (A.4) it is clear that \(x^{(N)}= {\mathcal {I}}(x_0^N, {\hat{w}}_{{\mathbf {p}}}^N)\). Notice that, under our continuity assumption on \({\tilde{S}}\), \({\mathcal {I}}\) is a continuous map. Therefore, in order to prove that \(x^{(N)}(t)\) converges weakly to x(t) (where x(t) is the solution of Eq. (A.5) with \(\eta (t)=D_{{\mathbf {p}}}W^{{\mathcal {C}}}(t)\)), by the continuous mapping theorem we only need to prove that \({\hat{w}}_{{\mathbf {p}}}^N\) converges weakly to \(D_{{\mathbf {p}}}W^{{\mathcal {C}}}(t)\), where \(W^{{\mathcal {C}}}(t)\) is a \({\mathcal {H}}\)-valued \({\mathcal {C}}\)-Brownian motion. The weak convergence of \({\hat{w}}_{{\mathbf {p}}}^N\) to \(D_{{\mathbf {p}}}W^{{\mathcal {C}}}(t)\) is a consequence of (A.3) and of Lemmata A.1 and A.2. Then, we get the statement of Theorem 4.2. The proof of Lemmas A.1 and A.2 is contained in the remained of this “Appendix”.

Lemma A.1

Under Assumptions 5.1, 5.2 and Condition 5.3, the following holds

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\int _0^T \Vert d_{{\mathbf {p}}}^N({\bar{x}}^{(N)}(t))- d_{{\mathbf {p}}}({\bar{x}}^{(N)}(t)) \Vert dt \longrightarrow 0, \quad \text{ as } N \rightarrow \infty \end{aligned}$$
(A.6)

and

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\int _0^T \Vert d_{{\mathbf {p}}}({\bar{x}}^N(t)) - d_{{\mathbf {p}}}({x}^{(N)}(t)) \Vert dt \longrightarrow 0, \quad \text{ as } N \rightarrow \infty , \end{aligned}$$
(A.7)

where the function \(d_{{\mathbf {p}}}(x)\) has been defined in the statement of Theorem 4.2.

Set now

$$\begin{aligned} \epsilon _{{\mathbf {p}}}^N(x):= {\mathbb {E}}_x \left[ \left( 1 \wedge e^{Q^N} \right) {\mathcal {C}}_N^{1/2}z^N \right] \end{aligned}$$
(A.8)

and

$$\begin{aligned} h^N_{{\mathbf {p}}}(x):= {\mathbb {E}}_x \left( 1 \wedge e^{Q^N} \right) . \end{aligned}$$

While \(h_{{\mathbf {p}}}\) [see (5.28)] is the limiting average acceptance probability, \(h_{{\mathbf {p}}}^N(x)\) is the local average acceptance probability. The above notation will be used in the proof of the next lemma.

Lemma A.2

If Assumptions 5.1, 5.2 and Condition 5.3 hold, then \(w_{{\mathbf {p}}}^N(t)\) converges weakly in \(C([0,T];{\mathcal {H}})\) to \(D_{{\mathbf {p}}}W^{{\mathcal {C}}}(t)\), where \(W^{{\mathcal {C}}}(t)\) is a \({\mathcal {H}}\)-valued \({\mathcal {C}}\)-Brownian motion and the constant \(D_{{\mathbf {p}}}\) has been defined in the statement of Theorem 4.2.

Proof of Lemma A.1

We start by proving (A.7), which is simpler. The drift coefficient \(d_{{\mathbf {p}}}\) is globally Lipshitz; therefore, using (3.6), (3.3) and (3.4), if \(t_k\le t <t_{k+1}\), we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert d_{{\mathbf {p}}}({\bar{x}}^{(N)}(t)) - d_{{\mathbf {p}}}({x}^{(N)}(t)) \Vert&\lesssim {\mathbb {E}}_{\pi ^N}\Vert {\bar{x}}^{(N)}(t) - x^{(N)}(t)\Vert \\&\lesssim \left| (N^{\zeta \gamma } t - k) \right| {\mathbb {E}}_{\pi ^N}\Vert x_{k+1}^N-x_k^N\Vert \lesssim {\mathbb {E}}_{\pi ^N}\Vert y_{k+1}^N-x_k^N\Vert \\&\lesssim \left( \frac{1}{N^{2\gamma }}+\frac{1}{N^{\alpha \gamma }} \right) {\mathbb {E}}_{\pi ^N}\Vert x_k^N\Vert + \frac{1}{N^{\gamma }} {\mathbb {E}} \Vert ({\mathcal {C}}^N)^{1/2}z_{k+1}^N\Vert \rightarrow 0. \end{aligned}$$

Let us now come to the proof of (A.6). From (4.3)–(4.4), we have

$$\begin{aligned} d^N_{{\mathbf {p}}}(x)- d_{{\mathbf {p}}}(x) =A_1^N+A_2^N+A_3^N - d_{{\mathbf {p}}}(x), \end{aligned}$$

where

$$\begin{aligned} A_1^N&:= N^{\zeta \gamma }{\mathbb {E}}_x\left[ (1 \wedge e^{Q^N}) \left( - \frac{\ell ^2}{2N^{2\gamma }} x^N\right) \right] \end{aligned}$$
(A.9)
$$\begin{aligned} A_2^N&:= N^{(\zeta -\alpha )\gamma } \ell ^{\alpha }{\mathbb {E}}_x\left[ (1 \wedge e^{Q^N}) {\tilde{S}}^N x^N\right] \nonumber \\ A_3^N&:= N^{(\zeta -1)\gamma } \ell {\mathbb {E}}_x\left[ (1 \wedge e^{Q^N}) {\mathcal {C}}_N^{1/2} z^N \right] {\mathop {=}\limits ^{(B.17)}} N^{(\zeta -1)\gamma } \ell \epsilon _{{\mathbf {p}}}^N(x) . \end{aligned}$$
(A.10)

We split the function \(d_{{\mathbf {p}}}(x)\) in three (corresponding) parts:

$$\begin{aligned} d_{{\mathbf {p}}}(x) = A_1(x)+A_2(x)+A_3(x), \end{aligned}$$

with

$$\begin{aligned} A_1:= \left\{ \begin{array}{ll} - \frac{\ell ^2}{2} h_{{\mathbf {p}}}x &{} \text{ if } \gamma = 1/6 \text{ and } \alpha \ge 2\\ 0 &{} \text{ otherwise } \end{array} \right. \\ A_2:= \left\{ \begin{array}{ll} h_{{\mathbf {p}}}{\ell ^{\alpha }} {\tilde{S}} x &{} \text{ if } \gamma \ge 1/6 \text{ and } \alpha \le 2\\ 0 &{} \text{ otherwise } \end{array} \right. \\ A_3:= \left\{ \begin{array}{ll} -2 \nu _{{\mathbf {p}}}{\ell ^{\alpha }} {\tilde{S}} x &{} \text{ if } \gamma \ge 1/6 \text{ and } 1 \le \alpha \le 2\\ 0 &{} \text{ otherwise } \end{array} \right. \end{aligned}$$

We therefore need to consecutively estimate the above three terms.

\( \bullet \mathbf{\,\, A_1^N-A_1}:\) if \(\alpha \ge 2\) (and \(\gamma =1/6\)) we fix \(\zeta =2\) and we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert A_1^N-A_1\Vert&\lesssim {\mathbb {E}}_{\pi ^N}\left[ \left| {\mathbb {E}}_x \left( 1 \wedge e^{Q^N} \right) -h_{{\mathbf {p}}}\right| \Vert x\Vert \right] + {\mathbb {E}}_{\pi ^N}\Vert x^N-x\Vert \nonumber \\&{\mathop {\lesssim }\limits ^{(A.1)}} \left( {\mathbb {E}}_{\pi ^N}\left| h^N_{{\mathbf {p}}}(x) -h_{{\mathbf {p}}}\right| ^2\right) ^{1/2}+ {\mathbb {E}}_{\pi ^N}\Vert x^N-x\Vert \longrightarrow 0, \end{aligned}$$
(A.11)

as the first addend tends to zero by Lemma B.2 of B and the second addend tends to zero by definition (see also [22, equation (4.3)]). If \(\alpha <2\) then \(\zeta =\alpha \) and we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert A_1^N\Vert \lesssim N^{(\alpha -2)\gamma } {\mathbb {E}}_{\pi ^N}\Vert {\mathbb {E}}_x (1 \wedge e^{Q^N}) x^N \Vert \lesssim N^{(\alpha -2)\gamma } \rightarrow 0\,. \end{aligned}$$
(A.12)

\( \bullet \mathbf{\,\, A_2^N-A_2}:\) if \(\alpha \le 2\) then, recalling (4.5), a calculation analogous to the one in (A.11) gives the statement. If \(\alpha >2\) then we can act as in (A.12).

\( \bullet \mathbf{\,\, A_3^N-A_3}:\) by Lemma B.2 of B [and (4.5)], we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert A_3^N - A_3\Vert \rightarrow 0 \quad \text{ as } N \rightarrow \infty . \end{aligned}$$

This concludes the proof. \(\square \)

Proof of Lemma A.2

The calculations here are standard so we only prove it for the case \(\gamma =1/6\) and \(\alpha >2\). Let us recall the martingale difference given by (5.35). In the case \(\gamma =1/6\) and \(\alpha >2\), we have \(\zeta =2\) and \(d_{{\mathbf {p}}}(x)=-\frac{\ell ^2}{2}h_{{\mathbf {p}}}x\). Hence, the expression (5.35) becomes

$$\begin{aligned} M_k^N&=N^{-1/6}\left( {\tilde{\beta }}^{N}\frac{1}{h_{{\mathbf {p}}}}d_{{\mathbf {p}}}(x_k^N)\right) + N^{(1-\alpha )/6}\left( \ell ^{\alpha }{\tilde{\beta }}^{N} {\tilde{S}} x_k^N\right) + \left( \ell {\tilde{\beta }}^{N} {\mathcal {C}}^{1/2}z_{k+1}^N\right) - N^{-1/6}d_{{\mathbf {p}}}^{N}(x_k^N)\\&=\left( \ell {\tilde{\beta }}^{N} {\mathcal {C}}^{1/2}z_{k+1}^N\right) + N^{-(\alpha -1)/6}\left( \ell ^{\alpha }{\tilde{\beta }}^{N} {\tilde{S}} x_k^N\right) - N^{-1/6}\frac{1}{h_{{\mathbf {p}}}}\left[ h_{{\mathbf {p}}}d_{{\mathbf {p}}}^{N}(x_k^N)-{\tilde{\beta }}^{N}d_{{\mathbf {p}}}(x_k^N)\right] \end{aligned}$$

By Lemma A.1, we have that \({\mathbb {E}}_{\pi ^N}\Vert d_{{\mathbf {p}}}^N(x)-d_{{\mathbf {p}}}(x) \Vert ^{2} \longrightarrow 0\) as \(N\rightarrow \infty \). This implies that

$$\begin{aligned} N^{-1/3}{\mathbb {E}}_{\pi ^N}\Vert h_{{\mathbf {p}}}d_{{\mathbf {p}}}^{N}(x)-{\tilde{\beta }}^{N}d_{{\mathbf {p}}}(x)\Vert ^{2}&\rightarrow 0. \end{aligned}$$

At the same time, we also notice that

$$\begin{aligned} N^{-(\alpha -1)/3}\ell ^{2\alpha }{\mathbb {E}}_{\pi ^N}\Vert {\tilde{\beta }}^{N} {\tilde{S}} x\Vert ^2&\lesssim N^{-(\alpha -1)/3} \ell ^{2\alpha } {\mathbb {E}}_{\pi ^N}\Vert x\Vert ^2 \longrightarrow 0 \quad \text{ if } \alpha >1. \end{aligned}$$

Hence, if we define \({\mathcal {M}}_{N}(x)={\mathbb {E}}_x\left[ M_k^N \otimes M_k^N|x_k^N=x\right] \), we obtain that up to a constant

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \text {Tr}({\mathcal {M}}_{N}(x))-{\mathbb {E}}_x\left[ \Vert \ell {\tilde{\beta }}^{N} {\mathcal {C}}^{1/2}z\Vert ^2\right] \right|&\le N^{-1/3} \end{aligned}$$

Then, as in Lemma 4.8 of [22] we obtain that

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \ell ^{2}h_{{\mathbf {p}}}\text {Tr}({\mathcal {C}})-{\mathbb {E}}_x\left[ \Vert \ell {\tilde{\beta }}^{N} {\mathcal {C}}^{1/2}z\Vert ^2\right] \right|&\rightarrow 0, \text { as } N\rightarrow \infty \end{aligned}$$

which then immediately implies that

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \ell ^{2}h_{{\mathbf {p}}}\text {Tr}({\mathcal {C}})-\text {Tr}({\mathcal {M}}_{N}(x))\right|&\rightarrow 0, \text { as } N\rightarrow \infty . \end{aligned}$$

The latter result implies that the invariance principle of Proposition 5.1 of [1] holds, which then imply the statement of the lemma. \(\square \)

Appendix B. Auxiliary estimates

We first decompose \(Q^N\) as follows: let

$$\begin{aligned} {Z}^N :=&- \frac{\ell ^6}{32}-a - \frac{\ell ^3}{4 {N^{3\gamma }}} \langle x, {\mathcal {C}}^{1/2} z^N\rangle _{{\mathcal {C}}}\nonumber \\&-2 \frac{\ell ^{\alpha -1}}{N^{(\alpha -1)\gamma }} \langle ({\mathcal {C}}^N)^{1/2}z^N, S^N x^N \rangle - \frac{\ell ^{(2\alpha -1)}}{N^{(2\alpha -1)\gamma }} \langle {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2} z^N, {\tilde{S}}^N x^N\rangle _{{\mathcal {C}}^N} \nonumber \\&- \frac{\ell ^{(3\alpha -1)}}{N^{(3\alpha -1)\gamma }} \langle {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2} z^N, ({\tilde{S}}^N)^2 x^N\rangle _{{\mathcal {C}}^N} \,. \end{aligned}$$
(B.1)

Then

$$\begin{aligned} Q^N = {Z}^N + e_{\star }^N, \end{aligned}$$
(B.2)

where

$$\begin{aligned} e_{\star }^N:= & {} \frac{\ell ^6}{32}- \frac{\ell ^6}{32 \,N^{6\gamma }} \Vert x^N\Vert _{{\mathcal {C}}^N}^2 \nonumber \\&+ a -2 \frac{\ell ^{2(\alpha -1)}}{N^{(\alpha -1) 2\gamma }}\Vert {\tilde{S}}^N x^N\Vert _{{\mathcal {C}}^N}^2 - \frac{\ell ^{2(2\alpha -1)}}{2 N^{2 \gamma (2\alpha -1)}}\Vert ({\tilde{S}}^N)^2 x^N\Vert _{{\mathcal {C}}^N}^2 \nonumber \\&+ i^N(x,z)+ e^N(x,z) , \end{aligned}$$
(B.3)

with

$$\begin{aligned} i^N(x,z):= i_1^N(x,z) + i_2^N(x,z), \end{aligned}$$

having defined

$$\begin{aligned}&i_1^N(x,z): =\frac{\ell ^4}{8 N^{4\gamma }} \left( \Vert x^N\Vert _{{\mathcal {C}}^N}^2- \Vert z^N\Vert ^2\right) \end{aligned}$$
(B.4)
$$\begin{aligned}&i_2^N(x,z): =\frac{\ell ^{2\alpha }}{2 N^{2 \alpha \gamma }} \left( \Vert {\tilde{S}}^N x^N\Vert _{{\mathcal {C}}^N}^2- \Vert {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2}z^N\Vert _{{\mathcal {C}}^N}^2\right) , \end{aligned}$$
(B.5)

and

$$\begin{aligned} e^N&:= \frac{\ell ^5}{8 N^{5\gamma }} \langle x^N, ({\mathcal {C}}^N)^{1/2}z^N\rangle _{{\mathcal {C}}^N}- \frac{\ell ^{2(\alpha +1)}}{4 N^{2(1+\alpha )\gamma }} \Vert {\tilde{S}}^N x^N\Vert _{{\mathcal {C}}}^2\nonumber \\&\qquad - \frac{\ell ^{3+\alpha }}{4 N^{(3+\alpha )\gamma }} \langle ({\mathcal {C}}^N)^{1/2}z^N, S^N x^N \rangle + \frac{\ell ^{(2\alpha +1)}}{2 N^{(2\alpha +1)\gamma }} \langle {\tilde{S}}({\mathcal {C}}^N)^{1/2} z^N, {\tilde{S}}^N x^N\rangle _{{\mathcal {C}}^N}. \end{aligned}$$
(B.6)

Finally, we set

$$\begin{aligned} {\tilde{e}}^N(x,z):= e^N+\frac{\ell ^{2(\alpha +1)}}{4 N^{2(1+\alpha )\gamma }} \Vert {\tilde{S}}^N x^N\Vert _{{\mathcal {C}}^N}^2. \end{aligned}$$
(B.7)

That is, \({\tilde{e}}^N\) contains only the addends of \(e^N\) that depend on the noise z.

Furthermore, we split \(Q^N(x,z)\) into the terms that contain \(z^{j,N}\) and the terms that don’t, \(Q^N_j\) and \(Q^N_{j, \perp }\), respectively; that is

$$\begin{aligned} Q^N= Q^{N}_j+ Q^N_{j, \perp }, \end{aligned}$$

where

$$\begin{aligned} Q_j^N:= {\tilde{e}}^N+ (i_1^N)_j+ (i_2^N)_j+ {Z}^N_j, \end{aligned}$$
(B.8)

having denoted by \((i_1^N)_j, \, (i_2^N)_j\) and \({Z}^N_j\), the part of \(i_1^N, i_2^N\) and \({Z}^N\), respectively, that depend on \(z^{j,N}\).

Lemma B.1

Let Assumptions 5.1, 5.2 and Condition 5.3 hold; then,

$$\begin{aligned} \mathrm{(i)}&\quad {\mathbb {E}}_{\pi ^N}\left| {\tilde{e}}^N \right| ^2 \lesssim \frac{1}{N^{2/3}} + \frac{1}{N^{4 \gamma }}, \quad \text{ for } \text{ all } \alpha \ge 1, \, \gamma \ge 1/6 \end{aligned}$$
(B.9)
$$\begin{aligned} \mathrm{(ii)}&\quad N^{1/3} {\mathbb {E}}_{\pi ^N}\sum _{j=1}^N \lambda _j^2 \left| (i_1^N)_j\right| ^2 \longrightarrow 0, \quad \text{ as } \, N\rightarrow \infty \nonumber \\ \mathrm{(iii)}&\quad {\mathbb {E}}_{\pi ^N}\left| i_2^N\right| ^2 \lesssim \frac{1}{N^{4 \gamma }}, \text{ for } \text{ all } \alpha \ge 1, \gamma \ge 1/6\nonumber \\ \text {(iv)}&\quad N^{1/3} {\mathbb {E}}_{\pi ^N}\sum _{j=1}^N \lambda _j^2 \left| {Z}^N_j\right| ^2 \longrightarrow 0, \quad \text{ as } \,N \rightarrow \infty , \text{ for } \text{ all } \alpha >2 , \gamma =1/6 \end{aligned}$$
(B.10)
$$\begin{aligned} \mathrm{(v)}&\quad {\mathbb {E}}_{\pi ^N}\left| e^N \right| ^2 \lesssim \frac{1}{N^{2/3}} + \frac{1}{N^{4 \gamma }}, \quad \text{ for } \text{ all } \alpha \ge 1, \, \gamma \ge 1/6 \end{aligned}$$
(B.11)
$$\begin{aligned} \mathrm{vi)}&\quad {\mathbb {E}}_{\pi ^N}\left| (\mathrm {Var}_x {Z}^N)^{1/2} - (\mathrm {Var} Q)^{1/2} \right| ^2 \longrightarrow 0, \quad \text{ as } \, N \rightarrow \infty , \text{ for } \text{ all } \alpha \ge 1, \, \gamma \ge 1/6 \,. \end{aligned}$$
(B.12)

Proof of Lemma B.1

Recall that, under \(\pi ^N\), \(x^{i, N} \sim \lambda _i \rho ^i\), where \(\{\rho ^i\}_{i \in {\mathbb {N}}}\) are i.i.d standard Gaussians. We now consecutively prove all the statements of the lemma.

Proof of (i) Notice that

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \langle x^N, ({\mathcal {C}}^N)^{1/2}z^N\rangle _{{\mathcal {C}}^N} \right| ^2 = {\mathbb {E}}\left| \sum _{i=1}^N \rho ^i z^{i,N}\right| ^2 =N \,. \end{aligned}$$

Therefore, since \(\gamma \ge 1/6\),

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \frac{\ell ^5}{8 N^{5\gamma }} \langle x^N, ({\mathcal {C}}^N)^{1/2}z^N\rangle _{{\mathcal {C}}^N}\right| ^2 \lesssim N^{-2/3}. \end{aligned}$$
(B.13)

Furthermore, using (5.15) (which follows from point i) of Assumption 5.2) and (5.11), we have

$$\begin{aligned} \frac{1}{N^{(3+\alpha )2 \gamma }} {\mathbb {E}}_{\pi ^N}\left| \langle ({\mathcal {C}}^N)^{1/2}z^N, S^Nx^N \rangle \right| ^2 \lesssim N^{- 8 \gamma }; \end{aligned}$$
(B.14)

similarly, using (5.16) (which follows from point ii) of Assumption 5.2) and (5.12),

$$\begin{aligned} \frac{1}{ N^{(2\alpha +1)2 \gamma }} {\mathbb {E}}_{\pi ^N}\left| \langle {\tilde{S}}({\mathcal {C}}^N)^{1/2} z^N, {\tilde{S}}^N x^N\rangle _{{\mathcal {C}}}\right| ^2 \lesssim N^{-4 \gamma } \,. \end{aligned}$$
(B.15)

Now the first statement of the lemma is a consequence of (B.6)–(B.7) and the above (B.13), (B.14) and (B.15).

Proof of (ii) This is proved in [22], see calculations after [22, equation (4.18)], so we omit it.

Proof of (iii) This estimate follows again from Assumption 5.2, once we observe that if \(x \sim \pi ^N\) then \( \Vert {\tilde{S}}^N x^N\Vert _{{\mathcal {C}}^N}^2\) and \(\Vert {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2} z^N\Vert _{{\mathcal {C}}^N}^2\) are two independent random variables with the same distribution. With this observation in place, we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| \frac{ \Vert {\tilde{S}}x^N\Vert _{{\mathcal {C}}}^2 - \Vert {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N\Vert _{{\mathcal {C}}}^2}{N^{2 \alpha \gamma }} \right| ^2&= {\mathbb {E}}_{\pi ^N}\left| \frac{ \Vert {\tilde{S}}x^N\Vert _{{\mathcal {C}}}^2 }{N^{2\alpha \gamma }} - \frac{c_1}{N^{2 \gamma }} + \frac{c_1}{N^{2 \gamma }} - \frac{\Vert {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N\Vert _{{\mathcal {C}}}^2}{N^{2\alpha \gamma }} \right| ^2 \\&\lesssim {\mathbb {E}}_{\pi ^N}\left| \frac{ \Vert {\tilde{S}}x^N\Vert _{{\mathcal {C}}}^2 }{N^{2\alpha \gamma }} - \frac{c_1}{N^{2 \gamma }} \right| ^2 = N^{-4\gamma } {\mathbb {E}}_{\pi ^N}\left| \frac{ \Vert {\tilde{S}}x^N\Vert _{{\mathcal {C}}}^2 }{N^{2(\alpha -1)\gamma }} - c_1 \right| ^2, \end{aligned}$$

which gives the claim.

Proof of (iv) We recall that \({Z}^N_j\) has been introduced in (B.8). Using the antisymmetry of \(S^N\) and the definition of \({\tilde{S}}^N\), we have

$$\begin{aligned} -\langle {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2}z^N, {\tilde{S}}^N x^N\rangle _{{\mathcal {C}}}= \langle ({\mathcal {C}}^N)^{1/2}z^N, S^N {\tilde{S}}^N x^N\rangle \end{aligned}$$

and

$$\begin{aligned} -\langle {\tilde{S}}^N ({\mathcal {C}}^N)^{1/2}z^N, ({\tilde{S}}^N)^2 x\rangle _{{\mathcal {C}}^N}= \langle ({\mathcal {C}}^N)^{1/2}z^N, S^N ({\tilde{S}}^N)^2 x^N\rangle . \end{aligned}$$

We can therefore write an explicit expression for \({Z}^N_j\):

$$\begin{aligned} {Z}^N_j&= -\frac{\ell ^3}{4 {N^{3\gamma }}} \frac{x^{j,N} z^{j,N}}{\lambda _j } -2 \frac{\ell ^{\alpha -1}}{N^{(\alpha -1)\gamma }} \lambda _j z^{j,N} (S^Nx^N)^j \nonumber \\&\qquad + \frac{\ell ^{2\alpha -1}}{N^{(2\alpha -1)\gamma }} \lambda _j z^{j,N} (S^N{\tilde{S}}^N x^N)^j + \frac{\ell ^{3\alpha -1}}{N^{(3\alpha -1)\gamma }} \lambda _j z^{j,N} (S^N({\tilde{S}}^N)^2 x^N)^j. \end{aligned}$$
(B.16)

For the sake of clarity we stress again that in the above \((S^Nx^N)\) is an N-dimensional vector and \((S^Nx^N)^j\) is the j-th component of such a vector. Therefore, recalling (2.3), (2.1), (A.1) and setting \(\gamma =1/6\), we have

$$\begin{aligned} \sum _{j=1}^N \lambda _j^2 {\mathbb {E}}_{\pi ^N}\left| {Z}^N_j\right| ^2&\lesssim \frac{1}{N} \sum _{j=1}^N \lambda _j^2 {\mathbb {E}}\left| \rho ^j z^{j,N}\right| ^2 + \frac{{\mathbb {E}}_{\pi ^N}}{N^{(\alpha -1)/3}} \sum _{j=1}^N \lambda _j^4 \left| (S^Nx^N)^j\right| ^2 \\&\quad + \frac{{\mathbb {E}}_{\pi ^N}}{N^{(2\alpha -1)/3}} \sum _{j=1}^N \lambda _j^4 \left| (S^N{\tilde{S}}^N x^N)^j\right| ^2\\&\quad + \frac{{\mathbb {E}}_{\pi ^N}}{N^{(3\alpha -1)/3}} \sum _{j=1}^N \lambda _j^4 \left| (S^N({\tilde{S}}^N)^2 x^N)^j\right| ^2 \\&\lesssim \frac{1}{N}+ \frac{1}{N^{(\alpha -1)/3}}{\mathbb {E}}_{\pi ^N}\Vert {\tilde{S}}^N x^N\Vert ^2 + \frac{1}{N^{(2\alpha -1)/3}}{\mathbb {E}}_{\pi ^N}\Vert ({\tilde{S}}^N)^2 x^N\Vert ^2 \\&\quad + \frac{1}{N^{(3\alpha -1)/3}}{\mathbb {E}}_{\pi ^N}\Vert ({\tilde{S}}^N)^3 x^N\Vert ^2 \\&\lesssim \frac{1}{N}+ \frac{1}{N^{(\alpha -1)/3}}{\mathbb {E}}_{\pi ^N}\Vert x^N\Vert ^2 \,. \end{aligned}$$

Therefore,

$$\begin{aligned} N^{1/3}\sum _{j=1}^N \lambda _j^2 {\mathbb {E}}_{\pi ^N}\left| {Z}^N_j\right| ^2 \lesssim \frac{1}{N^{2/3}}+ \frac{1}{N^{(\alpha -2)/3}} \longrightarrow 0, \quad \text{ when } \alpha >2. \end{aligned}$$

Proof of (v) Follows from Assumption 5.2, from statement (i) of this lemma and from (B.7).

Proof of (vi) From (B.1),

$$\begin{aligned} \mathrm {Var}_x({Z}^N) =&{\mathbb {E}}_x\left| - \frac{\ell ^3}{4 {N^{3\gamma }}} \langle x, {\mathcal {C}}^{1/2} z^N\rangle _{{\mathcal {C}}} -2 \frac{\ell ^{\alpha -1}}{N^{(\alpha -1)\gamma }} \langle {\mathcal {C}}^{1/2}z^N, Sx \rangle \right. \\&\left. - \frac{\ell ^{(2\alpha -1)}}{N^{(2\alpha -1)\gamma }} \langle {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N, {\tilde{S}}x\rangle _{{\mathcal {C}}} - \frac{\ell ^{(3\alpha -1)}}{N^{(3\alpha -1)\gamma }} \langle {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N, {\tilde{S}}^2 x\rangle _{{\mathcal {C}}}\right| ^2 \,. \end{aligned}$$

Therefore,

$$\begin{aligned} \mathrm {Var}_x({Z}^N)&= \frac{\ell ^6}{16 N^{6\gamma }}{\mathbb {E}}_x \Vert x\Vert _{{\mathcal {C}}}^2 + 4 \frac{\ell ^{2(\alpha -1)}}{N^{2(\alpha -1)\gamma }} {\mathbb {E}}_x \Vert {\mathcal {C}}_N^{1/2}S x\Vert ^2 \\&+ \frac{\ell ^{2(2\alpha -1)}}{N^{2(2\alpha -1)\gamma }} {\mathbb {E}}_x \Vert {\mathcal {C}}_N^{1/2}S{\tilde{S}}x\Vert ^2 + \frac{\ell ^{2(3\alpha -1)}}{N^{2(3\alpha -1)\gamma }} {\mathbb {E}}_x \Vert {\mathcal {C}}_N^{1/2}S{\tilde{S}}^2 x\Vert ^2 + {\mathbb {E}}_x r^N \end{aligned}$$

where \(r^N\) contains all the cross-products in the expansion of the variance. By direct calculation and using the antisymmetry of S, one finds that most of such cross products vanish and we have

$$\begin{aligned} {\mathbb {E}}_x r^N:= \frac{\ell ^{(2\alpha +2)}}{2 {N^{3\gamma }}N^{(2\alpha -1)\gamma }} \langle Sx, {\tilde{S}} x \rangle + 4 \frac{\ell ^{(4\alpha -2)}}{N^{2(2\alpha -1)\gamma }}{\mathbb {E}}_x \Vert {\tilde{S}}^2 x\Vert _{{\mathcal {C}}}^2. \end{aligned}$$

Observe that

$$\begin{aligned} \langle Sx, {\tilde{S}} x \rangle = \Vert {\tilde{S}}x\Vert _{{\mathcal {C}}}^2; \end{aligned}$$

using this fact, Assumption 5.2 implies that the first addend in the above expression for \({\mathbb {E}}_x r^N\) vanishes as \(N \rightarrow \infty \). The second addend contributes instead to the limiting variance. Now straightforward calculations give the result.

\(\square \)

We recall the definitions

$$\begin{aligned} \epsilon _{{\mathbf {p}}}^N(x):= {\mathbb {E}}_x \left[ \left( 1 \wedge e^{Q^N} \right) {\mathcal {C}}_N^{1/2}z^N \right] \end{aligned}$$
(B.17)

and

$$\begin{aligned} h^N_{{\mathbf {p}}}(x):= {\mathbb {E}}_x \left( 1 \wedge e^{Q^N} \right) . \end{aligned}$$

Lemma B.2

Suppose that Assumptions 5.1, 5.2 and Condition 5.3 hold. Then

  1. (i)

    If \(\alpha > 2\) and \(\gamma = 1/6\),

    $$\begin{aligned} N^{1/3}{\mathbb {E}}_{\pi ^N}\Vert \epsilon _{{\mathbf {p}}}^N(x)\Vert ^2 {\mathop {\longrightarrow }\limits ^{N\rightarrow \infty }} 0\, ; \end{aligned}$$
    (B.18)
  2. (ii)

    if \(1 \le \alpha \le 2\) and \(\gamma \ge 1/6\) then

    $$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert N^{\gamma (\alpha -1)}\epsilon _{{\mathbf {p}}}^N(x) + 2 \ell ^{\alpha -1} \nu _{{\mathbf {p}}}\, {\tilde{S}} x\Vert ^2 {\mathop {\longrightarrow }\limits ^{N\rightarrow \infty }} 0 \, \end{aligned}$$
    (B.19)

    where the constant \(\nu _{{\mathbf {p}}}\) has been defined in (5.31).

  3. (iii)

    if \(\alpha \ge 1\), \(\gamma \ge 1/6\) and \(S^N\) is such that \((c_1, c_2, c_3) \ne (0,0,0)\), then

    $$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| h_{{\mathbf {p}}}^N(x)-h_{{\mathbf {p}}}\right| ^2 {\mathop {\longrightarrow }\limits ^{N\rightarrow \infty }} 0 \,; \end{aligned}$$
    (B.20)
  4. (iv)

    finally, if \( 1< \alpha < 2 \), \(\gamma >1/6\) and \(S^N\) is such that \(c_1=c_2=c_3=0\), then

    $$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| h_{{\mathbf {p}}}^N(x)- 1\right| ^2 {\mathop {\longrightarrow }\limits ^{N\rightarrow \infty }} 0 \,, \end{aligned}$$

    i.e., the constant \(h_{{\mathbf {p}}}\) in (B.20) is equal to one. This means that, as \(N \rightarrow \infty \), the acceptance probability tends to one.

Proof of Lemma B.2

\(\bullet \,\)Proof of (i) Acting as in [22, page 2349], we obtain

$$\begin{aligned} \left| \langle \epsilon ^N_{{\mathbf {p}}}(x), \varphi _j\rangle \right| ^2 \lesssim \lambda _j^2 {\mathbb {E}}_x \left| Q_j^N\right| ^2 \,. \end{aligned}$$

Taking the sum over j on both sides of the above then gives

$$\begin{aligned} \Vert \epsilon ^N_{{\mathbf {p}}}(x)\Vert ^2 \lesssim \sum _{j=1}^N\lambda _j^2 {\mathbb {E}}_x \left| Q_j^N\right| ^2 \,. \end{aligned}$$

Therefore, if we show

$$\begin{aligned} N^{1/3} \sum _{j=1}^N\lambda _j^2 {\mathbb {E}}_x \left| Q_j^N\right| ^2 {\mathop {\longrightarrow }\limits ^{N \rightarrow \infty }} 0, \end{aligned}$$

(B.18) follows. From (B.8), it is clear that the above is a consequence of Lemma B.1 [in particular, it follows from (B.9)–(B.10)].

\(\bullet \,\)Proof of (ii) Let us split \(Q^N\) as follows:

$$\begin{aligned} Q^N= R^N+ e^N+i_2^N, \end{aligned}$$

where \( e^N\) and \(i_2^N\) are defined in (B.6) and (B.5), respectively, while

$$\begin{aligned} R^N:= I^N+i_1+B^N+H^N, \end{aligned}$$
(B.21)

having set

$$\begin{aligned} I^N&:= - \frac{\ell ^6}{32 \,N^{6\gamma }} \Vert x^N\Vert _{{\mathcal {C}}}^2 -2 \frac{\ell ^{2 (\alpha -1)}}{N^{2\gamma (\alpha -1)}}\Vert {\tilde{S}}x^N\Vert _{{\mathcal {C}}}^2 - \frac{\ell ^{2 (2\alpha -1)}}{2 N^{2\gamma (2\alpha -1)}}\Vert {\tilde{S}}^2 x^N\Vert _{{\mathcal {C}}}^2 \end{aligned}$$
(B.22)
$$\begin{aligned} B^N&:= -2 \frac{\ell ^{\alpha -1}}{N^{\gamma (\alpha -1)}} \langle {\mathcal {C}}^{1/2}z^N, Sx^N \rangle \end{aligned}$$
(B.23)
$$\begin{aligned} H^N&:= - \frac{\ell ^3}{4 {N^{3\gamma }}} \langle x, {\mathcal {C}}^{1/2} z^N\rangle _{{\mathcal {C}}} - \frac{\ell ^{(2\alpha -1)}}{ N^{\gamma (2\alpha -1)}} \langle {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N, {\tilde{S}}x^N\rangle _{{\mathcal {C}}} \nonumber \\&- \frac{\ell ^{(3\alpha -1)}}{N^{\gamma (3\alpha -1)}} \langle {\tilde{S}}{\mathcal {C}}_N^{1/2} z^N, {\tilde{S}}^2 x^N\rangle _{{\mathcal {C}}} \,, \end{aligned}$$
(B.24)

and \(i_1^N\) is defind in (B.4). The j-th component of \(N^{\gamma (\alpha -1)} \epsilon _{{\mathbf {p}}}^N\) can be therefore expressed as follows:

$$\begin{aligned} N^{\gamma (\alpha -1)} \epsilon _{{\mathbf {p}}}^{j,N}&=N^{\gamma (\alpha -1)} {\mathbb {E}}_x\left[ (1 \wedge e^{Q^N}) \lambda _j z^{j,N}\right] \nonumber \\&= N^{\gamma (\alpha -1)} {\mathbb {E}}_x\left[ (1 \wedge e^{R^N}) \lambda _j z^{j,N}\right] + T_0^j \end{aligned}$$
(B.25)

where \(T_0^j:=\langle T_0, \varphi _j \rangle \) and

$$\begin{aligned} T_0:= N^{\gamma (\alpha -1)} {\mathbb {E}}_x\left[ \left( (1 \wedge e^{Q^N}) - (1 \wedge e^{R^N})\right) {\mathcal {C}}_N^{1/2} z^N \right] . \end{aligned}$$
(B.26)

We now decompose \(R^N\) into a component which depends on \(z^{j,N}\), \(R^N_j\), and a component that does not depend on \(z^{j,N}\), \(R^N_{j, \perp }\):

$$\begin{aligned} R^N=R^N_j+ R^N_{j, \perp }, \end{aligned}$$

with

$$\begin{aligned} R^N_j:=(i_1)_j+(B^N)_j+(H^N)_j, \end{aligned}$$

having denoted by \( (i_1^N)_j, (B^N)_j\) and \((H^N)_j\) the part of \(i_1, B^N\) and \(H^N\), respectively, that depend on \(z^{j,N}\). That is,

$$\begin{aligned} (i_1)_j=- \frac{\ell ^4}{8N^{4 \gamma }} \left| z^{j,N}\right| ^2; \end{aligned}$$

as for \((H^N)_j\), it suffices to notice that

$$\begin{aligned} (B^N)_j+(H^N)_j = {Z}^N_j, \end{aligned}$$

and the expression for \({Z}^N_j\) is detailed in (B.16) (just set \(\alpha =2\) in (B.16)). With this notation, from (B.25), we further write

$$\begin{aligned} N^{\gamma (\alpha -1)} {\mathbb {E}}_x\left[ (1 \wedge e^{Q^N}) \lambda _j z^j\right] {=} N^{\gamma (\alpha -1)} {\mathbb {E}}_x\left[ (1 \wedge e^{[R^N -(i_1^N)_j)-H_j^N]} \lambda _j z^j\right] {+} T_0^j{+} T_1^j\nonumber \\ \end{aligned}$$
(B.27)

where, like before, \(T_1^j:=\langle T_1, \varphi _j \rangle \) and

$$\begin{aligned} T_1:= N^{\gamma (\alpha -1)} \ell {\mathbb {E}}_x\left[ \left( (1 \wedge e^{R^N}) - \left( 1 \wedge e^{R^N -(i_1^N)_j-H_j^N}\right) \right) {\mathcal {C}}_N^{1/2} z^N \right] . \end{aligned}$$

We recall that the notation \({\mathbb {E}}_x\) denotes expected value given x, where the expectation is taken over all the sources of noise contained in the integrand. In order to further evaluate the RHS of (B.27) we calculate the expected value of the integrand with respect to the law of \(z^j\) (we denote such expected value by \({\mathbb {E}}^{z^j}\) and use \({\mathbb {E}}^{z^j_-}\) to denote expectation with respect to \(z^N\setminus z^{j,N}\)); to this end, we use the following lemma.

Lemma B.3

If G is a normally distributed random variable with \(G \sim {\mathcal {N}}(0,1)\) then

$$\begin{aligned} {\mathbb {E}}\left[ G\left( 1 \wedge e^{\delta G + \mu }\right) \right] = \delta e^{\mu + \delta ^2/2} \Phi \left( -\frac{\mu }{\left| \delta \right| }- \left| \delta \right| \right) , \end{aligned}$$

where \(\Phi \) is the CDF of a standard Gaussian.

We apply the above lemma with \(\mu = R^N_{j,\perp }\) and \(\delta = \delta ^B_j\), where

$$\begin{aligned} \delta ^B_j&:= -2 \frac{\ell ^{\alpha -1}}{N^{\gamma (\alpha -1)}} \lambda _j (Sx^N)^j. \end{aligned}$$

We therefore obtain

$$\begin{aligned}&N^{\gamma (\alpha -1)} \lambda _j {\mathbb {E}}_x\left[ (1 \wedge e^{[R^N -(i_1^N)_j-H_j^N)]} z^j\right] \\&\quad = N^{\gamma (\alpha -1)} \lambda _j {\mathbb {E}}_{x}^{z^j_-}\delta _j^B e^{R_{j,\perp }^N +(\delta _j^B)^2/2} \Phi \left( - \frac{R_{j,\perp }^N}{\left| \delta _j^B\right| } - \left| \delta _j^B \right| \right) \\&\quad =-2 \ell ^{\alpha -1} \lambda _j^2 (S^Nx^N)^j {\mathbb {E}}_x^{z} e^{R_{j,\perp }^N +(\delta _j^B)^2/2} \Phi \left( - \frac{R_{j,\perp }^N}{\left| \delta _j^B\right| } - \left| \delta _j^B \right| \right) \\&\quad = -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x}(Sx)^j e^{R_{j,\perp }^N +(\delta _j^B)^2/2}{\mathbf {1}}_{\{R_{j,\perp }^N<0\}}+ T_2^j+T_3^j\\&\quad =-2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x} (Sx)^j e^{{\tilde{Q}}^N }{\mathbf {1}}_{\{{\tilde{Q}}<0\}}+ T_2^j+T_3^j +T_4^j\\&\quad = -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x}(Sx)^j e^{{Q}^N }{\mathbf {1}}_{\{{Q}<0\}}+ T_2^j+T_3^j +T_4^j + T_5^j,\\&T_2^j := -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x}(Sx)^j e^{R_{j,\perp }^N +(\delta _j^B)^2/2} \left[ \Phi \left( - \frac{R_{j,\perp }^N}{\left| \delta _j^B\right| } - \left| \delta _j^B \right| \right) - \Phi \left( - \frac{R_{j,\perp }^N}{\left| \delta _j^B\right| } \right) \right] \end{aligned}$$

and

$$\begin{aligned} T_3^j&:= -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x} (Sx)^j e^{R_{j,\perp }^N+(\delta _j^B)^2/2 } \left[ \Phi \left( - \frac{R_{j,\perp }^N}{\left| \delta _j^B\right| } \right) - {\mathbf {1}}_{\{R_{j,\perp }^N<0\}} \right] \nonumber \\ T_4^j&:= -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x} (Sx)^j \left[ e^{R_{j,\perp }^N +(\delta _j^B)^2/2 } {\mathbf {1}}_{\{R_{j,\perp }^N<0\}} - e^{{Q}^N } {\mathbf {1}}_{\{{Q}^N<0\}}\right] \nonumber \\ T_5^j&:= -2 \ell ^{\alpha -1} \lambda _j^2 {\mathbb {E}}_{x} (Sx)^j \left[ e^{{Q}^N }{\mathbf {1}}_{\{{Q}^N<0\}} - e^{{Q} }{\mathbf {1}}_{\{{Q}<0\}}\right] . \end{aligned}$$
(B.28)

To prove the statement it suffices to show that

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\sum _{n=0}^5\Vert T_n\Vert ^2\rightarrow 0 \quad \text{ as } N \rightarrow \infty \end{aligned}$$

These calculations are a a bit lengthy, so we gather the proof of the above in Lemma B.4 below. Assuming for the moment that the above is true, the proof is concluded after recognising that

$$\begin{aligned} -2 \ell ^2 \lambda _j^2 (Sx)^j {\mathbb {E}}e^{{Q} }{\mathbf {1}}_{\{{Q}<0\}}= -2 \ell ^2 ({\tilde{S}}^Nx^N)^j \nu _{{\mathbf {p}}}. \end{aligned}$$

\(\bullet \,\)Proof of (iii) By acting as in the proof of [22, Lemma 4.5 and Corollary 4.6] we see that (B.20) is a consequence of (B.2), (B.12) and of the following limit:

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| e^N_{\star }\right| ^2 \longrightarrow 0 \quad \text{ as } N \rightarrow \infty . \end{aligned}$$

The above follows from the definition (B.3), Lemma B.1, Assumption 5.2 and [22, equation (4.7)].

\(\bullet \,\)Proof of (iv) One could show this with the same procedure as in (iii). However, in this case things are easier, indeed we can write

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\left| {\mathbb {E}}_x (1\wedge e^{Q^N}) - (1\wedge e^0)\right| ^2 \le {\mathbb {E}}_{\pi ^N}\left| Q^N\right| ^2 \rightarrow 0\,. \end{aligned}$$

The above limit follows simply by the assumption that \(\gamma >1/6\) and \(c_1=c_2=c_3=0\).    \(\square \)

Lemma B.4

If Assumptions 5.1, 5.2, 5.4 and Condition 5.3 hold, then

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert T_n\Vert ^2= {\mathbb {E}}_{\pi ^N}\sum _{i=1}^N \left| T_n^j \right| ^2 {\mathop {\longrightarrow }\limits ^{N \rightarrow \infty }} 0, \quad \text{ for } \text{ all } n \in \{0,1 ,{\dots },5\}, \end{aligned}$$

where \(T_n=\sum _{i=1}^N \langle T_n, \varphi _i\rangle \varphi _i\) and the terms \(T_0^j ,{\dots },T_5^j\) have been introduced in (B.26)–(B.28).

Proof of Lemma B.4

We consecutively prove the above limit for the terms \(T_0^j ,{\dots },T_5^j\).

\(\bullet \) Using the Lipshitzianity of the function \(u \rightarrow 1 \wedge e^u\), the result for \(T_0\) follows from the definition of \(R^N\), Eq. (B.21), and Lemma B.1, statements (iii) and (v). The result for \(T_1\) can be obtained similarly.

\(\bullet \) Term \(T_2\): we using the the lipshitzianity of the function \(\Phi \) and observe that the following holds

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}e^{c (R^N_{j,\perp }+ \delta ^2/2)} \lesssim {\mathbb {E}}_{\pi ^N}e^{c R^N}\lesssim 1 \quad \text{ for } \text{ all } c>0. \end{aligned}$$

The above can be obtained with a reasoning similar to the one detailed in [18, page 916 and (5.20)], using (i) of Assumption 5.4 . Using the above observations and applying the Hoelder inequality with the exponent r appearing in (ii) of Assumption 5.4, one then gets,

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert T_2\Vert ^2 \lesssim \left( {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \frac{\lambda _j^6 \left| (Sx)^j \right| ^4}{N^{2\gamma (\alpha -1)}} \right) ^r \right) ^{1/r} \,. \end{aligned}$$

Therefore the term \(T_2\) goes to zero by Assumption 5.4.

\(\bullet \) The term \(T_3\) can be treated with calculations completely analogous to those in [18, Lemma 5.8]. As a result of such calculations, using the fact that the noise \(z^{j,N}\) is always independent on the current position x, and recalling Eq. (B.16), we obtain that for any \(r,q>1\) (to be later appropriately chosen), the following bound holds:

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert T_3\Vert ^2&\lesssim \left\{ {\mathbb {E}}_{\pi ^N}\left[ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \frac{(\lambda _j \left| (Sx)^j\right| +1 )}{\left| R^N \right| N^{\gamma (\alpha -1)}+1} \cdot \left( 1+ \frac{\left| z^{j,N}\right| ^2}{N^{4\gamma }}+ Z_j^N\right) \right] ^{2/q} \right] ^r \right\} ^{1/r} \end{aligned}$$
(B.29)

Now set

$$\begin{aligned} D_N:= \left( {\mathbb {E}}_x \frac{1}{(\left| R^N \right| N^{\gamma (\alpha -1)}+1)^2} \right) ^{r/q}, \end{aligned}$$

so that

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\Vert T_3\Vert ^2&\lesssim \left\{ {\mathbb {E}}_{\pi ^N}D_N \left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2\left[ {\mathbb {E}}_x \left( \lambda _j \left| (Sx)^j\right| +1 \right) ^{2} \right] ^{1/q} \right\} ^r\right\} ^{1/r} \quad \quad \quad \quad {(I)}\\&\quad + \left\{ {\mathbb {E}}_{\pi ^N}D_N \left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2\left[ {\mathbb {E}}_x \left( \lambda _j \left| (Sx)^j\right| +1 \right) ^{2} \left| Z^N_j\right| ^2 \right] ^{1/q} \right\} ^r \right\} ^{1/r} \,. \quad {(II)} \end{aligned}$$

Notice that by the bounded convergence theorem, we have

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}D_N \lesssim \left( \frac{1}{N^{\gamma (\alpha -1)}}\right) ^{r/q}. \end{aligned}$$
(B.30)

With this observation it is easy to show that the term (I) tends to zero. It is less easy to show that (II) tends to zero, so for this term we detail calculations a bit more.

$$\begin{aligned} (II)&\lesssim ({\mathbb {E}}_{\pi ^N}D_N^2)^{1/(2r)}\\&\times \left( {\mathbb {E}}_{\pi ^N}\left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \left[ (\lambda _j \left| (Sx)^j\right| +1 ) \frac{\left| x^{j,N} z^{j,N}\right| }{N^{3\gamma }\lambda _j} \right] ^2 \right] ^{1/q} \right\} ^{2r} \right) ^{1/(2r)}\\&\quad + ({\mathbb {E}}_{\pi ^N}D_N^2)^{1/(2r)}\\&\times \left( {\mathbb {E}}_{\pi ^N}\left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \left[ (\lambda _j \left| (Sx)^j\right| +1 ) \frac{\lambda _j \left| (Sx)^j\right| \left| z^{j,N} \right| }{N^{\gamma (\alpha -1)}} \right] ^2 \right] ^{1/q} \right\} ^{2r} \right) ^{1/(2r)}\\&\quad + ({\mathbb {E}}_{\pi ^N}D_N^2)^{1/(2r)}\\&\times \left( {\mathbb {E}}_{\pi ^N}\left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \left[ (\lambda _j \left| (Sx)^j\right| +1 ) \frac{\lambda _j \left| (S{\tilde{S}}x)^j\right| \left| z^{j,N} \right| }{N^{\gamma (2\alpha -1)}} \right] ^2 \right] ^{1/q} \right\} ^{2r} \right) ^{1/(2r)}\\&\quad + ({\mathbb {E}}_{\pi ^N}D_N^2)^{1/(2r)}\\&\times \left( {\mathbb {E}}_{\pi ^N}\left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \left[ (\lambda _j \left| (Sx)^j\right| +1 ) \frac{\lambda _j \left| (S {\tilde{S}}^2 x)^j\right| \left| z^{j,N} \right| }{N^{\gamma (3\alpha -1)}} \right] ^2 \right] ^{1/q} \right\} ^{2r} \right) ^{1/(2r)}\,. \end{aligned}$$

We denote by (II)\(_1\) to (II)\(_4\) the terms in line 1 to 4 of the above array of equations and the scond factor in line i we denote by (II)\(_{ib}\), so e.g.,

$$\begin{aligned} (II)_1= ({\mathbb {E}}_{\pi ^N}D_N^2)^{1/(2r)} ((II)_{1b})^{1/(2r)}, \end{aligned}$$

where

$$\begin{aligned} (II)_{1b}:= {\mathbb {E}}_{\pi ^N}\left\{ \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^2 \left[ {\mathbb {E}}_x \left[ (\lambda _j \left| (Sx)^j\right| +1 ) \frac{\left| x^{j,N} z^{j,N}\right| }{N^{3\gamma }\lambda _j} \right] ^2 \right] ^{1/q} \right\} ^{2r} . \end{aligned}$$

To streamline the presentation we have written the calculations leading to the above four addends in a way that it looks like the choice of q should be the same for the four terms above. However, acting appropriately in the computations that give (B.29), one can see that the q does not need to be the same for each one of the above addends. We show how to study (II)\(_1\) and (II)\(_3\), the other terms can be done with similar tricks. Starting from (II)\(_1\), because of (B.30), we just need to prove that (II)\(_{1b}\) is bounded. We will do slightly better in what follows. Recall that by assumption

$$\begin{aligned} {\mathbb {E}}_{\pi ^N}\sum _{j=1}^N \frac{\lambda _j^{2p} \left| (Sx)^j\right| ^{2p}}{N^{2p\gamma (\alpha -1)}}\le {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \frac{\lambda _j^{2} \left| (Sx)^j\right| ^{2}}{N^{2\gamma (\alpha -1)}} \right) ^p < \infty . \end{aligned}$$
(B.31)

Choosing \(q=2\) in the definition of (II)\(_{1b}\) and recalling \(x^{j,N} \sim \lambda _j \rho _j\), we get

$$\begin{aligned} (II)_{1b} = {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \frac{\lambda _j^{2} \lambda _j^3 \left| (Sx)^j\right| ^{3}}{N^{3\gamma }} \right) ^{2r} \lesssim {\mathbb {E}}_{\pi ^N}\sum _{j=1}^N \frac{\lambda _j^{2} \lambda _j^{6r} \left| (Sx)^j\right| ^{6r}}{N^{6r\gamma }} \longrightarrow 0, \end{aligned}$$

where in the last inequality we have used the weighted Jentsen’s inequality (relying on the fact that \(\{\lambda _j^2\}_j\) is summable) and the convergence of the RHS to zero follows from (B.31). The term (II)\(_{2b}\) can be dealt with analogously, choosing \(q=4\) (this time when applying the weighted Jentsen’s inequality one should rely on the fact that the sequence \(\{\lambda _j^4 \left| (Sx)^j\right| ^2 \}_j \) is summable for every \(x \in {\mathcal {H}}\)). Finally, to deal with (II)\(_{3b}\), we use the fact that the sequence \(\{({\tilde{S}}^2 x)^j\}_j\) is, by assumpion, bounded for every \(x \in {\mathcal {H}}\). Therefore, choosing \(q=2\) we have:

$$\begin{aligned} (II)_{3b}&= {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \lambda _j^4 \left| (Sx)^j\right| ^{2} \lambda _j \left| (Sx)^j\right| \frac{\lambda _j \left| (S {\tilde{S}}x)^j\right| }{N^{\gamma (2\alpha -1)}} \right) ^{2r}\\&= {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \frac{\lambda _j^4 \left| (Sx)^j\right| ^3 }{N^{\gamma (2\alpha -1)}} \left| ({\tilde{S}}^2 x)^j\right| \right) ^{2r}\\&\le {\mathbb {E}}_{\pi ^N}\left( \sum _{j=1}^N \frac{\lambda _j^2 \left| (Sx)^j\right| ^2}{N^{\gamma (2\alpha -1)}} \lambda _j^2\left| (Sx)^j\right| \right) ^{2r} \,. \end{aligned}$$

Because \(2\alpha \gamma -\gamma > 2\gamma (\alpha -1)\), the RHS of the above tends to zero by using (B.31). The term (II)\(_{4b}\) can be dealt with in a completely analogous manner.

\(\bullet \) The terms \(T_4\) and \(T_5\) can be studied similarly to what has been done in [14], see calculations from equation (8.31), in particular the terms \(e^{i,N}_{3,k},e^{i,N}_{5,k}\). \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ottobre, M., Pillai, N.S. & Spiliopoulos, K. Optimal scaling of the MALA algorithm with irreversible proposals for Gaussian targets. Stoch PDE: Anal Comp 8, 311–361 (2020). https://doi.org/10.1007/s40072-019-00147-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40072-019-00147-5

Keywords

Navigation