1 Introduction

1.1 Main results

The main goal of this article is to give quantitative estimates in the homogenization of discrete divergence-form operators with random coefficients. Writing \(\mathbb B \) for the set of edges of \(\mathbb Z ^d\), we let \(\omega = (\omega _e)_{e \in \mathbb B }\) be a family of i.i.d. random variables, assumed to be uniformly bounded away from \(0\) and infinity, and whose joint distribution will be written \(\mathbb P \) (with associated expectation \(\mathbb E \)). The operator whose homogenization properties we wish to investigate is

$$\begin{aligned} L^\omega f(x) = \sum _{y \sim x} \omega _{x,y} (f(y)-f(x)) \quad (x \in \mathbb Z ^d), \end{aligned}$$
(1.1)

where we write \(y \sim x\) when \(x,y \in \mathbb Z ^d\) are nearest neighbours. For a bounded continuous \(f : \mathbb R ^d \rightarrow \mathbb R \), we consider \(u^{(\varepsilon )}\) the solution of

and \(u_\varepsilon (t,x) = u^{(\varepsilon )}(\varepsilon ^{-2} t,\lfloor \varepsilon ^{-1} x \rfloor )\). There exists a symmetric positive-definite matrix \(\overline{A}\) (independent of \(f\)) such that the function \(u_\varepsilon \) converges, as \(\varepsilon \) tends to \(0\), to the function \(\overline{u}\) solution of

The notions of being a solution to (DPE\(^\omega _\varepsilon \)) or (CPE), and of the convergence of \(u_\varepsilon \) to \(\overline{u}\), will be made precise later on. For every \(\alpha = (\alpha _1, \ldots , \alpha _d) \in \mathbb N ^d\), we call

$$\begin{aligned} \partial _{x_1^{\alpha _1} \ldots x_d^{\alpha _d}} f = \frac{\partial ^{|\alpha |_1} f}{\partial {x_1^{\alpha _1}} \cdots \partial {x_d^{\alpha _d}}} \qquad \textstyle {(|\alpha |_1 = \sum _j \alpha _j)} \end{aligned}$$

a weak derivative of \(f\) of order \(|\alpha |_1\), where the derivative is understood in the sense of distributions.

Here and below, we write \(\lfloor x \rfloor \) for the integer part of \(x, a \wedge b = \min (a,b), a \vee b = \max (a,b), \log _+(x) = \log (x) \vee 1\), and \(|\xi |\) for the \(L^2\) norm of \(\xi \in \mathbb R ^d\). The main purpose of this paper is to prove the following theorems.

Theorem 1.1

Let \(m = \lfloor d/2 \rfloor + 3\) and \(\delta > 0\). There exist constants \(C_\delta \) (which may depend on the dimension) and \(q\) such that, if the weak derivatives of \(f\) up to order \(m\) are in \(L^2(\mathbb R ^d)\), then for any \(\varepsilon > 0, t > 0\) and \(x \in \mathbb R ^d\), one has

$$\begin{aligned}&\left| \mathbb E [u_\varepsilon (t,x)] - \overline{u}(t,x) \right| \nonumber \\&\quad \leqslant \sum _{j=1}^d \Vert \partial _{x_j} f \Vert _\infty \ \varepsilon + C_\delta \ (t+\sqrt{t}) \left( \Vert f\Vert _2 + \sum _{j=1}^d \Vert \partial _{x_j^m} f\Vert _2 \right) \ \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{{t}}\right) ,\quad \qquad \end{aligned}$$
(1.2)

where

$$\begin{aligned} \Psi _{q,\delta }(u) = \left| \begin{array}{l@{\quad }l} u^{1/4} &{} \quad \text {if } d = 1, \\ \log ^q_+(u^{-1})\ u^{1/4} &{} \quad \text {if } d = 2, \\ u^{1/2-\delta } &{} \quad \text {if } d \geqslant 3. \end{array}\right. \end{aligned}$$
(1.3)

Remark 1.2

The Sobolev embedding theorem ensures that under the assumptions of Theorem 1.1, the function \(f\) is continuously differentiable and the norms \(\Vert \partial _{x_j} f \Vert _\infty \) are finite (see for instance [1, Theorem 5.4]).

Theorem 1.3

Let \(p_t^\omega (x,y)\) be the heat kernel associated to \(L^\omega \), let

$$\begin{aligned} \overline{p}_t(x,y) = \frac{1}{(2 \pi t)^{d/2} \sqrt{\det \overline{A}}} \exp \left( - \frac{1}{2t} (y-x)^\mathsf T \ \overline{A}^{-1} (y-x) \right) \end{aligned}$$

be the heat kernel associated to \(\frac{1}{2} \nabla \cdot \overline{A} \nabla \), and let \(\delta > 0\). There exist constants \(c > 0\) (independent of \(\delta \)), \(q, C_\delta , \varepsilon _\delta > 0\) such that for any \(\varepsilon > 0, t > 0\) satisfying \(\varepsilon /\sqrt{t}\leqslant \varepsilon _\delta \) and any \(x \in \mathbb R ^d\), one has

$$\begin{aligned}&\left| \varepsilon ^{-d} \ \mathbb E \left[ p^\omega _{\varepsilon ^{-2} t}(0, \lfloor \varepsilon ^{-1} x \rfloor )\right] - \overline{p}_t(0,x) \right| \nonumber \\&\quad \leqslant \frac{C_\delta }{t^{d/2}} \ \left( \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{t}\right) \right) ^{1/(d+3)} \exp \left[ -c \left( \frac{|x|^2}{t} \wedge |\varepsilon ^{-1} x| \right) \right] . \end{aligned}$$
(1.4)

In particular, for any \(s > 0\), there exists \(C_{\delta ,s}\) such that for \(\varepsilon \) small enough,

$$\begin{aligned} \sup _{x \in \mathbb R ^d} \sup _{t \geqslant s} \left| \varepsilon ^{-d} \ \mathbb E \left[ p^\omega _{\varepsilon ^{-2} t}(0, \lfloor \varepsilon ^{-1} x \rfloor )\right] - \overline{p}_t(0,x) \right| \leqslant C_{\delta ,s} \left( \Psi _{q,\delta }(\varepsilon ^2)\right) ^{1/(d+3)}. \end{aligned}$$

Remark 1.4

For a given smooth function \(f\) and a fixed \(t > 0\), the right-hand side of (1.2) is of the order of

$$\begin{aligned} \left| \begin{array}{ll} \sqrt{\varepsilon } &{} \quad \text {if } d = 1, \\ \log ^q(\varepsilon ^{-1}) \sqrt{\varepsilon } &{} \quad \text {if } d = 2, \\ {\varepsilon }^{1-\delta '} &{} \quad \text {if } d \geqslant 3, \\ \end{array}\right. \end{aligned}$$
(1.5)

where \(\delta ' = 2 \delta > 0\) is arbitrary. Similarly, for fixed \(t\) and \(x\), the right-hand side of (1.4) is of the order of

$$\begin{aligned} \left| \begin{array}{ll} \varepsilon ^{1/8} &{}\quad \text {if } d = 1, \\ \log ^{q/5}(\varepsilon ^{-1}) \ \varepsilon ^{1/10} &{}\quad \text {if } d = 2, \\ {\varepsilon }^{1/(d+3)-\delta ''} &{}\quad \text {if } d \geqslant 3, \\ \end{array}\right. \end{aligned}$$
(1.6)

where \(\delta '' = 2\delta /(d+3) > 0\) is arbitrary.

Remark 1.5

Similar results concerning elliptic equations are presented in Theorems 7.1 and 7.3 below.

1.2 Context

Homogenization problems have a very long story, going back at least to [34, 40]. Rigorous proofs of homogenization for periodic environments were obtained in the 1960s and 1970s (see [4] for references), and then for random environments with [31, 32, 37, 42]. Classical methods used to show homogenization typically rely on a compactness argument, or on the ergodic theorem, both approaches leaving the question of the rate of convergence untouched.

For continuous space and periodic coefficients, [28, Corollary 2.7] uses spectral methods to show that

$$\begin{aligned} \left| \varepsilon ^{-d} \ p^\omega _{\varepsilon ^{-2} t}(0,\varepsilon ^{-1} x) - \overline{p}_t(0,x) \right| \leqslant C \ \frac{\varepsilon }{t^{(d+1)/2}}. \end{aligned}$$

For random coefficients, available results are much less precise. For continuous space, [43] gives an algebraic speed of convergence of \(u_\varepsilon \) to \(\overline{u}\) for the elliptic problem and \(d \geqslant 3\), without providing an explicit exponent. In [8], the much more general case of fully nonlinear elliptic equations is considered, and a speed of convergence of a logarithmic type is proved.

Here, we focus on the convergence of the average of \(u_\varepsilon \) to \(\overline{u}\). This approach has been considered in [14] for the elliptic problem. There, it is shown that the suitably rescaled Green function, once averaged over the randomness of the coefficients, differs from the Green function of the homogenized equation by no more than a negative power of \(\varepsilon \). The exponent obtained is implicit, and depends on the ellipticity condition assumed on the random coefficients. Similar results for parabolic equations have been derived in [11].

In contrast, Theorems 1.1 and 1.3 provide explicit exponents, that depend only on the dimension. I conjecture that the correct order of decay with \(\varepsilon \) in Theorem 1.1 should be

$$\begin{aligned} \left| \begin{array}{ll} \sqrt{\varepsilon } &{} \quad \text {if } d = 1, \\ \varepsilon \sqrt{|\log (\varepsilon )|} &{} \quad \text {if } d = 2, \\ {\varepsilon }&{} \quad \text {if } d \geqslant 3. \\ \end{array}\right. \end{aligned}$$

This differs notably from what is obtained in Theorem 1.1 only when \(d = 2\). On the other hand, it may well be that the assumption of high regularity on \(f\) is only an artefact of the methods employed.

The fact that

$$\begin{aligned} \sup _{x \in \mathbb R ^d} \sup _{t \geqslant s} \left| \varepsilon ^{-d} \ p^\omega _{\varepsilon ^{-2} t}(0, \lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_t(0,x) \right| \xrightarrow [\varepsilon \rightarrow 0]{\text {a.s.}} 0 \end{aligned}$$
(1.7)

is known at least since [3], where the much more difficult case where the random coefficients are Bernoulli random variables is considered (in this context, the heat kernel should be considered only within the unique infinite percolation cluster). Yet, for strictly positive random coefficients, this convergence does not hold if the distribution of the random coefficients is allowed to have a fat tail close to \(0\) and when \(d \geqslant 4\) [5, 6]. Under the same circumstance and when \(p^\omega \) is replaced by its average in (1.7), the convergence fails to hold in any dimension [20] (see however [2, Proposition 7.2] for a nice way to get around this problem).

Under our present assumption of uniform ellipticity, regularity properties of the average of \(p^\omega \) were proved in [12, 17] (more on this will come below).

An evaluation of the gap between the average of \(u_\varepsilon \) and \(\overline{u}\) naturally calls for estimates on the size of the random fluctuations of \(u_\varepsilon \) around its average. In this direction and for the elliptic problem, [12] obtains algebraic decay of the variance of \(u_\varepsilon \) (integrated over space). The exponent obtained is implicit, and depends on the ellipticity conditions.Footnote 1

1.3 Our approach

In order to prove Theorem 1.1, we will use the representation of \(u_\varepsilon \) as the expected value over the paths of a random walk, that we write \((X_t)_{t \geqslant 0}\). This random walk has inhomogeneous jump rates given by the \((\omega _e)_{e \in \mathbb B }\), and \(L^\omega \) is its infinitesimal generator. For instance, one has

$$\begin{aligned} u_\varepsilon (t,0) = \mathbf E ^\omega _0\left[ f\left( \varepsilon X_{\varepsilon ^{-2} t} \right) \right] \!, \end{aligned}$$

where we write \(\mathbf P ^\omega _0\) for the distribution of the random walk starting from \(0\), and \(\mathbf E ^\omega _0\) for its associated expectation. The (pointwise) convergence of \(u_\varepsilon \) to \(\overline{u}\) is equivalent to the claim that the random walk, after diffusive scaling, satisfies a central limit theorem. Quantitative estimates should thus follow if one can provide with rates of convergence in this central limit theorem.

In [35], it is shown that there exist constants \(C,q \geqslant 0\) such that for any \(\xi \) of unit norm,

$$\begin{aligned} \sup _{x \in \mathbb R } \ \left| \mathbb P \mathbf P ^\omega _0 \left[ \frac{\xi \cdot X_t}{\sigma (\xi ) \sqrt{t}} \leqslant x \right] - \Phi (x) \right| \leqslant C \left| \begin{array}{ll} t^{-1/10} &{} \quad \text {if } d = 1, \\ \log _+^q(t) \ t^{-1/10} &{} \quad \text {if } d = 2, \\ \log _+(t)\ t^{-1/5} &{} \quad \text {if } d = 3, \\ t^{-1/5} &{} \quad \text {if } d \geqslant 4, \end{array}\right. \end{aligned}$$
(1.8)

where \(\Phi \) is the cumulative distribution function of the standard Gaussian random variable, and \(\sigma (\xi ) = \xi \cdot \overline{A}\xi \).

This result has two important weak points: (1) the rates are far from the usual \(t^{-1/2}\) one obtains for sums of i.i.d. random variables, and (2) the theorem only gives information about the projections of \(X_t\) onto a fixed vector. We shall find ways to overcome these two problems.

The classical approach for the proof of a central limit theorem for the random walk consists in decomposing it as the sum of a martingale plus a remainder term, and then show that the martingale converges (after scaling) to a Gaussian random variable, while the remainder term becomes negligible in the limit.

In view of this, what should be done is clear: we should first find a quantitative estimate on how small the remainder term is, and second, show that the martingale converges rapidly to a Gaussian. This is indeed the method used in [35]. The control of the remainder term given there is satisfactory, and the problem lies with the quantitative central limit theorem for the martingale part.

This quantitative central limit theorem relies on the fact that one can have a sharp control of the variance of the quadratic variation of the martingale. It is shown that, after scaling, this variance decays like \(t^{-1}\) when \(d \geqslant 4\), which is the best possible rate. However, given such a control, the quantitative CLT (due to [24, 26]) used there only yields a decay of \(t^{-1/5}\) in this case.

Surprisingly, this exponent \(1/5\) is best possible in general [36]. To overcome this obstruction, we derive new quantitative CLT’s for martingales, that will not yield a Berry-Esseen type of estimate, but rather measure

$$\begin{aligned} \sup _{f \in \mathfrak L } \left| \mathbb E \mathbf E ^\omega _0\left[ f\left( \frac{\xi \cdot X_t}{\sigma (\xi ) \sqrt{t}} \right) \right] - \int f \ \mathrm{d }\Phi \right| , \end{aligned}$$

where \(\mathfrak L \) is a class of functions (this is reminiscent of Stein’s method, see for instance [10]). When \(\mathfrak L \) is the class of bounded \(1\)-Lipschitz functions, the supremum is often called the Kantorovich(-Rubinstein) distance. We also consider \(\mathfrak L \) to be the class of bounded \(\mathcal C ^2\) functions that have first derivative bounded by \(1\) and second derivative bounded by \(k\), and call it the \(k\) -Kantorovich distance. The martingale CLT’s obtained hold for general square-integrable martingales, and are of independent interest.

Once equipped with these quantitative martingale CLT’s, we apply them to the one-dimensional projections of the random walk \((X_t)\), and for \(d \geqslant 3\), we obtain rates approaching the i.i.d. rate of \(t^{-1/2}\). To do so, we use estimates derived in [35], most importantly on the variance of the quadratic variation of the martingale. These in turn are consequences of the \(L^p\) boundedness of the corrector (for \(d \geqslant 3\), and with logarithmic corrections for \(d = 2\)), and of a spatial decorrelation property of this corrector, proved in [22, Theorem 2.1 and Proposition 2.1].

In order to obtain Theorem 1.1, we need to carry the information obtained on the projections of \(X_t\) to \(X_t\) itself, in a kind of quantitative version of the Cramér-Wold theorem. This is achieved through Fourier analysis, at the price of requiring the existence of weak derivatives of higher order.

The key observation that enables to go from Theorem 1.1 to Theorem 1.3 is the high regularity of the averaged heat kernel. In contrast to the true heat kernel, the averaged one has a gradient which is bounded by a constant times the gradient of \(\overline{p}\), as is proved in [12, 17].

The estimates due to [22] are the only place where the assumptions of independence and uniform ellipticity of the coefficients come into play. In particular, if it is shown that these estimates are valid for certain correlated environments, then the present results automatically extend to this context. The present results should also extend to continuous space with only minor change, as long as the estimates of [22] remain true in this setting.Footnote 2

1.4 Organization of the paper

We introduce the (\(k\)-)Kantorovich and Kolmogorov distances in Sect. 2. In Sect. 3, we consider general square-integrable martingales, and derive quantitative CLT’s with respect to the (\(k\)-)Kantorovich distances. We then apply these results to projections of the random walk \(X_t\) in Sect. 4. The homogenization setting is taken up in Sect. 5, and Theorem 1.1 is proved. Theorem 1.3 is then derived in Sect. 6. Finally, similar results for the homogenization of elliptic equations are presented in Sect. 7.

2 Distances between probability measures

A function \(f : \mathbb R ^m \rightarrow \mathbb R ^n\) is said to be \(k\)-Lipschitz if for any \(x,y \in \mathbb R ^m\), one has \(|f(y)-f(x)| \leqslant k |y-x|\). Let \(\nu , \nu '\) be probability measures on \(\mathbb R \), and let \(F_\nu , F_{\nu '}\) be their respective cumulative distribution functions. We define the Kantorovich distance between \(\nu \) and \(\nu '\) as

$$\begin{aligned} \mathsf d _1(\nu ,\nu ') = \sup \left\{ \left| \int f \mathrm{d }\nu - \int f \mathrm{d }\nu ' \right| , \ f \text { bounded and 1-Lip.} \right\} , \end{aligned}$$
(2.1)

and the Kolmogorov distance between \(\nu \) and \(\nu '\) as

$$\begin{aligned} \mathsf d _\infty (\nu ,\nu ') = \sup _{x \in \mathbb R } \left| F_{\nu '}(x) - F_{\nu }(x) \right| = \Vert F_{\nu '} - F_{\nu }\Vert _\infty . \end{aligned}$$
(2.2)

The notation for the Kantorovich distance becomes more transparent once we recall that (see for instance [41, Theorem 1.14 and (2.48)])

$$\begin{aligned} \mathsf d _1(\nu ,\nu ') = \int \left| F_{\nu '}(x) - F_{\nu }(x) \right| \mathrm{d }x = \Vert F_{\nu '} - F_{\nu }\Vert _1. \end{aligned}$$
(2.3)

As we will see below, bounds in the martingale CLT are improved when measured with the Kantorovich distance instead of the Kolmogorov distance. We now introduce weaker forms of the Kantorovich distance, for which the rates of convergence will be even better. For any \(k \in [0,+\infty ]\), we define the \(k\) -Kantorovich distance as

$$\begin{aligned} \mathsf d _{1,k}(\nu ,\nu ') = \sup \left\{ \left| \int f \mathrm{d }\nu - \int f \mathrm{d }\nu ' \right| , \ f \in \mathcal C _b^2(\mathbb R ,\mathbb R ), \Vert f'\Vert _\infty \leqslant 1, \Vert f''\Vert _\infty \leqslant k \right\} , \end{aligned}$$

where \(\mathcal C _b^2(\mathbb R ,\mathbb R )\) is the set of bounded twice continuously differentiable functions from \(\mathbb R \) to \(\mathbb R \). For \(k \leqslant k'\), one has \(\mathsf d _{1,k}\leqslant \mathsf d _{1,k'} \leqslant \mathsf d _{1,\infty } = \mathsf d _1\). Note that if \(f \in \mathcal C ^2_b(\mathbb R ,\mathbb R )\), then

$$\begin{aligned} \left| \int f \mathrm{d }\nu - \int f \mathrm{d }\nu '\right| \leqslant \Vert f'\Vert _\infty \ \mathsf d _{1,\Vert f''\Vert _\infty /\Vert f'\Vert _\infty }(\nu ,\nu '). \end{aligned}$$
(2.4)

In the sequel, if \(X\) follows the distribution \(\nu \) and \(Y\) the distribution \(\nu '\), we may write \(\mathsf d _1(X,Y)\) to denote \(\mathsf d _1(\nu ,\nu ')\), or also \(\mathsf d _1(X,F_{\nu '})\) if convenient. If \(X\) and \(Y\) are defined on the same probability space with probability measure \(P\) and associated expectation \(E\), then for any \(1\)-Lipschitz function \(f\), we have

$$\begin{aligned} \left| E[f(X)] - E[f(Y)]\right| \leqslant E|f(X) - f(Y)| \leqslant E |X-Y| , \end{aligned}$$

and hence

$$\begin{aligned} \mathsf d _1(X,Y) \leqslant E |X-Y|. \end{aligned}$$
(2.5)

Similarly, if \(X\) follows the distribution \(\nu \) and \(Y\) the distribution \(\nu '\), we write \(\mathsf d _{1,k}(X,Y), \mathsf d _{1,k}(X,F_{\nu '})\) or \(\mathsf d _{1,k}(\nu ,\nu ')\) as convenient.

3 Martingale CLT

For a square-integrable cadlag martingale \((M_t)_{t \in [0,1]}\) defined with respect to the probability measure \(P\) and the right-continuous filtration \((\mathcal F _t)_{t \geqslant 0}\), we write \((\langle M \rangle _t)_{t \in [0,1]}\) for its predictable quadratic variation,

$$\begin{aligned} \Delta M(t) = M_t - \lim _{s \rightarrow t^-} M_s, \end{aligned}$$

and

$$\begin{aligned} L_{2p} = E\left[ \sum _{0 \leqslant t \leqslant 1} |\Delta M(t)|^{2p} \right] . \end{aligned}$$

Recall that we denote by \(\Phi \) the cumulative distribution function of the standard Gaussian random variable. In [24], the following is proved.

Theorem 3.1

([24]) For any \(p > 1\), there exists \(\overline{C}_p\) (independent of \(M\)) such that

$$\begin{aligned} \mathsf d _\infty (M_1,\Phi ) \leqslant \overline{C}_p \left( L_{2p}^{1/(2p+1)} + \Vert \langle M \rangle _1 - 1\Vert _p^{p/(2p+1)} \right) . \end{aligned}$$
(3.1)

Our first result consists in showing that one can get sharper bounds if one replaces the Kolmogorov distance by the (\(k\)-)Kantorovich distance in (3.1).

Theorem 3.2

For any \(p > 1\), there exists \(C_p\) (independent of \(M\)) such that

$$\begin{aligned} \mathsf d _1(M_1,\Phi ) \leqslant C_p L_{2p}^{1/(2p+1)} + 2 \Vert \langle M \rangle _1 - 1\Vert _1^{1/2}, \end{aligned}$$
(3.2)

and for any \(k \geqslant 0\),

$$\begin{aligned} \mathsf d _{1,k}(M_1,\Phi ) \leqslant C_p L_{2p}^{1/(2p+1)} + \frac{k}{2} L_{2p}^{1/p} + (k\vee 1) \Vert \langle M \rangle _1 - 1\Vert _1. \end{aligned}$$
(3.3)

Remark 3.3

Naturally, one has \(\Vert \langle M \rangle _1 - 1\Vert _1 \leqslant \Vert \langle M \rangle _1 - 1\Vert _p\), and the statements are only interesting when this quantity, and also \(L_{2p}\), are small, so Theorem 3.2 indeed provides better rates of convergence than Theorem 3.1. It is shown in [36] that it is not possible to change the exponent \(p/(2p+1)\) appearing on the term \(\Vert \langle M \rangle _1 - 1\Vert _p\) in the right-hand side of (3.1) by any higher exponent. It would be interesting to investigate how sharp (3.2) is in this respect. The term \(\Vert \langle M \rangle _1 - 1\Vert _1\) appearing on the right-hand side of (3.3) cannot be improved. Indeed, let \((B_s)_{s \geqslant 0}\) be a standard Brownian motion, and consider the martingale \(M_s = B_{(1+\varepsilon ) s}\). Since the martingale is continuous, \(L_{2p}\) vanishes, while one has \(\Vert \langle M \rangle _1 - 1\Vert _1 = \varepsilon \). On the other hand, the cosine function has first and second derivatives bounded by \(1\), and thus

$$\begin{aligned} \mathsf d _{1,1}(M_1,\Phi ) \geqslant E[\cos (B_1)] - E[\cos (M_1)] = e^{-1/2} - e^{-(1+\varepsilon )/2} \sim \frac{\varepsilon }{2} \qquad (\varepsilon \rightarrow 0), \end{aligned}$$

thus justifying the optimality of the exponent on \(\Vert \langle M \rangle _1 - 1\Vert _1\).

Remark 3.4

A quantitative martingale CLT expressed in terms of the Kantorovich distance was already formulated in [38, Theorem 8.1.16]. The terms involved in the bound are however difficult to estimate in practical situations, in contrast to what is obtained in Theorem 3.2.

In order to prove Theorem 3.2, we will rely on the following non-uniform version of Theorem 3.1.

Theorem 3.5

([24, 25]) For any \(p > 1\), there exists \(\tilde{C}_p\) (independent of \(M\)) such that if \(L_{2p} + \Vert \langle M \rangle _1 - 1\Vert _p^{p} \leqslant 1\), then for any \(x \in \mathbb R \),

$$\begin{aligned} \left| P[M_1 \leqslant x] - \Phi (x) \right| \leqslant \frac{\tilde{C}_p}{1+|x|^{2p}} \left( L_{2p}^{1/(2p+1)} + \Vert \langle M \rangle _1 - 1\Vert _p^{p/(2p+1)} \right) . \end{aligned}$$

[25, Theorem 1] is the equivalent statement concerning discrete-time martingales. Theorem 3.5 can be derived from its discrete-time version by applying the approximation procedure explained in [24, Section 4] (in [24], locally square-integrable martingales are considered, while we stick here with plainly square-integrable martingales. There is no loss of generality however, since a locally square-integrable martingale is in fact a square-integrable one if \(\Vert \langle M \rangle _1 - 1\Vert _1\) is finite. One can thus skip the localization procedure at the end of [24, Section 4]).

Proof of Theorem 3.2

We start by proving that there exists \(C_p\) (independent of \(M\)) such that (3.2) holds. We decompose the proof of this into three steps.

Step I.1. We first prove the claim assuming that

$$\begin{aligned} \langle M \rangle _1 = 1 \text { a.s.} \end{aligned}$$
(3.4)

and that \(L_{2p} \leqslant 1\). Under this condition, Theorem 3.5 ensures that

$$\begin{aligned} \left| P[M_1 \leqslant x] - \Phi (x) \right| \leqslant \frac{\tilde{C}_p}{1+|x|^{2p}} L_{2p}^{1/(2p+1)}. \end{aligned}$$

We thus have, after possibly enlarging \(\tilde{C}_p\),

$$\begin{aligned} \mathsf d _1(M_1,\Phi ) = \int \left| P[M_1 \leqslant x] - \Phi (x) \right| \mathrm{d }x \leqslant \tilde{C}_p L_{2p}^{1/(2p+1)}, \end{aligned}$$

which is the desired result.

Step I.2. We now no longer impose that condition (3.4) holds, but keep with the assumption that \(L_{2p} \leqslant 1\). Following an idea probably due to [18], we introduce

$$\begin{aligned} \tau = \sup \{t \leqslant 1 : \langle M \rangle _t \leqslant 1\}. \end{aligned}$$

Note that \(\tau \) is a stopping time, since the filtration is right-continuous, and thus

$$\begin{aligned} \{ \tau \leqslant t \} = \bigcap _{\varepsilon > 0} \{ \langle M \rangle _{t+\varepsilon } > 1 \} \in \mathcal F _{t}. \end{aligned}$$

We define

$$\begin{aligned} \langle M \rangle _{\tau ^-} = \left| \begin{array}{ll} \langle M \rangle _1 &{} \quad \text {if } \langle M \rangle _1 \leqslant 1, \\ \lim _{t \rightarrow \tau ^-} \langle M \rangle _t &{}\quad \text {otherwise}, \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} M_{\tau ^-} =\left| \begin{array}{ll} M_1 &{} \quad \text {if } \langle M \rangle _1 \leqslant 1, \\ \lim _{t \rightarrow \tau ^-} M_t &{}\quad \text {otherwise}. \end{array}\right. \end{aligned}$$

Note that \(\langle M \rangle _{\tau ^-} \leqslant 1\). Let \((B_s)_{s \geqslant 0}\) be a standard Brownian motion, independent of the martingale. We define

$$\begin{aligned} \tilde{M}_s = \left| \begin{array}{ll} M_s &{} \quad \text {if } 0 \leqslant s < \tau ,\\ M_{\tau ^-} &{} \quad \text {if } \tau \leqslant s \leqslant 1 ,\\ M_{\tau ^-} + B_{s-1} &{} \quad \text {if } 1 \leqslant s \leqslant 2-\langle M \rangle _{\tau ^-} ,\\ M_{\tau ^-} + B_{1-\langle M \rangle _{\tau ^-}} &{} \quad \text {if }2-\langle M \rangle _{\tau ^-} \leqslant s \leqslant 2.\\ \end{array}\right. \end{aligned}$$

By construction, \(\tilde{M}\) is a martingale, and

$$\begin{aligned} \langle \tilde{M} \rangle _2 - \langle \tilde{M} \rangle _1 = 1-\langle {M} \rangle _{\tau ^-}, \end{aligned}$$

hence \(\langle \tilde{M} \rangle _2 = 1\). Naturally, the fact that \(\tilde{M}\) is defined on \([0,2]\) instead of \([0,1]\) plays no role, and this martingale satisfies condition (3.4) (at time \(2\)). Writing

$$\begin{aligned} \tilde{L}_{2p} = E\left[ \sum _{0 \leqslant t \leqslant 2} |\Delta \tilde{M}(t)|^{2p} \right] , \end{aligned}$$

we clearly have \(\tilde{L}_{2p} \leqslant L_{2p} \leqslant 1\). We learn from the first step of the proof that

$$\begin{aligned} \mathsf d _1(\tilde{M}_2,\Phi ) \leqslant \tilde{C}_p \tilde{L}_{2p}^{1/(2p+1)} \leqslant \tilde{C}_p {L}_{2p}^{1/(2p+1)}. \end{aligned}$$
(3.5)

We now want to use the fact that

$$\begin{aligned} \mathsf d _1(M_1,\Phi ) \leqslant \mathsf d _1(M_1,\tilde{M}_2) + \mathsf d _1(\tilde{M}_2,\Phi ) \end{aligned}$$
(3.6)

to estimate \(\mathsf d _1(M_1,\Phi )\). In view of (2.5), we have

$$\begin{aligned} \mathsf d _1(M_1,\tilde{M}_2) \leqslant E\left[ \left| M_1 - \tilde{M}_2\right| \right] . \end{aligned}$$

Note that

$$\begin{aligned} M_1 - \tilde{M}_2 = M_1 - M_\tau + \Delta M(\tau ) \mathbf 1 _{\langle M \rangle _1 > 1} - B_{1-\langle M \rangle _{\tau ^-}}, \end{aligned}$$

and thus

$$\begin{aligned} E\left[ \left| M_1 - \tilde{M}_2\right| \right] \leqslant E\left[ \left| M_1 - M_\tau \right| \right] + E\left[ \left| \Delta M(\tau )\right| \right] + E\left[ \left| B_{1-\langle M \rangle _{\tau ^-}}\right| \right] . \end{aligned}$$

Let us write \(a_1 + a_2 + a_3\) for the latter sum, with obvious identifications. We bound the contribution of each of these terms successively.

$$\begin{aligned} a_1 \leqslant E\left[ (M_1 - M_\tau )^2 \right] ^{1/2} = E\left[ \langle M \rangle _1 - \langle M\rangle _\tau \right] ^{1/2}, \end{aligned}$$

since \(\tau \leqslant 1\) is a stopping time. Now, either \(\tau = 1\), in which case \(\langle M \rangle _1 - \langle M\rangle _\tau = 0\), or \(\tau < 1\), in which case \(\langle M \rangle _\tau \geqslant 1\). In both cases, we have

$$\begin{aligned} \langle M \rangle _1 - \langle M\rangle _\tau \leqslant \left| \langle M \rangle _1 - 1 \right| \!, \end{aligned}$$

and thus

$$\begin{aligned} a_1 \leqslant \Vert \langle M \rangle _1 - 1 \Vert _1^{1/2}. \end{aligned}$$

As for \(a_2\), we have

$$\begin{aligned} a_2 = E\left[ \left| \Delta M(\tau )\right| \right] \leqslant E\left[ \left| \Delta M(\tau )\right| ^{2p} \right] ^{1/(2p)} \leqslant L_{2p}^{1/(2p)}. \end{aligned}$$
(3.7)

For the third term, we have

$$\begin{aligned} a_3 = c \ E\left[ \left| 1-\langle M\rangle _{\tau ^-}\right| ^{1/2}\right] \leqslant c \ E\left[ \left| 1-\langle M\rangle _{\tau ^-}\right| \right] ^{1/2}, \end{aligned}$$

where \(c = E[|B_1|] \leqslant 1\). We decompose the last expectation as

$$\begin{aligned} E\left[ \left| 1-\langle M\rangle _{\tau ^-}\right| \ \mathbf 1 _{\langle M \rangle _1 \leqslant 1}\right] + E\left[ \left| 1-\langle M\rangle _{\tau ^-}\right| \ \mathbf 1 _{\langle M \rangle _1 > 1}\right] \!. \end{aligned}$$

The first term is bounded by \(\Vert 1-\langle M \rangle _1\Vert _1\), while the second term is smaller than

$$\begin{aligned} E\left[ \Delta \langle M \rangle (\tau ) \right] \leqslant E\left[ (\Delta M(\tau ))^2\right] \end{aligned}$$

(to see this, consult for instance the proof of [27, Theorem 4.2]). The latter is bounded by

$$\begin{aligned} E\left[ (\Delta M(\tau ))^{2p}\right] ^{1/p} \leqslant L_{2p}^{1/p}. \end{aligned}$$

To sum up, we have shown that

$$\begin{aligned} \mathsf d _1(M_1,\tilde{M}_2)&\leqslant \Vert \langle M \rangle _1 - 1 \Vert _1^{1/2} + L_{2p}^{1/(2p)} + \sqrt{\Vert 1-\langle M \rangle _1\Vert _1 + L_{2p}^{1/p}} \nonumber \\&\leqslant 2 \Vert \langle M \rangle _1 - 1 \Vert _1^{1/2} + 2 L_{2p}^{1/(2p)}. \end{aligned}$$
(3.8)

Since we assume that \(L_{2p} \leqslant 1\), we have \(L_{2p}^{1/(2p)} \leqslant L_{2p}^{1/(2p+1)}\), and equations (3.6), (3.5) and (3.8) give us that

$$\begin{aligned} \mathsf d _1(M_1, \Phi ) \leqslant (\tilde{C}_p+2) L_{2p}^{1/(2p+1)} + 2 \Vert \langle M \rangle _1 - 1 \Vert _1^{1/2} , \end{aligned}$$

which is what we wanted to prove.

Step I.3. It remains to consider the case when \(L_{2p} > 1\). It follows from (2.5) that

$$\begin{aligned} \mathsf d _1(M_1,\Phi ) \leqslant c + \Vert M_1\Vert _1, \end{aligned}$$

where \(c\) is the \(L_1\) norm of a standard Gaussian, \(c \leqslant 1\). Moreover,

$$\begin{aligned} \Vert M_1\Vert _1 \leqslant \Vert M_1\Vert _2 = \Vert \langle M \rangle _1\Vert _1^{1/2} \leqslant \left( 1+\Vert \langle M \rangle _1-1\Vert _1\right) ^{1/2}. \end{aligned}$$

As a consequence, it is always true that

$$\begin{aligned} \mathsf d _1(M_1,\Phi ) \leqslant 2+\Vert \langle M \rangle _1 - 1\Vert _1^{1/2}. \end{aligned}$$

The theorem is thus clearly true when \(L_{2p} > 1\) as soon as \(C_p \geqslant 2\), and this finishes the proof of (3.2).

We now proceed to show that there exists \(C_{p}\) (independent of \(M\) and \(k\)) such that (3.3) holds, and decompose the proof of this fact into two steps.

Step II.1. We assume first that \(L_{2p} \leqslant 1\), and consider again the martingale \(\tilde{M}\) as constructed in step I.2. Since \(\langle \tilde{M} \rangle _2 = 1\), we know from step I.1 that

$$\begin{aligned} \mathsf d _1(\tilde{M}_2,\Phi ) \leqslant \tilde{C}_p L_{2p}^{1/(2p+1)}. \end{aligned}$$
(3.9)

Let

$$\begin{aligned} \overline{M}_2 = M_\tau + B_{1-\langle M \rangle _{\tau ^-}}, \end{aligned}$$

and observe that

$$\begin{aligned} \tilde{M}_2 = \overline{M}_2 - \Delta M(\tau ) \mathbf 1 _{\langle M \rangle _1 > 1}. \end{aligned}$$

We have

$$\begin{aligned} \mathsf d _1(\overline{M}_2,\Phi ) \leqslant \mathsf d _1(\overline{M}_2,\tilde{M}_2) + \mathsf d _1(\tilde{M}_2,\Phi ). \end{aligned}$$

The first term on the right-hand side is smaller than \(E[|\Delta M (\tau )|]\) by (2.5), and we have seen in (3.7) that this is smaller than \(L_{2p}^{1/(2p)} \leqslant L_{2p}^{1/(2p+1)}\). Using also (3.9), we obtain

$$\begin{aligned} \mathsf d _{1,k}(\overline{M}_2,\Phi ) \leqslant \mathsf d _1(\overline{M}_2,\Phi ) \leqslant (\tilde{C}_p+1) L_{2p}^{1/(2p+1)}. \end{aligned}$$
(3.10)

Let \(f \in \mathcal C _b^2 (\mathbb R ,\mathbb R )\) be such that \(\Vert f'\Vert _\infty \leqslant 1\) and \(\Vert f''\Vert _\infty \leqslant k\). We will show that

$$\begin{aligned} \left| E[f(\overline{M}_2)] - E[f(M_\tau )] \right| \leqslant \frac{k}{2} \left( L_{2p}^{1/p} + \Vert \langle M \rangle _1 - 1 \Vert _1 \right) . \end{aligned}$$
(3.11)

Indeed, since \(f \in \mathcal C _b^2 (\mathbb R ,\mathbb R )\) and \(\Vert f''\Vert _\infty \leqslant k\), we have

$$\begin{aligned} \left| E\left[ f(\overline{M}_2) - f(M_\tau ) - (\overline{M}_2 - M_\tau ) f'(M_\tau ) \right] \right| \leqslant \frac{k}{2} E\left[ (\overline{M}_2 - M_\tau )^2\right] . \end{aligned}$$

But

$$\begin{aligned} E\left[ (\overline{M}_2 - M_\tau ) f'(M_\tau ) \right]&= E[B_{1-\langle M \rangle _{\tau ^-}}f'(M_\tau )] \\&= E\big [E[B_{1-\langle M \rangle _{\tau ^-}} \ | \ \mathcal F _{\tau }] \ f'(M_\tau )\big ], \end{aligned}$$

and \(E[B_{1-\langle M \rangle _{\tau ^-}} \ | \ \mathcal F _{\tau }] = 0\) since \(B\) and \(M\) are independent. On the other hand,

$$\begin{aligned} E\left[ (\overline{M}_2 - M_\tau )^2\right] = E[(B_{1-\langle M \rangle _{\tau ^-}})^2] = E[1-\langle M \rangle _{\tau ^-}], \end{aligned}$$

and we have seen in step I.2, while treating the term \(a_3\), that

$$\begin{aligned} E[1-\langle M \rangle _{\tau ^-}] \leqslant \Vert \langle M \rangle _1 - 1 \Vert _1 + L_{2p}^{1/p}. \end{aligned}$$

As a consequence, (3.11) is proved, and thus

$$\begin{aligned} \mathsf d _{1,k}(M_\tau ,\overline{M}_2) \leqslant \frac{k}{2} \left( L_{2p}^{1/p} + \Vert \langle M \rangle _1 - 1 \Vert _1 \right) . \end{aligned}$$
(3.12)

We now show that

$$\begin{aligned} \mathsf d _{1,k}(M_\tau ,M_1) \leqslant \frac{k}{2} \Vert \langle M \rangle _1 - 1 \Vert _1 \end{aligned}$$
(3.13)

using the same technique. We write

$$\begin{aligned} \left| E\left[ f(M_1) - f(M_\tau ) - (M_1 - M_\tau ) f'(M_\tau ) \right] \right| \leqslant \frac{k}{2} E\left[ (M_1 - M_\tau )^2\right] , \end{aligned}$$

and observe that

$$\begin{aligned} E\left[ ({M}_1 - M_\tau ) f'(M_\tau ) \right] = E\big [E[({M}_1 - M_\tau ) \ | \ \mathcal F _{\tau }] \ f'(M_\tau )\big ] = 0, \end{aligned}$$

since \(M\) is a martingale and \(\tau \) a stopping time. On the other hand, we have seen while treating the term \(a_1\) in step I.2 that

$$\begin{aligned} E\left[ (M_1 - M_\tau )^2\right] \leqslant \Vert \langle M \rangle _1 - 1 \Vert _1, \end{aligned}$$

and thus (3.13) is proved. Combining (3.10), (3.12) and (3.13), we thus obtain

$$\begin{aligned} \mathsf d _{1,k}(M_1,\Phi ) \leqslant \left( \tilde{C}_p + 1\right) L_{2p}^{1/(2p+1)} + \frac{k}{2} L_{2p}^{1/p} + k \Vert \langle M \rangle _1 - 1 \Vert _1, \end{aligned}$$

and this proves (3.3) for \(L_{2p} \leqslant 1\).

Step II.2. We now conclude by considering the case when \(L_{2p} > 1\). We learn from step I.3 that

$$\begin{aligned} \mathsf d _{1,k}(M_1,\Phi ) \leqslant 2+\Vert \langle M \rangle _1 - 1\Vert _1^{1/2}. \end{aligned}$$

Since for any \(x \geqslant 0\), we have \(\sqrt{x} \leqslant 1 + x/2\), we thus obtain

$$\begin{aligned} \mathsf d _{1,k}(M_1,\Phi ) \leqslant 3 + \frac{1}{2} \Vert \langle M \rangle _1 - 1\Vert _1, \end{aligned}$$

and thus relation (3.3) holds when \(L_{2p} > 1\), provided we choose \(C_{p} \geqslant 3\). \(\square \)

4 The random walk among random conductances

Let \(0 < \alpha \leqslant \beta < + \infty \), and \(\Omega = [\alpha ,\beta ]^\mathbb B \). For any family \(\omega = (\omega _e)_{e \in \mathbb B } \in \Omega \), we consider the Markov process \((X_t)_{t \geqslant 0}\) whose jump rate between \(x\) and a neighbour \(y\) is given by \(\omega _{x,y}\). We write \(\mathbf P ^\omega _x\) for the law of this process starting from \(x \in \mathbb Z ^d, \mathbf E ^\omega _x\) for its associated expectation. Its infinitesimal generator is \(L^\omega \) defined in (1.1). We assume that the \((\omega _e)_{e \in \mathbb B }\) are themselves i.i.d. random variables under the measure \(\mathbb P \) (with associated expectation \(\mathbb E \)). We write \(\overline{\mathbb{P }} = \mathbb P \mathbf P ^\omega _0\) for the annealed measure. It was shown in [30] that under \(\overline{\mathbb{P }}\) and as \(\varepsilon \) tends to \(0\), the process \(\sqrt{\varepsilon } X_{\varepsilon ^{-1} t}\) converges to a Brownian motion, whose covariance matrix we write \(\overline{A}\) (in [39], it is shown that under our present assumption of uniform ellipticity, the invariance principle holds under \(\mathbf P ^\omega _0\) for almost every \(\omega \)).

Let \(\xi \in \mathbb R ^d\) be a vector of unit \(L^2\) norm. The purpose of this section is to give sharp estimates on the \(k\)-Kantorovich distance between \(\xi \cdot X_t/\sqrt{t}\) and \(\Phi _{\sigma (\xi )}\), where we write \(\Phi _\sigma \) to denote the cumulative distribution function of a Gaussian random variable with variance \(\sigma ^2\), and \(\sigma (\xi ) = (\xi \cdot \overline{A} \xi )^{1/2}\).

Theorem 4.1

For any \(\delta > 0\), there exists a constant \(C\) (which may depend on the dimension) such that for any \(k \geqslant 0\) and any \(\xi \) of unit norm, one has

$$\begin{aligned} \mathsf d _{1,k}\left( \frac{\xi \cdot X_t}{\sqrt{t}}, \Phi _{\sigma (\xi )}\right) \leqslant C \ (k \vee 1) \ \Psi _{q,\delta }(t^{-1}) \end{aligned}$$
(4.1)

for some \(q \geqslant 0\), where in the left-hand side, \(\xi \cdot X_t/\sqrt{t}\) stands for the distribution of this random variable under the measure \(\overline{\mathbb{P }}\), and where \(\Psi _{q,\delta }\) was defined in (1.3).

Remark 4.2

When \(d \geqslant 3\), the exponent of decay in (4.1) can thus be made arbitrarily close to \(1/2\), and this is the exponent one gets when considering sums of i.i.d. random variables with finite third moment.

Remark 4.3

By the same reasoning, one can also prove that there exist constants \(C\) (which may depend on the dimension) and \(q\) such that, for any \(\xi \) of unit norm, one has

$$\begin{aligned} \mathsf d _1\left( \frac{\xi \cdot X_t}{\sqrt{t}}, \Phi _{\sigma (\xi )}\right) \leqslant C \left| \begin{array}{ll} t^{-1/8} &{} \quad \text {if } d = 1, \\ \log _+^q(t) \ t^{-1/8} &{} \quad \text {if } d = 2, \\ \log _+^{1/4}(t) \ t^{-1/4} &{} \quad \text {if } d = 3, \\ t^{-1/4} &{} \quad \text {if } d \geqslant 4, \end{array}\right. \end{aligned}$$

where again \(\xi \cdot X_t/\sqrt{t}\) stands for the distribution of this random variable under the measure \(\overline{\mathbb{P }}\).

The proof of Theorem 4.1 follows a line of reasoning similar to that of [35, Theorem 2.1]. From now on, we fix \(\xi \in \mathbb R ^d\) of unit norm. The starting point is to approximate \(\xi \cdot X_t\) by a martingale, whose construction we now recall. To begin with, let us write \((\theta _x)_{x \in \mathbb Z ^d}\) to denote the action of translation of \(\mathbb Z ^d\) on the space of environments \(\Omega \), so that for \(\omega \in \Omega \) and \(x,y,z \in \mathbb Z ^d, y\sim z\),

$$\begin{aligned} (\theta _x \ \omega )_{y,z} = \omega _{x+y,x+z}. \end{aligned}$$
(4.2)

Let \(\mathcal L \) be the operator acting on \(L^2(\Omega , \mathbb P )\) by

$$\begin{aligned} \mathcal L f (\omega ) = \sum _{|z| = 1} \omega _{0,z} (f(\theta _z \ \omega ) - f(\omega )). \end{aligned}$$

This operator comes out naturally as the infinitesimal generator of the Markov process of the environment viewed by the particle (i.e. the process \(t \mapsto \theta _{X_t} \ \omega \)). One can check that \(-\mathcal L \) is a positive self-adjoint operator on \(L^2(\Omega ,\mathbb P )\). We let

$$\begin{aligned} \mathfrak d (\omega ) = \sum _{|z| = 1} \omega _{0,z} \ \xi \cdot z \ \in L^2(\Omega ,\mathbb P ) \end{aligned}$$

be the local drift in the direction \(\xi \), and for every \(\mu > 0\), we define \(\phi _\mu \in L^2(\Omega ,\mathbb P )\) to be such that

$$\begin{aligned} (\mu - \mathcal L ) \phi _\mu = \mathfrak d . \end{aligned}$$

The parameter \(\mu > 0\) should be thought to be small (ideally, one would like to take it to be zero, but this is not possible in dimension \(2\)). We decompose \(\xi \cdot X_t\) as the sum \(M_\mu (t) + R_\mu (t)\), where

$$\begin{aligned} M_\mu (t) = \xi \cdot X_t + \phi _\mu (\omega (t)) - \phi _\mu (\omega (0)) - \mu \int _0^t \phi _\mu (\omega (s)) \ \mathrm{d }s, \end{aligned}$$
(4.3)

and

$$\begin{aligned} R_\mu (t) = - \phi _\mu (\omega (t)) + \phi _\mu (\omega (0)) + \mu \int _0^t \phi _\mu (\omega (s)) \ \mathrm{d }s. \end{aligned}$$
(4.4)

The next proposition collects several results, mostly from [35], that will be useful for our purpose.

Theorem 4.4

The process \((M_\mu (t))_{t \geqslant 0}\) is a square-integrable martingale under \(\overline{\mathbb{P }}\) (with respect to the natural filtration associated to \((X_t)_{t \geqslant 0}\)). Let \(\sigma _\mu = \mathbb E [(M_\mu (1))^2]^{1/2}\). There exist constants \(C\) and \(q\) such that for any \(\mu > 0, t > 0\), the following three estimates hold:

$$\begin{aligned} \overline{\mathbb{E }}\left[ \left( \frac{\langle M_\mu \rangle _t}{t} - \sigma _\mu ^2 \right) ^2 \right]&\leqslant C \left| \begin{array}{ll} \log _+^q(\mu ^{-1}) \left( 1 / \sqrt{t} + \mu ^2 \right) &{} \quad \text {if } d = 2 , \\ \log _+(t)/t + \mu ^2 &{} \quad \text {if } d = 3, \\ 1/t + \mu ^2 &{} \quad \text {if } d \geqslant 4, \end{array}\right. \end{aligned}$$
(4.5)
$$\begin{aligned} \overline{\mathbb{E }}[(R_{1/t}(t))^2]&\leqslant C \left| \begin{array}{ll} \log ^q(t) &{} \quad \text {if } d = 2, \\ 1 &{} \quad \text {if } d \geqslant 3. \end{array} \right. \end{aligned}$$
(4.6)
$$\begin{aligned} \big | \sigma _\mu - \sigma (\xi ) \big |&\leqslant C \left| \begin{array}{ll} \mu \log ^q(\mu ^{-1}) &{} \quad \text {if } d = 2,\\ \mu ^{3/2} &{} \quad \text {if } d = 3,\\ \mu ^2 \log (\mu ^{-1}) &{} \quad \text {if } d = 4,\\ \mu ^2 &{} \quad \text {if } d \geqslant 5. \end{array} \right. \end{aligned}$$
(4.7)

Moreover, for every integer \(p \geqslant 1\), there exist constants \(C\) and \(q\) such that for any \(\mu > 0, t > 0\), one has

$$\begin{aligned} \frac{1}{t^p} \overline{\mathbb{E }}\left[ \sum _{0 \leqslant s \leqslant t} (\Delta M_\mu (s))^{2p} \right] \leqslant C \left| \begin{array}{ll} \log _+^q(\mu ^{-1}) / t^{-p+1} &{} \quad \text {if } d = 2, \\ t^{-p+1} &{} \quad \text {if } d \geqslant 3. \end{array}\right. \end{aligned}$$
(4.8)

In these four estimates, the constants do not depend on the vector \(\xi \in \mathbb R ^d\) of unit norm.

Inequality (4.7) was proved in [22, Theorem 1] (see also [21, Theorem 3 with \(k = 1\)] for a slightly different point of view). Inequalities (4.5) and (4.6) correspond respectively to [35, (3.10) and Proposition 3.4]. The last inequality with \(p = 2\) corresponds [35, (3.11)]; the extension to arbitrary \(p\) is straightforward.

Proof of Theorem 4.1

We first treat the case \(d \geqslant 2\). We have

$$\begin{aligned}&\mathsf d _{1,k}\left( \frac{\xi \cdot X_t}{\sqrt{t}}, \Phi _{\sigma (\xi )}\right) \\&\quad \leqslant \mathsf d _{1,k}\left( \frac{\xi \cdot X_t}{\sqrt{t}}, \frac{M_\mu (t)}{\sqrt{t}}\right) + \mathsf d _{1,k}\left( \frac{M_\mu (t)}{\sqrt{t}}, \Phi _{\sigma _\mu }\right) + \mathsf d _{1,k}\left( \Phi _{\sigma _\mu }, \Phi _{\sigma (\xi )}\right) \!, \end{aligned}$$

with the understanding that random variables stand in place of their respective distributions under the measure \(\overline{\mathbb{P }}\). Let us write the three terms in the right-hand side above as \(b_1 + b_2 + b_3\), and proceed to evaluate each of these terms for the specific choice \(\mu = 1/t\). Considering (2.5), we can bound the term \(b_1\) by

$$\begin{aligned} \mathsf d _{1}\left( \frac{\xi \cdot X_t}{\sqrt{t}}, \frac{M_{1/t}(t)}{\sqrt{t}}\right) \leqslant \overline{\mathbb{E }}\left[ \frac{|R_{1/t}(t)|}{\sqrt{t}}\right] \leqslant \frac{\overline{\mathbb{E }}[(R_{1/t}(t))^2]^{1/2}}{\sqrt{t}}, \end{aligned}$$

and inequality (4.6) gives us adequate control of this upper bound.

To handle the term \(b_3\), consider a standard Gaussian random variable \(\mathcal N \). Then \(\sigma \mathcal N \) has \(\Phi _{\sigma }\) as its cumulative distribution function, hence

$$\begin{aligned} \mathsf d _{1,k}\left( \Phi _{\sigma }, \Phi _{\sigma '}\right) \leqslant \mathsf d _1\left( \Phi _{\sigma }, \Phi _{\sigma '}\right) \leqslant E[|\sigma \mathcal N - \sigma ' \mathcal N |] = E[|\mathcal N |] \ |\sigma - \sigma '|. \end{aligned}$$

Since \(E[|\mathcal N |] \leqslant 1\), the term \(b_3\) is bounded by \(|\sigma _{1/t} - \sigma (\xi )|\). We can thus use inequality (4.7) (with \(\mu = 1/t\)), which is much better than what we need for our purpose.

We now turn to the term \(b_2\). For any \(p > 1\), we introduce

$$\begin{aligned} L_{2p}(t) = \frac{1}{t^{p}} \ \overline{\mathbb{E }}\left[ \sum _{0 \leqslant s \leqslant t} |\Delta M_{1/t}(t)|^{2p} \right] . \end{aligned}$$

Theorem 3.2 tells us that if \(L_{2p}(t) \leqslant 1\), then \(b_2\) is smaller than

$$\begin{aligned} \left( C_p + \frac{k}{2}\right) (L_{2p}(t))^{1/(2p+1)} + (k \vee 1) \left\| \frac{\langle M_{1/t}\rangle _t}{t} - \sigma _{1/t}\right\| _1. \end{aligned}$$

Inequality (4.8) ensures that

$$\begin{aligned} L_{2p}(t) \leqslant C \left| \begin{array}{ll} \log _+^q(t) \ t^{-p+1} &{} \quad \text {if } d = 2, \\ t^{-p+1} &{} \quad \text {if } d \geqslant 3, \end{array} \right. \end{aligned}$$

for some constants \(C\) and \(q\) depending on \(p\). In particular, it is always true that \(L_{2p}(t)\) tends to \(0\) as \(t\) tends to infinity. We fix \(p\) large enough so that

$$\begin{aligned} \frac{p-1}{2p+1} > \frac{1}{2} - \delta . \end{aligned}$$
(4.9)

With such a choice for \(p\), we have \((L_{2p}(t))^{1/(2p+1)} = o(t^{\delta -1/2})\).

Finally, inequality (4.5) gives us that

$$\begin{aligned} \left\| \frac{\langle M_{1/t}\rangle _t}{t} - \sigma _{1/t}\right\| _2^2 \leqslant C \left| \begin{array}{ll} \log _+^q(t) \ t^{-1/2} &{} \quad \text {if } d = 2, \\ \log _+(t) \ t^{-1} &{} \quad \text {if } d = 3, \\ t^{-1} &{} \quad \text {if } d \geqslant 4. \end{array}\right. \end{aligned}$$

Since

$$\begin{aligned} \left\| \frac{\langle M_{1/t}\rangle _t}{t} - \sigma _{1/t}\right\| _1 \leqslant \left\| \frac{\langle M_{1/t}\rangle _t}{t} - \sigma _{1/t}\right\| _2, \end{aligned}$$

this finishes the proof of Theorem 4.1 for \(d\geqslant 2\) and \(t\) large enough, and it is easy to see that the left-hand side of (4.1) is bounded for smaller \(t\). The one-dimensional case is obtained in a similar way, following [35, Section 9]. \(\square \)

5 Homogenization

We consider the discrete parabolic equation with random coefficients

where \(f: \mathbb Z ^d \rightarrow \mathbb R , L^\omega \) is the operator defined in (1.1), and by \(L^\omega u(t,x)\), we understand \(L^\omega u(t,\cdot ) (x)\). Note that \(L^\omega \) is the discrete analog of a divergence form operator.

For a fixed \(\omega \in \Omega \), we say that \(u\) is a solution of (DPE\(^\omega \)) if it is continuous on \([0,+\infty ) \times \mathbb Z ^d\), has continuous time derivative there (in other words, \(u(\cdot ,x)\) is in \(\mathcal C ^1(\mathbb R _+,\mathbb R )\) for every \(x \in \mathbb Z ^d\)), and satisfies the identities displayed in (DPE\(^\omega \)).

Proposition 5.1

For any \(\omega \in \Omega \) and any bounded initial condition \(f\), there exists a unique bounded solution \(u\) of (DPE\(^\omega \)), and it is given by

$$\begin{aligned} u(t,x) = \mathbf E ^\omega _x[f(X_t)]. \end{aligned}$$
(5.1)

This is a very well known result. Checking that (5.1) is indeed a solution is a direct consequence of the definition of the Markov chain. To see uniqueness, take \(\tilde{u}\) a bounded solution of (DPE\(^\omega \)). Letting \(\tilde{M}_s = \tilde{u}(t-s,X_s)\), one can show that \((\tilde{M}_s)_{0 \leqslant s \leqslant t}\) is a martingale under \(\mathbf P ^\omega _x\) for any \(x \in \mathbb Z ^d\), and as a consequence,

$$\begin{aligned} \tilde{u}(t,x) = \mathbf E ^\omega _x[\tilde{M}_0] = \mathbf E ^\omega _x[\tilde{M}_t] = \mathbf E ^\omega _x[\tilde{u}(0,X_t)] = \mathbf E ^\omega _x[f(X_t)], \end{aligned}$$

which is the function defined in (5.1).

For a symmetric positive-definite matrix \(\overline{A}\), we consider the equation (CPE) given in the introduction. We say that \(\overline{u}\) is a solution of (CPE) if it is continuous on \(\mathbb R _+ \times \mathbb R ^d\), has a continuous first derivative in the time variable and continuous first and second derivatives in the space variable on \((0,+\infty ) \times \mathbb R ^d\), and satisfies the identities displayed in (CPE).

Proposition 5.2

For any bounded continuous initial condition \(f\), there exists a unique bounded solution \(\overline{u}\) of (CPE), and it is given by

$$\begin{aligned} \overline{u}(t,x) = \mathbf E _x[f(B_t)], \end{aligned}$$
(5.2)

where, under the measure \(\mathbf P _x, B_t\) is a Brownian motion with covariance matrix \(\overline{A}\) that starts at \(x\).

Again, this result is standard. It is proved in the same way as Proposition 5.1, with the help of Itô’s formula.

Remark 5.3

The boundedness assumption in Propositions 5.1 and 5.2 could be changed for being subexponential. More precisely, let \(f : \mathbb Z ^d \rightarrow \mathbb R \) be such that for any \(\alpha > 0, |f(x)| = O(e^{\alpha |x|})\). Then there exists a unique solution \(u\) of (DPE\(^\omega \)) such that, for any \(\alpha > 0\) and any \(t \geqslant 0, \sup _{s \leqslant t} |u(s,x)| = O(e^{\alpha |x|})\). The boundedness condition was merely chosen for convenience.

We now define rescaled solutions of the parabolic equation with random coefficients. For a bounded continuous function \(f:\mathbb R ^d \rightarrow \mathbb R \), we let \(u^{(\varepsilon )}\) be the bounded solution of (DPE\(^\omega \)) with initial condition given by the function \(x \mapsto f(\varepsilon x)\), and for any \(t\geqslant 0\) and \(x \in \mathbb R ^d\), we let

$$\begin{aligned} u_\varepsilon (t,x) = u^{(\varepsilon )}(\varepsilon ^{-2} t, \lfloor \varepsilon ^{-1}x \rfloor ) = \mathbf E ^\omega _{\lfloor \varepsilon ^{-1} x \rfloor } [f(\varepsilon X_{\varepsilon ^{-2} t})]. \end{aligned}$$
(5.3)

It is well understood (see for instance [4, Chapter 3]) that the probabilistic approach yields pointwise convergence of \(u_\varepsilon \) to the solution of the homogenized problem. The following result is folklore (see also [33] where the homogenization of random operators in continuous space is obtained using the probabilistic approach).

Theorem 5.4

There exists a symmetric positive-definite matrix \(\overline{A}\) (independent of \(f\)) such that for every \(t \geqslant 0\) and \(x \in \mathbb R ^d\), we have

$$\begin{aligned} u_\varepsilon (t,x) \xrightarrow [\varepsilon \rightarrow 0]{\text {(prob.)}} \overline{u}(t,x), \end{aligned}$$
(5.4)

where \(\overline{u}\) is the bounded solution of (CPE) with initial condition \(f\).

Proof

Recall that we write \((\theta _x)\) to denote the translations on \(\Omega \), see (4.2). The distribution of \(X\) under \(\mathbf P ^\omega _x\) is the same as the one of \(X+x\) under \(\mathbf P ^{\theta _x \omega }_0\) (both are Markov processes with the same initial condition and the same transition rates). Using this observation in (5.3), we obtain that

$$\begin{aligned} u_\varepsilon (t,x) = \mathbf E ^{\theta _{\lfloor \varepsilon ^{-1} x \rfloor } \omega }_0 [f(\varepsilon X_{\varepsilon ^{-2} t} + x_\varepsilon )], \end{aligned}$$

where \(x_\varepsilon = \varepsilon \lfloor \varepsilon ^{-1} x \rfloor \).

Since the measure \(\mathbb P \) is invariant under translations, \(u_\varepsilon (t,x)\) has the same distribution as

$$\begin{aligned} \mathbf E ^\omega _0 [f(\varepsilon X_{\varepsilon ^{-2} t} + x_\varepsilon )]. \end{aligned}$$
(5.5)

It is proved in [15, 30] that for some symmetric positive-definite \(\overline{A}\) (independent of \(f\)), the quantity in (5.5) converges in probability to \(\mathbf E _0[f(B_t +x)]\) as \(\varepsilon \) tends to \(0\), where \(B\) is a Brownian motion with covariance matrix \(\overline{A}\). \(\square \)

Remark 5.5

It would be interesting to replace the convergence in probability in (5.4) by an almost sure convergence. Note that almost sure convergence for \(x = 0\) is equivalent to an almost sure central limit theorem for the random walk, and this is proved in [39]. Theorem 5.4 contrasts with for instance [28, Theorem 7.4], where weak convergence of an analogue of \(u_\varepsilon \) is proved, but for almost every environment.

We start the proof of Theorem 1.1 with two lemmas with a Fourier-analytic flavour.

Lemma 5.6

Let \(Z\) be a random variable with distribution \(\nu , \mathcal N \) be a standard \(d\)-dimensional Gaussian random variable independent of \(Z\), and \(\sigma > 0\). If \(f\) is in \(L^2(\mathbb R ^d)\), then

$$\begin{aligned} E[f(Z + \sigma \mathcal N )] = (2\pi )^{-d} \int _\mathbb{R ^d} \exp \left( -\frac{\sigma ^2 |\xi |^2}{2}\right) \hat{f}(\xi ) \hat{\nu }(\xi ) \ \mathrm{d }\xi , \end{aligned}$$

where

$$\begin{aligned} \hat{f}(\xi ) = \int e^{i \xi \cdot x} f(x) \ \mathrm{d }x, \end{aligned}$$
(5.6)

and

$$\begin{aligned} \hat{\nu }(\xi ) = \int e^{- i \xi \cdot x} \ \mathrm{d }\nu (x). \end{aligned}$$

Remark 5.7

The definition of the Fourier transform given in (5.6) only makes sense for \(f \in L^1(\mathbb R ^d)\), but as is well known, the Fourier transform can then be extended to functions in \(L^2(\mathbb R ^d)\) by continuity.

Proof

Recall that we always assume \(f\) to be bounded. In order to prove the proposition, it suffices to prove it for functions \(f \in L^1(\mathbb R ^d)\), since we can then conclude by a density argument.

Let us write

$$\begin{aligned} g_\sigma (x) = \frac{1}{(2\pi \sigma ^2)^{d/2}} \exp \left( - \frac{|x|^2}{2 \sigma ^2} \right) . \end{aligned}$$

Note first that

$$\begin{aligned} \hat{g}_{1/\sigma }(x) = \exp \left( - \frac{|x|^2}{2 \sigma ^2} \right) = (2\pi \sigma ^2)^{d/2} g_\sigma (x). \end{aligned}$$
(5.7)

The distribution of \(Z + \sigma \mathcal N \) has a density (with respect to Lebesgue measure) at point \(z\) which is given by

$$\begin{aligned} \int g_\sigma (z-x) \ \mathrm{d }\nu (x)&\stackrel{\text {(5.7)}}{=} {(2\pi \sigma ^2)^{-d/2}} \int \hat{g}_{1/\sigma }(z-x) \ \mathrm{d }\nu (x) \\&= {(2\pi \sigma ^2)^{-d/2}} \int e^{i \xi \cdot (z-x)} g_{1/\sigma }(\xi ) \ \mathrm{d }\xi \ \mathrm{d }\nu (x) \\&= {(2\pi \sigma ^2)^{-d/2}} \int e^{i \xi \cdot z} g_{1/\sigma }(\xi ) \hat{\nu }(\xi ) \ \mathrm{d }\xi . \end{aligned}$$

As a consequence (and using the fact that \(\hat{\nu }\) is bounded), we have

$$\begin{aligned} E[f(Z + \sigma \mathcal N )]&= {(2\pi \sigma ^2)^{-d/2}} \int f(z) e^{i \xi \cdot z} g_{1/\sigma }(\xi ) \hat{\nu }(\xi ) \ \mathrm{d }\xi \ \mathrm{d }z\\&= {(2\pi \sigma ^2)^{-d/2}} \int g_{1/\sigma }(\xi ) \hat{f}(\xi ) \hat{\nu }(\xi ) \ \mathrm{d }\xi . \end{aligned}$$

Since

$$\begin{aligned} {(2\pi \sigma ^2)^{-d/2}} g_{1/\sigma }(\xi ) = (2\pi )^{-d} \exp \left( - \frac{\sigma ^2 |x|^2}{2} \right) , \end{aligned}$$

this proves the lemma. \(\square \)

Lemma 5.8

For any integer \(m \geqslant 0\), there exists a constant \(C_m\) such that if the weak derivatives of \(f\) up to order \(m\) are in \(L^2(\mathbb R ^d)\), then

$$\begin{aligned} \int \left( 1+|\xi |^{2m}\right) \ \left| \hat{f}(\xi )\right| ^2 \ \mathrm{d }\xi \leqslant C_m \left( \Vert f\Vert _2^2 + \sum _{j = 1}^d \Vert \partial _{x_j^m} f\Vert _2^2 \right) . \end{aligned}$$

Proof

See [19, Theorem 8]. \(\square \)

Proof of Theorem 1.1

Let \(t > 0\). We saw in the proof of Theorem 5.4 that

$$\begin{aligned} \mathbb E [u_\varepsilon (t,x)]&= \mathbb E \mathbf E ^{\theta _{\lfloor \varepsilon ^{-1} x \rfloor } \omega }_0 [f(\varepsilon X_{\varepsilon ^{-2} t} + x_\varepsilon )] \\&= \overline{\mathbb{E }}[f(\varepsilon X_{\varepsilon ^{-2} t} + x_\varepsilon )], \end{aligned}$$

where in the last line, we used the fact that the measure \(\mathbb P \) is translation invariant, and we recall that we write \(\overline{\mathbb{E }}\) for \(\mathbb E \mathbf E ^\omega _0\) and \(x_\varepsilon \) for \(\varepsilon \lfloor \varepsilon ^{-1} x \rfloor \). Note that

$$\begin{aligned} \left| \overline{\mathbb{E }}[f(\varepsilon X_{\varepsilon ^{-2} t} + x_\varepsilon )] - \overline{\mathbb{E }}[f(\varepsilon X_{\varepsilon ^{-2} t} + x)]\right| \leqslant \sum _{j=1}^d \Vert \partial _{x_j} f \Vert _\infty \ \varepsilon , \end{aligned}$$

which is the first term in the right-hand side of (1.2) (a “lattice effect”). We now focus on studying the difference

$$\begin{aligned} \left| \overline{\mathbb{E }}[f(\varepsilon X_{\varepsilon ^{-2} t} + x)] - \mathbf E _0[f(B_t + x)]\right| \!, \end{aligned}$$

where we recall that \(\mathbf E _0[f(B_t+x)] = \mathbf E _x[f(B_t)] = \overline{u}(t,x)\). Possibly replacing \(f\) by \(f(\ \cdot \ + x)\), we may as well suppose that \(x = 0\). Let \(\sigma > 0\) be a small parameter, \(\mathcal N \) be a standard \(d\)-dimensional Gaussian random variable, independent of everything else, and write \(f_t = f(\sqrt{t} \ \cdot )\). Since \(f_t\) is bounded and continuous, we have

$$\begin{aligned} \overline{\mathbb{E }}[f(\varepsilon X_{\varepsilon ^{-2} t})] = \overline{\mathbb{E }}\left[ f_t\left( \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t}\right) \right] = \lim _{\sigma \rightarrow 0} \overline{\mathbb{E }}\left[ f_t\left( \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t}+ \sigma \mathcal N \right) \right] .\quad \end{aligned}$$
(5.8)

Similarly,

$$\begin{aligned} \mathbf E _0[f(B_t)] = \mathbf E _0[f(\sqrt{t} B_1)] = \mathbf E _0[f_t(B_1)] = \lim _{\sigma \rightarrow 0} \mathbf E _0[f_t(B_1 + \sigma \mathcal N )], \end{aligned}$$
(5.9)

where we slightly abuse notation by using the same \(\mathcal N \) to denote a standard Gaussian (independent of everything else) under both the measures \(\mathbf E _0\) and \(\overline{\mathbb{E }}\). The random variable \(\sigma \mathcal N \) is introduced for regularization purposes, and in particular will enable us to use Lemma 5.6.

Let us write \(\nu _{\varepsilon }\) for the distribution of

$$\begin{aligned} \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t} \end{aligned}$$

under the measure \(\overline{\mathbb{P }}\), and \(\nu _0\) for the distribution of \(B_1\) under \(\mathbf E _0\). Note that

$$\begin{aligned} \hat{\nu }_\varepsilon (\xi ) = \overline{\mathbb{E }}\left[ \exp \left( i |\xi | \ \frac{\varepsilon \ \xi \cdot X_{\varepsilon ^{-2} t}}{ \sqrt{t} \ |\xi |} \right) \right] . \end{aligned}$$

The function \( x \mapsto e^{i |\xi | x}\) has first derivative bounded by \(|\xi |\) and second derivative bounded by \(|\xi |^2\). In view of (2.4), we obtain from Theorem 4.1 that

$$\begin{aligned} \left| \hat{\nu }_\varepsilon (\xi ) - \hat{\nu }_0(\xi ) \right| \leqslant C |\xi | \ (|\xi | \vee 1) \ \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{{t}}\right) . \end{aligned}$$

Using Lemma 5.6, we thus obtain that

$$\begin{aligned}&\left| \overline{\mathbb{E }}\left[ f_t\left( \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t}+ \sigma \mathcal N \right) \right] - \mathbf E _0[f_t(B_1 + \sigma \mathcal N )] \right| \\&\qquad \leqslant (2\pi )^{-d} \int _\mathbb{R ^d} \exp \left( -\frac{\sigma ^2 |\xi |^2}{2}\right) \left| \hat{f}_t(\xi )\right| \ \left| \hat{\nu }_\varepsilon (\xi ) - \hat{\nu }_0(\xi ) \right| \ \mathrm{d }\xi \\&\qquad \leqslant C \ \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{{t}}\right) {\int \left| \hat{f}_t(\xi )\right| \ |\xi | \ (|\xi | \vee 1) \ \mathrm{d }\xi }, \end{aligned}$$

where \(C\) does not depend on \(\sigma \). We can thus take the limit \(\sigma \rightarrow 0\) in this inequality and use (5.8) and (5.9) to obtain

$$\begin{aligned}&\left| \overline{\mathbb{E }}\left[ f_t\left( \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t}\right) \right] - \mathbf E _0[f_t(B_1)] \right| \nonumber \\&\quad \leqslant C \ \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{{t}}\right) \underbrace{\int \left| \hat{f}_t(\xi )\right| \ |\xi | \ (|\xi | \vee 1) \ \mathrm{d }\xi }. \end{aligned}$$
(5.10)

Since \(\hat{f}_t(\xi ) = t^{-d/2} \hat{f}(\xi /\sqrt{t})\), we can perform a change of variables on the integral underbraced above:

$$\begin{aligned} \int \left| \hat{f}_t(\xi )\right| \ |\xi | \ (|\xi | \vee 1) \ \mathrm{d }\xi = \sqrt{t} \int \left| \hat{f}(\xi )\right| \ |\xi | \ (\sqrt{t}|\xi | \vee 1) \ \mathrm{d }\xi . \end{aligned}$$

Note that \(|\xi | (\sqrt{t}|\xi | \vee 1) \leqslant (\sqrt{t}+1)(|\xi |^2+1)\). Hence, the integral above is bounded by

$$\begin{aligned} (t +\sqrt{t}) \int \left| \hat{f}(\xi )\right| \ (|\xi |^2 + 1) \ \mathrm{d }\xi . \end{aligned}$$

Let \(m = \lfloor d/2 \rfloor +3\). By the Cauchy–Schwarz inequality, this integral is bounded by

$$\begin{aligned} \left( \int \frac{(|\xi |^2 + 1)^2}{{1+|\xi |^{2m}}} \ \mathrm{d }\xi \right) ^{1/2} \left( \int \left( 1+|\xi |^{2m}\right) \ \left| \hat{f}(\xi )\right| ^2 \ \mathrm{d }\xi \right) ^{1/2}. \end{aligned}$$

Since \(2m - 4 > d\), the first term of this product is finite, while Lemma 5.8 gives us that the second term is bounded by

$$\begin{aligned} \sqrt{C_m} \ \left( \Vert f\Vert _2^2 + \sum _{j = 1}^d \Vert \partial _{x_j^m} f\Vert _2^2\right) ^{1/2} \leqslant \sqrt{C_m} \ \left( \Vert f\Vert _2 + \sum _{j = 1}^d \Vert \partial _{x_j^m} f\Vert _2\right) . \end{aligned}$$

Recalling (5.10), we thus get that

$$\begin{aligned}&\left| \overline{\mathbb{E }}\left[ f_t\left( \frac{\varepsilon }{\sqrt{t}} X_{\varepsilon ^{-2} t}\right) \right] - \mathbf E _0[f_t(B_1)] \right| \\&\quad \leqslant C \ (t+\sqrt{t}) \left( \Vert f\Vert _2 + \sum _{j=1}^d \Vert \partial _{x_j^m} f\Vert _2 \right) \ \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{{t}}\right) , \end{aligned}$$

and this finishes the proof. \(\square \)

6 Heat kernel estimates

The heat kernel \(p_t^\omega (x,y)\) is defined so that \((t,y) \mapsto p^\omega _t(x,y)\) is the unique bounded solution to (DPE\(^\omega \)) with initial condition \(f = \mathbf 1 _x\). The heat kernel is symmetric: \(p_t^\omega (x,y) = p_t^\omega (y,x)\), and by translation invariance of the random coefficients, \(\mathbb E [p_t^\omega (x,y)] = \mathbb E [p_t^\omega (0,y-x)]\).

The aim of this section is to prove Theorem 1.3. In order to do so, we will need a regularity result on the averaged heat kernel. For \(f : \mathbb Z ^d \rightarrow \mathbb R \) and \(1 \leqslant i \leqslant d\), we write

$$\begin{aligned} \nabla _i f(x) = f(x+\mathbf e _i) - f(x), \end{aligned}$$

where \((\mathbf e _i)_{1 \leqslant i \leqslant d}\) is the canonical basis of \(\mathbb R ^d\). The following result was proved in [12, Theorem 1.4], and then elegantly rederived in [17, (1.4)].

Theorem 6.1

([12, 17]) Let

$$\begin{aligned} q_t(x) = \mathbb E \left[ p^\omega _t(0,x) \right] . \end{aligned}$$
(6.1)

There exist \(C, c_1 > 0\) such that for any \(t > 0\) and any \(x \in \mathbb Z ^d\), one has

$$\begin{aligned} \left| \nabla _i q_t(x) \right| \leqslant \frac{C}{t^{(d+1)/2}} \exp \left( -c_1\left( \frac{|x|^2}{t} \wedge |x| \right) \right) . \end{aligned}$$

We also recall the following upper bound on the heat kernel, taken from [16, Proposition 3.4] (see also [9, Section 3] for earlier results in this context).

Theorem 6.2

([16])

  1. (1)

    There exist constants \(C, \overline{c}\) such that for any \(t \geqslant 0\) and any \(x \in \mathbb Z ^d\),

    $$\begin{aligned} p_t^\omega (0,x) \leqslant \frac{C}{1 \vee t^{d/2}} \exp \left( -D_{\overline{c}t}(x)\right) \!, \end{aligned}$$

    where

    $$\begin{aligned} D_{t}(x) = |x|{{\mathrm{arsinh}}}\left( \frac{|x|}{t} \right) + t \left( \sqrt{1+\frac{|x|^2}{t^2}} - 1\right) . \end{aligned}$$
  2. (2)

    In particular, there exists \(c_2 > 0\) such that for any \(x\in \mathbb Z ^d\),

    $$\begin{aligned} p_t^\omega (0,x) \leqslant \frac{C}{1 \vee t^{d/2}} \exp \left( -c_2\left( \frac{|x|^2}{t} \wedge |x| \right) \right) . \end{aligned}$$

Proof of Theorem 1.3

We decompose the proof into three steps.

Step 1. Possibly lowering the value of \(c_2 > 0\), we have that for any \(x \in \mathbb R ^d\),

$$\begin{aligned} \overline{p}_1(0,x)&\leqslant C \exp \left( -c_2 |x|^2\right) , \end{aligned}$$
(6.2)
$$\begin{aligned} \frac{\partial \overline{p}_1(0,\cdot )}{\partial x_i}(x)&\leqslant C \exp \left( -c_2 |x|^2\right) \qquad (1 \leqslant i \leqslant d). \end{aligned}$$
(6.3)

Equation (6.2) and part (2) of Theorem 6.2 thus ensure that (possibly enlarging \(C\)),

$$\begin{aligned} \left| \varepsilon ^{-d} \ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_1(0,x) \right| \leqslant C \exp \left( -c_2 (|x|^2 \wedge |\varepsilon ^{-1} x|) \right) . \end{aligned}$$
(6.4)

Moreover, Theorem 6.1 remains true if we lower the value of the constant \(c_1 > 0\) in such a way that \(c_2 \geqslant c_1/2\sqrt{d}\).

Step 2. We now show that there exist \(c > 0\) (independent of \(\delta \)), \(\varepsilon _\delta > 0\) and \(C_\delta \) such that, for any \(\varepsilon \leqslant \varepsilon _\delta \) and any \(x \in \mathbb R ^d\), one has

$$\begin{aligned} \left| \varepsilon ^{-d} \ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_1(0,x) \right| \leqslant C_\delta \left( \Psi _{q,\delta }(\varepsilon ^2)\right) ^{1/(d+3)}\exp \left( -c (|x|^2 \wedge |\varepsilon ^{-1} x|) \right) .\nonumber \\ \end{aligned}$$
(6.5)

Let \(f\) be a positive smooth function on \(\mathbb R ^d\) with support in \([-1,1]^d\) and such that \(\int f = 1\). We define, for any \(r > 0\), the function \(f_r : x \mapsto r^{-d} f(r^{-1} x)\).

Let \(u^{(\varepsilon )}\) be the bounded solution of (DPE\(^\omega \)) with initial condition \(f_r(\varepsilon \ \cdot )\) (we keep the dependence of \(u^{(\varepsilon )}\) in \(r\) implicit in the notation). By linearity, we have

$$\begin{aligned} u^{(\varepsilon )}(t,x) = \sum _{z \in \mathbb Z ^d} f_r(\varepsilon z) \ p_t^\omega (z,x). \end{aligned}$$

Letting \(u_\varepsilon (t,x) = u^{(\varepsilon )}(\varepsilon ^{-2} t, \lfloor \varepsilon ^{-1} x \rfloor )\), we obtain

$$\begin{aligned} u_\varepsilon (t,x) = \sum _{z \in \mathbb Z ^d} f_r(\varepsilon z) \ p_{\varepsilon ^{-2}t}^\omega (z,\lfloor \varepsilon ^{-1} x \rfloor ). \end{aligned}$$
(6.6)

Let \(\overline{u}\) be the bounded solution of (CPE) with initial condition \(f_r\). Observing the proof of Theorem 1.1, we get that for any \(\delta > 0\), there exists \(C\) such that

$$\begin{aligned}&\left| \mathbb E [u_\varepsilon (1,x)] - \overline{u}(1,x) \right| \nonumber \\&\quad \leqslant \sum _{j=1}^d \Vert \partial _{x_j} f_r \Vert _\infty \ \varepsilon + C \ \Psi _{q,\delta }({\varepsilon ^2}) \ \int \left| \hat{f_r}(\xi )\right| \ (|\xi |^2 + 1) \ \mathrm{d }\xi . \end{aligned}$$
(6.7)

Scaling relations ensures that \(\Vert \partial _{x_j} f_r \Vert _\infty \) is bounded, up to a constant, by \(r^{-(d+1)}\), while \(\hat{f_r}(\xi ) = \hat{f}(r \xi )\). As a consequence,

$$\begin{aligned} \int \left| \hat{f_r}\right|&= r^{-d} \int \left| \hat{f}\right| , \\ \int \left| \hat{f_r}(\xi )\right| \ |\xi |^2 \ \mathrm{d }\xi&= r^{-(d+2)} \int \left| \hat{f}(\xi )\right| \ |\xi |^2 \ \mathrm{d }\xi , \end{aligned}$$

and the integrals on the right-hand side are finite since \(f\) is smooth (see Lemma 5.8). To sum up, for some constant \(C\) and any \(r \leqslant 1\), we have

$$\begin{aligned} \left| \mathbb E [u_\varepsilon (1,x)] - \overline{u}(1,x) \right| \leqslant C \left( \varepsilon \ r^{-(d+1)} + \Psi _{q,\delta }\left( {\varepsilon ^2}\right) r^{-(d+2)} \right) . \end{aligned}$$
(6.8)

The solution \(\overline{u}\) can be represented in terms of the heat kernel as

$$\begin{aligned} \overline{u}(1,x) = \int f_r(z) \overline{p}_1(z,x) \ \mathrm{d }z = \overline{p}_1(0,x) + \int f_r(z) (\overline{p}_1(z,x) - \overline{p}_1(0,x)) \ \mathrm{d }z, \end{aligned}$$

where we used the fact that \(\int f_r = 1\). For \(z \in \mathbb R ^d\) such that \(\Vert z\Vert _\infty \leqslant r\leqslant 1\) and up to a constant, \(|\overline{p}_1(z,x) - \overline{p}_1(0,x)|\) is bounded by \(r e^{-c_2 |x|^2}\) by (6.3). Since \(f_r\) has support in \([-r,r]^d\), we arrive at

$$\begin{aligned} \left| \overline{u}(1,x) - \overline{p}_1(0,x) \right| \leqslant C \ {r} \exp \left( -c_2 |x|^2\right) . \end{aligned}$$
(6.9)

On the other hand, if \(z \in \mathbb Z ^d\) is such that \(\Vert z\Vert _\infty \leqslant \varepsilon ^{-1} r\), then

$$\begin{aligned} \left| \mathbb E [p_{\varepsilon ^{-2}}^\omega (z,\lfloor \varepsilon ^{-1} x \rfloor )] - q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) \right| \leqslant d \varepsilon ^{-1} r \sup _{\mathop {\Vert z\Vert _\infty \leqslant \varepsilon ^{-1} r}\limits _{1 \leqslant i \leqslant d}} |\nabla _i q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor - z)| \end{aligned}$$

We now argue that there exists \(c_3 > 0\) (independent of \(\delta \)) such that, uniformly over \(r \leqslant 1\) and \(x \in \mathbb R ^d\), one has

$$\begin{aligned} \sup _{\mathop {\Vert z\Vert _\infty \leqslant \varepsilon ^{-1} r}\limits _{1 \leqslant i \leqslant d}} |\nabla _i q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor - z)| \leqslant \frac{C}{\varepsilon ^{d+1}} \exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] . \end{aligned}$$
(6.10)

Theorem 6.1 tells us indeed that the left-hand side of (6.10) is smaller than

$$\begin{aligned} \frac{C}{\varepsilon ^{d+1}} \exp \left[ -c_1\inf _{\mathop {\Vert z\Vert _\infty \leqslant \varepsilon ^{-1} r}\limits _{1 \leqslant i \leqslant d}} \left( \frac{|\lfloor \varepsilon ^{-1} x \rfloor - z|^2}{\varepsilon ^{-2}} \wedge |\lfloor \varepsilon ^{-1} x \rfloor - z | \right) \right] . \end{aligned}$$

For any \(r \leqslant 1\) and \(\Vert x\Vert _\infty \geqslant 2\), the infimum above is larger than

$$\begin{aligned} \frac{|x|^2 \wedge |\varepsilon ^{-1} x|}{2\sqrt{d}}, \end{aligned}$$

so (6.10) holds in this case, with \(c_3 = c_1/2\sqrt{d}\). To control smaller values of \(\Vert x\Vert _\infty \), it suffices to enlarge the constant \(C\) in (6.10). To sum up, we have shown that

$$\begin{aligned} \left| \mathbb E [p_{\varepsilon ^{-2}}^\omega (z,\lfloor \varepsilon ^{-1} x \rfloor )] - q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) \right| \leqslant C \ \varepsilon ^d \ {r} \exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] . \end{aligned}$$

In the sum on the right-hand side of (6.6), only \(C (\varepsilon ^{-1} r)^d\) terms are non-zero, and \(\Vert f\Vert _\infty \leqslant r^{-d}\), so

$$\begin{aligned} \left| \mathbb E [u_\varepsilon (1,x)] - \sum _{z \in \mathbb Z ^d} f_r(\varepsilon z) \ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) \right| \leqslant C \ r\exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] . \end{aligned}$$

Observe also that

$$\begin{aligned} \varepsilon ^d \sum _{z \in \mathbb Z ^d} f_r(\varepsilon z) = \left( \frac{\varepsilon }{r}\right) ^d\sum _{z \in \mathbb Z ^d} f\left( \frac{\varepsilon }{r} z \right) . \end{aligned}$$

This is a Riemann approximation of \(\int f = 1\), hence

$$\begin{aligned} \left| \varepsilon ^d \sum _{z \in \mathbb Z ^d} f_r(\varepsilon z) - 1 \right| \leqslant C \ \frac{\varepsilon }{r}, \end{aligned}$$

and we are thus led to

$$\begin{aligned} \left| \mathbb E [u_\varepsilon (1,x)] \!-\! \varepsilon ^{-d} \ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) \right| \!\leqslant \! C \left( {r}\exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] \!+\! \frac{\varepsilon }{r}\right) .\quad \qquad \end{aligned}$$
(6.11)

Combining (6.8), (6.9), (6.11) and the fact that \(c_2 \geqslant c_3 = c_1/2\sqrt{d}\), we obtain that up to a constant,

$$\begin{aligned} \left| \varepsilon ^{-d}\ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_1(0,x) \right| \end{aligned}$$

is bounded by

$$\begin{aligned} \frac{\varepsilon }{r^{d+1}} + \frac{\Psi _{q,\delta }\left( {\varepsilon ^2}\right) }{r^{d+2}} + r\exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] + \frac{\varepsilon }{r}, \end{aligned}$$

uniformly over \(r \leqslant 1\). Since for \(\varepsilon \) small enough, one has \(\varepsilon \leqslant \Psi _{q,\delta }\left( {\varepsilon ^2}\right) \), the above is bounded, up to a constant, by

$$\begin{aligned} \frac{\Psi _{q,\delta }\left( {\varepsilon ^2}\right) }{r^{d+2}} + r \exp \left[ -c_3\left( |x|^2 \wedge |\varepsilon ^{-1} x| \right) \right] , \end{aligned}$$
(6.12)

uniformly over \(r \leqslant 1\). Choosing

$$\begin{aligned} r^{d+3} = \Psi _{q,\delta }(\varepsilon ^2) \exp \left[ c_3\left( |x| \wedge |\varepsilon ^{-1} x| \wedge M_\varepsilon \right) \right] , \end{aligned}$$

where

$$\begin{aligned} M_\varepsilon = -\frac{\log (\Psi _{q,\delta }(\varepsilon ^2))}{c_3} \end{aligned}$$

is here to ensure that \(r \leqslant 1\), we obtain that the expression in (6.12) is smaller than

$$\begin{aligned} \left( \Psi _{q,\delta }(\varepsilon ^2)\right) ^{1/(d+3)} \exp \left[ -c_3\left( 1-\frac{1}{d+3}\right) \left( |x| \wedge |\varepsilon ^{-1} x| \wedge M_\varepsilon \right) \right] . \end{aligned}$$

This proves (6.5) when \(|x| \wedge |\varepsilon ^{-1} x| \leqslant M_\varepsilon \). Otherwise, we use the bound (6.4), together with the fact that \(c_2 \geqslant c_3\), to get

$$\begin{aligned}&\left| \varepsilon ^{-d} \ q_{\varepsilon ^{-2}}(\lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_1(0,x) \right| \\&\qquad \leqslant C \exp \left( -c_3 (|x|^2 \wedge |\varepsilon ^{-1} x|) \right) \\&\qquad \leqslant C \exp \left( -c_3 \left( 1-\frac{1}{d+3}\right) (|x|^2 \wedge |\varepsilon ^{-1} x|) - \frac{c_3}{d+3} M_\varepsilon \right) \\&\qquad \leqslant C \left( \Psi _{q,\delta }(\varepsilon ^2)\right) ^{1/(d+3)} \exp \left( -c_3 \left( 1-\frac{1}{d+3}\right) (|x|^2 \wedge |\varepsilon ^{-1} x|)\right) . \end{aligned}$$

Hence, (6.5) holds also in this case, and we can always choose \(c = c_3(1-1/(d+3))\).

Step 3. We now extend the result to any time \(t > 0\). The heat kernel of the continuous operator satisfies the scaling relation

$$\begin{aligned} \overline{p}_t(0,x) = t^{-d/2} \ \overline{p}_1(0,x/\sqrt{t}), \end{aligned}$$

while we can write

$$\begin{aligned} \varepsilon ^{-d} \ q_{\varepsilon ^{-2}t}(\lfloor \varepsilon ^{-1} x \rfloor ) = t^{-d/2} \ (\varepsilon / \sqrt{t})^{-d} \ q_{(\varepsilon / \sqrt{t})^{-2}}(\lfloor (\varepsilon / \sqrt{t})^{-1} \ (x/\sqrt{t}) \rfloor ). \end{aligned}$$

For \(\varepsilon _\delta \) and \(C_\delta \) given by step 2, as soon as \(\varepsilon / \sqrt{t} \leqslant \varepsilon _\delta \), one thus has

$$\begin{aligned}&\left| \varepsilon ^{-d} \ q_{\varepsilon ^{-2}t}(\lfloor \varepsilon ^{-1} x \rfloor ) - \overline{p}_t(0,x) \right| \\&\quad \leqslant \frac{C_\delta }{t^{d/2}} \ \left( \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{t}\right) \right) ^{1/(d+3)}\exp \left[ -c\left( \frac{|x|^2}{t} \wedge |\varepsilon ^{-1} x| \right) \right] , \end{aligned}$$

which is the claim of the theorem. \(\square \)

7 Homogenization of elliptic equations

In this last section, we state and prove the counterparts of Theorems 1.1 and 1.3 for the homogenization of elliptic equations. For \(f : \mathbb R ^d \rightarrow \mathbb R \) bounded continuous, we consider the unique bounded solution of

Using integration by parts, one can check that

$$\begin{aligned} v^{(\varepsilon )}(x) = \int _0^{+\infty } e^{-t} \ u^{(\varepsilon )}(\varepsilon ^{-2} t,x) \ \mathrm{d }t, \end{aligned}$$
(7.1)

where \(u^{(\varepsilon )}\) is solution of (DPE\(^\omega _\varepsilon \)). For \(x \in \mathbb R ^d\), we let \(v_\varepsilon (x) = v^{(\varepsilon )}(\lfloor \varepsilon ^{-1} x \rfloor )\), so that

$$\begin{aligned} v_\varepsilon (x) = \int _0^{+\infty } e^{-t} \ u_\varepsilon (t,x) \ \mathrm{d }t. \end{aligned}$$
(7.2)

The function \(v_\varepsilon \) converges pointwise, as \(\varepsilon \) tends to \(0\), to \(\overline{v}\) the bounded solution of

$$\begin{aligned} \left( 1-\frac{1}{2} \nabla \cdot \overline{A} \nabla \right) \overline{v} = f \quad \text {on } \mathbb R ^d, \end{aligned}$$
(CEE)

and one has

$$\begin{aligned} \overline{v}(x) = \int _0^{+\infty } e^{-t} \ \overline{u}(t,x) \ \mathrm{d }t, \end{aligned}$$
(7.3)

where \(\overline{u}\) is the solution of (CPE). Equipped with the representations (7.2)–(7.3), it is straightforward to derive the following result from Theorem 1.1.

Theorem 7.1

Let \(m = \lfloor d/2 \rfloor + 3\) and \(\delta > 0\). There exist constants \(C_\delta \) (which may depend on the dimension) and \(q\) such that, if the weak derivatives of order \(m\) of \(f\) are in \(L^2(\mathbb R ^d)\), then for any \(\varepsilon > 0\) and \(x \in \mathbb R ^d\), one has

$$\begin{aligned} \left| \mathbb E [v_\varepsilon (x)] - \overline{v}(x) \right| \leqslant \sum _{j=1}^d \Vert \partial _{x_j} f \Vert _\infty \ \varepsilon + C_\delta \left( \Vert f\Vert _2 + \sum _{j=1}^d \Vert \partial _{x_j^m} f\Vert _2 \right) \ \Psi _{q,\delta }\left( {\varepsilon ^2}\right) \!. \end{aligned}$$

Remark 7.2

Note that on the other hand, it does not look so simple to deduce Theorem 1.1 from Theorem 7.1. A possibility for doing so may be to try to devise a quantitative version of [29, Theorem IX.2.16].

One can also consider the Green function \(G_\varepsilon ^\omega (x,y)\), the unique bounded function such that

$$\begin{aligned} (\varepsilon ^2-L^\omega ) G_\varepsilon ^\omega (x,\cdot ) = \mathbf 1 _x. \end{aligned}$$

Letting \(\overline{G}(x,y)\) be the Green function associated to equation (CEE), we can write the counterpart of Theorem 1.3.

Theorem 7.3

Let \(d \geqslant 2\) and \(\delta > 0\). There exist constants \(c > 0\) (independent of \(\delta \)), \(q, C_\delta \) such that for any \(\varepsilon > 0\) and any \(x \in \varepsilon \mathbb Z ^d {\setminus } \{0\}\), one has

$$\begin{aligned}&\left| \varepsilon ^{2-d} \ \mathbb E \left[ G_\varepsilon ^\omega (0,\varepsilon ^{-1} x)\right] - \overline{G}(0,x) \right| \nonumber \\&\quad \leqslant \frac{C_\delta }{|x|^{d-2}} \left[ \left( \Psi _{q,\delta }\left( \frac{\varepsilon ^2}{|x|^2}\right) \right) ^{1/(d+3)} e^{-c|x|} + e^{-c|\varepsilon ^{-1}x|} \right] . \end{aligned}$$
(7.4)

When \(d = 1\), there exist \(C, c> 0\) such that for any \(\varepsilon > 0\) and any \(x \in \varepsilon \mathbb Z \), one has

$$\begin{aligned} \left| \varepsilon \ \mathbb E \left[ G_\varepsilon ^\omega (0,\varepsilon ^{-1} x)\right] - \overline{G}(0,x) \right| \leqslant C \left[ \varepsilon ^{1/8} e^{-c|x|} + e^{-c|\varepsilon ^{-1}x|} \right] . \end{aligned}$$

Remark 7.4

The orders of magnitude, as \(\varepsilon \) tends to \(0\), of the right-hand side of (7.1) and (7.4), are given respectively by (1.5) and (1.6).

Proof

Our starting point is the fact that

$$\begin{aligned} G_\varepsilon ^\omega (x,y) = \varepsilon ^{-2} \int _0^{+\infty } e^{-t} \ p^\omega _{\varepsilon ^{-2}t}(x,y) \ \mathrm{d }t, \end{aligned}$$

while

$$\begin{aligned} \overline{G}(x,y) = \int _0^{+\infty } e^{-t} \ \overline{p}_t(x,y) \ \mathrm{d }t \end{aligned}$$

Recall first that Theorem 1.3 ensures that there exist \(c >0, C_\delta ,\varepsilon _\delta > 0\) such that whenever \(t\geqslant (\varepsilon /\varepsilon _\delta )^2\), one has

$$\begin{aligned}&\left| \varepsilon ^{-d} \mathbb E [p_{\varepsilon ^{-2} t}^\omega (0,\lfloor \varepsilon ^{-1} x \rfloor )] - \overline{p}_t(0,x)\right| \nonumber \\&\quad \leqslant \frac{C_\delta }{t^{d/2}} \Psi _{q,\delta }^{1/(d+3)}\left( \frac{\varepsilon ^2}{t}\right) \exp \left[ -c \left( \frac{|x|^2}{t} \wedge |\varepsilon ^{-1} x| \right) \right] . \end{aligned}$$
(7.5)

The difference of interest

$$\begin{aligned} \left| \varepsilon ^{2-d} \ \mathbb E \left[ G_\varepsilon ^\omega (0,\lfloor \varepsilon ^{-1} x \rfloor )\right] - \overline{G}(0,x) \right| \end{aligned}$$

is bounded by

$$\begin{aligned} \int _0^{+\infty } e^{-t} \left| \varepsilon ^{-d} \ \mathbb E [p_{\varepsilon ^{-2} t}^\omega (0,\lfloor \varepsilon ^{-1} x \rfloor )] - \overline{p}_t(0,x) \right| \ \mathrm{d }t. \end{aligned}$$
(7.6)

Let \(\eta = (\varepsilon /\varepsilon _\delta )^2 \vee (\varepsilon |x|)\). If \(t \geqslant \eta \), then the integrand above is bounded, up to a constant, by

$$\begin{aligned} \frac{e^{-t}}{t^{d/2}} \Psi _{q,\delta }^{1/(d+3)}\left( \frac{\varepsilon ^2}{t}\right) \exp \left[ -c \frac{|x|^2}{t}\right] . \end{aligned}$$

In order to control the integral in (7.6), it thus suffices to bound the following three quantities:

$$\begin{aligned}&\int _{0}^{+\infty } \frac{e^{-t}}{t^{d/2}} \Psi _{q,\delta }^{1/(d+3)}\left( \frac{\varepsilon ^2}{t}\right) \exp \left[ -c \frac{|x|^2}{t} \right] \ \mathrm{d }t, \end{aligned}$$
(7.7)
$$\begin{aligned}&\int _0^\eta \varepsilon ^{-d} \mathbb E [p_{\varepsilon ^{-2} t}^\omega (0,\lfloor \varepsilon ^{-1} x \rfloor )] \ \mathrm{d }t,\end{aligned}$$
(7.8)
$$\begin{aligned}&\int _0^\eta \overline{p}_t(0,x) \ \mathrm{d }t. \end{aligned}$$
(7.9)

We start with the integral in (7.7), which is the only non-negligible one. To begin with, note that for any \(\gamma \), a change of variables gives us the identity

$$\begin{aligned} \int _0^{+\infty }\frac{e^{-t}}{t^{\gamma }} e^{-c |x|^2/t} \ \mathrm{d }t = |x|^{2-2\gamma } \int _0^{+\infty } \frac{e^{-s|x|^2}}{s^{\gamma }} e^{-c/s} \ \mathrm{d }s, \end{aligned}$$
(7.10)

and moreover, provided \(\gamma > 1\),

$$\begin{aligned} \int _0^{+\infty } \frac{e^{-s|x|^2}}{s^{\gamma }} e^{-c/s} \ \mathrm{d }s&\leqslant e^{-c|x|/2} \int _0^{1/|x|} \frac{e^{-c/2s}}{s^\gamma } \ \mathrm{d }s + e^{-|x|} \int _{1/|x|}^{+\infty } \frac{e^{-c/s}}{s^\gamma } \ \mathrm{d }s \nonumber \\&\leqslant C e^{-c|x|/2}, \end{aligned}$$
(7.11)

for some large enough \(C\) (and \(c \leqslant 2\)). We have thus shown that, for \(\gamma > 1\),

$$\begin{aligned} \int _0^{+\infty }\frac{e^{-t}}{t^{\gamma }} e^{-c |x|^2/t} \ \mathrm{d }t \leqslant C |x|^{2 - 2\gamma } e^{-c|x|/2}. \end{aligned}$$
(7.12)

When \(d \geqslant 3\), we have \(\Psi _{q,\delta }(u) = u^{1/2-\delta }\), so that the integral in (7.7) is bounded, up to a constant, by

$$\begin{aligned} |x|^{2-d} \ \Psi _{q,\delta }^{1/(d+3)}\left( \frac{\varepsilon ^2}{|x|^2} \right) e^{-c|x|/2}. \end{aligned}$$
(7.13)

When \(d = 2\), the argument requires some minor modifications, due to presence of a logarithmic factor in \(\Psi _{q,\delta }\). One should consider instead integrals of the form

$$\begin{aligned} \int _0^{+\infty }\frac{e^{-t}}{t^{\gamma }} \log _+^{q'}\left( t/\varepsilon ^2\right) e^{-c |x|^2/t} \ \mathrm{d }t = |x|^{2-2\gamma } \int _0^{+\infty } \frac{e^{-s|x|^2}}{s^{\gamma }} \log _+^{q'}\left( s|x|^2/\varepsilon ^2\right) e^{-c/s} \ \mathrm{d }s, \end{aligned}$$

for some \(q' \geqslant 0\) and \(\gamma > 1\) (in fact, \(\gamma = 1+1/20\)). This last integral is bounded by

$$\begin{aligned} \int _0^{\varepsilon ^2/|x|^2}\frac{e^{-s|x|^2}}{s^{\gamma }} e^{-c/s} \ \mathrm{d }s + \int _{\varepsilon ^2/|x|^2}^{+\infty } \frac{e^{-s|x|^2}}{s^{\gamma }} \log ^{q'}\left( s|x|^2/\varepsilon ^2\right) e^{-c/s} \ \mathrm{d }s \end{aligned}$$

For the first integral, (7.11) gives us an upper bound. Inequality (7.11) also enables us to bound the second integral, using the fact that

$$\begin{aligned} \log ^{q'}\left( s|x|^2/\varepsilon ^2\right) \leqslant 2^{q'} \left( \log ^{q'}\left( |x|^2/\varepsilon ^2\right) + \log ^{q'}\left( s\right) \right) . \end{aligned}$$

These observations thus guarantee that (7.7) is also bounded by (7.13) when \(d = 2\).

We now turn to the evaluation of the integral in (7.8). Since, for \(z \geqslant 0\), one has \({{\mathrm{arsinh}}}(z) = \log (z+\sqrt{1+z^2}) \geqslant \log (1+z)\), and using part (1) of Theorem 6.2, one can bound the integral in (7.8) (up to a constant) by

$$\begin{aligned} \int _0^\eta \varepsilon ^{-d} \exp \left( - |\varepsilon ^{-1} x| \log \left( 1+\frac{|\varepsilon ^{-1} x|}{\overline{c} \varepsilon ^{-2} t}\right) \right) \ \mathrm{d }t. \end{aligned}$$

A change of variables shows that this is equal to

$$\begin{aligned} \frac{\varepsilon |x|}{\overline{c}} \varepsilon ^{-d} \int _0^{\eta '} \exp \left( - |\varepsilon ^{-1} x| \log \left( 1+1/s\right) \right) \ \mathrm{d }s, \end{aligned}$$
(7.14)

where

$$\begin{aligned} \eta ' = \frac{\overline{c} \eta }{\varepsilon |x|} = \frac{\overline{c} \varepsilon _\delta ^{-2}}{|\varepsilon ^{-1} x|} \vee \overline{c}. \end{aligned}$$

Since we consider only \(x \in \varepsilon \mathbb Z ^d {\setminus } \{0\}\), the parameter \(\eta '\) is uniformly bounded, independently of the value of \(x\) and \(\varepsilon \). The integral in (7.14) is thus bounded (up to a constant) by

$$\begin{aligned} \varepsilon ^{1-d} |x| (1+\eta ^{-1})^{-|\varepsilon ^{-1} x|}&= |x|^{2-d} |\varepsilon ^{-1} x|^{d-1} (1+\eta ^{-1})^{-|\varepsilon ^{-1} x|} \\&\leqslant C |x|^{2-d} \exp \left( -c|\varepsilon ^{-1} x|\right) . \end{aligned}$$

This finishes the analysis of the integral in (7.8), and there remains only to consider the integral in (7.9). This integral is bounded by a constant times

$$\begin{aligned} \int _0^{\eta } t^{-d/2} e^{-c |x|^2/t} \ \mathrm{d }t \end{aligned}$$

for some small enough \(c> 0\). A change of variables enables one to rewrite this integral as

$$\begin{aligned} |x|^{2-d} \int _0^{\eta |x|^{-2} } u^{-d/2} e^{-c/u} \ \mathrm{d }u \leqslant |x|^{2-d}\exp \left( -\frac{c}{2\eta |x|^{-2}} \right) \int _0^{\eta |x|^{-2}} u^{-d/2} e^{-c/2u} \ \mathrm{d }u.\nonumber \\ \end{aligned}$$
(7.15)

Moreover,

$$\begin{aligned} \eta |x|^{-2} = \frac{\varepsilon _\delta ^{-2}}{|\varepsilon ^{-1} x|^2} \vee \frac{1}{|\varepsilon ^{-1} x|} \leqslant \frac{C'}{|\varepsilon ^{-1} x|} \leqslant C' \end{aligned}$$

for some large enough \(C'\), uniformly over \(\varepsilon > 0\) and \(x \in \varepsilon \mathbb Z ^d {\setminus } \{0\}\). The right-hand side of (7.15) is thus bounded by

$$\begin{aligned} |x|^{2-d} \exp \left( -\frac{|\varepsilon ^{-1} x|}{C'} \right) \int _0^{C'} u^{-d/2} e^{-c/2u} \ \mathrm{d }u. \end{aligned}$$

We thus obtained the required bound on (7.9), and this finishes the proof of Theorem 7.3 for \(d \geqslant 2\).

For the one-dimensional case, the analysis must be slightly adapted. We need to bound the integrals appearing in (7.7), (7.8) and (7.9). The analysis of the integrals in (7.8) and (7.9) can be kept without change, except that only the case \(x \in \varepsilon \mathbb Z {\setminus } \{0\}\) was considered above, while here we want to consider also \(x = 0\). But this is a very easy case, since the upper bound \(t^{-1/2}\) on the heat kernels is integrable close to \(0\). As for the integral in (7.7), it is equal to

$$\begin{aligned} \varepsilon ^{1/8} \int _0^{+\infty } \frac{e^{-t}}{t^\gamma } \ e^{-c|x|^2/t} \ \mathrm{d }t, \end{aligned}$$

where \(\gamma = 1/2+1/16 < 1\). The integral above is uniformly bounded over \(x\) such that \(|x| \leqslant 1\). Otherwise, as noted in (7.10), we have

$$\begin{aligned} \int _0^{+\infty } \frac{e^{-t}}{t^\gamma } \ e^{-c|x|^2/t} \ \mathrm{d }t = |x|^{2-2\gamma } \int _0^{+\infty } \frac{e^{-s|x|^2}}{s^{\gamma }} e^{-c/s} \ \mathrm{d }s, \end{aligned}$$

and we can bound the last integral by

$$\begin{aligned} e^{-c|x|} \int _0^{1/|x|} \frac{e^{-s}}{s^\gamma } \ \mathrm{d }s + e^{-|x|/2} \int _{1/|x|}^{+\infty } \frac{e^{-s/2}}{s^\gamma } \ \mathrm{d }s, \end{aligned}$$

where in the second part, we used the fact that for \(|x| \geqslant 1\) and \(s \geqslant |x|^{-1}\), we have \(s|x|^2 \geqslant |x|/2 + s/2\). We have thus shown that the integral in (7.7) is bounded, up to a constant, by

$$\begin{aligned} \varepsilon ^{1/8} \left( |x|^{2-2\gamma } + 1\right) e^{-c|x|}, \end{aligned}$$

uniformly over \(x \in \mathbb R \), and this finishes the proof for \(d = 1\). \(\square \)