1 Introduction

In the present work, we are concerned with the stochastic homogenization of linear elliptic equations of the form

$$\begin{aligned} -\nabla \cdot a\nabla u = f. \end{aligned}$$
(1)

In stochastic homogenization of elliptic PDEs, a is typically a uniformly elliptic and bounded coefficient field, chosen at random according to some stationary and ergodic ensemble \(\langle \cdot \rangle \). On large scales (and for slowly varying f), one may then approximate the solution u to the Eq. (1) by the solution \(u_{hom}\) to the so-called effective equation

$$\begin{aligned} -\nabla \cdot a_{hom}\nabla u_{hom} = f, \end{aligned}$$
(2)

which is a constant-coefficient equation with the so-called effective coefficient \(a_{hom}\). Mathematically, this homogenization effect is encoded in growth properties of the corrector (cf. below for a definition of the corrector).

The goal of the present paper is to provide a fairly simple proof of quantified sublinear growth of the corrector under very mild assumptions on the decorrelation of the coefficient field a under the ensemble \(\langle \cdot \rangle \). We do this in the context of coefficient fields that are essentially Gaussian. More precisely, we consider coefficient fields a which are obtained from a Gaussian random field by pointwise application of some (nonlinear) mapping, the role of the nonlinear map being basically to enforce uniform ellipticity and boundedness of our coefficient field.

The motivation for this result is the following: Gloria et al. [11] have shown that qualitatively sublinear growth of the (extended) corrector \((\phi ,\sigma )\) (cf. below for a definition) entails a large-scale intrinsic \(C^{1,\alpha }\) regularity theory for a-harmonic functions. In a subsequent work [10], the two authors of the present paper have shown that slightly quantified sublinear growth of the corrector even leads to a large-scale intrinsic \(C^{k,\alpha }\) regularity theory for any \(k\in {\mathbb {N}}\). Therefore, the results of the present work show that even in case of ensembles with very mild decorrelation, for almost every realization of the coefficient field, a-harmonic functions have arbitrary intrinsic smoothness properties on large scales. Furthermore, our results enable us to estimate the scale above which this happens—a random quantity—in a stochastically optimal way. Indeed, the motivation for the present work was to establish such a (necessarily intrinsic) higher order regularity theory under the weakest possible assumptions on the decay of correlations.

By an intrinsic regularity theory we mean that the regularity is measured in terms of objects intrinsic to the Riemannian geometry defined by the coefficient field a, like the dimension of the space of a-harmonic functions of a certain algebraic growth rate, or like estimates on the Hölder modulus of the derivative of a-harmonic functions as measured in terms of their distance to a-linear functions. An extrinsic large-scale regularity theory for a-harmonic functions in case of random coefficients was initiated on the level of a \(C^{0,\alpha }\) in [7, 21] and pushed to \(C^{1,0}\) in [4], which significantly extended qualitative arguments from the periodic case [5] to quantitative arguments in the random case. However, an extrinsic regularity theory is limited to \(C^{1,0}\), as can be seen considering the harmonic coordinates: Taking higher order polynomials into account does not increase the local approximation order.

After this motivation, we now return to the discussion of the history on bounds on the corrector, as they depend on assumptions on the stationary ensemble of coefficient fields. Almost-sure sublinearity (always meant in a spatially averaged sense) of the corrector \(\phi \) under the mere assumption of ergodicity was a key ingredient in the original work on stochastic homogenization by Kozlov [19] and by Papanicolaou and Varadhan [24]. Almost-sure sublinearity of the extended corrector \((\phi ,\sigma )\), as is needed for the large-scale intrinsic \(C^{1,\alpha }\)-regularity theory, was established in [11] under mere ergodicity.

Yurinskii [25] was the first to quantify sublinear growth under general mixing conditions, however only capturing suboptimal rates even in case of finite range of dependence. Very recently, a much improved quantification of sublinear growth of \(\phi \) under finite range assumptions was put forward by Armstrong et al. [1], relying on a variational approach to quantitative stochastic homogenization introduced by Armstrong and Smart [4], an approach which presumably can be extended to the case of non-symmetric coefficients and more general mixing conditions following [3]. Recently, optimal growth bounds on the corrector \(\phi \) with optimal—i.e. Gaussian—stochastic integrability have been established under the assumption of finite range of dependence [2, 17].

Optimal growth rates have been obtained under a quantification of ergodicity different from finite range or mixing conditions, namely under Spectral Gap assumptions on the ensemble. This functional analytic tool from statistical mechanics was introduced into the field of stochastic homogenization in an unpublished paper by Naddaf and Spencer [23], and further leveraged by Conlon et al. [8, 9], yielding optimal rates for some errors in stochastic homogenization in case of a small ellipticity contrast. The work of Gloria, Neukamm and the second author extended these results to the present case of arbitrary ellipticity contrast [12,13,14], in particular yielding at most logarithmic growth of the corrector (and its stationarity in \(d>2\)). Loosely speaking, the assumption of a Spectral Gap Inequality amounts to correlations with integrable tails; in the above-mentioned works it has been used for discrete media (i.e. random conductance models), but has subsequently been extended to the continuum case [15, 16].

A strengthening of the Spectral Gap Inequality is given by the Logarithmic Sobolev Inequality (LSI); it is a slight strengthening in terms of the assumption (still essentially encoding integrable tails of the correlations), but a substantial improvement in its effect, since it implies Gaussian concentration of measure for Lipschitz random variables. The assumption of LSI and implicitly concentration of measure, which will be explicitly used in this work, has been introduced into stochastic homogenization by Marahrens and Otto [21]. In [11], it has been shown that the concept of LSI can be adapted to also capture ensembles with slowly decaying correlations, i.e. thick non-integrable tails, by adapting the norm of the vertical or Malliavin derivative to the correlation structure. As a result, the stochastic integrability of the optimal rates could be improved from algebraic to (stretched) exponential, but missing the expected Gaussian integrability.

The main merit of the present contribution w.r.t. to [11] is twofold: First, our approach directly provides optimal quantitative sublinearity of the corrector \((\phi ,\sigma )\) on all scales above a random minimal radius \(r_*\), i.e. in contrast to the estimates of [11] our estimates capture the decorrelation on scales larger than \(r_*\) in a single argument. Note that our definition of \(r_*\) differs from the one in [11]. Second, in case of weak decorrelation, our simpler arguments are nevertheless sufficient to establish optimal stochastic moments for the minimal radius \(r_*\) above which the corrector \((\phi ,\sigma )\) displays the quantified sublinear growth.

In the present work, we consider the following type of ensembles on \(\lambda \)-uniformly elliptic tensor fields \(a=a(x)\) on \({\mathbb {R}}^d\): Let \({\tilde{a}}={\tilde{a}}(x)\) be a tensor-valued Gaussian random field on \({\mathbb {R}}^d\) that is centered (i.e. of vanishing expectation) and stationary (i.e. invariant under translation) and thus characterized by the covariance \(\langle {\tilde{a}}(x)\otimes {\tilde{a}}(0)\rangle \). Our only additional assumption on \({\tilde{a}}\) is that there exists an exponent \(\beta \in (0,d)\) such that

$$\begin{aligned} \left| \langle {\tilde{a}}(x)\otimes {\tilde{a}}(0)\rangle \right| \le |x|^{-\beta }\quad \hbox {for all}\;x\in {\mathbb {R}}^d. \end{aligned}$$
(3)

In this work, we are concerned with the case of weak decay of correlation in the sense of \(\beta \ll 1\). Let \(\Phi \) be a 1-Lipschitz map from the space of tensors into the space of \(\lambda \)-elliptic symmetric tensors. Then our ensemble is the distribution of a where a is given by \(a(x):=\Phi ({\tilde{a}}(x))\). Note that the normalization in the constant in (3) and in the Lipschitz constant is not essential, since it can be achieved by a rescaling of x and the amplitude of \({\tilde{a}}\).

Concerning the mathematical tools of our approach, several ideas are inspired by the work [11]. In particular, a key component of our approach are sensitivity estimates (Malliavin derivative bounds) for certain integral functionals, which basically average the gradient \(\nabla (\phi ,\sigma )\) over an appropriate cube. Furthermore, we rely on a mean-value property for a-harmonic functions, which has been derived in [11] under appropriate smallness assumptions on the corrector. In our present contribution, we however pursue a conceptually simpler route to estimate the Malliavin derivative: The sensitivity estimate is performed through appropriate \(L^q\)-norm bounds and Meyer’s estimate, rather than a more involved \(\ell ^2-L^1\)-norm bound like in [11].

Before stating our main results, let us recall the concept of correctors in homogenization and introduce some notation. The basic idea underlying the concept of correctors in homogenization is the observation that the oscillations in the gradient \(\nabla u_{hom}\) of solutions to the homogenized (constant-coefficient) problem (2) occur on a much larger scale than the oscillations in the gradient \(\nabla u\) of solutions to the original problem (1). Thus, it is important to understand how to add oscillations to an affine map (an affine map being always \(a_{hom}\)-harmonic) to obtain an a-harmonic map. In the context of stochastic homogenization, one is therefore interested in constructing random scalar fields \(\phi _i=\phi _i(a,x)\) subject to the equation

$$\begin{aligned} -\nabla \cdot a(e_i+\nabla \phi _i)=0 \end{aligned}$$
(4)

which almost surely display sublinear growth in x: The \(\phi _i\) then facilitate the transition from the \(a_{hom}\)-harmonic (Euclidean) coordinates \(x\mapsto x_i\) to the “a-harmonic coordinates” \(x\mapsto x_i+\phi _i(x)\). Since any affine map may be represented in the form \(b+\sum _i \xi _i x_i\) for \(b,\xi _i\in {\mathbb {R}}\), the \(\phi _i\) also facilitate the construction of associated a-harmonic “corrected affine maps” \(b+\sum _i \xi _i(x_i+\phi _i)\).

With the help of the corrector, one may characterize the effective coefficient \(a_{hom}\): In our setting of stochastic homogenization, the effective coefficient is given by the formula

$$\begin{aligned} a_{hom} e_i = \left\langle a(e_i+\nabla \phi _i) \right\rangle , \end{aligned}$$
(5)

where \(\langle \cdot \rangle \) refers to the expectation with respect to our ensemble (i.e. probability measure).

In the language of a conducting medium with conductivity tensor a—note that in this picture, one has \(f\equiv 0\) in (1)—, the quantity \(E_i:=e_i+\nabla \phi _i\) corresponds to the (curl-free) “microscopic” electric field associated with a “macroscopic” electric field \(e_i\) (and, therefore, \(\phi _i\) corresponds to the “microscopic” correction to the “macroscopic” electric potential \(x_i\)). The corresponding (divergence-free) “microscopic” current density is given by

$$\begin{aligned} q_i :=a(e_i+\nabla \phi _i), \end{aligned}$$
(6)

while the “macroscopic” current density associated with the “macroscopic” electric field \(e_i\) is given by the “average” of this quantity, i.e. by the expression (5).

In periodic homogenization of linear elliptic PDEs, it turns out to be convenient to introduce a dual quantity to the corrector \(\phi _i\) (cf. e.g. [18, p.27]): One constructs a tensor field \(\sigma _{ijk}\), skew-symmetric in the last two indices, which is a potential for the flux correction \(q_i-a_{hom}e_i\) in the sense

$$\begin{aligned} \nabla \cdot \sigma _{i}=a(e_i+\nabla \phi _i)-a_{hom}e_i, \end{aligned}$$
(7)

where we have set \((\nabla \cdot \sigma _{i})_j:=\sum _{k=1}^d\partial _k\sigma _{ijk}\). With the help of this “extended corrector” \((\phi , \sigma )\), it is possible to give a bound on the homogenization error (in terms of appropriate norms of \(\phi \) and \(\sigma \)).

One of the main merits of [11] is the discovery of the usefulness of this extended corrector \((\phi ,\sigma )\) in the context of stochastic homogenization. For stationary and ergodic ensembles \(\langle \cdot \rangle \) of \(\lambda \)-uniformly elliptic and symmetric coefficient fields \(a=a(x)\) on \({\mathbb {R}}^d\), in [11] correctors \(\phi _i\) and \(\sigma _{ijk}\) such that

$$\begin{aligned} \nabla \phi _i,\nabla \sigma _{ijk}\; \begin{array}{l} \hbox {are stationary,}\\ \hbox {of bounded second moment,}\\ \hbox {and of vanishing expectation,} \end{array} \end{aligned}$$
(8)

have been constructed. As a consequence of this and of ergodicity, the \(\phi _i\) and \(\sigma _{ijk}\) almost surely display sublinear growth. Note that in case of \(\sigma _i\), the choice of the appropriate gauge is important for the property (8) and for our work, as the Eq. (7) determines \(\sigma _i\) (which by its skew-symmetry and its behavior under change of coordinates may be identified with a \(d-1\)-form) only up to the exterior derivative of a \(d-2\)-form. In fact, the choice of the gauge in [11] is such that

$$\begin{aligned} -\triangle \sigma _{ijk}=\partial _jq_{ik}-\partial _kq_{ij}, \end{aligned}$$
(9)

which in view of (4) and (6) is clearly compatible with (7).

Notation To quantify the ellipticity and boundedness of our coefficient fields, throughout the paper we shall work with the assumptions

$$\begin{aligned} av\cdot v&\ge \lambda |v|^2\quad \text {for all }v\in {\mathbb {R}}^d, \end{aligned}$$
(10)
$$\begin{aligned} |av|&\le |v|\quad \text {for all }v\in {\mathbb {R}}^d, \end{aligned}$$
(11)

where \(\lambda \in (0,1)\). Note that in view of rescaling, the upper bound (11) on a does not induce a loss of generality of our results.

For our convenience, throughout the paper we shall assume that our coefficient field a is symmetric. The arguments however easily carry over to the case of non-symmetric coefficient fields by simultaneously considering the correctors for the dual equation (i.e. the PDE with coefficient field \(a^*\), \(a^*\) denoting the transpose of a).

The expression \(s\lesssim t\) is an abbreviation for \(s\le C t\) with C a generic constant only depending on the dimension d, the exponent \(\beta >0\), and the ellipticity ratio \(\lambda >0\).

The expression \(s\ll t\) stands for \(s\le \frac{1}{C} t\) with C a generic sufficiently large constant only depending on the dimension d, the exponent \(\beta >0\), and the ellipticity ratio \(\lambda >0\).

By I(E) we denote the characteristic function of an event E.

The notation refers to the average integral over the set A, i.e. we have .

In the sequel, \((\phi ,\sigma )\) stands for any component \(\phi _i,\sigma _{ijk}\) for \(i,j,k=1,\ldots ,d\).

2 Main results and structure of proof

Let us now state our main theorem. To quantify the sublinear growth of the extended corrector \((\phi ,\sigma )\), we first quantify the decay of spatial averages of \(\nabla (\phi ,\sigma )\) over larger scales. In view of the decorrelation assumption (3) for our ensemble of coefficient fields, we expect that, up to logarithms, it is the exponent \(\frac{\beta }{2}\) that governs the decay of averages of \(\nabla (\phi ,\sigma )\) and the improvement over linear growth for \((\phi ,\sigma )\). Indeed, this exponent is reflected in the theorem.

Theorem 1

Let \({\tilde{a}}={\tilde{a}}(x)\) be a tensor-valued Gaussian random field on \({\mathbb {R}}^d\) that is centered (i.e. of vanishing expectation) and stationary (i.e. invariant under translation); assume that the covariance of \({\tilde{a}}\) satisfies the estimate

$$\begin{aligned} \left| \langle {\tilde{a}}(x)\otimes {\tilde{a}}(0)\rangle \right| \le |x|^{-\beta }\quad \hbox {for all}\;x\in {\mathbb {R}}^d \end{aligned}$$
(12)

for some \(\beta \in (0,d)\). Let \(\Phi :{\mathbb {R}}^{d\times d}\rightarrow {\mathbb {R}}^{d\times d}\) be a Lipschitz map with Lipschitz constant \(\le 1\); suppose that \(\Phi \) takes values in the set of symmetric matrices subject to the ellipticity and boundedness assumptions (10), (11). Define the ensemble \(\langle \cdot \rangle \) as the probability distribution of a, where a is the image of \({\tilde{a}}\) under pointwise application of the map \(\Phi \), i.e. \(a(x):=\Phi ({\tilde{a}}(x))\).

Assume in addition on the ensemble \(\langle \cdot \rangle \) that \(\beta \) in (12) is sufficiently small in the sense of

$$\begin{aligned} \beta \le \frac{1}{C}, \end{aligned}$$
(13)

where C denotes a generic constant only depending on d and \(\lambda \).

  1. (i)

    Consider a linear functional \(F=Fh\) on vector fields \(h=h(x)\) satisfying the boundedness property

    $$\begin{aligned} |Fh|\le \left( -\int _{|x|\le r}|h|^\frac{2d}{d+\beta }dx\right) ^\frac{d+\beta }{2d} \end{aligned}$$
    (14)

    for some radius \(r>0\). Then the random variable \(F\nabla (\phi ,\sigma )\) satisfies uniform Gaussian bounds in the sense of

    $$\begin{aligned} \langle I(|F\nabla (\phi ,\sigma )|\ge M)\rangle \le C\exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for all}\;M\le 1. \end{aligned}$$
    (15)
  2. (ii)

    There exists a (random) radius \(r_*\) for which the “iterated logarithmic” bound

    $$\begin{aligned} \frac{1}{r^2}-\int _{|x|\le r}|(\phi ,\sigma )--\int _{|x|\le r}(\phi ,\sigma )|^2dx\le \left( \frac{r_*}{r}\right) ^\beta \log \left( e+\log \left( \frac{r}{r_*}\right) \right) \quad \hbox {for}\;r\ge r_*\nonumber \\ \end{aligned}$$
    (16)

    holds and which satisfies the stretched exponential bound

    $$\begin{aligned} \left\langle \exp \left( \frac{1}{C}r_*^\beta \right) \right\rangle \le C. \end{aligned}$$
    (17)

Morally speaking, Theorem 1 converts statistical information on the coefficient field a (or rather \({\tilde{a}}\)) into statistical information on the coefficient field \(\nabla \phi :=\nabla (\phi _1,\ldots ,\phi _d)\) related by (4). Despite the nonlinearity of the map \(a\mapsto \nabla \phi \), which only in its linearization around \(a=\mathrm{id}\) turns into the Helmholtz projection, Theorem 1 states that \(\nabla \phi \) essentially inherits the statistics of a: (15) implies in particular that spatial averages \(F=-\int _{|x|\le r}\nabla \phi dx\) of \(\nabla \phi \) satisfy the same bounds as if \(\nabla \phi \) itself was Gaussian with correlation decay (3). On the level of these Gaussian bounds, the only price to pay for the nonlinearity is the restriction \(M\lesssim 1\) in (15) on the threshold.

Incidentally, the way we obtain (ii) from (i) bears similarities with an argument in [1] in the sense that a decomposition into Haar wavelets is implicitly used.

To see the optimality of the corrector bound (16) and the stochastic integrability (17), consider a perturbative setting: Let \({\tilde{a}}\) be a scalar, centered, stationary Gaussian field with covariance given by

$$\begin{aligned} \langle {\tilde{a}}(x){\tilde{a}}(0) \rangle = |x|^{-\beta } \end{aligned}$$
(18)

and consider the infinitesimally perturbed Laplace operator

$$\begin{aligned} a:=\left( 1+\frac{1}{\pi }{\text {tan}}^{-1}(\delta \pi \tilde{a})\right) {\text {Id}}=(1+\delta \,{\tilde{a}}) {\text {Id}}, \end{aligned}$$
(19)

where \(\delta >0\) is infinitesimal. In this setting, by (4) the gradient of the corrector \(\phi \) is given by

$$\begin{aligned} \nabla \phi _i (x) = -\delta (\nabla \partial _i \Phi *{\tilde{a}}) (x), \end{aligned}$$

where \(\Phi \) denotes the fundamental solution of the (negative) Laplacian (note that this identity is just formal, as we silently pass over integrability issues, but it may be given a rigorous meaning). In particular, \(\nabla \phi _i\) is a stationary centered Gaussian random field satisfying

$$\begin{aligned}&Cov(\partial _j \phi _i(x),\partial _k \phi _i(0))\nonumber \\&\quad =\delta ^2 \left\langle (\partial _j \partial _i \Phi *{\tilde{a}})(x) (\partial _k \partial _i \Phi *{\tilde{a}})(0) \right\rangle \nonumber \\&\quad =\delta ^2 \int \int \partial _j \partial _i \Phi (x-z) \partial _k \partial _i \Phi (0-w) Cov({\tilde{a}}(z),{\tilde{a}}(w)) \,dz\,dw\nonumber \\&\quad =\delta ^2 \int \int \partial _j \partial _i \Phi (x-z) \partial _k \partial _i \Phi (0-w) |z-w|^{-\beta } \,dz\,dw\nonumber \\&\quad = C(d,\beta ,i,j,k) \delta ^2 |x|^{-\beta } \end{aligned}$$
(20)

with an explicitly computable (and for general i, j, k nonzero) constant \(C(d,\beta ,i,j,k)\). The difference is therefore a centered Gaussian random variable with variance \(\sim r^{2-\beta }\), which entails that the moment bound (17) for the factor \(r_*^\beta \log (e+\log (r/r_*))\) in the estimate (16) is (almost) optimal. The scaling in r of the bound (16) is optimal in view of the law of the iterated logarithm; details are provided in the “Appendix”.

To obtain an estimate like (15), the starting point of our proof is the Gaussian concentration of measure applied to \(\tilde{a}\). Recall the notion of the covariance operator \({\text {Cov}}\), which in our setting of a stationary centered Gaussian random field \({\tilde{a}}\) is given as the convolution with the tensor field \(\langle {\tilde{a}}(x) \otimes {\tilde{a}}(0) \rangle \).

Proposition 1

(Concentration of Measure, cf. e.g. [20, Proposition 2.18])). Let \({\tilde{a}}={\tilde{a}}(x)\) be a tensor-valued Gaussian random field on \({\mathbb {R}}^d\) that is centered and stationary; denote its covariance operator by \({\text {Cov}}\). Consider a random variable F, that is, a function(al) \(F=F({\tilde{a}})\). Suppose that F is 1-Lipschitz in the sense that its functional derivative, or rather its Fréchet derivative with respect to \(L^2({\mathbb {R}}^d;{\mathbb {R}}^{d\times d})\), \(\frac{\partial F}{\partial {\tilde{a}}}=\frac{\partial F}{\partial {\tilde{a}}}({\tilde{a}},x)\), which can be considered a random tensor field and assimilated with a Malliavin derivative, satisfies

$$\begin{aligned} \int _{{\mathbb {R}}^d}\frac{\partial F}{\partial {\tilde{a}}}({\tilde{a}},x) ({\text {Cov}}\frac{\partial F}{\partial {\tilde{a}}}(\tilde{a},\cdot ))(x)dx \le 1\quad \hbox {for almost every}\;{\tilde{a}}. \end{aligned}$$
(21)

Then F has Gaussian moments in the sense of

$$\begin{aligned} \langle \exp (M (F-\langle F\rangle ))\rangle \le \exp \left( \frac{M^2}{2}\right) \quad \hbox {for all}\;M\ge 0. \end{aligned}$$
(22)

Furthermore, for any \(M\ge 0\) we have the estimate

$$\begin{aligned} \langle I(|F-\langle F\rangle |\ge M)\rangle \le 2\exp \left( -\frac{M^2}{2}\right) . \end{aligned}$$
(23)

We now substitute our assumption (21) on the Fréchet derivative by a stronger but more tractable condition.

Lemma 1

Let \({\tilde{a}}={\tilde{a}}(x)\) be a tensor-valued Gaussian random field on \({\mathbb {R}}^d\) that is centered and stationary; denote its covariance operator as \({\text {Cov}}\) and suppose that for some \(\beta \in (0,d)\) we have the bound

$$\begin{aligned} \left| \langle {\tilde{a}}(x)\otimes {\tilde{a}}(0)\rangle \right| \le |x|^{-\beta }\quad \hbox {for all}\;x\in {\mathbb {R}}^d. \end{aligned}$$

Let \(\Phi :{\mathbb {R}}^{d\times d} \rightarrow {\mathbb {R}}^{d\times d}\) be a 1-Lipschitz map; denote the probability distribution of \(\Phi ({\tilde{a}})\) as \(\langle \cdot \rangle \). Consider a functional F on the space of tensor fields \({\tilde{a}}\) of the form \(F=F(a)\) with \(a(x):=\Phi ({\tilde{a}}(x))\); we shall use the abbreviation \(F(\tilde{a})\) for \(F(\Phi ({\tilde{a}}))\). Let \(q\in (1,2)\) be given by

$$\begin{aligned} \frac{1}{q}=1-\frac{\beta }{2d} \end{aligned}$$
(24)

and suppose that the Fréchet derivative of F with respect to \(L^2({\mathbb {R}}^d;{\mathbb {R}}^{d\times d})\) satisfies

$$\begin{aligned} \left( \int |\frac{\partial F}{\partial a}|^q dx\right) ^\frac{2}{q}\ll 1 \quad \hbox {for }\langle \cdot \rangle \text {-almost every }\;a. \end{aligned}$$
(25)

Then the estimate (21) is satisfied, i.e. we have

$$\begin{aligned} \int _{{\mathbb {R}}^d}\frac{\partial F}{\partial {\tilde{a}}}({\tilde{a}},x) \left( {\text {Cov}}\frac{\partial F}{\partial {\tilde{a}}}(\tilde{a},\cdot )\right) (x)dx \le 1\quad \hbox {for almost every}\;{\tilde{a}}. \end{aligned}$$

We observe that if q and \(\beta \) are related by (24), as \(\beta \uparrow d\) we have \(q\uparrow 2\) and for \(\beta \downarrow 0\) we have \(q\downarrow 1\).

For linear functionals of (the gradient of) the corrector (which are therefore nonlinear functionals of the coefficient field a), we now establish an explicit representation of the Fréchet derivative; this will aid us in verifying the Lipschitz condition (25) and thus ultimately the concentration of measure statements (22) and (23) for (an appropriate modification of) such functionals.

Lemma 2

Consider a linear functional on \(L^\frac{p}{p-1}({\mathbb {R}}^d;{\mathbb {R}}^d)\) of the form

$$\begin{aligned} Fh:=\int _{{\mathbb {R}}^d} g\cdot h\,dx, \end{aligned}$$
(26)

where \(g\in L^p({\mathbb {R}}^d;{\mathbb {R}}^d)\), \(p\ge 2\), and \({\text {supp}}g\subset \{|x|\le r\}\) for some \(r\ge 1\). Let a be some coefficient field subject to the ellipticity and boundedness conditions (10), (11). Then the following two assertions hold:

  1. (1)

    Consider the Fréchet derivative \(\frac{\partial F}{\partial a}\) of the functional \(F:=F\nabla \sigma _{ijk}\) (note that this functional is nonlinear in a, although it is linear in \(\sigma _{ijk}\)) at a (for some fixed ijk). Introduce the decaying solutions v, \({\tilde{v}}_{jk}\) to the equations

    $$\begin{aligned} -\triangle v=\nabla \cdot g \end{aligned}$$
    (27)

    and (where \(a^*\) denotes the transpose of a)

    $$\begin{aligned} -\nabla \cdot a^*\left( \nabla {\tilde{v}}_{jk}+(\partial _j ve_k-\partial _k ve_j)\right) =0. \end{aligned}$$
    (28)

    We then have the representation

    $$\begin{aligned} \frac{\partial F}{\partial a}(a)=\left( \partial _jve_k-\partial _kve_j+\nabla \tilde{v}_{jk}\right) \otimes (\nabla \phi _i+e_i). \end{aligned}$$
    (29)
  2. (2)

    Consider the Fréchet derivative \(\frac{\partial F}{\partial a}\) of the functional \(F:=F\nabla \phi _i\) at a. Introduce the decaying solution \({\overline{v}}\) to the equation (again, \(a^*\) denoting the transpose of a)

    $$\begin{aligned} -\nabla \cdot a^*\nabla {\overline{v}} = \nabla \cdot g. \end{aligned}$$
    (30)

    We then have the representation

    $$\begin{aligned} \frac{\partial F}{\partial a}(a) = \nabla {\overline{v}} \otimes (\nabla \phi _i + e_i). \end{aligned}$$
    (31)

The previous explicit representation of the Fréchet derivative for certain linear functionals of (the gradient of) the corrector \((\phi ,\sigma )\) enables us to verify the bound (25) for the Malliavin derivative, provided that a certain mean value property is satisfied for a-harmonic functions. Note that the latter requirement is a condition on the coefficient field a; in Lemma 4 below we shall provide a sufficient condition for this property to hold.

As the functionals which the next lemma shall be applied to are basically averages of \(\nabla \phi \) or \(\nabla \sigma \) over cubes of a certain scale r, we state the lemma in a form which makes it directly applicable in such a setting. In particular, the boundedness assumption (32) for the linear functional is motivated by these considerations.

Lemma 3

Let \(r\ge 1\) and consider a linear functional \(h\mapsto Fh\) on \(L^\frac{p}{p-1}({\mathbb {R}}^d;{\mathbb {R}}^d)\) (with \(p\in (2,\infty )\)) satisfying the support and boundedness condition

(32)

Suppose that the constraint

$$\begin{aligned} 2<p<2+c(d,\lambda ) \end{aligned}$$
(33)

holds (with \(c(d,\lambda )>0\) to be fixed in the proof below). Let \(q\in (1,2)\) be related to p through

$$\begin{aligned} \frac{1}{p}=\frac{1}{q}-\frac{1}{2}. \end{aligned}$$
(34)

Consider the Fréchet derivative \(\frac{\partial F}{\partial a}\) of the functional \(F:=F\nabla \sigma _{ijk}\) (or the functional \(F:=F\nabla \phi _i\); note that these functionals are nonlinear functionals of a) at some symmetric coefficient field a subject to the conditions (10), (11).

Provided that the coefficient field a is such that the mean value property

(35)

holds for any a-harmonic function u and provided that furthermore a is such that

(36)

is satisfied, we have the estimate

$$\begin{aligned} \left( \int |\frac{\partial F}{\partial a}|^qdx\right) ^\frac{2}{q} \lesssim r^{-\frac{(p-2)d}{p}}. \end{aligned}$$
(37)

Note that for q related to \(\beta \) through (24) and p related to q through (34), we have \(r^{-\frac{(p-2)d}{p}}=r^{-\beta }\), i.e. by (37) the \(L^q\)-norm of the Malliavin derivative decays like \(r^{-\frac{\beta }{2}}\). This demonstrates that for functionals like our averages of \(\nabla (\phi ,\sigma )\)—note that these functionals have vanishing expectation due to the vanishing expectation of \(\nabla (\phi ,\sigma )\)—, the concentration of measure indeed improves on large scales with the desired exponent: The “typical value” of the average of \(\nabla (\phi ,\sigma )\) on some scale r decays like \(r^{-\frac{\beta }{2}}\).

We now have to provide a sufficient condition for the mean value property for a-harmonic functions (35). To do so, we make use of the following result from [11], which provides the mean-value property assuming just an appropriate sublinearity condition on the corrector \((\phi ,\sigma )\).

Proposition 2

(see [11, Lemma 2]) There exists a constant \(C_0\) only depending on dimension d and ellipticity ratio \(\lambda >0\) with the following property: Suppose that for an elliptic coefficient field a subject to the ellipticity and boundedness conditions (10) and (11) the scalar and vector potentials \((\phi ,\sigma )\), cf. (4) and (7), satisfy

$$\begin{aligned} \left( -\int _{|x|\le R}|(\phi ,\sigma )--\int _{|x|\le R}(\phi ,\sigma )|^2dx\right) ^\frac{1}{2}\le \frac{1}{C_0}R \quad \hbox {for all}\;R\ge r. \end{aligned}$$
(38)

Then for any two radii \(R\ge r\) and \(\rho \in [r,R]\) and any a-harmonic function u in \(\{|x|\le R\}\) we have

$$\begin{aligned} -\int _{|x|\le \rho }|\nabla u|^2dx\lesssim -\int _{|x|\le R}|\nabla u|^2dx. \end{aligned}$$

We shall show in the proof of the next lemma that the quantitative sublinearity condition on the corrector (38) may be reduced to a smallness assumption on a certain family of linear functionals of the gradient of the corrector. This reduction relies on the compactness of the left-hand side of (38) with respect to the \(L^2\)-norm of \(\nabla (\phi ,\sigma )\), which in turn may be estimated via Caccioppoli’s estimate by the left-hand side. It appeals to a quantitative version of inequalities in functional analysis where an intermediate norm is estimated by a bit of a stronger norm and a lot of a weaker (semi-)norm, the role of which is played by the expression in (39). A slight subtlety follows from the fact that the use of Caccioppoli’s inequality increases the radius (by a factor of two, say), so that one has to buckle on the level of all dyadic radii R larger than the given radius r, cf. the expression in (39). This requires the qualitative a priori information (36). One has a lot of flexibility in the choice of the functionals \(F_n\); for pure convenience we choose the same functionals, of Haar wavelet-type, that play a prominent role in the proof of Assertion (ii) of Theorem 1. Other natural choices would be the first N eigenfunctions of the Neumann-Laplacian, like in Step 7 of the proof of Theorem 2 in [6] or the proof of Lemma 2.6 in [15].

Lemma 4

There exist \(N\lesssim 1\) linear functionals \(F_1,\ldots ,F_N\) satisfying the support and boundedness condition

$$\begin{aligned} |F_n h|\lesssim \left( \int _{|x|\le 1} |h|^\frac{p}{p-1} ~dx \right) ^\frac{p-1}{p} \end{aligned}$$

such that the following holds: Denote by \(F_{n,R}\) the rescaling of \(F_n\) given by \(F_{n,R}h=F_n\big (h(\frac{\cdot }{R})\big )\). Let \(r\ge 0\). Provided that the condition

$$\begin{aligned} \sup _{R\ge r\;\text {dyadic};n=1,\ldots ,N}F_{n,R}\nabla (\phi ,\sigma )\ll 1 \end{aligned}$$
(39)

and the condition (36) are satisfied, we have the smallness estimate (38) for the corrector; in particular, by Proposition 2 the mean value property (35) holds for any a-harmonic function u on scales \(\ge r\), i.e. we have

With these preparations, we are able to establish our main theorem. The main technical difficulty in the proof below is that our estimate

$$\begin{aligned} \left( \int \left| \frac{\partial F}{\partial a}\right| ^q dx \right) ^\frac{2}{q} \lesssim r^{-\beta } \end{aligned}$$

for the Malliavin derivative of linear functionals of the gradient of the corrector (cf. (37)) is a conditional bound: It relies on the assumption that the mean-value property (35) holds for a-harmonic functions on scales larger than r. For the concentration of measure estimate (22), however, an unconditional estimate of the form (21) or (25) (the latter being a proxy for (21)) is needed. By Lemma 4 we know that the mean-value property holds, provided that for a certain family of linear functionals of the corrector the smallness estimate

$$\begin{aligned} \sup _{R\ge r~\text {dyadic}; n=1,\ldots ,N} F_{n,R} \nabla (\phi ,\sigma ) \le \frac{1}{C_0} \end{aligned}$$

is satisfied (\(C_0\) being a universal constant). To circumvent this problem, in the proof below we therefore introduce the family of functionals

$$\begin{aligned} {\bar{F}}_r := \min \left\{ \sup _{R\ge r~\text {dyadic}; n=1,\ldots ,N} F_{n,R} \nabla (\phi ,\sigma ),\frac{1}{C_0}\right\} , \end{aligned}$$

for which by design the unconditional bound for the Malliavin derivative

$$\begin{aligned} \left( \int \left| \frac{\partial {\bar{F}}_r}{\partial a}\right| ^q dx \right) ^\frac{2}{q} \lesssim r^{-\beta } \end{aligned}$$

holds. Therefore, concentration of measure is applicable to \(\bar{F}_r\). The remainder of the proof of the first part of our theorem below is dedicated to handling the (a priori unknown) expectation \(\langle {\bar{F}}_r\rangle \).

The proof of the second assertion of our main theorem will mainly rely on the first assertion of the theorem as well as the quantitative improvement of the Malliavin derivative of averages of \((\nabla \phi ,\nabla \sigma )\) on larger scales, as captured by the estimate (37).

3 Concentration of measure and estimates of the Malliavin derivative

3.1 Concentration of measure

Proof of Proposition 1

For the proof of the concentration of measure estimate (22), we refer the reader to [20, Proposition 2.18]. We now establish (23). By Chebychev’s inequality, (22) implies \(\langle I(F-\langle F\rangle \ge M)\rangle \le \exp (-\frac{M^2}{2})\). In combination with the same estimate with F replaced by \(-F\), we obtain (23). \(\square \)

Proof of Lemma 1

We need to verify that the condition (21) is implied by the assumption (25).

To do so, we first note that by Hölder’s inequality we have for any exponent \(1<q<\infty \)

$$\begin{aligned} \int \frac{\partial F}{\partial {\tilde{a}}} \hbox {Cov}\frac{\partial F}{\partial {\tilde{a}}}dx\le & {} \left( \int |\frac{\partial F}{\partial {\tilde{a}}}|^qdx\right) ^\frac{1}{q} \left( \int |\hbox {Cov}\frac{\partial F}{\partial {\tilde{a}}}|^\frac{q}{q-1}dx\right) ^\frac{q-1}{q}. \end{aligned}$$

Since \(\hbox {Cov}\) is the convolution with \(\langle \tilde{a}(x)\otimes {\tilde{a}}(0)\rangle \) and since we have the bound \(|\langle {\tilde{a}}(x)\otimes {\tilde{a}}(0)\rangle |\le |x|^{-\beta }\), we have for the second factor

$$\begin{aligned} \left( \int |\hbox {Cov}\frac{\partial F}{\partial \tilde{a}}|^\frac{q}{q-1}dx\right) ^\frac{q-1}{q}\le & {} \left( \int \left| \int \frac{1}{|x-y|^\beta } |\frac{\partial F}{\partial {\tilde{a}}}(y)|dy\right| ^\frac{q}{q-1}dx\right) ^\frac{q-1}{q}, \end{aligned}$$

which allows us to use the Hardy–Littlewood–Sobolev inequality

$$\begin{aligned} \left( \int \left| \int \frac{1}{|x-y|^\beta } |\frac{\partial F}{\partial {\tilde{a}}}(y)|dy\right| ^\frac{q}{q-1}dx\right) ^\frac{q-1}{q}\lesssim & {} \left( \int |\frac{\partial F}{\partial \tilde{a}}|^qdx\right) ^\frac{1}{q}, \end{aligned}$$

provided the exponents q and \(\beta \) are related by (24). From this string of inequalities we learn that (21) also holds provided

$$\begin{aligned} \left( \int |\frac{\partial F}{\partial \tilde{a}}|^qdx\right) ^\frac{2}{q}\ll 1 \quad \hbox {for almost every}\;\tilde{a}. \end{aligned}$$
(40)

We now change variables according to \(a(x)=\Phi ({\tilde{a}}(x))\); by the chain rule for \(F({\tilde{a}})=F(\Phi ({\tilde{a}}))\) we have \(\frac{\partial F}{\partial {\tilde{a}}}({\tilde{a}},x)=\Phi '(\tilde{a}(x))\frac{\partial F}{\partial a}(a,x)\), so that by the 1-Lipschitz continuity of \(\Phi \), our assumption (25) implies (40) and thus (21). \(\square \)

3.2 Representation of the Malliavin derivative

Proof of Lemma 2

We first give the argument for the “vector potential” \(\sigma \), fixing a component \(\sigma _{ijk}\). Consider a functional of the form \(F:=F\nabla \sigma _{ijk}\) with Fh as in (26). We claim that the Fréchet derivative of F with respect to a is given by (29) where the functions \(v=v(x)\) and \(\tilde{v}_{jk}={\tilde{v}}_{jk}(a,x)\) are determined as the decaying solutions of the elliptic Eqs. (27) and (28).

Computing the functional derivative of F as a function of a amounts to a linearization. We thus consider an arbitrary tensor field \(\delta a=\delta a(x)\), which we think of as an infinitesimal perturbation of a, and which thus generates infinitesimal perturbations \(\delta \phi \) and \(\delta \sigma \) of \(\phi \) and \(\sigma \) according to (4), (6), and (9), that is,

$$\begin{aligned} -\nabla \cdot (a\nabla \delta \phi _i+\delta a(\nabla \phi _i+e_i))=0 \end{aligned}$$
(41)

and

$$\begin{aligned} -\triangle \delta \sigma _{ijk}=\partial _j\left( \delta a(\nabla \phi _i+e_i)+a\nabla \delta \phi _i\right) _k -\partial _k\left( \delta a(\nabla \phi _i+e_i)+a\nabla \delta \phi _i\right) _j. \end{aligned}$$
(42)

In terms of the infinitesimal perturbation \(\delta F\) of F, this implies by integration by parts (or rather by directly appealing to the weak Lax–Milgram formulations of the elliptic equations)

$$\begin{aligned} \delta F&=\int g\cdot \nabla \delta \sigma _{ijk} dx\\&\mathop {=}\limits ^{(27)}-\int \nabla v \cdot \nabla \delta \sigma _{ijk} dx\\&\mathop {=}\limits ^{(42)}\int (\partial _jve_k-\partial _kve_j)\cdot \left( \delta a(\nabla \phi _i+e_i)+a\nabla \delta \phi _i\right) dx \\&\mathop {=}\limits ^{(28)}\int (\partial _jve_k-\partial _kve_j)\cdot \delta a(\nabla \phi _i+e_i)dx -\int \nabla {\tilde{v}}_{jk}\cdot a\nabla \delta \phi _i dx\\&\mathop {=}\limits ^{(41)}\int \left( \partial _jve_k-\partial _kve_j+\nabla \tilde{v}_{jk}\right) \cdot \delta a(\nabla \phi _i+e_i)dx, \end{aligned}$$

which is nothing else than (29).

Let us now establish the second part of our lemma. Consider a functional of the scalar potential of the form \(F:=F\nabla \phi _i\). To represent its Fréchet derivative, introduce the decaying solution \({\overline{v}}\) to the Eq. (30). We observe that the variation of F with respect to a is given by

$$\begin{aligned} \delta F&= \int g \cdot \nabla \delta \phi _i dx\\&\mathop {=}\limits ^{(30)}-\int a^*\nabla {\overline{v}} \cdot \nabla \delta \phi _i dx\\&\mathop {=}\limits ^{(41)}\int \nabla {\overline{v}} \cdot \delta a(\nabla \phi _i+e_i) dx, \end{aligned}$$

which leads to the conclusion (31). \(\square \)

3.3 Sensitivity estimate

Proof of Lemma 3

We now argue that under certain boundedness assumptions on \(F=Fh\) as a linear functional in vector fields \(h=h(x)\), we control the size (25) of its Fréchet derivative \(\frac{\partial F}{\partial a}=\frac{\partial F}{\partial a}(a,x)\) as a nonlinear functional \(F\nabla \sigma _{ijk}=F(a)\) in coefficient fields \(a=a(x)\) (and similarly in the case \(F(a)=F\nabla \phi _i\); for this case, the (simpler) proof is sketched afterwards).

To this aim, let us first note that we have a Calderon-Zygmund estimate for \(-\nabla \cdot a\nabla \) with the exponents p and its dual exponent \(\frac{p}{p-1}\): For any decaying function w and vector field h on \({\mathbb {R}}^d\) related by

$$\begin{aligned} -\nabla \cdot a\nabla w=\nabla \cdot h \end{aligned}$$
(43)

we have

$$\begin{aligned} \int |\nabla w|^\frac{p}{p-1}dx\lesssim \int |h|^\frac{p}{p-1}dx\quad \hbox {and}\quad \int |\nabla w|^pdx\lesssim \int |h|^pdx. \end{aligned}$$
(44)

This assertion holds by Meyer’s estimate (see e.g. [22]), which only requires the ellipticity and boundedness assumptions (10), (11) on a as well as the estimate \(|p-2|\ll 1\), which is ensured by our condition (33). Note that an analogous estimate would hold for the dual equation \(-\nabla \cdot a^*\nabla w=\nabla \cdot h\) if our coefficient field were nonsymmetric.

In the following, we will use the abbreviation \(\Vert \cdot \Vert _{p,B}\) for the spatial \(L^p\)-norm on the set B; we write \(\Vert \cdot \Vert _{p}\) when \(B={\mathbb {R}}^d\). We start by arguing that because \(\frac{p}{p-1}\in (1,2)\), (35) also entails

$$\begin{aligned} -\int _{|x|\le \rho }|\nabla u|^\frac{p}{p-1}dx\lesssim -\int _{|x|\le R}|\nabla u|^\frac{p}{p-1}dx. \end{aligned}$$
(45)

It is obviously enough to establish (45) only for \(R\ge 2\rho \); hence by Jensen’s inequality, (45) follows from (35) once we establish the reverse Hölder inequality

$$\begin{aligned} \left( -\int _{|x|\le \frac{R}{2}}|\nabla u|^2dx\right) ^\frac{1}{2}\lesssim \ -\int _{|x|\le R}|\nabla u|dx. \end{aligned}$$
(46)

To this purpose, we test \(-\nabla \cdot a\nabla u=0\) with \(\eta ^{2\gamma }(u-m)\), where \(\eta \) is a smooth cut-off of \(\chi _{\{|x|\le \frac{R}{2}\}}\) in \(\{|x|\le R\}\) (with the property \(|\nabla \eta |\lesssim \frac{1}{R}\)) and where the exponent \(\gamma \ge 1\) and the constant \(m\in {\mathbb {R}}\) will be chosen later. By the ellipticity and boundedness assumptions (10), (11) and Young’s inequality we obtain

$$\begin{aligned} \int (\eta ^{\gamma }|\nabla u|)^2dx\lesssim \int ((u-m)|\nabla \eta ^{\gamma }|)^2dx, \end{aligned}$$

and thus

$$\begin{aligned} \int |\nabla (\eta ^\gamma (u-m))|^2dx\lesssim \int ((u-m)|\nabla \eta ^\gamma |)^2dx, \end{aligned}$$

which by the estimate on \(\nabla \eta \) gives

$$\begin{aligned} \Vert \nabla (\eta ^\gamma (u-m))\Vert _2\lesssim \frac{1}{R} \Vert \eta ^{\gamma -1}(u-m)\Vert _2. \end{aligned}$$
(47)

On the r.h.s. of (47) we use first Hölder’s inequality, then the isoperimetric inequality on \(\{|x|\le R\}\) and finally Sobolev’s inequality on the whole space (for simplicity, we assume \(d>2\) here)

$$\begin{aligned} \Vert \eta ^{\gamma -1}(u-m)\Vert _2\le & {} \Vert \eta ^{\gamma }(u-m)\Vert _{\frac{2d}{d-2}}^\frac{\gamma -1}{\gamma } \Vert u-m\Vert _{\frac{d}{d-1},|x|\le R}^\frac{1}{\gamma }\\\lesssim & {} \Vert \nabla (\eta ^{\gamma }(u-m))\Vert _{2}^\frac{\gamma -1}{\gamma }\Vert \nabla u\Vert _{1,|x|\le R}^\frac{1}{\gamma }, \end{aligned}$$

provided the exponent \(\gamma \in (1,\infty )\) is chosen such that \(\frac{1}{2}=(1-\frac{1}{\gamma })\frac{d-2}{2d}+\frac{1}{\gamma }\frac{d-1}{d}\) (which—as a simple computation shows—is satisfied precisely for \(\gamma =\frac{d}{2}\)) and the constant m is the spatial average of u on \(\{|x|\le R\}\). The combination of the last two estimates yields

$$\begin{aligned} \Vert \nabla (\eta ^\gamma (u-m))\Vert _{2}\lesssim \frac{1}{R} \Vert \nabla (\eta ^{\gamma }(u-m))\Vert _{2}^\frac{\gamma -1}{\gamma } \Vert \nabla u\Vert _{1,|x|\le R}^\frac{1}{\gamma }, \end{aligned}$$

which (by \(\gamma =\frac{d}{2}\)) entails \(\Vert \nabla u\Vert _{2,|x|\le \frac{R}{2}}\le \Vert \nabla (\eta ^\gamma (u-m))\Vert _{2}\lesssim R^{-\frac{d}{2}} \Vert \nabla u\Vert _{1,|x|\le R}\) and thus (46).

We now give the argument for (37) in case of a functional of the form \(F\nabla \sigma _{ijk}\) (the case \(F\nabla \phi _i\) will be treated below). Clearly (32) implies that there exists a (deterministic) vector field \(g=g(x)\) with

$$\begin{aligned} {\text {supp}}g\subset \{|x|\le r\}\quad \hbox {and}\quad \Vert g\Vert _p\le r^{-\frac{p-1}{p}d} \end{aligned}$$
(48)

such that we have the representation for \(F=Fh\) as a linear functional on vector fields \(h=h(x)\)

$$\begin{aligned} Fh=\int g\cdot h\,dx. \end{aligned}$$
(49)

This gives us access to the representation (29) of its Fréchet derivative \(\frac{\partial F}{\partial a}\) considered as a nonlinear functional \(F\nabla \sigma _{ijk}=F(a)\) of a. Using this representation, a partition into dyadic annuli, and Hölder’s estimate (recall (34)) we obtain

$$\begin{aligned} \left\| \frac{\partial F}{\partial a}\right\| _{q}&\lesssim \Vert (|\nabla v|+|\nabla \tilde{v}_{jk}|)|\nabla \phi _i+e_i|\Vert _{q,|x|\le 2r} \nonumber \\&~~~ +\sum _{n=1}^\infty \Vert (|\nabla v|+|\nabla \tilde{v}_{jk}|)|\nabla \phi _i+e_i|\Vert _{q,2^n r\le |x|\le 2^{n+1} r} \nonumber \\&\lesssim (\Vert \nabla v\Vert _{p,|x|\le 2r}+\Vert \nabla \tilde{v}_{jk}\Vert _{p,|x|\le 2r})\Vert \nabla \phi _i+e_i\Vert _{2,|x|\le 2r} \nonumber \\&~~~ +\sum _{n=1}^\infty (\Vert \nabla v\Vert _{p,2^n r\le |x|\le 2^{n+1} r}+\Vert \nabla {\tilde{v}}_{jk}\Vert _{p,2^n r\le |x|\le 2^{n+1} r}) \nonumber \\&~\qquad \qquad \times \Vert \nabla \phi _i+e_i\Vert _{2,2^n r\le |x|\le 2^{n+1} r}. \end{aligned}$$
(50)

In view of (35) applied to the a-harmonic function \(u(x)=x_i+\phi _i(x)\), cf. (4), we obtain for all radii \(\rho \ge r\) using Caccioppoli’s inequality and (36)

Hence (50) turns into

$$\begin{aligned} \left\| \frac{\partial F}{\partial a}\right\| _{q}\lesssim & {} r^\frac{d}{2}(\Vert \nabla v\Vert _{p}+\Vert \nabla {\tilde{v}}_{jk}\Vert _{p}) \end{aligned}$$
(51)
$$\begin{aligned}&+\,\sum _{n=1}^\infty (2^{n} r)^\frac{d}{2}(\Vert \nabla v\Vert _{p,|x|\ge 2^{n} r} +\Vert \nabla {\tilde{v}}_{jk}\Vert _{p,|x|\ge 2^{n} r}). \end{aligned}$$
(52)

It thus remains to estimate the auxiliary functions v and \(\tilde{v}_{jk}\). The estimate of the terms in line (51) is easy: By (48) and Calderon–Zygmund for (27) we obtain \(\Vert \nabla v\Vert _{p}\lesssim \Vert g\Vert _{p}\le r^{-\frac{p-1}{p}d}\). By (44) we have Calderon-Zygmund with exponent p for the equation (28), so that \(\Vert \nabla {\tilde{v}}_{jk}\Vert _{p}\lesssim \Vert \nabla v\Vert _{p}\lesssim r^{-\frac{p-1}{p}d}\). In order to control the terms in line (52), we shall establish the following estimates for \(n\in {\mathbb {N}}\)

$$\begin{aligned} \Vert \nabla v\Vert _{p,|x|\ge 2^n r}\lesssim & {} (2^n)^{-d+\frac{d}{p}} r^{-\frac{p-1}{p}d} , \end{aligned}$$
(53)
$$\begin{aligned} \Vert \nabla {\tilde{v}}_{jk}\Vert _{p,|x|\ge 2^n r}\lesssim & {} n(2^n)^{-d+\frac{d}{p}} r^{-\frac{p-1}{p}d}. \end{aligned}$$
(54)

We note that since \(p>2\), these estimates imply that the sum over n in (52) converges and gives (37).

The estimate (53) for the solution v of the constant coefficient Eq. (27) is classical: We already argued that \(\Vert \nabla v\Vert _{p}\lesssim r^{-\frac{p-1}{p}d}\); by the estimate on the support of g in (48) we have that v is harmonic in \(\{|x|\ge r\}\) and that it has vanishing flux \(\int _{|x|=r}x\cdot \nabla v=0\). It thus decays as \(|\nabla v(x)|\lesssim |x|^{-d} r^{d-\frac{d}{p}} \Vert \nabla v\Vert _{p}\) for \(|x|\ge 2r\), which in particular yields (53). We now turn to (54) and to this purpose rewrite the Eq. (28) for \({\tilde{v}}_{jk}\) as

$$\begin{aligned} -\nabla \cdot a^*\nabla {\tilde{v}}_{jk}=\nabla \cdot {\tilde{g}} \end{aligned}$$

with the r.h.s. \({\tilde{g}}:=-a^*(\partial _jve_k-\partial _kve_j)\). We already argued that \(\Vert \tilde{g}\Vert _{p}\lesssim r^{-\frac{p-1}{p}d}\) and (53) translates into

$$\begin{aligned} \Vert {\tilde{g}}\Vert _{p,|x|\ge 2^n r}\lesssim (2^n)^{-d+\frac{d}{p}} r^{-\frac{p-1}{p}d}. \end{aligned}$$
(55)

In order to proceed, we split \({\tilde{g}}\) into \(\{\tilde{g}_m\}_{m=0,1,\ldots }\) where \({\tilde{g}}_0\) is supported in \(\{|x|\le 2r\}\) and for \(m\ge 1\) \({\tilde{g}}_m\) is supported in \(\{2^m r\le |x|\le 2^{m+1} r\}\), so that (55) translates into

$$\begin{aligned} \Vert {\tilde{g}}_m\Vert _{p}\lesssim (2^m)^{-d+\frac{d}{p}}r^{-\frac{p-1}{p}d}. \end{aligned}$$
(56)

This entails a splitting of \({\tilde{v}}_{jk}\) into \(\{\tilde{v}_m\}_{m=0,1,\ldots }\), where \({\tilde{v}}_m\) is the Lax–Milgram solution of

$$\begin{aligned} -\nabla \cdot a^*\nabla {\tilde{v}}_m=\nabla \cdot {\tilde{g}}_m. \end{aligned}$$
(57)

We will now argue that

$$\begin{aligned} \Vert \nabla {\tilde{v}}_m\Vert _{p,|x|\ge 2^{n}r}\lesssim \min \left\{ (2^n)^{-d+\frac{d}{p}},(2^m)^{-d+\frac{d}{p}}\right\} r^{-\frac{p-1}{p}d}, \end{aligned}$$
(58)

which implies the estimate (54) by the triangle inequality \(\Vert \nabla {\tilde{v}}_{jk}\Vert _{p,|x|\ge 2^{n} r}\le \sum _{m=0}^\infty \Vert \nabla {\tilde{v}}_m\Vert _{p,|x|\ge 2^{n} r}\). We note that (56) together with our Calderon–Zygmund estimate (44) applied to (57) yields \(\Vert \nabla \tilde{v}_m\Vert _{p}\lesssim (2^m)^{-d+\frac{d}{p}}r^{-\frac{p-1}{p}d}\). In order to establish (58), it thus remains to show

$$\begin{aligned} \Vert \nabla {\tilde{v}}_m\Vert _{p,|x|\ge 2^{n}r}\lesssim (2^n)^{-d+\frac{d}{p}}r^{-\frac{p-1}{p}d}\quad \hbox {for}\;m<n. \end{aligned}$$
(59)

We argue in favor of (59) by duality and thus consider an arbitrary \(h\in L^\frac{p}{p-1}\) supported in \(\{|x|\ge 2^n r\}\) and denote by w the corresponding Lax–Milgram solution of (43). By integration by parts, we deduce from (43) and (57) that \(\int h\cdot \nabla {\tilde{v}}_m dx=\int {\tilde{g}}_m\cdot \nabla w\,dx\). By the support condition on \({\tilde{g}}_m\) this yields

$$\begin{aligned} \left| \int h\cdot \nabla {\tilde{v}}_m dx\right| \le \Vert \tilde{g}_m\Vert _{p}\Vert \nabla w\Vert _{\frac{p}{p-1},|x|\le 2^{m+1}r}. \end{aligned}$$

By the support assumption on h we have that w is a-harmonic in \(\{|x|\le 2^n r\}\). Since \(m<n\), we may use (45) applied to w in form of

$$\begin{aligned} \Vert \nabla w\Vert _{\frac{p}{p-1},|x|\le 2^{m+1}r}\lesssim (2^{n-m})^{-d+\frac{d}{p}}\Vert \nabla w\Vert _{\frac{p}{p-1},|x|\le 2^{n}r}. \end{aligned}$$

We combine this with (44) in form of \(\Vert \nabla w\Vert _{\frac{p}{p-1}}\lesssim \Vert h\Vert _{\frac{p}{p-1}}\), and with (56), to obtain

$$\begin{aligned} \left| \int h\cdot \nabla {\tilde{v}}_m dx\right| \lesssim (2^{n})^{-d+\frac{d}{p}} r^{-\frac{p-1}{p}d}\Vert h\Vert _{\frac{p}{p-1}}, \end{aligned}$$

which gives (59).

In the case of a functional of the scalar potential of the form \(F(a)=F\nabla \phi _i\), we claim that the Fréchet derivative of F is again controlled in the sense of (37). The proof is mostly analogous to the previous one; we again rewrite F as in (49) with some g satisfying (48). Starting from the representation (31), one derives an analogue of estimate (50) reading

$$\begin{aligned} \left\| \frac{\partial F}{\partial a}\right\| _{q}\lesssim & {} \Vert \nabla {\overline{v}}\Vert _{p,|x|\le 2r}\Vert \nabla \phi _i+e_i\Vert _{2,|x|\le 2r}\\&+\sum _{n=1}^\infty \Vert \nabla {\overline{v}}\Vert _{p,2^n r\le |x|\le 2^{n+1} r} \Vert \nabla \phi _i+e_i\Vert _{2,2^n r\le |x|\le 2^{n+1} r}. \end{aligned}$$

The second factors on the right in this estimate coincide with the ones in the case \(F(a)=F\nabla \sigma _{ijk}\); therefore, we get the following analogue to estimate (52):

$$\begin{aligned} \left\| \frac{\partial F}{\partial a}\right\| _{q}\lesssim & {} r^\frac{d}{2}\Vert \nabla {\overline{v}}\Vert _{p} +\sum _{n=1}^\infty (2^{n}r)^\frac{d}{2}\Vert \nabla {\overline{v}}\Vert _{p,|x|\ge 2^{n} r}. \end{aligned}$$

The Eq. (30) for \({\overline{v}}\) has the structure of the equation (57) with \(m=0\) (including the estimate \(||g||_{p} \lesssim r^{-\frac{p-1}{p}d}\) and the inclusion \({\text {supp}} g\subset \{|x|\le 2r\}\), cf. (48)); therefore, the decay property (59) carries over to our \({\overline{v}}\). This establishes the estimate

$$\begin{aligned} \left\| \frac{\partial F}{\partial a}\right\| _{q} \lesssim r^{-\frac{p-2}{2p}d} \end{aligned}$$

also in the case \(F(a)=F\nabla \phi _i\). \(\square \)

3.4 Sufficient conditions for the mean value property in terms of linear functionals of the corrector

Proof of Lemma 4

In order to show that (39) and (36) imply (35), we only need to show the existence of functionals \(F_1,\ldots ,F_N\) such that (39) and (36) imply

$$\begin{aligned} \frac{1}{R}\left( -\int _{|x|\le R}|(\phi ,\sigma )--\int _{|x|\le R}(\phi ,\sigma )|^2 dx\right) ^\frac{1}{2}\ll 1 \quad \hbox {for all}\;R\ge r. \end{aligned}$$
(60)

By Proposition 2, the estimate (35) follows from (60).

Let us now give the argument for (60). First, it is clearly enough to show that for any \(0<\delta \ll 1\), there exists \(N\lesssim \delta ^{-d}\) functionals \(F_1,\ldots ,F_N\) on vector fields which are bounded in the sense of

$$\begin{aligned} |F_nh|\lesssim \delta ^{-d}\left( \int _{|x|\le 1}|h|^\frac{p}{p-1}dx\right) ^\frac{p-1}{p} \end{aligned}$$
(61)

and such that for any dyadic \(\rho \ge 1\)

$$\begin{aligned} \frac{1}{\rho }\left( -\int _{|x|\le \rho }\right| (\phi ,\sigma )--\int _{|x|\le \rho }(\phi ,\sigma )\left| ^2 dx\right) ^\frac{1}{2} \lesssim \delta +\sup _{R\ge \rho \;\text {dyadic}}\max _{n=1,\ldots ,N}|F_{n,R}\nabla (\phi ,\sigma )|.\nonumber \\ \end{aligned}$$
(62)

By dyadic iteration, it is enough to show for any dyadic \(\rho \ge 1\)

(63)

Indeed, abbreviating , the estimate (63) may be rewritten as (using a slight readjustment of \(\delta \))

$$\begin{aligned} D_m\le & {} C_0\max _{n=1,\ldots ,N}|F_{n,2^m}\nabla (\phi ,\sigma )| +\delta \Big (D_{m+1}+1\Big ), \end{aligned}$$

which may be iterated to

$$\begin{aligned} D_m \le&\frac{1}{1-\delta }C_0\max _{M=m,m+1,\ldots ,m+m_0}\max _{n=1,\ldots ,N}|F_{n,2^{M}}\nabla (\phi ,\sigma )|\\&+\frac{\delta }{1-\delta }+\delta ^{m_0+1} D_{m+m_0+1}. \end{aligned}$$

By our sublinearity assumption on the corrector (36) (which may be rewritten as \(\lim _{m_0\uparrow \infty } D_{m_0}=0\)), this yields (62).

We now turn to the argument for (63). By Caccioppoli’s estimate on (4) we have

$$\begin{aligned} \left( \int _{|x|\le \frac{3}{2}\rho }|\nabla \phi _i|^2dx\right) ^\frac{1}{2}\lesssim \frac{1}{\rho } \left( \int _{|x|\le 2\rho }\left( \phi _i--\int _{|x|\le 2\rho }\phi _i\right) ^2 dx\right) ^\frac{1}{2}+\rho ^{d/2}, \end{aligned}$$

and thus in particular for the flux \(q_i=a(\nabla \phi _i+e_i)\)

$$\begin{aligned} \left( \int _{|x|\le \frac{3}{2}\rho }|q_i|^2dx\right) ^\frac{1}{2}\lesssim \frac{1}{\rho } \left( \int _{|x|\le 2\rho }\left( \phi _i--\int _{|x|\le 2\rho }\phi _i\right) ^2dx\right) ^\frac{1}{2}+\rho ^{d/2}. \end{aligned}$$

Caccioppoli’s estimate on (9) gives

$$\begin{aligned} \left( \int _{|x|\le \rho }|\nabla \sigma _{ijk}|^2dx\right) ^\frac{1}{2}\lesssim & {} \frac{1}{\rho } \left( \int _{|x|\le \frac{3}{2}\rho }\left( \sigma _{ijk}--\int _{|x|\le \frac{3}{2}\rho }\sigma _{ijk}\right) ^2 dx\right) ^\frac{1}{2}+ \left( \int _{|x|\le \frac{3}{2}\rho }|q_i|^2dx\right) ^\frac{1}{2}\\\le & {} \frac{1}{\rho } \left( \int _{|x|\le 2\rho }\left( \sigma _{ijk}--\int _{|x|\le 2\rho }\sigma _{ijk}\right) ^2dx\right) ^\frac{1}{2}+ \left( \int _{|x|\le \frac{3}{2}\rho }|q_i|^2dx\right) ^\frac{1}{2} \end{aligned}$$

The last three estimates combine to

$$\begin{aligned} \left( \int _{|x|\le \rho }|\nabla (\phi ,\sigma )|^2 dx\right) ^\frac{1}{2} \lesssim \frac{1}{\rho }\left( \int _{|x|\le 2\rho }\left| (\phi ,\sigma )--\int _{|x|\le 2\rho }(\phi ,\sigma )\right| ^2 dx\right) ^\frac{1}{2}+\rho ^{d/2}. \end{aligned}$$
(64)

Hence, to establish (63) it is enough to show

This statement is not just true for \((\phi ,\sigma )--\int _{|x|\le \rho }(\phi ,\sigma )\), but for any function \(\zeta \) of vanishing spatial average on \(\{|x|\le \rho \}\): By rescaling, it is sufficient to show the estimate on the unit ball \(\{|x|\le 1\}\). It is more convenient to see it when the unit ball \(\{|x|\le 1\}\) is replaced by the unit square \((0,1)^d\):

$$\begin{aligned} \int _{(0,1)^d}\zeta ^2dx\le & {} \max _{n=1,\ldots ,N}|F_{n}\nabla \zeta |^2+\delta ^2\int _{(0,1)^d}|\nabla \zeta |^2dx. \end{aligned}$$
(65)

Indeed, dividing \((0,1)^d\) into \(N=\delta ^{-d}\) (suppose that \(\delta ^{-1}\) is an integer) sub-cubes \(\{Q_n\}_{n=1,\ldots ,N}\) of side length \(\delta \) and setting \(F_n\nabla \zeta :=-\int _{Q_n}\zeta dx\) (recall that \(\int _{(0,1)^d}\zeta dx=0\) so that \(F_n\) is indeed a function of \(\nabla \zeta \)), (65) follows from using Poincaré’s estimate on each \(Q_n\) in form of \(\int _{Q_n}\zeta ^2 dx-|Q_n|(-\int _{Q_n}\zeta dx)^2\lesssim \delta ^2\int _{Q_n}|\nabla \zeta |^2dx\) and then summing up. We note that by Poincaré’s estimate on \((0,1)^d\), the \(F_n\) have the desired boundedness property (61), at first on gradient fields \(\nabla \zeta \)

$$\begin{aligned} |F_n\nabla \zeta |\le \delta ^{-d}\left( \int _{(0,1)^d}|\zeta |^\frac{p}{p-1} dx\right) ^\frac{p-1}{p} \lesssim \delta ^{-d}\left( \int _{(0,1)^d}|\nabla \zeta |^\frac{p}{p-1} dx\right) ^\frac{p-1}{p}, \end{aligned}$$

and then on any vector field h by extension à la Hahn-Banach. \(\square \)

4 Proof of main result

Proof of Theorem 1

Proof of Assertion (i).

Consider the functionals \(\{F_{n}\}_{n=1,\ldots ,N}\) and their rescalings \(\{F_{n,R}\}_{n=1,\ldots ,N;R\;\text {dyadic}}\) constructed in Lemma 4. Let us abbreviate \(F_{n,R}\nabla (\phi ,\sigma )\) as \(F_{n,R}\). We would like to apply concentration of measure to these functionals.

The main difficulty that we need to overcome is that our sensitivity estimate (37) in Lemma 3 for the quantity \(F_{n,r}\) is based on the assumption that the mean-value property (35) holds for a-harmonic functions down to scale r. By Lemma 4 this assumption may be reduced to the smallness assumption (39) for our functionals \(F_{m,R}\) on scales \(R\ge r\), so that Lemma 3 becomes applicable under the assumption (39): Let q be related to \(\beta \) through (24) and let p be related to q through (34). By the smallness assumption on \(\beta \) in our theorem (cf. (13)), we deduce that (33) holds. By scaling, our functionals \(F_{n,r}\) satisfy the estimate (32) up to a universal constant factor. Furthermore, by ergodicity the property (36) holds for \(\langle \cdot \rangle \)-almost every coefficient field a (regarding \(\sigma \), this result has been shown in [11, Lemma 1]; for \(\phi \), it is classical but may also be found in [11]). Thus, the estimate (37) holds for \(F_{n,r}\) under the assumption (39), i.e. there exists a constant \(C_0\) only depending on d, \(\lambda \), and \(\beta \), such that for any \(n=1,\ldots ,N\) and any radius r the implication

$$\begin{aligned} \sup _{m,R\ge r\text { dyadic}}F_{m,R}\le \frac{1}{C_0}\quad \Longrightarrow \quad \left\| \frac{\partial F_{n,r}}{\partial a}\right\| _q^2\lesssim \frac{1}{r^\beta } \end{aligned}$$
(66)

holds for \(\langle \cdot \rangle \)-a.e. coefficient field a.

To apply concentration of measure in the form of Proposition 1 to some functional F, we however need an unconditional bound on the Malliavin derivative (cf. (21) respectively (25)).

Therefore we first introduce a new random variable whose derivative vanishes whenever the smallness condition in (66) is violated: Consider the auxiliary random variable

$$\begin{aligned} {\bar{F}}_r:=\min \left\{ \sup _{m,R\ge r}|F_{m,R}|,\frac{1}{C_0}\right\} , \end{aligned}$$
(67)

where the \(\sup \) runs over all dyadic radii \(R=2^k r\), \(k\in {\mathbb {N}}_0\). By the usual differentiation rules applied to the Fréchet derivative \(\frac{\partial }{\partial a}\) in the norm \(\Vert \cdot \Vert _q\), we obtain

$$\begin{aligned} \left\| \frac{\partial {\bar{F}}_r}{\partial a}\right\| _q\le I\left( \sup _{m,R\ge r}F_{m,R}\le \frac{1}{C_0}\right) \sup _{m,R\ge r} \left\| \frac{\partial F_{m,R}}{\partial a}\right\| _q \end{aligned}$$

and thus by (66)

$$\begin{aligned} \left\| \frac{\partial {\bar{F}}_r}{\partial a}\right\| _q^2\lesssim \frac{1}{r^\beta }. \end{aligned}$$

By Lemma 1, we may apply concentration of measure in form of (23) to the random variable \(c r^{\beta /2} {\bar{F}}_r\) (where c is some small universal constant). This yields

$$\begin{aligned} \langle I(|{\bar{F}}_r-\langle {\bar{F}}_r\rangle |\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for all}\; M\ge 0, \end{aligned}$$
(68)

so that it remains to control the expectation \(\langle \bar{F}_r\rangle \).

Because of (8) and the definition of \(F_{m,R}\), it follows from qualitative ergodicity of \(\langle \cdot \rangle \) and Birkhoff’s ergodic theorem that \(\lim _{R\uparrow \infty }F_{m,R}=0\) almost surely, so that by dominated convergence \(\lim _{r\uparrow \infty }\langle {\bar{F}}_r\rangle =0\). Hence there exists a finite radius \(r_0\) which is minimal with the property

$$\begin{aligned} \langle {\bar{F}}_r\rangle \le \frac{1}{4C_0}\quad \hbox {for all}\;r\ge r_0. \end{aligned}$$
(69)

Hence using (68) for \(M=\frac{1}{4C_0}\) we get in view of the definition (67) of \({\bar{F}}_r\)

$$\begin{aligned} \left\langle I\left( \sup _{m,R\ge r}|F_{m,R}|\ge \frac{1}{2C_0}\right) \right\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta \right) \quad \hbox {for all}\;r\ge r_0. \end{aligned}$$
(70)

On the basis of (70), we now get a quantitative estimate on \(r_0\). To this purpose we now consider the auxiliary variable

$$\begin{aligned} {\bar{F}}_{n,r}:=\eta (\sup _{m,R\ge r}F_{m,R})F_{n,r}, \end{aligned}$$
(71)

where again the \(\sup \) runs over all dyadic radii \(R=2^k r\), \(k\in {\mathbb {N}}_0\), and where the cut-off function \(\eta =\eta (F)\) is given by

$$\begin{aligned} \eta (F)=\max \left\{ \min \left\{ 2C_0\left( \frac{1}{C_0}-F\right) ,1\right\} ,0\right\} . \end{aligned}$$
(72)

The advantage of the auxiliary variable (71) over (67) is that we control its expectation: Since the stationary \(\nabla (\phi ,\sigma )\) has vanishing expectation, cf. (8), and by the linearity of \(F_{n,r}\) in \(\nabla (\phi ,\sigma )\) we have \(\langle F_{n,r}\rangle =0\) and thus \(\langle \bar{F}_{n,r}\rangle =\langle (\eta -1) F_{n,r}\rangle \) so that by construction of \(\eta \)

$$\begin{aligned} |\langle {\bar{F}}_{n,r}\rangle |\le \left\langle I\left( \sup _{m,R\ge r}|F_{m,R}|\ge \frac{1}{2C_0}\right) |F_{n,r}|\right\rangle . \end{aligned}$$

Since the stationary \(\nabla (\phi ,\sigma )\) has bounded second moments, cf. (8), and by the boundedness property of \(F_{n,r}\) in \(\nabla (\phi ,\sigma )\) we obtain from the Cauchy-Schwarz inequality

$$\begin{aligned} \langle {\bar{F}}_{n,r}\rangle ^2\lesssim \left\langle I\left( \sup _{m,R\ge r}|F_{m,R}|\ge \frac{1}{2C_0}\right) \right\rangle , \end{aligned}$$

which in view of (70) improves to

$$\begin{aligned} |\langle \bar{F}_{n,r}\rangle |\lesssim \exp \left( -\frac{1}{C}r^\beta \right) \quad \hbox {for any}\;r\ge r_0. \end{aligned}$$
(73)

By differentiation rules for the Fréchet derivative \(\frac{\partial }{\partial a}\) in the norm \(\Vert \cdot \Vert _q\) we obtain for the auxiliary random variable \({\bar{F}}_{n,r}\)

$$\begin{aligned} \left\| \frac{\partial {\bar{F}}_{n,r}}{\partial a}\right\| _q\le & {} I\left( \sup _{m,R\ge r}|F_{m,R}|\le \frac{1}{C_0}\right) \left( 2C_0|F_{n,r}| \sup _{m,R\ge r}\left\| \frac{\partial F_{m,R}}{\partial a}\right\| _q+ \left\| \frac{\partial F_{n,r}}{\partial a}\right\| _q\right) \\\le & {} 3I\left( \sup _{m,R\ge r}|F_{m,R}|\le \frac{1}{C_0}\right) \sup _{m,R\ge r}\left\| \frac{\partial F_{m,R}}{\partial a}\right\| _q, \end{aligned}$$

and thus by (66)

$$\begin{aligned} \left\| \frac{\partial {\bar{F}}_{n,r}}{\partial a}\right\| _q^2\lesssim \frac{1}{r^\beta }, \end{aligned}$$

and hence by concentration of measure in form of (23) (applied to \(cr^{\beta /2} {\bar{F}}_{n,r}\) by means of Lemma 1, c being a small universal constant)

$$\begin{aligned} \langle I(|{\bar{F}}_{n,r}-\langle {\bar{F}}_{n,r}\rangle |\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for all}\;M\ge 0. \end{aligned}$$

Together with (73) this yields

$$\begin{aligned} \langle I(|{\bar{F}}_{n,r}|\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for}\;M\gg \exp \left( -\frac{1}{C}r^\beta \right) \;\hbox {and}\;r\ge r_0. \end{aligned}$$

By definition (71) we have \(\langle I(|F_{n,r}|\ge M)\rangle \le \langle I(\sup _{m,R\ge r}|F_{m,R}|\ge \frac{1}{2C_0})\rangle +\langle I(|{\bar{F}}_{n,r}|\ge M)\rangle \) so that by (70) the above upgrades to

$$\begin{aligned} \langle I(|F_{n,r}|\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for}\;1\ge \;M\gg \exp \left( -\frac{1}{C}r^\beta \right) \;\hbox {and}\;r\ge r_0. \end{aligned}$$

Since \(r^\beta \exp (-\frac{1}{C}r^\beta )\lesssim 1\) for all r, the above holds without the lower restriction on M:

$$\begin{aligned} \langle I(|F_{n,r}|\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for}\;M\le 1\;\hbox {and}\;r\ge r_0. \end{aligned}$$
(74)

Using this estimate with r replaced by R and summing over the finite index set \(n=1,\ldots ,N\) and all dyadic \(R\ge r\) we obtain

$$\begin{aligned} \langle I(\sup _{m,R\ge r}|F_{m,R}|\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for}\;M\le 1\;\hbox {and}\;r\ge r_0 \end{aligned}$$
(75)

and thus in particular for the auxiliary random variable (67)

$$\begin{aligned} \langle I({\bar{F}}_r\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for all}\;M\ge 0\;\hbox {and}\;r\ge r_0, \end{aligned}$$

where the upper bound on M is immaterial since \(\bar{F}_r\le \frac{1}{C_0}\le 1\). Using \(\langle \bar{F}_r\rangle =\int _0^\infty \langle I({\bar{F}}_r\ge M)\rangle dM\), this yields the following quantification of \(\lim _{r\uparrow \infty }\langle {\bar{F}}_r\rangle =0\):

$$\begin{aligned} \langle {\bar{F}}_r\rangle \lesssim \int _0^\infty \exp (-r^\beta M^2)dM \lesssim r^{-\frac{\beta }{2}} \quad \text { for all }r\ge r_0. \end{aligned}$$

Since \(r_0\) was minimal in (69) and since \(\langle \bar{F}_r\rangle \) depends continuously on r, this yields the desired

$$\begin{aligned} r_0\lesssim 1. \end{aligned}$$
(76)

It remains to argue why (74), which together with (76) may be rephrased as

$$\begin{aligned} \langle I(|F_{n,r}|\ge M)\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta M^2\right) \quad \hbox {for}\;M\le 1\;\hbox {and}\;r\gg 1, \end{aligned}$$

yields (15). It just suffices to include the given functional F from (14) into the list of finitely many functionals \(F_1,\ldots ,F_N\), say, as the last functional \(F_N=F\), and then to specify the above to \(n=N\). We note that for q related to \(\beta \) through (24) and p related to q through (34) one has \(\frac{p}{p-1}=\frac{2d}{d+\beta }\), i.e. (14) entails (32). Note that by adjusting the constants, (15) is trivial for \(r\lesssim 1\), so that we obtain (15) over the whole range \(r\ge 0\).

Proof of Assertion (ii).

The arguments in this section require \(\beta <2\), which in view of our assumption \(\beta \ll 1\) is no restriction. Let \(r_*\) denote the minimal dyadic radius with the property (16); note that the proof to follow does not assume \(r_*<\infty \). In order to establish (17), it is enough to show for a given dyadic \(r_0\ge 1\) that

$$\begin{aligned} \langle I(r_*> r_0)\rangle \lesssim \exp \left( -\frac{1}{C}r_0^\beta \right) . \end{aligned}$$
(77)

It will be convenient to replace balls by cubes. Moreover, all radii or rather side length are dyadic. By definition of \(r_*\) as the smallest radius with (16), the event \(r_*> r_0\) means that there exists a radius \(R\ge r_0\) with

$$\begin{aligned}&\frac{1}{R^2}-\int _{(-R,R)^d}|(\phi ,\sigma )--\int _{(-R,R)^d}(\phi ,\sigma )|^2dx>\left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) , \end{aligned}$$
(78)

where

$$\begin{aligned} f(z):=\log (e+\log z). \end{aligned}$$

In the sequel, the intermediate (dyadic) radius \(r_1\in [r_0,R]\) with

$$\begin{aligned} r_1\sim r_0^\frac{\beta }{2}R^{1-\frac{\beta }{2}}f^\frac{1}{2}\left( \frac{R}{r_0}\right) \quad \hbox {so that}\quad \left( \frac{r_1}{R}\right) ^2\sim \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) \end{aligned}$$
(79)

will play a role. Note that we use here \(\beta >0\) and that f(z) grows sub-algebraically. For the l. h. s. of (78) we note

$$\begin{aligned}&-\int _{(-R,R)^d}|(\phi ,\sigma )--\int _{(-R,R)^d}(\phi ,\sigma )|^2dx \nonumber \\&\quad =-\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx +\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}-\int _{(-R,R)^d}|(\phi ,\sigma )_{\frac{r}{2}}-(\phi ,\sigma )_{r}|^2dx, \end{aligned}$$
(80)

where \((\phi ,\sigma )_r\) denotes the \(L^2((-R,R)^d)\)-orthogonal projection of \((\phi ,\sigma )\) onto the space of functions that are piecewise constant on the \((\frac{R}{r})^d\) dyadic sub-cubes Q of “level r” (that is, of side length 2r) of the cube \((-R,R)^d\). In other words, on such a sub-cube Q, \((\phi ,\sigma )_r=-\int _{Q}(\phi ,\sigma )dx\). With this language, we may rewrite the first r. h. s. term of (80) as

$$\begin{aligned} -\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx =(\frac{r_1}{R})^d\sum _{Q\;\text {level}\;r_1}-\int _{Q}|(\phi ,\sigma )--\int _{Q}(\phi ,\sigma )|^2dx, \end{aligned}$$

so that by Poincaré’s estimate on each of the cubes Q we obtain

$$\begin{aligned} -\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx \lesssim r_1^2-\int _{(-R,R)^d}|\nabla (\phi ,\sigma )|^2dx, \end{aligned}$$

and then by Caccioppoli’s estimate based on (4) and (9), cf. (64),

$$\begin{aligned}&-\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx\\&\quad \lesssim (\frac{r_1}{R})^2-\int _{(-2R,2R)^d}|(\phi ,\sigma )--\int _{(-2R,2R)^d}(\phi ,\sigma )|^2dx+r_1^2. \end{aligned}$$

As a consequence of Lemma 4, there exist \(N\sim 1\) linear functionals \(\{F_{n}\}_{n=1,\ldots ,N}\) whose rescaled versions \(F_{n,r}\) satisfy the boundedness property (14) such that for any \(r\ge 2R\) we have the implication

$$\begin{aligned}&\sup _{r\ge 2R~\text {dyadic}}\max _{n=1,\ldots ,N}(F_{n,r}\nabla (\phi ,\sigma ))^2\ll 1\\&\quad \Longrightarrow \frac{1}{R^2}-\int _{(-2R,2R)^d}|(\phi ,\sigma )--\int _{(-2R,2R)^d}(\phi ,\sigma )|^2dx\lesssim 1. \end{aligned}$$

From the two last statements we gather

$$\begin{aligned}&\sup _{r\ge 2R~\text {dyadic}}\max _{n=1,\ldots ,N}(F_{n,r}\nabla (\phi ,\sigma ))^2\ll 1\\&\quad \Longrightarrow \frac{1}{R^2}-\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx\lesssim \left( \frac{r_1}{R}\right) ^2. \end{aligned}$$

In view of (79) this can be rewritten as

$$\begin{aligned}&\forall \;r\ge 2R~\text {dyadic},\;n=1,\ldots ,N\quad (F_{n,r}\nabla (\phi ,\sigma ))^2\ll 1 \nonumber \\&\quad \Longrightarrow \frac{1}{R^2}-\int _{(-R,R)^d}|(\phi ,\sigma )-(\phi ,\sigma )_{r_1}|^2dx\le \frac{1}{2}\left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) , \end{aligned}$$
(81)

provided that we adjust the definition (79) of \(r_1\) appropriately (to obtain the estimate \(\le \frac{1}{2}\cdot \) in (81) in place of just \(\lesssim \)). We now turn to the second r. h. s. term in (80), which in view of the definition of \((\phi ,\sigma )_r\) we may estimate as follows

$$\begin{aligned}&\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}-\int _{(-R,R)^d}|(\phi ,\sigma )_{\frac{r}{2}}-(\phi ,\sigma )_{r}|^2dx \\&\quad \le \sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}\max _{Q\;\text {level}\;r}\max _{Q'\subset Q\;\text {level}\;\frac{r}{2}} \left| -\int _{Q'}(\phi ,\sigma )--\int _{Q}(\phi ,\sigma )\right| ^2. \end{aligned}$$

Hence if for any of the \((\frac{R}{r})^d\) dyadic sub-cubes Q of \((-R,R)^d\) of level r we introduce the \(N=2^d\) linear functionals \(F_{Q,n}\) as an extension of

$$\begin{aligned} F_{Q,n}\nabla \zeta :=\frac{1}{r}\left( -\int _{Q'_n}\zeta dx--\int _{Q}\zeta dx\right) , \end{aligned}$$

where \(\{Q'_n\}_{n=1,\ldots ,2^d}\) is an enumeration of the sub-cubes of level \(\frac{r}{2}\) of Q, and which satisfy the desired boundedness property (14) restricted to gradient fields (which is no issue because of Hahn–Banach extension) and translated (which will be no issue because of stationarity), that is,

$$\begin{aligned} |F_{Q,n}\nabla \zeta |\lesssim \left( -\int _{Q}|\nabla \zeta |^\frac{2d}{d+\beta }dx\right) ^\frac{d+\beta }{2d}, \end{aligned}$$
(82)

we have

$$\begin{aligned}&\frac{1}{R^2}\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}-\int _{(-R,R)^d}|(\phi ,\sigma )_{\frac{r}{2}}-(\phi ,\sigma )_{r}|^2dx\\&\quad \le \sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}\left( \frac{r}{R}\right) ^2\max _{Q\;\text {level}\;r}\max _{n=1,\ldots ,2^d}(F_{Q,n}\nabla (\phi ,\sigma ))^2. \end{aligned}$$

From this we learn, since for the auxiliary function \(g(z):=\log ^{-2}(z+e)\), the dyadic sum \(\sum _{r\in [2r_1,R]}g(\frac{R}{r})\) is universally bounded,

$$\begin{aligned}&\forall \;r\in [2r_1,R]\text { dyadic},\;Q\;\hbox {level}\;r,\;n=1,\ldots ,2^d\\&\quad (F_{Q,n}\nabla (\phi ,\sigma ))^2\ll \left( \frac{R}{r}\right) ^{2}g\left( \frac{R}{r}\right) \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) \\&\quad \Longrightarrow \frac{1}{R^2}\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_1,R]}{r\;\text {dyadic}}}-\int _{(-R,R)^d}|(\phi ,\sigma )_{\frac{r}{2}}-(\phi ,\sigma )_{r}|^2dx \le \frac{1}{2}\left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) . \end{aligned}$$

In view of (80), the combination of this with (81) yields

$$\begin{aligned}&\forall \;r\in [2r_1,R]\text { dyadic},\;Q\;\hbox {level}\;r,\;n=1,\ldots ,2^d\nonumber \\&\quad (F_{Q,n}\nabla (\phi ,\sigma ))^2\ll \left( \frac{R}{r}\right) ^{2}g\left( \frac{R}{r}\right) \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) \nonumber \\&\quad \hbox {and} \ \forall \;r\ge 2R~\text {dyadic},\;n=1,\ldots ,N\quad (F_{n,r}\nabla (\phi ,\sigma ))^2\ll 1\nonumber \\&\quad \Longrightarrow \frac{1}{R^2}-\int _{(-R,R)^d}\left| (\phi ,\sigma )--\int _{(-R,R)^d}(\phi ,\sigma )\right| ^2dx \le \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) . \end{aligned}$$
(83)

Equipped with this deterministic argument, we now may proceed to the stochastic part: In the event of \(r_*>r_0\), there exists a dyadic \(R\ge r_0\) such that (78) holds, so that we learn from (83) that there exists

  • a (dyadic) \(r\in [2r_1,R]\), a sub-cube Q of \((-R,R)^d\) of level r, and an index \(n=1,\ldots ,2^d\) such that \((F_{Q,n}\nabla (\phi ,\sigma ))^2\gtrsim (\frac{R}{r})^2g(\frac{R}{r})(\frac{r_0}{R})^\beta f(\frac{R}{r_0})\). In view of the boundedness condition (82) and stationarity, we may apply (15) with F replaced by \(F_{Q,n}\) and \(M^2\) replaced by \((\frac{R}{r})^2g(\frac{R}{r})(\frac{r_0}{R})^\beta f(\frac{R}{r_0})\). This M is admissible in the sense of \(M\lesssim 1\) because by (79) we have \((\frac{R}{r})^2g(\frac{R}{r})(\frac{r_0}{R})^\beta f(\frac{R}{r_0}) \le (\frac{R}{r_1})^2(\frac{r_0}{R})^\beta f(\frac{R}{r_0})\sim 1\). Hence the probability of each single of this events is estimated as follows

    $$\begin{aligned}&\left\langle I\left( (F_{Q,n}\nabla (\phi ,\sigma ))^2 \ge \frac{1}{C}\left( \frac{R}{r}\right) ^2g\left( \frac{R}{r}\right) \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) \right) \right\rangle \\&\quad \lesssim \exp \left( -\frac{1}{C}\left( \frac{R}{r}\right) ^{2-\beta }g\left( \frac{R}{r}\right) f\left( \frac{R}{r_0}\right) r_0^\beta \right) . \end{aligned}$$

    Since g(z) decays sub-algebraically in z and since \(\beta <2\), this yields the simpler form

    $$\begin{aligned}&\left\langle I\left( (F_{Q,n}\nabla (\phi ,\sigma ))^2 \ge \frac{1}{C}\left( \frac{R}{r}\right) ^2g\left( \frac{R}{r}\right) \left( \frac{r_0}{R}\right) ^\beta f\left( \frac{R}{r_0}\right) \right) \right\rangle \\&\quad \lesssim \exp \left( -\frac{1}{C}\left( \frac{R}{r}\right) ^{1-\frac{\beta }{2}} f\left( \frac{R}{r_0}\right) r_0^\beta \right) . \end{aligned}$$
  • or a (dyadic) \(r\ge 2R\) and an index \(n=1,\ldots ,N\) for which the estimate \((F_{n,r}\nabla (\phi ,\sigma ))^2\gtrsim 1\) holds. By the boundedness property of \(F_{n,r}\), each single of these events is estimated as

    $$\begin{aligned} \left\langle I\left( (F_{n,r}\nabla (\phi ,\sigma ))^2 \ge \frac{1}{C}\right) \right\rangle \lesssim \exp \left( -\frac{1}{C}r^\beta \right) . \end{aligned}$$

Taking the number \((\frac{R}{r})^d\) of sub-cubes Q into account and recalling \(N\lesssim 1\), this implies

$$\begin{aligned} \langle I(r_*>r_0)\rangle&\lesssim \sum _{\genfrac{}{}{0.0pt}1{R\ge r_0}{R\;\text {dyadic}}} \left( \sum _{\genfrac{}{}{0.0pt}1{r\in [2r_0,R]}{r\;\text {dyadic}}} \left( \frac{R}{r}\right) ^d\exp \left( -\frac{1}{C}\left( \frac{R}{r}\right) ^{1-\frac{\beta }{2}}f\left( \frac{R}{r_0}\right) r_0^\beta \right) \right. \nonumber \\&~~~~~~~~~~~~~~~~~\quad \left. +\sum _{\genfrac{}{}{0.0pt}1{r\ge 2R}{r\;\text {dyadic}}}\exp \left( -\frac{1}{C}r^\beta \right) \right) . \end{aligned}$$
(84)

Again, since \(1-\frac{\beta }{2}>0\), we have the calculus estimate

$$\begin{aligned}&\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_0,R]}{r\;\text {dyadic}}}\left( \frac{R}{r}\right) ^d\exp \left( -A\left( \frac{R}{r}\right) ^{1-\frac{\beta }{2}}\right) \\&\quad \lesssim \exp (-A)\sum _{\genfrac{}{}{0.0pt}1{r\in [2r_0,R]}{r\;\text {dyadic}}}\left( \frac{R}{r}\right) ^d\exp \left( -A\log \left( \frac{R}{r}\right) \right) \\&\quad \lesssim \exp (-A)\quad \hbox {for}\;A\gg 1. \end{aligned}$$

Applying this to the first sum over r in (84) and \(A=\frac{1}{C}f(\frac{R}{r_0})r_0^\beta \), which satisfies \(A\gg 1\) for \(r_0\gg 1\), and using the estimate \(\sum _{r\ge 2R; r\text { dyadic}} \exp (-\frac{1}{C}r^\beta ) \lesssim \exp (-\frac{1}{C} R^\beta )\) (which holds provided that \(R\ge r_0\ge 1\)) for the second sum over r, we obtain

$$\begin{aligned} \langle I(r_*>r_0)\rangle\lesssim & {} \sum _{\genfrac{}{}{0.0pt}1{R\ge r_0}{R\;\text {dyadic}}}\left( \exp \left( -\frac{1}{C}f\left( \frac{R}{r_0}\right) r_0^\beta \right) + \exp \left( -\frac{1}{C}R^\beta \right) \right) \quad \text {for }r_0\gg 1. \end{aligned}$$

Thanks to \(\beta >0\), we have \(\exp (-\frac{1}{C}R^\beta ) \lesssim \exp (-\frac{1}{C}f(\frac{R}{r_0})r_0^\beta )\), so that the second summand is dominated by the first one:

$$\begin{aligned} \langle I(r_*>r_0)\rangle\lesssim & {} \sum _{\genfrac{}{}{0.0pt}1{R\ge r_0}{R\;\text {dyadic}}}\exp \left( -\frac{1}{C}f\left( \frac{R}{r_0}\right) r_0^\beta \right) =\sum _{m=0}^\infty \exp \left( -\frac{1}{C}f(2^m)r_0^\beta \right) . \end{aligned}$$

Now we see the reason for the choice of \(f(z)=\log (e+\log z)\) for which \(f(2^m)\ge \frac{1}{C}(1+\log (m+1))\) and thus

$$\begin{aligned} \sum _{m=0}^\infty \exp (-A f(2^m))\le \sum _{m=0}^\infty \exp \left( -\frac{A}{C}(1+\log (m+1))\right) \lesssim \exp \left( -\frac{A}{C}\right) \quad \hbox {for}\;A\gg 1. \end{aligned}$$

With \(\frac{1}{C}r_0^\beta \) playing the role of A this yields (77). Note that the condition \(r_0\gg 1\) is immaterial after adjusting the constants, as the l. h. s. of (77) is bounded by 1. \(\square \)