Sublinear growth of the corrector in stochastic homogenization: Optimal stochastic estimates for slowly decaying correlations

We establish sublinear growth of correctors in the context of stochastic homogenization of linear elliptic PDEs. In case of weak decorrelation and"essentially Gaussian"coefficient fields, we obtain optimal (stretched exponential) stochastic moments for the minimal radius above which the corrector is sublinear. Our estimates also capture the quantitative sublinearity of the corrector (caused by the quantitative decorrelation on larger scales) correctly. The result is based on estimates on the Malliavin derivative for certain functionals which are basically averages of the gradient of the corrector, on concentration of measure, and on a mean value property for $a$-harmonic functions.


Introduction
In the present work, we are concerned with the stochastic homogenization of linear elliptic equations of the form In stochastic homogenization of elliptic PDEs, a is typically a uniformly elliptic and bounded coefficient field, chosen at random according to some stationary and ergodic ensemble · . On large scales (and for slowly varying f ), one may then approximate the solution u to the equation (1) by the solution u hom to the so-called effective equation which is a constant-coefficient equation with the so-called effective coefficient a hom . Mathematically, this homogenization effect is encoded in growth properties of the corrector (cf. below for a definition of the corrector).
The goal of the present paper is to provide a fairly simple proof of quantified sublinear growth of the corrector under very mild assumptions on the decorrelation of the coefficient field a under the ensemble · . We do this in the context of coefficient fields that are essentially Gaussian. More precisely, we consider coefficient fields a which are obtained from a Gaussian random field by pointwise application of some (nonlinear) mapping, the role of the nonlinear map being basically to enforce uniform ellipticity and boundedness of our coefficient field.
The motivation for this result is the following: Gloria, Neukamm, and the second author [10] have shown that qualitatively sublinear growth of the (extended) corrector (φ, σ) (cf. below for a definition) entails a large-scale intrinsic C 1,α regularity theory for a-harmonic functions. In a subsequent work [8], the two authors of the present paper have shown that slightly quantified sublinear growth of the corrector even leads to a large-scale intrinsic C k,α regularity theory for any k ∈ N. Therefore, the results of the present work show that even in case of ensembles with very mild decorrelation, for almost every realization of the coefficient field, a-harmonic functions have arbitrary intrinsic smoothness properties on large scales. Furthermore, our results enable us to estimate the scale above which this happens -a random quantityin a stochastically optimal way. Indeed, the motivation for the present work was to establish such a (necessarily intrinsic) higher order regularity theory under the weakest possible assumptions on the decay of correlations.
By an intrinsic regularity theory we mean that the regularity is measured in terms of objects intrinsic to the Riemannian geometry defined by the coefficient field a, like the dimension of the space of a-harmonic functions of a certain algebraic growth rate, or like estimates on the Hölder modulus of the derivative of a-harmonic functions as measured in terms of their distance to a-linear functions. An extrinsic large-scale regularity theory for a-harmonic functions in case of random coefficients was initiated on the level of a C 0,α in [5,18] and pushed to C 1,0 in [3], which significantly extended qualitative arguments from the periodic case [4] to quantitative arguments in the random case. However, an extrinsic regularity theory is limited to C 1,0 , as can be seen considering the harmonic coordinates: Taking higher order polynomials into account does not increase the local approximation order.
After this motivation, we now return to the discussion of the history on bounds on the corrector, as they depend on assumptions on the stationary ensemble of coefficient fields. Almost-sure sublinearity (always meant in a spatially averaged sense) of the corrector φ under the mere assumption of ergodicity was a key ingredient in the original work on stochastic homogenization by Kozlov [16] and by Papanicolaou & Varadhan [21]. Almost-sure sublinearity of the extended corrector (φ, σ), as is needed for the large-scale intrinsic C 1,α -regularity theory, was established in [10] under mere ergodicity.
Yurinskii [22] was the first to quantify sublinear growth under general mixing conditions, however only capturing suboptimal rates even in case of finite range of dependence. Very recently, a much improved quantification of sublinear growth of φ under finite range assumptions was put forward by Armstrong, Kuusi, and Mourrat [1], relying on a variational approach to quantitative stochastic homogenization introduced by Armstrong and Smart [3], an approach which presumably can be extended to the case of non-symmetric coefficients and more general mixing conditions following [2]. However, while this approach gives the optimal, i. e. Gaussian, stochastic integrability, it presently fails to give the optimal growth rates.
Optimal growth rates have been obtained under a quantification of ergodicity different from finite range or mixing conditions, namely under Spectral Gap assumptions on the ensemble. This functional analytic tool from statistical mechanics was introduced into the field of stochastic homogenization in an unpublished paper by Naddaf and Spencer [20], and further leveraged by Conlon et. al. [6,7], yielding optimal rates for some errors in stochastic homogenization in case of a small ellipticity contrast. The work of Gloria, Neukamm and the second author extended these results to the present case of arbitrary ellipticity contrast [9,11,12], in particular yielding at most logarithmic growth of the corrector (and its stationarity in d > 2). Loosely speaking, the assumption of a Spectral Gap Inequality amounts to correlations with integrable tails; in the above-mentioned works it has been used for discrete media (i. e. random conductance models), but has subsequently been extended to the continuum case [13,14].
A strengthening of the Spectral Gap Inequality is given by the Logarithmic Sobolev Inequality (LSI); it is a slight strengthening in terms of the assumption (still essentially encoding integrable tails of the correlations), but a substantial improvement in its effect, since it implies Gaussian concentration of measure for Lipschitz random variables. The assumption of LSI and implicitly concentration of measure, which will be explicitly used in this work, has been introduced into stochastic homogenization by Marahrens and the second author [18]. In [10], it has been shown that the concept of LSI can be adapted to also capture ensembles with slowly decaying correlations, i. e. thick non-integrable tails, by adapting the norm of the vertical or Malliavin derivative to the correlation structure. As a result, the stochastic integrability of the optimal rates could be improved from algebraic to (stretched) exponential, but missing the expected Gaussian integrability.
The main merit of the present contribution w. r. t. to [10] is twofold: First, our approach directly provides optimal quantitative sublinearity of the corrector (φ, σ) on all scales above a random minimal radius r * , i.e. in contrast to the estimates of [10] our estimates capture the decorrelation on scales larger than r * in a single argument. Note that our definition of r * differs from the one in [10]. Second, in case of weak decorrelation, our simpler arguments are nevertheless sufficient to establish optimal stochastic moments for the minimal radius r * above which the corrector (φ, σ) displays the quantified sublinear growth.
In the present work, we consider the following type of ensembles on λuniformly elliptic tensor fields a = a(x) on R d : Letã =ã(x) be a tensorvalued Gaussian random field on R d that is centered (i. e. of vanishing expectation) and stationary (i. e. invariant under translation) and thus characterized by the covariance ã(x) ⊗ã(0) . Our only additional assumption oñ a is that there exists an exponent β ∈ (0, d) such that In this work, we are concerned with the case of weak decay of correlation in the sense of β ≪ 1. Let Φ be a 1-Lipschitz map from the space of tensors into the space of λ-elliptic symmetric tensors. Then our ensemble is the distribution of a where a is given by a(x) := Φ(ã(x)). Note that the normalization in the constant in (3) and in the Lipschitz constant is not essential, since it can be achieved by a rescaling of x and the amplitude ofã.
Concerning the mathematical tools of our approach, several ideas are inspired by the work [10]. In particular, a key component of our approach are sensitivity estimates (Malliavin derivative bounds) for certain integral functionals, which basically average the gradient ∇(φ, σ) over an appropriate cube. Furthermore, we rely on a mean-value property for a-harmonic functions, which has been derived in [10] under appropriate smallness assumptions on the corrector. In our present contribution, we however pursue a conceptually simpler route to estimate the Malliavin derivative: The sensitivity estimate is performed through appropriate L q -norm bounds and Meyer's estimate, rather than a more involved ℓ 2 − L 1 -norm bound like in [10].
Before stating our main results, let us recall the concept of correctors in homogenization and introduce some notation. The basic idea underlying the concept of correctors in homogenization is the observation that the oscillations in the gradient ∇u hom of solutions to the homogenized (constantcoefficient) problem (2) occur on a much larger scale than the oscillations in the gradient ∇u of solutions to the original problem (1). Thus, it is important to understand how to add oscillations to an affine map (an affine map being always a hom -harmonic) to obtain an a-harmonic map. In the context of stochastic homogenization, one is therefore interested in constructing random scalar fields φ i = φ i (a, x) subject to the equation which almost surely display sublinear growth in x: The φ i then facilitate the transition from the a hom -harmonic (Euclidean) coordinates x → x i to the "a-harmonic coordinates" x → x i + φ i (x). Since any affine map may be represented in the form b + i ξ i x i for b, ξ i ∈ R, the φ i also facilitate the construction of associated a-harmonic "corrected affine maps" b With the help of the corrector, one may characterize the effective coefficient a hom : In our setting of stochastic homogenization, the effective coefficient is given by the formula where · refers to the expectation with respect to our ensemble (i.e. probability measure).
In the language of a conducting medium with conductivity tensor a -note that in this picture, one has f ≡ 0 in (1) -, the quantity E i := e i + ∇φ i corresponds to the (curl-free) "microscopic" electric field associated with a "macroscopic" electric field e i (and, therefore, φ i corresponds to the "microscopic" correction to the "macroscopic" electric potential x i ). The corresponding (divergence-free) "microscopic" current density is given by while the "macroscopic" current density associated with the "macroscopic" electric field e i is given by the "average" of this quantity, i.e. by the expression (5).
In periodic homogenization of linear elliptic PDEs, it turns out to be convenient to introduce a dual quantity to the corrector φ i (cf. e.g. [15, p.27]): One constructs a tensor field σ ijk , skew-symmetric in the last two indices, which is a potential for the flux correction q i − a hom e i in the sense where we have set (∇ · σ i ) j := d k=1 ∂ k σ ijk . With the help of this "extended corrector" (φ, σ), it is possible to give a bound on the homogenization error (in terms of appropriate norms of φ and σ).
One of the main merits of [10] is the discovery of the usefulness of this extended corrector (φ, σ) in the context of stochastic homogenization. For stationary and ergodic ensembles · of λ-uniformly elliptic and symmetric coefficient fields a = a(x) on R d , in [10] correctors φ i and σ ijk such that ∇φ i , ∇σ ijk are stationary, of bounded second moment, and of vanishing expectation, (8) have been constructed. As a consequence of this and of ergodicity, the φ i and σ ijk almost surely display sublinear growth. Note that in case of σ i , the choice of the appropriate gauge is important for the property (8) and for our work, as the equation (7) determines σ i (which by its skew-symmetry and its behavior under change of coordinates may be identified with a d − 1-form) only up to the exterior derivative of a d − 2-form. In fact, the choice of the gauge in [10] is such that which in view of (4) and (6) is clearly compatible with (7).
Notation. To quantify the ellipticity and boundedness of our coefficient fields, throughout the paper we shall work with the assumptions where λ ∈ (0, 1). Note that in view of rescaling, the upper bound (11) on a does not induce a loss of generality of our results. For our convenience, throughout the paper we shall assume that our coefficient field a is symmetric. The arguments however easily carry over to the case of non-symmetric coefficient fields by simultaneously considering the correctors for the dual equation (i.e. the PDE with coefficient field a * , a * denoting the transpose of a). The expression s t is an abbreviation for s ≤ Ct with C a generic constant only depending on the dimension d, the exponent β > 0, and the ellipticity ratio λ > 0. The expression s ≪ t stands for s ≤ 1 C t with C a generic sufficiently large constant only depending on the dimension d, the exponent β > 0, and the ellipticity ratio λ > 0.
By I(E) we denote the characteristic function of an event E.
The notation − A f refers to the average integral over the set A, i.e. we have In the sequel, (φ, σ) stands for any component φ i , σ ijk for i, j, k = 1, · · · , d.

Main Results and Structure of Proof
Let us now state our main theorem. To quantify the sublinear growth of the extended corrector (φ, σ), we first quantify the decay of spatial averages of ∇(φ, σ) over larger scales. In view of the decorrelation assumption (3) for our ensemble of coefficient fields, we expect that, up to logarithms, it is the exponent β 2 that governs the decay of averages of ∇(φ, σ) and the improvement over linear growth for (φ, σ). Indeed, this exponent is reflected in the theorem. Theorem 1. Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered (i. e. of vanishing expectation) and stationary (i. e. invariant under translation); assume that the covariance ofã satisfies the estimate for some β ∈ (0, d). Let Φ : R d×d → R d×d be a Lipschitz map with Lipschitz constant ≤ 1; suppose that Φ takes values in the set of symmetric matrices subject to the ellipticity and boundedness assumptions (10), (11). Define the ensemble · as the probability distribution of a, where a is the image ofã under pointwise application of the map Φ, i.e. a(x) := Φ(ã(x)).
Assume in addition on the ensemble · that β in (12) is sufficiently small in the sense of where C denotes a generic constant only depending on d and λ.
i) Consider a linear functional F = F h on vector fields h = h(x) satisfying the boundedness property for some radius r > 0. Then the random variable F ∇(φ, σ) satisfies uniform Gaussian bounds in the sense of ii) There exists a (random) radius r * for which the "iterated logarithmic" bound 1 16) holds and which satisfies the stretched exponential bound Morally speaking, Theorem 1 converts statistical information on the coefficient field a (or ratherã) into statistical information on the coefficient field ∇φ := ∇(φ 1 , · · · , φ d ) related by (4). Despite the nonlinearity of the map a → ∇φ, which only in its linearization around a = id turns into the Helmholtz projection, Theorem 1 states that ∇φ essentially inherits the statistics of a: (15) implies in particular that spatial averages F = − |x|≤r ∇φdx of ∇φ satisfy the same bounds as if ∇φ itself was Gaussian with correlation decay (3). On the level of these Gaussian bounds, the only prize to pay for nonlinearity is the restriction M 1 in (15) on the threshold.
Incidentally, the way we obtain ii) from i) bears similarities with an argument in [1] in the sense that a decomposition into Haar wavelets is implicitly used.
To obtain an estimate like (15), the starting point of our proof is the Gaussian concentration of measure applied toã. Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered and stationary; denote its covariance operator by Cov. Consider a random variable F , that is, a function(al) F = F (ã). Suppose that F is 1-Lipschitz in the sense that its functional derivative, or rather its Fréchet derivative with respect to L 2 (R d ; R d×d ), ∂F ∂ã = ∂F ∂ã (ã, x), which can be considered a random tensor field and assimilated with a Malliavin derivative, satisfies Then F has Gaussian moments in the sense of Furthermore, for any M ≥ 0 we have the estimate We now substitute our assumption (18) on the Fréchet derivative by a stronger but more tractable condition.
Lemma 1. Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered and stationary; denote its covariance operator as Cov and suppose that for some β ∈ (0, d) we have the bound Let Φ : R d×d → R d×d be a 1-Lipschitz map; denote the probability distribution of Φ(ã) as · . Consider a functional F on the space of tensor fieldsã of the form F = F (a) with a(x) := Φ(ã(x)); we shall use the abbreviation F (ã) for F (Φ(ã)). Let q ∈ (1, 2) be given by and suppose that the Fréchet derivative of F with respect to Then the estimate (18) is satisfied, i.e. we have We observe that if q and β are related by (21), as β ↑ d we have q ↑ 2 and for β ↓ 0 we have q ↓ 1.
For linear functionals of (the gradient of) the corrector (which are therefore nonlinear functionals of the coefficient field a), we now establish an explicit representation of the Fréchet derivative; this will aid us in verifying the Lipschitz condition (22) and thus ultimately the concentration of measure statements (19) and (20) for (an appropriate modification of) such functionals.
where g ∈ L p (R d ; R d ), p ≥ 2, and supp g ⊂ {|x| ≤ r} for some r ≥ 1. Let a be some coefficient field subject to the ellipticity and boundedness conditions (10), (11). Then the following two assertions hold: 1) Consider the Fréchet derivative ∂F ∂a of the functional F := F ∇σ ijk (note that this functional is nonlinear in a, although it is linear in σ ijk ) at a (for some fixed i, j, k). Introduce the decaying solutions v,ṽ jk to the equations − △v = ∇ · g and (where a * denotes the transpose of a) We then have the representation 2) Consider the Fréchet derivative ∂F ∂a of the functional F := F ∇φ i at a. Introduce the decaying solution v to the equation (again, a * denoting the transpose of a) − ∇ · a * ∇v = ∇ · g.
We then have the representation The previous explicit representation of the Fréchet derivative for certain linear functionals of (the gradient of) the corrector (φ, σ) enables us to verify the bound (22) for the Malliavin derivative, provided that a certain mean value property is satisfied for a-harmonic functions. Note that the latter requirement is a condition on the coefficient field a; in Lemma 4 below we shall provide a sufficient condition for this property to hold.
As the functionals which the next lemma shall be applied to are basically averages of ∇φ or ∇σ over cubes of a certain scale r, we state the lemma in a form which makes it directly applicable in such a setting. In particular, the boundedness assumption (29) for the linear functional is motivated by these considerations. ∞)) satisfying the support and boundedness condition Suppose that the constraint holds (with c(d, λ) > 0 to be fixed in the proof below). Let q ∈ (1, 2) be related to p through Consider the Fréchet derivative ∂F ∂a of the functional F := F ∇σ ijk (or the functional F := F ∇φ i ; note that these functionals are nonlinear functionals of a) at some symmetric coefficient field a subject to the conditions (10), (11). Provided that the coefficient field a is such that the mean value property − |x|≤ρ |∇u| 2 dx − |x|≤R |∇u| 2 dx for any R ≥ r and any ρ ∈ [r, R] (32) holds for any a-harmonic function u and provided that furthermore a is such that is satisfied, we have the estimate Note that for q related to β through (21) and p related to q through (31), (34) the L q -norm of the Malliavin derivative decays like r − β 2 . This demonstrates that for functionals like our averages of ∇(φ, σ) -note that these functionals have vanishing expectation due to the vanishing expectation of ∇(φ, σ) -, the concentration of measure indeed improves on large scales with the desired exponent: The "typical value" of the average of ∇(φ, σ) on some scale r decays like r − β 2 . We now have to provide a sufficient condition for the mean value property for a-harmonic functions (32). To do so, we make use of the following result from [10], which provides the mean-value property assuming just an appropriate sublinearity condition on the corrector (φ, σ).
Proposition 2 (see [10,Lemma 2]). There exists a constant C 0 only depending on dimension d and ellipticity ratio λ > 0 with the following property: Suppose that for an elliptic coefficient field a subject to the ellipticity and boundedness conditions (10) and (11) the scalar and vector potentials (φ, σ), cf. (4) and (7), satisfy Then for any two radii R ≥ r and ρ ∈ [r, R] and any a-harmonic function u in {|x| ≤ R} we have We shall show in the proof of the next lemma that the quantitative sublinearity condition on the corrector (35) may be reduced to a smallness assumption on a certain family of linear functionals of the gradient of the corrector. Basically, these functionals will be obtained by averaging the gradient of the corrector on appropriate cubes, cf. the proof below. In combination with the previous proposition, we get the following lemma. Note that this result will allow us to buckle, since by Lemma 3 and Lemma 1 the mean-value property (32) and thus ultimately (36) implies With these preparations, we are able to establish our main theorem. The main technical difficulty in the proof below is that our estimate for the Malliavin derivative of linear functionals of the gradient of the corrector (cf. (34)) is a conditional bound: It relies on the assumption that the mean-value property (32) holds for a-harmonic functions on scales larger than r. For the concentration of measure estimate (19), however, an unconditional estimate of the form (18) or (22) (the latter being a proxy for (18)) is needed. By Lemma 4 we know that the mean-value property holds, provided that for a certain family of linear functionals of the corrector the smallness estimate sup R≥r dyadic;n=1,··· ,N is satisfied (C 0 being a universal constant). To circumvent this problem, in the proof below we therefore introduce the family of functionals for which by design the unconditional bound for the Malliavin derivative holds. Therefore, concentration of measure is applicable toF r . The remainder of the proof of the first part of our theorem below is dedicated to handling the (a priori unknown) expectation F r .
The proof of the second assertion of our main theorem will mainly rely on the first assertion of the theorem as well as the quantitative improvement of the Malliavin derivative of averages of (∇φ, ∇σ) on larger scales, as captured by the estimate (34).  (20). By Chebychev's inequality, (19) implies . In combination with the same estimate with F replaced by −F , we obtain (20).
Proof of Lemma 1. We need to verify that the condition (18) is implied by the assumption (22). To do so, we first note that by Hölder's inequality we have for any exponent Since Cov is the convolution with ã(x)ã(0) and since we have the bound | ã(x) ⊗ã(0) | ≤ |x| −β , we have for the second factor which allows us to use the Hardy-Littlewood-Sobolev inequality provided the exponents q and β are related by (21). From this string of inequalities we learn that (18) also holds provided We now change variables according to a(x) = Φ(ã(x)); by the chain rule for F (ã) = F (Φ(ã)) we have ∂F ∂ã (ã, x) = Φ ′ (ã(x)) ∂F ∂a (a, x), so that by the 1-Lipschitz continuity of Φ, our assumption (22) implies (37) and thus (18).

Representation of the Malliavin derivative
Proof of Lemma 2. We first give the argument for the "vector potential" σ, fixing a component σ ijk . Consider a functional of the form F := F ∇σ ijk with F h as in (23). We claim that the Fréchet derivative of F with respect to a is given by (26) where the functions v = v(x) andṽ jk =ṽ jk (a, x) are determined as the decaying solutions of the elliptic equations (24) and (25). Computing the functional derivative of F as a function of a amounts to a linearization. We thus consider an arbitrary tensor field δa = δa(x), which we think of as an infinitesimal perturbation of a, and which thus generates infinitesimal perturbations δφ and δσ of φ and σ according to (4), (6), and (9), that is, − ∇ · (a∇δφ i + δa(∇φ i + e i )) = 0 and In terms of the infinitesimal perturbation δF of F , this implies by integration by parts (or rather by directly appealing to the weak Lax-Milgram formulations of the elliptic equations) which is nothing else than (26).
Let us now establish the second part of our lemma. Consider a functional of the scalar potential of the form F := F ∇φ i . To represent its Fréchet derivative, introduce the decaying solution v to the equation (27). We observe that the variation of F with respect to a is given by which leads to the conclusion (28).

Sensitivity estimate
Proof of Lemma 3. We now argue that under certain boundedness assumptions on F = F h as a linear functional in vector fields h = h(x), we control the size (22) of its Fréchet derivative ∂F ∂a = ∂F ∂a (a, x) as a nonlinear functional F ∇σ ijk = F (a) in coefficient fields a = a(x) (and similarly in the case F (a) = F ∇φ i ; for this case, the (simpler) proof is sketched afterwards). To this aim, let us first note that we have a Calderon-Zygmund estimate for −∇ · a∇ with the exponents p and its dual exponent p p−1 : For any decaying function w and vector field h on R d related by we have This assertion holds by Meyer's estimate (see e.g. [19]), which only requires the ellipticity and boundedness assumptions (10), (11) on a as well as the estimate |p − 2| ≪ 1, which is ensured by our condition (30). Note that an analogous estimate would hold for the dual equation −∇ · a * ∇w = ∇ · h if our coefficient field were nonsymmetric.
In the following, we will use the abbreviation · p,B for the spatial L p -norm on the set B; we write · p when B = R d . We start by arguing that because It is obviously enough to establish (42) only for R ≥ 2ρ; hence by Jensen's inequality, (42) follows from (32) once we establish the reverse Hölder inequality To this purpose, we test −∇ · a∇u = 0 with η 2γ (u − m), where η is a smooth cut-off of χ {|x|≤ R 2 } in {|x| ≤ R} (with the property |∇η| 1 R ) and where the exponent γ ≥ 1 and the constant m ∈ R will be chosen later. By the ellipticity and boundedness assumptions (10), (11) and Young's inequality we obtain (η γ |∇u|) 2 dx ((u − m)|∇η γ |) 2 dx, and thus which by the estimate on ∇η gives On the r. h. s. of (44) we use first Hölder's inequality, then the isoperimetric inequality on {|x| ≤ R} and finally Sobolev's inequality on the whole space (for simplicity, we assume d > 2 here) (which -as a simple computation shows -is satisfied precisely for γ = d 2 ) and the constant m is the spatial average of u on {|x| ≤ R}. The combination of the last two estimates yields which (by γ = d 2 ) entails ∇u 2,|x|≤ R 2 ≤ ∇(η γ (u − m)) 2 R − d 2 ∇u 1,|x|≤R and thus (43).
It thus remains to estimate the auxiliary functions v andṽ jk . The estimate of the terms in line (48) is easy: By (45) and Calderon-Zygmund for (24) we obtain ∇v p g p ≤ r − p−1 p d . By (41) we have Calderon-Zygmund with exponent p for the equation (25), so that ∇ṽ jk p ∇v p r − p−1 p d . In order to control the terms in line (49), we shall establish the following estimates for n ∈ N ∇v p,|x|≥2 n r We note that since p > 2, these estimates imply that the sum over n in (49) converges and gives (34).
The estimate (50) for the solution v of the constant coefficient equation (24) is classical: We already argued that ∇v p r − p−1 p d ; by the estimate on the support of g in (45) we have that v is harmonic in {|x| ≥ r} and that it has vanishing flux |x|=r x · ∇v = 0. It thus decays as |∇v(x)| |x| −d r d− d p ∇v p for |x| ≥ 2r, which in particular yields (50). We now turn to (51) and to this purpose rewrite the equation (25) forṽ jk as − ∇ · a * ∇ṽ jk = ∇ ·g with the r. h. s.g := −a * (∂ j ve k − ∂ k ve j ). We already argued that g p r − p−1 p d and (50) translates into In order to proceed, we splitg into {g m } m=0,1,··· whereg 0 is supported in This entails a splitting ofṽ jk into {ṽ m } m=0,1,··· , whereṽ m is the Lax-Milgram solution of − ∇ · a * ∇ṽ m = ∇ ·g m .
We will now argue that which implies the estimate (51) by the triangle inequality ∇ṽ jk p,|x|≥2 n r ≤ ∞ m=0 ∇ṽ m p,|x|≥2 n r . We note that (53) together with our Calderon-Zygmund estimate (41) applied to (54) yields ∇ṽ m p (2 m ) −d+ d p r − p−1 p d . In order to establish (55), it thus remains to show We argue in favor of (56) by duality and thus consider an arbitrary h ∈ L p p−1 supported in {|x| ≥ 2 n r} and denote by w the corresponding Lax-Milgram solution of (40). By integration by parts, we deduce from (40) and (54) that h · ∇ṽ m dx = g m · ∇w dx. By the support condition ong m this yields h · ∇ṽ m dx ≤ g m p ∇w p p−1 ,|x|≤2 m+1 r .
By the support assumption on h we have that w is a-harmonic in {|x| ≤ 2 n r}.
Since m < n, we may use (42) applied to w in form of We combine this with (41) in form of ∇w p p−1 h p p−1 , and with (53), to obtain h · ∇ṽ m dx In the case of a functional of the scalar potential of the form F (a) = F ∇φ i , we claim that the Fréchet derivative of F is again controlled in the sense of (34). The proof is mostly analogous to the previous one; we again rewrite F as in (46) with some g satisfying (45). Starting from the representation (28), one derives an analogue of estimate (47) reading ∂F ∂a q ∇v p,|x|≤2r ∇φ i + e i 2,|x|≤2r + ∞ n=1 ∇v p,2 n r≤|x|≤2 n+1 r ∇φ i + e i 2,2 n r≤|x|≤2 n+1 r .
The second factors on the right in this estimate coincide with the ones in the case F (a) = F ∇σ ijk ; therefore, we get the following analogue to estimate (49):

Sufficient conditions for the mean value property in terms of linear functionals of the corrector
Proof of Lemma 4. In order to show that (36) and (33) imply (32), we only need to show the existence of functionals F 1 , · · · , F N such that (36) and (33) imply By Proposition 2, the estimate (32) follows from (57).
Let us now give the argument for (57). First, it is clearly enough to show that for any 0 < δ ≪ 1, there exists N δ −d functionals F 1 , · · · , F N on vector fields which are bounded in the sense of and such that for any dyadic ρ ≥ 1 (59) By dyadic iteration, it is enough to show for any dyadic ρ ≥ 1 Indeed, abbreviating D m := 1 , the estimate (60) may be rewritten as (using a slight readjustment of δ) which may be iterated to By our sublinearity assumption on the corrector (33) (which may be rewritten as lim m 0 ↑∞ D m 0 = 0), this yields (59).
We now turn to the argument for (60). By Caccioppoli's estimate on (4) we have and thus in particular for the flux q i = a(∇φ i + e i ) Caccioppoli's estimate on (9) gives The last three estimates combine to |x|≤ρ |∇(φ, σ)| 2 dx Hence for (60) is enough to show This statement is not just true for (φ, σ) − − |x|≤ρ (φ, σ), but for any function ζ of vanishing spatial average on {|x| ≤ ρ}: By rescaling, it is sufficient to show the estimate on the unit ball {|x| ≤ 1}. It is more convenient to see it when the unit ball {|x| ≤ 1} is replaced by the unit square (0, 1) d : Indeed, dividing (0, 1) d into N = δ −d (suppose that δ −1 is an integer) subcubes {Q n } n=1,··· ,N of side length δ and setting F n ∇ζ := − Qn ζdx (recall that (0,1) d ζdx = 0 so that F n is indeed a function of ∇ζ), (62) follows from using Poincaré's estimate on each Q n in form of Qn ζ 2 dx − |Q n |(− Qn ζdx) 2 δ 2 Qn |∇ζ| 2 dx and then summing up. We note that by Poincaré's estimate on (0, 1) d , the F n have the desired boundedness property (58), at first on gradient fields ∇ζ and then on any vector field h by extensionà la Hahn-Banach.

Proof of Main Result
Proof of Theorem 1.

Proof of Assertion i).
Consider the functionals {F n } n=1,··· ,N and their rescalings {F n,R } n=1,··· ,N ;R dyadic constructed in Lemma 4. Let us abbreviate F n,R ∇(φ, σ) as F n,R . We would like to apply concentration of measure to these functionals.
The main difficulty that we need to overcome is that our sensitivity estimate (34) in Lemma 3 for the quantity F n,r is based on the assumption that the mean-value property (32) holds for a-harmonic functions down to scale r. By Lemma 4 this assumption may be reduced to the smallness assumption (36) for our functionals F m,R on scales R ≥ r, so that Lemma 3 becomes applicable under the assumption (36): Let q be related to β through (21) and let p be related to q through (31). By the smallness assumption on β in our theorem (cf. (13)), we deduce that (30) holds. By scaling, our functionals F n,r satisfy the estimate (29) up to a universal constant factor. Furthermore, by ergodicity the property (33) holds for · -almost every coefficient field a (regarding σ, this result has been shown in [10, Lemma 1]; for φ, it is classical but may also be found in [10]). Thus, the estimate (34) holds for F n,r under the assumption (36), i.e. there exists a constant C 0 only depending on d, λ, and β, such that for any n = 1, · · · , N and any radius r the implication sup m,R≥r dyadic holds for · -a.e. coefficient field a.
To apply concentration of measure in the form of Proposition 1 to some functional F , we however need an unconditional bound on the Malliavin derivative (cf. (18) respectively (22)).
Therefore we first introduce a new random variable whose derivative vanishes whenever the smallness condition in (63) is violated: Consider the auxiliary random variableF where the sup runs over all dyadic radii R = 2 k r, k ∈ N 0 . By the usual differentiation rules applied to the Fréchet derivative ∂ ∂a in the norm · q , we obtain ∂F r ∂a ∂F m,R ∂a q and thus by (63) ∂F r ∂a 2 q 1 r β . By Lemma 1, we may apply concentration of measure in form of (20) to the random variable cr β/2F r (where c is some small universal constant). This yields so that it remains to control the expectation F r .
Because of (8) and the definition of F m,R , it follows from qualitative ergodicity of · and Birkhoff's ergodic theorem that lim R↑∞ F m,R = 0 almost surely, so that by dominated convergence lim r↑∞ F r = 0. Hence there exists a finite radius r 0 which is minimal with the property On the basis of (67), we now get a quantitative estimate on r 0 . To this purpose we now consider the auxiliary variablē where again the sup runs over all dyadic radii R = 2 k r, k ∈ N 0 , and where the cut-off function η = η(F ) is given by The advantage of the auxiliary variable (68) over (64) is that we control its expectation: Since the stationary ∇(φ, σ) has vanishing expectation, cf. (8), and by the linearity of F n,r in ∇(φ, σ) we have F n,r = 0 and thus F n,r = (η − 1)F n,r so that by construction of η Since the stationary ∇(φ, σ) has bounded second moments, cf. (8), and by the boundedness property of F n,r in ∇(φ, σ) we obtain from the Cauchy-Schwarz inequality F n,r which in view of (67) improves to By differentiation rules for the Fréchet derivative ∂ ∂a in the norm · q we obtain for the auxiliary random variableF n,r and thus by (63) ∂F n,r ∂a 2 q 1 r β , and hence by concentration of measure in form of (20) (applied to cr β/2F n,r by means of Lemma 1, c being a small universal constant) Together with (70) this yields By definition (68) we have I(|F n,r | ≥ M) ≤ I(sup m,R≥r |F m,R | ≥ 1 2C 0 ) + I(|F n,r | ≥ M) so that by (67) the above upgrades to Since r β exp(− 1 C r β ) 1 for all r, the above holds without the lower restriction on M: Using this estimate with r replaced by R and summing over the finite index set n = 1, · · · , N and all dyadic R ≥ r we obtain for M ≤ 1 and r ≥ r 0 (72) and thus in particular for the auxiliary random variable (64) where the upper bound on M is immaterial sinceF r ≤ 1 C 0 ≤ 1. Using F r = ∞ 0 I(F r ≥ M) dM, this yields the following quantification of lim r↑∞ F r = 0: for all r ≥ r 0 .
Since r 0 was minimal in (66) and since F r depends continuously on r, this yields the desired r 0 1.
It remains to argue why (71), which together with (73) may be rephrased as I(|F n,r | ≥ M) exp(− 1 C r β M 2 ) for M ≤ 1 and r ≫ 1, yields (15). It just suffices to include the given functional F from (14) into the list of finitely many functionals F 1 , · · · , F N , say, as the last functional F N = F , and then to specify the above to n = N. We note that for q related to β through (21) and p related to q through (31) one has p p−1 = 2d d+β , i.e. (14) entails (29). Note that by adjusting the constants, (15) is trivial for r 1, so that we obtain (15) over the whole range r ≥ 0.

Proof of Assertion ii).
The arguments in this section require β < 2, which in view of our assumption β ≪ 1 is no restriction. Let r * denote the minimal dyadic radius with the property (16) -we know but don't have to use that it is finite by quantitative ergodicity. In order to establish (17), it is enough to show for a given dyadic r 0 ≥ 1 that It will be convenient to replace balls by cubes. Moreover, all radii or rather side length are dyadic. By definition of r * as the smallest radius with (16), the event r * > r 0 means that there exists a radius R ≥ r 0 with where f (z) := log(e + log z).
In the sequel, the intermediate (dyadic) radius r 1 ∈ [r 0 , R] with will play a role. Note that we use here β > 0 and that f (z) grows subalgebraically. For the l. h. s. of (75) we note where (φ, σ) r denotes the L 2 ((−R, R) d )-orthogonal projection of (φ, σ) onto the space of functions that are piecewise constant on the ( R r ) d dyadic subcubes Q of "level r" (that is, of side length 2r) of the cube (−R, R) d . In other words, on such a sub-cube Q, (φ, σ) r = − Q (φ, σ)dx. With this language, we may rewrite the first r. h. s. term of (77) as so that by Poincaré's estimate on each of the cubes Q we obtain and then by Caccioppoli's estimate based on (4) & (9), cf. (61), As a consequence of Lemma 4, there exist N ∼ 1 linear functionals {F n } n=1,··· ,N whose rescaled versions F n,r satisfy the boundedness property (14) such that for any r ≥ 2R we have the implication From the two last statements we gather In view of (76) this can be rewritten as ∀ r ≥ 2R dyadic, n = 1, · · · , N (F n,r ∇(φ, σ)) 2 ≪ 1 provided that we adjust the definition (76) of r 1 appropriately (to obtain the estimate ≤ 1 2 · in (78) in place of just ). We now turn to the second r. h. s. term in (77), which in view of the definition of (φ, σ) r we may estimate as follows Hence if for any of the ( R r ) d dyadic sub-cubes Q of (−R, R) d of level r we introduce the N = 2 d linear functionals F Q,n as an extension of F Q,n ∇ζ : where {Q ′ n } n=1,··· ,2 d is an enumeration of the sub-cubes of level r 2 of Q, and which satisfy the desired boundedness property (14) restricted to gradient fields (which is no issue because of Hahn-Banach extension) and translated (which will be no issue because of stationarity), that is, we have From this we learn, since for the auxiliary function g(z) := log −2 (z + e), the dyadic sum r∈[2r 1 ,R] g( R r ) is universally bounded, ∀ r ∈ [2r 1 , R] dyadic, Q level r, n = 1, · · · , 2 d (F Q,n ∇(φ, σ)) 2 ≪ ( R r In view of (77), the combination of this with (78) yields ∀ r ∈ [2r 1 , R] dyadic, Q level r, n = 1, · · · , 2 d (F Q,n ∇(φ, σ)) 2 ≪ ( R r and ∀ r ≥ 2R dyadic, n = 1, · · · , N (F n,r ∇(φ, σ)) 2 ≪ 1 Equipped with this deterministic argument, we now may proceed to the stochastic part: In the event of r * > r 0 , there exists a dyadic R ≥ r 0 such that (75) holds, so that we learn from (80) that there exists • a (dyadic) r ∈ [2r 1 , R], a sub-cube Q of (−R, R) d of level r, and an index n = 1, · · · , 2 d such that (F Q,n ∇(φ, σ)) 2 ( R r ) 2 g( R r )( r 0 R ) β f ( R r 0 ). In view of the boundedness condition (79) and stationarity, we may apply (15) with F replaced by F Q,n and M 2 replaced by ( R r ) 2 g( R r )( r 0 R ) β f ( R r 0 ). This M is admissible in the sense of M 1 because by (76) we have Hence the probability of each single of this events is estimated as follows Since g(z) decays sub-algebraically in z and since β < 2, this yields the simpler form • or a (dyadic) r ≥ 2R and an index n = 1, · · · , N for which the estimate (F n,r ∇(φ, σ)) 2 1 holds. By the boundedness property of F n,r , each single of these events is estimated as I (F n,r ∇(φ, σ)) 2 ≥ 1 C exp(− 1 C r β ).
Taking the number ( R r ) d of sub-cubes Q into account and recalling N 1, this implies I(r * > r 0 ) Again, since 1 − β 2 > 0, we have the calculus estimate exp(−A) for A ≫ 1.
Applying this to the first sum over r in (81) and A = 1 C f ( R r 0 )r β 0 , which satisfies A ≫ 1 for r 0 ≫ 1, and using the estimate r≥2R;r dyadic exp(− 1 C r β ) exp(− 1 C R β ) (which holds provided that R ≥ r 0 ≥ 1) for the second sum over r, we obtain I(r * > r 0 ) Thanks to β > 0, we have exp(− 1 C R β ) exp(− 1 C f ( R r 0 )r β 0 ), so that the second summand is dominated by the first one: Now we see the reason for the choice of f (z) = log(e + log z) for which f (2 m ) ≥ 1 C (1 + log(m + 1)) and thus With 1 C r β 0 playing the role of A this yields (74). Note that the condition r 0 ≫ 1 is immaterial after adjusting the constants, as the l. h. s. of (74) is bounded by 1.