Sublinear growth of the corrector in stochastic homogenization: optimal stochastic estimates for slowly decaying correlations

We establish sublinear growth of correctors in the context of stochastic homogenization of linear elliptic PDEs. In case of weak decorrelation and “essentially Gaussian” coefficient fields, we obtain optimal (stretched exponential) stochastic moments for the minimal radius above which the corrector is sublinear. Our estimates also capture the quantitative sublinearity of the corrector (caused by the quantitative decorrelation on larger scales) correctly. The result is based on estimates on the Malliavin derivative for certain functionals which are basically averages of the gradient of the corrector, on concentration of measure, and on a mean value property for a-harmonic functions.


Introduction
In the present work, we are concerned with the stochastic homogenization of linear elliptic equations of the form −∇ · a∇u = f. (1) B Julian Fischer julian.fischer@mis.mpg.de 1 Max-Planck-Institut für Mathematik in den Naturwissenschaften, Inselstrasse 22,04103 Leipzig,Germany In stochastic homogenization of elliptic PDEs, a is typically a uniformly elliptic and bounded coefficient field, chosen at random according to some stationary and ergodic ensemble · . On large scales (and for slowly varying f ), one may then approximate the solution u to the Eq. (1) by the solution u hom to the so-called effective equation which is a constant-coefficient equation with the so-called effective coefficient a hom . Mathematically, this homogenization effect is encoded in growth properties of the corrector (cf. below for a definition of the corrector). The goal of the present paper is to provide a fairly simple proof of quantified sublinear growth of the corrector under very mild assumptions on the decorrelation of the coefficient field a under the ensemble · . We do this in the context of coefficient fields that are essentially Gaussian. More precisely, we consider coefficient fields a which are obtained from a Gaussian random field by pointwise application of some (nonlinear) mapping, the role of the nonlinear map being basically to enforce uniform ellipticity and boundedness of our coefficient field.
The motivation for this result is the following: Gloria et al. [11] have shown that qualitatively sublinear growth of the (extended) corrector (φ, σ ) (cf. below for a definition) entails a large-scale intrinsic C 1,α regularity theory for a-harmonic functions. In a subsequent work [10], the two authors of the present paper have shown that slightly quantified sublinear growth of the corrector even leads to a large-scale intrinsic C k,α regularity theory for any k ∈ N. Therefore, the results of the present work show that even in case of ensembles with very mild decorrelation, for almost every realization of the coefficient field, a-harmonic functions have arbitrary intrinsic smoothness properties on large scales. Furthermore, our results enable us to estimate the scale above which this happens-a random quantity-in a stochastically optimal way. Indeed, the motivation for the present work was to establish such a (necessarily intrinsic) higher order regularity theory under the weakest possible assumptions on the decay of correlations.
By an intrinsic regularity theory we mean that the regularity is measured in terms of objects intrinsic to the Riemannian geometry defined by the coefficient field a, like the dimension of the space of a-harmonic functions of a certain algebraic growth rate, or like estimates on the Hölder modulus of the derivative of a-harmonic functions as measured in terms of their distance to a-linear functions. An extrinsic large-scale regularity theory for a-harmonic functions in case of random coefficients was initiated on the level of a C 0,α in [7,21] and pushed to C 1,0 in [4], which significantly extended qualitative arguments from the periodic case [5] to quantitative arguments in the random case. However, an extrinsic regularity theory is limited to C 1,0 , as can be seen considering the harmonic coordinates: Taking higher order polynomials into account does not increase the local approximation order.
After this motivation, we now return to the discussion of the history on bounds on the corrector, as they depend on assumptions on the stationary ensemble of coefficient fields. Almost-sure sublinearity (always meant in a spatially averaged sense) of the corrector φ under the mere assumption of ergodicity was a key ingredient in the original work on stochastic homogenization by Kozlov [19] and by Papanicolaou and Varadhan [24]. Almost-sure sublinearity of the extended corrector (φ, σ ), as is needed for the large-scale intrinsic C 1,α -regularity theory, was established in [11] under mere ergodicity.
Yurinskii [25] was the first to quantify sublinear growth under general mixing conditions, however only capturing suboptimal rates even in case of finite range of dependence. Very recently, a much improved quantification of sublinear growth of φ under finite range assumptions was put forward by Armstrong et al. [1], relying on a variational approach to quantitative stochastic homogenization introduced by Armstrong and Smart [4], an approach which presumably can be extended to the case of non-symmetric coefficients and more general mixing conditions following [3]. Recently, optimal growth bounds on the corrector φ with optimal-i.e. Gaussianstochastic integrability have been established under the assumption of finite range of dependence [2,17].
Optimal growth rates have been obtained under a quantification of ergodicity different from finite range or mixing conditions, namely under Spectral Gap assumptions on the ensemble. This functional analytic tool from statistical mechanics was introduced into the field of stochastic homogenization in an unpublished paper by Naddaf and Spencer [23], and further leveraged by Conlon et al. [8,9], yielding optimal rates for some errors in stochastic homogenization in case of a small ellipticity contrast. The work of Gloria, Neukamm and the second author extended these results to the present case of arbitrary ellipticity contrast [12][13][14], in particular yielding at most logarithmic growth of the corrector (and its stationarity in d > 2). Loosely speaking, the assumption of a Spectral Gap Inequality amounts to correlations with integrable tails; in the above-mentioned works it has been used for discrete media (i.e. random conductance models), but has subsequently been extended to the continuum case [15,16].
A strengthening of the Spectral Gap Inequality is given by the Logarithmic Sobolev Inequality (LSI); it is a slight strengthening in terms of the assumption (still essentially encoding integrable tails of the correlations), but a substantial improvement in its effect, since it implies Gaussian concentration of measure for Lipschitz random variables. The assumption of LSI and implicitly concentration of measure, which will be explicitly used in this work, has been introduced into stochastic homogenization by Marahrens and Otto [21]. In [11], it has been shown that the concept of LSI can be adapted to also capture ensembles with slowly decaying correlations, i.e. thick non-integrable tails, by adapting the norm of the vertical or Malliavin derivative to the correlation structure. As a result, the stochastic integrability of the optimal rates could be improved from algebraic to (stretched) exponential, but missing the expected Gaussian integrability.
The main merit of the present contribution w.r.t. to [11] is twofold: First, our approach directly provides optimal quantitative sublinearity of the corrector (φ, σ ) on all scales above a random minimal radius r * , i.e. in contrast to the estimates of [11] our estimates capture the decorrelation on scales larger than r * in a single argument. Note that our definition of r * differs from the one in [11]. Second, in case of weak decorrelation, our simpler arguments are nevertheless sufficient to establish optimal stochastic moments for the minimal radius r * above which the corrector (φ, σ ) displays the quantified sublinear growth.
In the present work, we consider the following type of ensembles on λ-uniformly elliptic tensor fields a = a(x) on R d : Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered (i.e. of vanishing expectation) and stationary (i.e. invariant under translation) and thus characterized by the covariance ã(x) ⊗ã(0) . Our only additional assumption onã is that there exists an exponent β ∈ (0, d) such that In this work, we are concerned with the case of weak decay of correlation in the sense of β 1. Let be a 1-Lipschitz map from the space of tensors into the space of λ-elliptic symmetric tensors. Then our ensemble is the distribution of a where a is given by a(x) := (ã(x)). Note that the normalization in the constant in (3) and in the Lipschitz constant is not essential, since it can be achieved by a rescaling of x and the amplitude ofã.
Concerning the mathematical tools of our approach, several ideas are inspired by the work [11]. In particular, a key component of our approach are sensitivity estimates (Malliavin derivative bounds) for certain integral functionals, which basically average the gradient ∇(φ, σ ) over an appropriate cube. Furthermore, we rely on a mean-value property for a-harmonic functions, which has been derived in [11] under appropriate smallness assumptions on the corrector. In our present contribution, we however pursue a conceptually simpler route to estimate the Malliavin derivative: The sensitivity estimate is performed through appropriate L q -norm bounds and Meyer's estimate, rather than a more involved 2 − L 1 -norm bound like in [11].
Before stating our main results, let us recall the concept of correctors in homogenization and introduce some notation. The basic idea underlying the concept of correctors in homogenization is the observation that the oscillations in the gradient ∇u hom of solutions to the homogenized (constant-coefficient) problem (2) occur on a much larger scale than the oscillations in the gradient ∇u of solutions to the original problem (1). Thus, it is important to understand how to add oscillations to an affine map (an affine map being always a hom -harmonic) to obtain an a-harmonic map. In the context of stochastic homogenization, one is therefore interested in constructing random scalar which almost surely display sublinear growth in x: The φ i then facilitate the transition from the a hom -harmonic (Euclidean) coordinates x → x i to the "a-harmonic coordinates" x → x i + φ i (x). Since any affine map may be represented in the form b + i ξ i x i for b, ξ i ∈ R, the φ i also facilitate the construction of associated aharmonic "corrected affine maps" b With the help of the corrector, one may characterize the effective coefficient a hom : In our setting of stochastic homogenization, the effective coefficient is given by the formula a hom e i = a(e i + ∇φ i ) , where · refers to the expectation with respect to our ensemble (i.e. probability measure).
In the language of a conducting medium with conductivity tensor a-note that in this picture, one has f ≡ 0 in (1)-, the quantity E i := e i + ∇φ i corresponds to the (curlfree) "microscopic" electric field associated with a "macroscopic" electric field e i (and, therefore, φ i corresponds to the "microscopic" correction to the "macroscopic" electric potential x i ). The corresponding (divergence-free) "microscopic" current density is given by while the "macroscopic" current density associated with the "macroscopic" electric field e i is given by the "average" of this quantity, i.e. by the expression (5).
In periodic homogenization of linear elliptic PDEs, it turns out to be convenient to introduce a dual quantity to the corrector φ i (cf. e.g. [18, p.27]): One constructs a tensor field σ i jk , skew-symmetric in the last two indices, which is a potential for the flux correction q i − a hom e i in the sense where we have set (∇ · σ i ) j := d k=1 ∂ k σ i jk . With the help of this "extended corrector" (φ, σ ), it is possible to give a bound on the homogenization error (in terms of appropriate norms of φ and σ ).
One of the main merits of [11] is the discovery of the usefulness of this extended corrector (φ, σ ) in the context of stochastic homogenization. For stationary and ergodic ensembles · of λ-uniformly elliptic and symmetric coefficient fields a = a(x) on R d , in [11] correctors φ i and σ i jk such that ∇φ i , ∇σ i jk are stationary, of bounded second moment, and of vanishing expectation, (8) have been constructed. As a consequence of this and of ergodicity, the φ i and σ i jk almost surely display sublinear growth. Note that in case of σ i , the choice of the appropriate gauge is important for the property (8) and for our work, as the Eq. (7) determines σ i (which by its skew-symmetry and its behavior under change of coordinates may be identified with a d − 1-form) only up to the exterior derivative of a d − 2-form. In fact, the choice of the gauge in [11] is such that which in view of (4) and (6) is clearly compatible with (7). Notation To quantify the ellipticity and boundedness of our coefficient fields, throughout the paper we shall work with the assumptions where λ ∈ (0, 1). Note that in view of rescaling, the upper bound (11) on a does not induce a loss of generality of our results. For our convenience, throughout the paper we shall assume that our coefficient field a is symmetric. The arguments however easily carry over to the case of non-symmetric coefficient fields by simultaneously considering the correctors for the dual equation (i.e. the PDE with coefficient field a * , a * denoting the transpose of a).
The expression s t is an abbreviation for s ≤ Ct with C a generic constant only depending on the dimension d, the exponent β > 0, and the ellipticity ratio λ > 0.
The expression s t stands for s ≤ 1 C t with C a generic sufficiently large constant only depending on the dimension d, the exponent β > 0, and the ellipticity ratio λ > 0.
By I (E) we denote the characteristic function of an event E.
The notation − A f refers to the average integral over the set A, i.e. we have

Main results and structure of proof
Let us now state our main theorem. To quantify the sublinear growth of the extended corrector (φ, σ ), we first quantify the decay of spatial averages of ∇(φ, σ ) over larger scales. In view of the decorrelation assumption (3) for our ensemble of coefficient fields, we expect that, up to logarithms, it is the exponent β 2 that governs the decay of averages of ∇(φ, σ ) and the improvement over linear growth for (φ, σ ). Indeed, this exponent is reflected in the theorem.

Theorem 1 Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered (i.e. of vanishing expectation) and stationary (i.e. invariant under translation); assume that the covariance ofã satisfies the estimate
for some β ∈ (0, d). Let : R d×d → R d×d be a Lipschitz map with Lipschitz constant ≤ 1; suppose that takes values in the set of symmetric matrices subject to the ellipticity and boundedness assumptions (10), (11). Define the ensemble · as the probability distribution of a, where a is the image ofã under pointwise application of the map , i.e. a(x) := (ã(x)).
Assume in addition on the ensemble · that β in (12) is sufficiently small in the sense of where C denotes a generic constant only depending on d and λ.

(i) Consider a linear functional F = Fh on vector fields h = h(x) satisfying the boundedness property
for some radius r > 0. Then the random variable F∇(φ, σ ) satisfies uniform Gaussian bounds in the sense of (ii) There exists a (random) radius r * for which the "iterated logarithmic" bound holds and which satisfies the stretched exponential bound Morally speaking, Theorem 1 converts statistical information on the coefficient field a (or ratherã) into statistical information on the coefficient field ∇φ := ∇(φ 1 , . . . , φ d ) related by (4). Despite the nonlinearity of the map a → ∇φ, which only in its linearization around a = id turns into the Helmholtz projection, Theorem 1 states that ∇φ essentially inherits the statistics of a: (15) implies in particular that spatial averages F = − |x|≤r ∇φdx of ∇φ satisfy the same bounds as if ∇φ itself was Gaussian with correlation decay (3). On the level of these Gaussian bounds, the only price to pay for the nonlinearity is the restriction M 1 in (15) on the threshold.
Incidentally, the way we obtain (ii) from (i) bears similarities with an argument in [1] in the sense that a decomposition into Haar wavelets is implicitly used.
To see the optimality of the corrector bound (16) and the stochastic integrability (17), consider a perturbative setting: Letã be a scalar, centered, stationary Gaussian field with covariance given by and consider the infinitesimally perturbed Laplace operator where δ > 0 is infinitesimal. In this setting, by (4) the gradient of the corrector φ is given by where denotes the fundamental solution of the (negative) Laplacian (note that this identity is just formal, as we silently pass over integrability issues, but it may be given a rigorous meaning). In particular, ∇φ i is a stationary centered Gaussian random field satisfying with an explicitly computable (and for general i, The difference − |x|≤r φ − − |x|≤r/2 φ is therefore a centered Gaussian random variable with variance ∼ r 2−β , which entails that the moment bound (17) for the factor r β * log(e + log(r/r * )) in the estimate (16) is (almost) optimal. The scaling in r of the bound (16) is optimal in view of the law of the iterated logarithm; details are provided in the "Appendix".
To obtain an estimate like (15), the starting point of our proof is the Gaussian concentration of measure applied toã. Recall the notion of the covariance operator Cov, which in our setting of a stationary centered Gaussian random fieldã is given as the convolution with the tensor field ã(x) ⊗ã(0) . Proposition 1 (Concentration of Measure, cf. e.g. [20, Proposition 2.18])). Let a =ã(x) be a tensor-valued Gaussian random field on R d that is centered and stationary; denote its covariance operator by Cov. Consider a random variable F, that is, a function(al) F = F(ã). Suppose that F is 1-Lipschitz in the sense that its functional derivative, or rather its Fréchet derivative with respect to , which can be considered a random tensor field and assimilated with a Malliavin derivative, satisfies Then F has Gaussian moments in the sense of Furthermore, for any M ≥ 0 we have the estimate We now substitute our assumption (21) on the Fréchet derivative by a stronger but more tractable condition.

Lemma 1
Letã =ã(x) be a tensor-valued Gaussian random field on R d that is centered and stationary; denote its covariance operator as Cov and suppose that for some β ∈ (0, d) we have the bound

Let
: R d×d → R d×d be a 1-Lipschitz map; denote the probability distribution of (ã) as · . Consider a functional F on the space of tensor fieldsã of the form F = F(a) with a(x) := (ã(x)); we shall use the abbreviation F(ã) for F( (ã)). Let q ∈ (1, 2) be given by (24) and suppose that the Fréchet derivative of F with respect to Then the estimate (21) is satisfied, i.e. we have We observe that if q and β are related by (24), as β ↑ d we have q ↑ 2 and for β ↓ 0 we have q ↓ 1. For linear functionals of (the gradient of) the corrector (which are therefore nonlinear functionals of the coefficient field a), we now establish an explicit representation of the Fréchet derivative; this will aid us in verifying the Lipschitz condition (25) and thus ultimately the concentration of measure statements (22) and (23) for (an appropriate modification of) such functionals.

Lemma 2 Consider a linear functional on L
where g ∈ L p (R d ; R d ), p ≥ 2, and supp g ⊂ {|x| ≤ r } for some r ≥ 1. Let a be some coefficient field subject to the ellipticity and boundedness conditions (10), (11). Then the following two assertions hold: (1) Consider the Fréchet derivative ∂ F ∂a of the functional F := F∇σ i jk (note that this functional is nonlinear in a, although it is linear in σ i jk ) at a (for some fixed i, j, k). Introduce the decaying solutions v,ṽ jk to the equations and (where a * denotes the transpose of a) We then have the representation (2) Consider the Fréchet derivative ∂ F ∂a of the functional F := F∇φ i at a. Introduce the decaying solution v to the equation (again, a * denoting the transpose of a) − ∇ · a * ∇v = ∇ · g. (30) We then have the representation The previous explicit representation of the Fréchet derivative for certain linear functionals of (the gradient of) the corrector (φ, σ ) enables us to verify the bound (25) for the Malliavin derivative, provided that a certain mean value property is satisfied for a-harmonic functions. Note that the latter requirement is a condition on the coefficient field a; in Lemma 4 below we shall provide a sufficient condition for this property to hold.
As the functionals which the next lemma shall be applied to are basically averages of ∇φ or ∇σ over cubes of a certain scale r , we state the lemma in a form which makes it directly applicable in such a setting. In particular, the boundedness assumption (32) for the linear functional is motivated by these considerations.
Suppose that the constraint holds (with c(d, λ) > 0 to be fixed in the proof below). Let q ∈ (1, 2) be related to p through Consider the Fréchet derivative ∂ F ∂a of the functional F := F∇σ i jk (or the functional F := F∇φ i ; note that these functionals are nonlinear functionals of a) at some symmetric coefficient field a subject to the conditions (10), (11).
Provided that the coefficient field a is such that the mean value property holds for any a-harmonic function u and provided that furthermore a is such that is satisfied, we have the estimate Note that for q related to β through (24) and p related to q through (34), we have r − ( p−2)d p = r −β , i.e. by (37) the L q -norm of the Malliavin derivative decays like r − β 2 . This demonstrates that for functionals like our averages of ∇(φ, σ )-note that these functionals have vanishing expectation due to the vanishing expectation of ∇(φ, σ )-, the concentration of measure indeed improves on large scales with the desired exponent: The "typical value" of the average of ∇(φ, σ ) on some scale r decays like r − β 2 . We now have to provide a sufficient condition for the mean value property for aharmonic functions (35). To do so, we make use of the following result from [11], which provides the mean-value property assuming just an appropriate sublinearity condition on the corrector (φ, σ ).
Proposition 2 (see [11,Lemma 2]) There exists a constant C 0 only depending on dimension d and ellipticity ratio λ > 0 with the following property: Suppose that for an elliptic coefficient field a subject to the ellipticity and boundedness conditions (10) and (11) the scalar and vector potentials (φ, σ ), cf. (4) and (7), satisfy Then for any two radii R ≥ r and ρ ∈ [r, R] and any a-harmonic function u in We shall show in the proof of the next lemma that the quantitative sublinearity condition on the corrector (38) may be reduced to a smallness assumption on a certain family of linear functionals of the gradient of the corrector. This reduction relies on the compactness of the left-hand side of (38) with respect to the L 2 -norm of ∇(φ, σ ), which in turn may be estimated via Caccioppoli's estimate by the left-hand side. It appeals to a quantitative version of inequalities in functional analysis where an intermediate norm is estimated by a bit of a stronger norm and a lot of a weaker (semi-)norm, the role of which is played by the expression in (39). A slight subtlety follows from the fact that the use of Caccioppoli's inequality increases the radius (by a factor of two, say), so that one has to buckle on the level of all dyadic radii R larger than the given radius r , cf. the expression in (39). This requires the qualitative a priori information (36). One has a lot of flexibility in the choice of the functionals F n ; for pure convenience we choose the same functionals, of Haar wavelet-type, that play a prominent role in the proof of Assertion (ii) of Theorem 1. Other natural choices would be the first N eigenfunctions of the Neumann-Laplacian, like in Step 7 of the proof of Theorem 2 in [6] or the proof of Lemma 2.6 in [15]. With these preparations, we are able to establish our main theorem. The main technical difficulty in the proof below is that our estimate ∂ F ∂a q dx for the Malliavin derivative of linear functionals of the gradient of the corrector (cf. (37)) is a conditional bound: It relies on the assumption that the mean-value property (35) holds for a-harmonic functions on scales larger than r . For the concentration of measure estimate (22), however, an unconditional estimate of the form (21) or (25) (the latter being a proxy for (21)) is needed. By Lemma 4 we know that the mean-value property holds, provided that for a certain family of linear functionals of the corrector the smallness estimate sup R≥r dyadic;n=1,...,N is satisfied (C 0 being a universal constant). To circumvent this problem, in the proof below we therefore introduce the family of functionals holds. Therefore, concentration of measure is applicable toF r . The remainder of the proof of the first part of our theorem below is dedicated to handling the (a priori unknown) expectation F r . The proof of the second assertion of our main theorem will mainly rely on the first assertion of the theorem as well as the quantitative improvement of the Malliavin derivative of averages of (∇φ, ∇σ ) on larger scales, as captured by the estimate (37).

Concentration of measure
Proof of Proposition 1 For the proof of the concentration of measure estimate (22), we refer the reader to [20,Proposition 2.18]. We now establish (23). By Chebychev's inequality, (22) 2 2 ). In combination with the same estimate with F replaced by −F, we obtain (23).

Proof of Lemma 1
We need to verify that the condition (21) is implied by the assumption (25).
To do so, we first note that by Hölder's inequality we have for any exponent 1 < Since Cov is the convolution with ã(x) ⊗ã(0) and since we have the bound | ã(x) ⊗ a(0) | ≤ |x| −β , we have for the second factor , which allows us to use the Hardy-Littlewood-Sobolev inequality provided the exponents q and β are related by (24). From this string of inequalities we learn that (21) also holds provided We now change variables according to a(x) = (ã(x)); by the chain rule for , x), so that by the 1-Lipschitz continuity of , our assumption (25) implies (40) and thus (21).

Representation of the Malliavin derivative
Proof of Lemma 2 We first give the argument for the "vector potential" σ , fixing a component σ i jk . Consider a functional of the form F := F∇σ i jk with Fh as in (26). We claim that the Fréchet derivative of F with respect to a is given by (29) where the functions v = v(x) andṽ jk =ṽ jk (a, x) are determined as the decaying solutions of the elliptic Eqs. (27) and (28).
Computing the functional derivative of F as a function of a amounts to a linearization. We thus consider an arbitrary tensor field δa = δa(x), which we think of as an infinitesimal perturbation of a, and which thus generates infinitesimal perturbations δφ and δσ of φ and σ according to (4), (6), and (9), that is, and In terms of the infinitesimal perturbation δ F of F, this implies by integration by parts (or rather by directly appealing to the weak Lax-Milgram formulations of the elliptic equations) δ F = g · ∇δσ i jk dx (27) = − ∇v · ∇δσ i jk dx which is nothing else than (29).
Let us now establish the second part of our lemma. Consider a functional of the scalar potential of the form F := F∇φ i . To represent its Fréchet derivative, introduce the decaying solution v to the Eq. (30). We observe that the variation of F with respect to a is given by which leads to the conclusion (31).

Sensitivity estimate
Proof of Lemma 3 We now argue that under certain boundedness assumptions on F = Fh as a linear functional in vector fields h = h(x), we control the size (25) of its Fréchet derivative ∂ F ∂a = ∂ F ∂a (a, x) as a nonlinear functional F∇σ i jk = F(a) in coefficient fields a = a(x) (and similarly in the case F(a) = F∇φ i ; for this case, the (simpler) proof is sketched afterwards).
To this aim, let us first note that we have a Calderon-Zygmund estimate for −∇ ·a∇ with the exponents p and its dual exponent p p−1 : For any decaying function w and vector field h on R d related by This assertion holds by Meyer's estimate (see e.g. [22]), which only requires the ellipticity and boundedness assumptions (10), (11) on a as well as the estimate | p − 2| 1, which is ensured by our condition (33). Note that an analogous estimate would hold for the dual equation −∇ · a * ∇w = ∇ · h if our coefficient field were nonsymmetric.
In the following, we will use the abbreviation · p,B for the spatial L p -norm on the set B; we write · p when B = R d . We start by arguing that because p p−1 ∈ (1, 2), It is obviously enough to establish (45) only for R ≥ 2ρ; hence by Jensen's inequality, (45) follows from (35) once we establish the reverse Hölder inequality To this purpose, we test −∇ · a∇u = 0 with η 2γ (u − m), where η is a smooth cutoff of χ {|x|≤ R 2 } in {|x| ≤ R} (with the property |∇η| 1 R ) and where the exponent γ ≥ 1 and the constant m ∈ R will be chosen later. By the ellipticity and boundedness assumptions (10), (11) and Young's inequality we obtain and thus which by the estimate on ∇η gives On the r.h.s. of (47) we use first Hölder's inequality, then the isoperimetric inequality on {|x| ≤ R} and finally Sobolev's inequality on the whole space (for simplicity, we assume d > 2 here) (which-as a simple computation shows-is satisfied precisely for γ = d 2 ) and the constant m is the spatial average of u on {|x| ≤ R}. The combination of the last two estimates yields This gives us access to the representation (29) of its Fréchet derivative ∂ F ∂a considered as a nonlinear functional F∇σ i jk = F(a) of a. Using this representation, a partition into dyadic annuli, and Hölder's estimate (recall (34)) we obtain ∂ F ∂a q (|∇v| + |∇ṽ jk |)|∇φ i + e i | q,|x|≤2r (|∇v| + |∇ṽ jk |)|∇φ i + e i | q,2 n r ≤|x|≤2 n+1 r ( ∇v p,|x|≤2r + ∇ṽ jk p,|x|≤2r ) ∇φ i + e i 2,|x|≤2r ( ∇v p,2 n r ≤|x|≤2 n+1 r + ∇ṽ jk p,2 n r ≤|x|≤2 n+1 r ) In view of (35) applied to the a-harmonic function u(x) = x i + φ i (x), cf. (4), we obtain for all radii ρ ≥ r using Caccioppoli's inequality and (36) ∇ṽ jk p,|x|≥2 n r n(2 n ) We note that since p > 2, these estimates imply that the sum over n in (52) converges and gives (37). The estimate (53) for the solution v of the constant coefficient Eq. (27) is classical: We already argued that ∇v p r − p−1 p d ; by the estimate on the support of g in (48) we have that v is harmonic in {|x| ≥ r } and that it has vanishing flux |x|=r x · ∇v = 0.
We will now argue that ∇ṽ m p,|x|≥2 n r min (2 n ) which implies the estimate (54) by the triangle inequality ∇ṽ jk p,|x|≥2 n r ≤ ∞ m=0 ∇ṽ m p,|x|≥2 n r . We note that (56) together with our Calderon-Zygmund estimate (44) applied to (57) yields ∇ṽ m p In order to establish (58), it thus remains to show We argue in favor of (59) by duality and thus consider an arbitrary h ∈ L p p−1 supported in {|x| ≥ 2 n r } and denote by w the corresponding Lax-Milgram solution of (43). By integration by parts, we deduce from (43) and (57) that h · ∇ṽ m dx = g m · ∇w dx. By the support condition ong m this yields h · ∇ṽ m dx ≤ g m p ∇w p p−1 ,|x|≤2 m+1 r .
By the support assumption on h we have that w is a-harmonic in {|x| ≤ 2 n r }. Since m < n, we may use (45) applied to w in form of In the case of a functional of the scalar potential of the form F(a) = F∇φ i , we claim that the Fréchet derivative of F is again controlled in the sense of (37). The proof is mostly analogous to the previous one; we again rewrite F as in (49) with some g satisfying (48). Starting from the representation (31), one derives an analogue of estimate (50) reading ∂ F ∂a q ∇v p,|x|≤2r ∇φ i + e i 2,|x|≤2r ∇v p,2 n r ≤|x|≤2 n+1 r ∇φ i + e i 2,2 n r ≤|x|≤2 n+1 r .
The second factors on the right in this estimate coincide with the ones in the case F(a) = F∇σ i jk ; therefore, we get the following analogue to estimate (52):

Sufficient conditions for the mean value property in terms of linear functionals of the corrector
Proof of Lemma 4 In order to show that (39) and (36) By dyadic iteration, it is enough to show for any dyadic ρ ≥ 1 Indeed, abbreviating D m := 1 , the estimate (63) may be rewritten as (using a slight readjustment of δ) which may be iterated to By our sublinearity assumption on the corrector (36) (which may be rewritten as lim m 0 ↑∞ D m 0 = 0), this yields (62). We now turn to the argument for (63). By Caccioppoli's estimate on (4) we have and thus in particular for the flux q i = a(∇φ i + e i ) Caccioppoli's estimate on (9) gives be reduced to the smallness assumption (39) for our functionals F m,R on scales R ≥ r , so that Lemma 3 becomes applicable under the assumption (39): Let q be related to β through (24) and let p be related to q through (34). By the smallness assumption on β in our theorem (cf. (13)), we deduce that (33) holds. By scaling, our functionals F n,r satisfy the estimate (32) up to a universal constant factor. Furthermore, by ergodicity the property (36) holds for · -almost every coefficient field a (regarding σ , this result has been shown in [11,Lemma 1]; for φ, it is classical but may also be found in [11]). Thus, the estimate (37) holds for F n,r under the assumption (39), i.e. there exists a constant C 0 only depending on d, λ, and β, such that for any n = 1, . . . , N and any radius r the implication sup m,R≥r dyadic holds for · -a.e. coefficient field a.
To apply concentration of measure in the form of Proposition 1 to some functional F, we however need an unconditional bound on the Malliavin derivative (cf. (21) respectively (25)).
Therefore we first introduce a new random variable whose derivative vanishes whenever the smallness condition in (66) is violated: Consider the auxiliary random variableF where the sup runs over all dyadic radii R = 2 k r , k ∈ N 0 . By the usual differentiation rules applied to the Fréchet derivative ∂ ∂a in the norm · q , we obtain By Lemma 1, we may apply concentration of measure in form of (23) to the random variable cr β/2F r (where c is some small universal constant). This yields so that it remains to control the expectation F r .
Because of (8) and the definition of F m,R , it follows from qualitative ergodicity of · and Birkhoff's ergodic theorem that lim R↑∞ F m,R = 0 almost surely, so that by dominated convergence lim r ↑∞ F r = 0. Hence there exists a finite radius r 0 which is minimal with the property On the basis of (70), we now get a quantitative estimate on r 0 . To this purpose we now consider the auxiliary variablē where again the sup runs over all dyadic radii R = 2 k r , k ∈ N 0 , and where the cut-off function η = η(F) is given by The advantage of the auxiliary variable (71) over (67) is that we control its expectation: Since the stationary ∇(φ, σ ) has vanishing expectation, cf. (8), and by the linearity of F n,r in ∇(φ, σ ) we have F n,r = 0 and thus F n,r = (η − 1)F n,r so that by construction of η Since the stationary ∇(φ, σ ) has bounded second moments, cf. (8), and by the boundedness property of F n,r in ∇(φ, σ ) we obtain from the Cauchy-Schwarz inequality which in view of (70) improves to | F n,r | exp − 1 C r β for any r ≥ r 0 .
Together with (73) this yields By definition (71) we have I (|F n,r | ≥ M) ≤ I (sup m,R≥r |F m,R | ≥ 1 2C 0 ) + I (|F n,r | ≥ M) so that by (70) the above upgrades to Since r β exp(− 1 C r β ) 1 for all r , the above holds without the lower restriction on M: Using this estimate with r replaced by R and summing over the finite index set n = 1, . . . , N and all dyadic R ≥ r we obtain and thus in particular for the auxiliary random variable (67) where the upper bound on M is immaterial sinceF r ≤ 1 C 0 ≤ 1. Using F r = ∞ 0 I (F r ≥ M) d M, this yields the following quantification of lim r ↑∞ F r = 0: for all r ≥ r 0 .
Since r 0 was minimal in (69) and since F r depends continuously on r , this yields the desired It remains to argue why (74), which together with (76) may be rephrased as yields (15). It just suffices to include the given functional F from (14) into the list of finitely many functionals F 1 , . . . , F N , say, as the last functional F N = F, and then to specify the above to n = N . We note that for q related to β through (24) and p related to q through (34) one has p p−1 = 2d d+β , i.e. (14) entails (32). Note that by adjusting the constants, (15) is trivial for r 1, so that we obtain (15) over the whole range r ≥ 0.

Proof of Assertion (ii).
The arguments in this section require β < 2, which in view of our assumption β 1 is no restriction. Let r * denote the minimal dyadic radius with the property (16); note that the proof to follow does not assume r * < ∞. In order to establish (17), it is enough to show for a given dyadic r 0 ≥ 1 that It will be convenient to replace balls by cubes. Moreover, all radii or rather side length are dyadic. By definition of r * as the smallest radius with (16), the event r * > r 0 means that there exists a radius R ≥ r 0 with where f (z) := log(e + log z).
In the sequel, the intermediate (dyadic) radius r 1 ∈ [r 0 , R] with will play a role. Note that we use here β > 0 and that f (z) grows sub-algebraically. For the l. h. s. of (78) we note sup we have Since g(z) decays sub-algebraically in z and since β < 2, this yields the simpler holds provided that R ≥ r 0 ≥ 1) for the second sum over r , we obtain I (r * > r 0 ) Thanks to β > 0, we have exp(− 1 C R β ) exp(− 1 C f ( R r 0 )r β 0 ), so that the second summand is dominated by the first one: Now we see the reason for the choice of f (z) = log(e + log z) for which f (2 m ) ≥ With 1 C r β 0 playing the role of A this yields (77). Note that the condition r 0 1 is immaterial after adjusting the constants, as the l. h. s. of (77) is bounded by 1.
Note that these random variables may be written in the form X i = 2 2mi(β/2−d) θ(2 −2mi |x|) x |x| · ∇φ k dx with θ = θ(|x|) being independent of i and m. By (20) and the discussion preceding (20), the X i are therefore identically distributed centered Gaussian random variables with variance ∼ δ 2 . One observes that their covariance satisfies By a purely linear algebra argument (namely, the Gram-Schmidt algorithm, see below), one may construct a sequence of independent centered Gaussian random variables from the X i :

Lemma 5
If m is chosen large enough, then there exists a sequence of independent centered Gaussian random variables Y i with the properties Y i ∼ δ 2 and where |a i j | 2 −βm|i− j|/2 .
Would the law of the iterated logarithm be non-optimal for the X i , the same would hold for the As the sum of the right-hand sides gives ∞ for c small enough, the second part of the Borel-Cantelli lemma provides the conclusion.
which gives using | j − i| + | j − k| = |i − k| + 2| j − i| (note that k > i > j) and the bound (85) Let us denote The previous estimate then entails If we can derive by induction a lower bound for Y 2 j of the form Y 2 j δ 2 , for m large enough the previous bound entails by induction sup i B i 1.
Note that we only require the lower bound on Y 2 j for all j ≤ i − 1 to obtain such an estimate for B i .
To obtain the desired lower bound, let us compute As we have X 2 i ∼ δ 2 , an estimate of the form would imply the desired lower bound for Y 2 i . Suppose that we have shown the lower bound Y 2 j δ 2 for all j ≤ i − 1. To show the desired lower bound for Y 2 i , it is then sufficient to have