Optimal homogenization rates in stochastic homogenization of nonlinear uniformly elliptic equations and systems

We derive optimal-order homogenization rates for random nonlinear elliptic PDEs with monotone nonlinearity in the uniformly elliptic case. More precisely, for a random monotone operator on $\mathbb{R}^d$ with stationary law (i.e. spatially homogeneous statistics) and fast decay of correlations on scales larger than the microscale $\varepsilon>0$, we establish homogenization error estimates of the order $\varepsilon$ in case $d\geq 3$, respectively of the order $\varepsilon |\log \varepsilon|^{1/2}$ in case $d=2$. Previous results in nonlinear stochastic homogenization have been limited to a small algebraic rate of convergence $\varepsilon^\delta$. We also establish error estimates for the approximation of the homogenized operator by the method of representative volumes of the order $(L/\varepsilon)^{-d/2}$ for a representative volume of size $L$. Our results also hold in the case of systems for which a (small-scale) $C^{1,\alpha}$ regularity theory is available.


Introduction
In the present work, we establish quantitative homogenization results with optimal rates for nonlinear elliptic PDEs of the form where A ε is a random monotone operator whose correlations decay quickly on scales larger than a microscopic scale ε. For scalar problems and also certain systems, we obtain the optimal convergence rate O(ε) of the solutions u ε towards the solution u hom of the homogenized problem −∇ · (A hom (∇u hom )) = f in R d (2) in three or more spatial dimensions d ≥ 3. In two spatial dimensions d = 2, we obtain the optimal convergence rate O(ε| log ε| 1/2 ) upon including a lower-order term in the PDEs (1) and (2).
Our results may be seen as the optimal quantitative counterpart in the case of 2-growth to the qualitative stochastic homogenization theory for monotone systems developed by Dal Maso and Modica [17,18], respectively as the nonlinear counterpart of the optimal-order stochastic homogenization theory for linear elliptic equations developed by Gloria and Otto [32,33] and Gloria, Otto, and the second author [29,31]. Just like for [17,18], a key motivation for our work is the homogenization of nonlinear materials.
In the context of random materials, the first -and to date also the onlyhomogenization rates for elliptic PDEs with monotone nonlinearity were obtained by Armstrong and Smart [9], Armstrong and Mourrat [7], and Armstrong, Ferguson, and Kuusi [2] in the form of a small algebraic convergence rate ε δ for some δ > 0. The optimal convergence rates derived in the present work improve substantially upon their rate. However, in contrast to the works of Armstrong et al. we make no attempt to reach optimal stochastic integrability: While we derive an error estimate of the form with a random constant C f whose stretched exponential moments are estimated by E[exp(|C f /C(f )| ν )] ≤ 2 for some constant ν > 0 and some constant C(f ) > 0, the homogenization error estimates for linear elliptic PDEs with optimal rate in [4,34] establish (essentially) Gaussian stochastic moments E[exp(|C f /C(f, µ)| 2−µ )] ≤ 2 for any µ > 0. Likewise, the homogenization error estimates for monotone operators with non-optimal rate ε δ of [2,7,9] include optimal stochastic moment bounds.
Before providing a more detailed summary of our results, let us give a brief overview of the previous quantitative results in nonlinear stochastic homogenization. The first -logarithmic -rates of convergence in the stochastic homogenization of a nonlinear second-order elliptic PDE were obtained by Caffarelli and Souganidis [15] in the setting of non-divergence form equations. Subsequently, a rate of convergence ε δ has been derived both for equations in divergence form and nondivergence form by Armstrong and Smart [9,8] and Armstrong and Mourrat [7]. In the homogenization of Hamilton-Jacobi equations, a rate of convergence of the order ε 1/8 has been obtained by Armstrong, Cardaliaguet, and Souganidis [6]. For forced mean curvature flow, Armstrong and Cardaliaguet [1] have derived a convergence rate of order ε 1/90 . These rates of convergence are all expected to be non-optimal (compare, for instance, the result for Hamilton-Jacobi equations to the rate of convergence ε known in the periodic homogenization setting [37]).
To the best of our knowledge, the present work constitutes the first optimal-order convergence results for any nonlinear stochastic homogenization problem. However, we are aware of an independent work in preparation by Armstrong, Ferguson, and Kuusi [3], which aims to address the same question. In contrast to our work -which is inspired by the approach to quantitative stochastic homogenization via spectral gap inequalities of [32,33,29,31] -the upcoming work [3] relies on the approach of sub-and superadditive quantities of [9,4,2]. Nevertheless, both our present work and the approach of [3,2] use the concept of correctors for the linearized PDE, see Section 3 for details.
Before turning to a more detailed description of our results, let us briefly comment on the theory of periodic homogenization of nonlinear elliptic equations. A quantitative theory for the periodic homogenization of nonlinear monotone operators has recently been derived by Wang, Xu, and Zhao [42]. A corresponding result for degenerate elliptic equations of p-Laplacian type may be found in [43]. In the periodic homogenization of polyconvex integral functionals, the single-cell formula for the effective material law (which determines the effective material law by a variational problem on a single periodicity cell) may fail [12,26], a phenomenon associated with possible "buckling" of the microstructure. A related phenomenon of loss of ellipticity may occur in the periodic homogenization of linear elasticity [14,24,35,36]. Note that polyconvex integral functionals occur naturally in the framework of nonlinear elasticity [11]; however, their Euler-Lagrange equations in general lack a monotone structure. However, in periodic homogenization of nonlinear elasticity the single-cell formula is valid for small deformations [39,40], and rates of convergence may be derived.
1.1. Summary of results. To summarize our results in a continuum mechanical language, we consider the effective macroscopic behavior of a nonlinear and microscopically heterogeneous material. We assume that the behavior of the nonlinear material is described by the solution u ε : R d → R m of a second-order nonlinear elliptic system of the form −∇ · A ε (x, ∇u ε ) = f for some random monotone operator A ε : R d × R m×d → R m×d with correlation length ε and some right-hand side f ∈ L 2d d+2 (R d ; R m ). We further assume that the random monotone operator A ε is of the form A ε (x, ξ) := A(ω ε (x), ξ), where ω ε is a random field representing the random heterogeneities in the material; for each realization of the random material (i. e. each realization of the probability distribution), ω ε selects at each point x ∈ R d a local material law A(ω ε (x), ·) : R m×d → R m×d from a family A of potential material laws. Under some suitable additional conditions, the theory of stochastic homogenization shows that for small correlation lengths ε 1 the above nonlinear elliptic system is well-approximated by a homogenized effective equation. The effective equation again takes the form of a nonlinear elliptic system, however now with a spatially homogeneous effective material law A hom : R m×d → R m×d . It is our goal to provide an optimal-order estimate for the difference of the solution u ε to the solution u hom of the effective equation −∇ · A hom (∇u hom ) = f, as well as to give an optimal-order error bound for the approximation of the effective material law A hom by the method of representative volumes.
To be mathematically more precise, we consider a random field taking values in the unit ball of a Hilbert space ω ε : Ω × R d → H ∩ B 1 and a family of monotone operators A : (H ∩ B 1 ) × R m×d → R m×d indexed by the unit ball of this Hilbert space. We then define a random monotone operator A ε (x, ·) := A(ω ε (x), ·) by selecting a monotone operator from the family A at each point x ∈ R d according to the value of the random field ω ε (x). Note that the property of ω ε being Hilbertspace valued is not an essential point and just included for generality: Even the homogenization for a scalar-valued random field (and correspondingly a singleparameter family of monotone operators A : (R ∩ B 1 ) × R m×d → R m×d ) would be highly relevant and just as difficult, as it could describe e. g. composite materials.
The conditions on the random field ω ε and the family of monotone operators A are as follows: • We assume spatial statistical homogeneity of the material : The statistics of the random material should not depend on the position in space. In terms of a mathematical formulation, this assumption corresponds to stationarity of the probability distribution of ω ε under spatial translations. • We assume sufficiently fast decorrelation of the material properties on scales larger than a correlation length ε. In terms of a mathematical formulation, we make this notion rigorous by assuming that a spectral gap inequality holds. More precisely, we shall assume that ω ε itself is a Hilbert-space valued random field on R d which satisfies a spectral gap inequality and on which the random monotone operator A ε depends in a pointwise way as A ε (x, ξ) := A(ω ε (x), ξ), where the map A : (H ∩ B 1 ) × R m×d → R m×d is Lipschitz in its first variable.
-We are in the two-dimensional case d = 2.
-Our system has Uhlenbeck structure, i. e. the nonlinearity has the structure A(ω, ξ) = ρ(ω, |ξ| 2 )ξ for some scalar function ρ, and the same is true for the homogenized operator.
Under these assumptions, we establish the following quantitative stochastic homogenization results with optimal rates for the nonlinear elliptic PDE (1).
• The solution u ε to the nonlinear PDE with fluctuating random material law (1) can be approximated by the solution u hom to a homogenized effective PDE of the form (2). In case d = 2 or d = 1, we include a lower-order term in the PDEs, see (6) and (7). The homogenized effective material law is given by a monotone operator A hom : R m×d → R m×d which is independent of the spatial variable x ∈ R d and satisfies analogous uniform ellipticity and boundedness properties. The error u − u hom is estimated by where p = 2d d−2 in case d ≥ 2 and p = 2 in case d ∈ {1, 2}. Here, C(f ) denotes a random constant with bounded stretched exponential moments in the sense E exp for some universal constantν > 0 (which is in particular independent of the right-hand side f ) and for some constant C(f ) = C(||f || L 1 , ||f || L d+1 ).
Without the additional small-scale regularity assumption, we still achieve half of the rate of convergence ε 1/2 for d ≥ 3 respectively ε 1/2 | log ε| 1/4 for d = 2, a result that we also establish for the Dirichlet problem in bounded domains.
• The homogenized effective operator A hom may be approximated by the method of representative volumes, and this approximation is subject to the following a priori error estimate: If a box of size L ≥ ε is chosen as the representative volume, the error estimate holds true for every ξ ∈ R m×d , where A RVE hom denotes the approximation of A hom by the method of representative volumes and where again C(L, ξ) denotes a random constant with bounded stretched exponential moments (independently of L and ξ). The systematic error is of higher order at least in case d ≤ 4 (which includes the physically relevant cases d = 2 and d = 3). Without the additional small-scale regularity assumption, we achieve almost the same overall estimate |A RVE hom (ξ) − A hom (ξ)| ≤ C(L, ξ)|ξ|(L/ε) −d/2 (log L ε ) C , but not the improved bound for the systematic error.
Note that the rates of convergence ||u ε − u hom || L 2d/(d−2) ≤ Cε in case d ≥ 3 respectively ||u ε − u hom || L 2 ≤ Cε| log ε| 1/2 in case d = 2 coincide with the optimal rate of convergence in the homogenization of linear elliptic PDEs, see e. g. [4,29,31,34]. Similarly, the rate of convergence for the representative volume approximation |A RVE hom (ξ) − A hom (ξ)| ≤ C(L/ε) −d/2 coincides with the corresponding optimal rate for linear elliptic PDEs, as does (essentially) the higher-order convergence rate for the systematic error. As linear elliptic PDEs may be regarded as a particular case of our nonlinear PDE (1), our rates of convergence are optimal.
1.2. Examples. To illustrate our results, let us mention two examples of random nonlinear elliptic PDEs and systems to which our theorems apply, as well as an important class of random fields ω ε which satisfy our assumptions.
We first give an example for the random field ω ε . Let θ : R k → R k ∩ B 1 be any Lipschitz map taking values only in the unit ball. Let Y ε : R d → R k be any stationary Gaussian random field whose correlations decay sufficiently quickly in the sense Cov Y ε (x), Y ε (y) 1 1 + |x−y| ε d+δ for some δ > 0. Set H := R k . Then the random field ω ε : R d → H defined by satisfies a spectral gap inequality with correlation length ε in the sense of Definition 11; for a proof see e. g. [20]. As stationarity is immediate, any such ω ε satisfies our key assumptions on the random field (P1) and (P2) stated in Section 2.1 below. Note in particular that the spectral gap assumption allows for the presence of (sufficiently quickly decaying, namely integrable) long-range correlations. Typical realizations for two such random fields are depicted in Figure 1. To state the first example of a random monotone operator satisfying our assumptions, consider any two deterministic spatially homogeneous monotone operators A 1 : R m×d → R m×d and A 2 : R m×d → R m×d subject to the ellipticity and Lipschitz continuity assumptions (A1) and (A2). Furthermore, consider any random field ω ε : R d → [0, 1]. Then the operator A ε (x, ξ) := ω ε (x)A 1 (ξ) + (1 − ω ε (x))A 1 (ξ) Figure 1. Typical realizations of random fields obtained by applying a nonlinear map pointwise to a stationary Gaussian random field with only short-range correlations (left), respectively to a stationary Gaussian random field with barely integrable correlations (right). satisfies our assumptions (A1)-(A3). Note that this operator corresponds to the PDE The additional small-scale regularity assumption (R) is satisfied whenever the operators A 1 and A 2 have uniformly bounded second derivatives, the random field ω ε is regular enough, and one of the three following conditions holds: The equation is scalar (m = 1), the spatial dimension is at most two (d ≤ 2), or both A 1 and A 2 as well as the homogenized operator A hom are of Uhlenbeck structure.
As a second simple example of a monotone operator, for any random field ω ε : R d → [0, 1] the operator A ε (x, ξ) := 1 + |ξ| 2 1 + (1 + ω ε (x))|ξ| 2 ξ satisfies our assumptions, possibly with the exception of the additional regularity condition (R). Note that this operator corresponds to the PDE The additional small-scale regularity assumption -stated in (R) below -is satisfied in the scalar case m = 1 as well as in the low-dimensional case d ≤ 2, provided that the random field ω ε is sufficiently smooth on the microscopic scale ε.

1.3.
Notation. The number of spatial dimensions will be denoted by d ∈ N. For a measurable function u, we denote by ∇u its (weak) spatial derivative. For a function of two variables A(ω, ξ), we denote its partial derivatives by ∂ ω A and ∂ ξ A.
For a function f : R d → R, we denote by ∂ i f its partial derivative with respect to the coordinate i. For a matrix-valued function M : R d → R m×d , we denote by ∇ · M its divergence with respect to the second index, i. e.
Throughout the paper, we use standard notation for Sobolev spaces. In particular, we denote by H 1 (R d ) the space of all measurable functions u : R d → R whose weak spatial derivative ∇u exists and which satisfy ||u|| H 1 := (´R d |u| 2 + |∇u| 2 dx) 1/2 < ∞. Similarly, we denote by H 1 (R d ; R m ) the space of R m -valued vector fields with the analogous properties and the analogous norm. For d ≥ 3, we denote byḢ 1 (R d ; R m ) the space of all measurable functions u with ||u||Ḣ 1 := (´R d |∇u| 2 dx) 1/2 + ||u|| L 2d/(d−2) < ∞. By H 1 loc (R d ) we denote the space of all measurable functions u : R d → R for which all restrictions u| Br to finite balls (0 < r < ∞) belong to H 1 (B r ). For a box [0, L] d , we denote by H 1 per (R d ) the closure in the H 1 ([0, L] d ) norm of the smooth L-periodic functions. By H 1 uloc (R d ), we denote the space of measurable functions u whose weak derivative ∇u exists and which satisfy the bound ||u|| H 1 uloc := sup x∈R d (´B 1(x) |∇u| 2 + |u| 2 dx) 1/2 < ∞. In order not to overburden notation, we shall frequently suppress the dependence on the spatial variable x in many expressions, for instance we will write A(ω ε , ξ) or A(ω ε , ∇u) instead of A(ω ε (x), ξ) respectively A(ω ε (x), ∇u(x)). By an expression like ∂ ξ A(ω, ξ)Ξ, we denote the derivative of A with respect to the second variable at the point (ω, ξ) evaluated in direction Ξ. Similarly, by ∂ ω ∂ ξ A(ω, ξ)δωΞ we denote the second mixed derivative of A with respect to its two variables at the point (ω, ξ) evaluated in directions δω and Ξ. We use notation like δF or δω ε to indicate infinitesimal changes (i. e. differentials) of various quantities and functions.
For two numbers a, b ∈ R, we denote by a ∧ b the minimum of a and b. We write a ∼ b to indicate that two constants a, b ∈ (0, ∞) are of a similar order of magnitude. For a matrix M ∈ R m×d , we denote by |M | := i,j |M ij | 2 its Frobenius norm. By R d×d skew we denote the set of skew-symmetric matrices of dimension d × d. By B r (x) we denote the ball of radius r centered at x. By B r we will denote the ball of radius r around the origin. For two sets A, B ⊂ R d and a point x ∈ R d , we denote by A + B := {a + b : A ∈ A, b ∈ B} their Minkowski sum, respectively by x + A the translation of A by x. By H we will denote a Hilbert space; we will denote its unit ball by H ∩ B 1 .
By C and c we will denote -typically large respectively typically small -nonnegative constants, whose precise value may change from occurrence to occurrence but which only depend on a certain set of parameters. For a set M , we denote by M the number of its elements.
We write ω ε ∼ P to indicate that a random field ω ε is distributed according to the probability distribution P. For two random variables or random fields X and Y , we write X ∼ Y to indicate that their laws coincide.
Whenever we use the terms "coefficient field" or "monotone operator", we shall implicitly assume measurability.

Main results
Before stating our main results and the precise setting, let us introduce the key objects in the homogenization of nonlinear elliptic PDEs and systems with monotone nonlinearity. To fix a physical setting, we will here give an outline of the meaning of the objects in the context of electric fields and associated currents. However, a major motivation for the present work -and in particular for the choice to include the case of nonlinear elliptic systems -stems from the homogenization of nonlinear elastic materials. While in this context the monotone structure is lost [11], it may be retained for small deformations [16,25,39,40,44]. A corresponding result in the context of stochastic homogenization will be established in [21]. The homogenization of monotone operators may also be viewed as a simple but necessary first step towards a possible quantitative homogenization theory for nonlinear elastic materials for larger deformations.
In the context of electric fields and currents, we are concerned with a scalar equation (i. e. m = 1); the functions u and u hom in (1) and (2) correspond (up to a sign) to the electric potential in the heterogeneous material respectively in the homogenized picture. Their gradients ∇u respectively ∇u hom are the associated electric fields. The monotone functions A(ω ε (x), ξ) respectively A hom (ξ) are the material law and describe the electric current created by a given electric field ξ. Note that in contrast to existing optimal results in stochastic homogenization, the material law may be nonlinear in ξ. Finally, the PDEs (1) and (2) correspond to prescribing the sources and sinks of the electric current.
The central object in the quantitative homogenization of elliptic PDEs is the homogenization corrector φ ξ , which in our context is an R m -valued random field on R d . It provides a bridge between the microscopic (heterogeneous) and the macroscopic (homogenized) picture: For a given constant macroscopic field gradient ξ ∈ R m×d , the corrector φ ξ provides the correction yielding the associated microscopic field gradient ξ + ∇φ ξ . The corrector φ ξ is defined as the unique (up to a constant) sublinearly growing distributional solution to the PDE see Definition 1 for the precise definition. Note that a priori, similarly to the linear elliptic case, the existence and uniqueness of such a solution is unclear.
The effective (homogenized) material law A hom (see Definition 1 for the precise definition) may be determined in terms of the homogenization corrector: In principle, at each point x ∈ R d the microscopic material law A(ω ε (x), ·) : R m×d → R m×d associates a current A(ω ε (x), ξ ε ) to a given electric field ξ ε ; likewise, the effective macroscopic material law A hom (·) : R m×d → R m×d associates a current A hom (ξ) to a given macroscopic electric field ξ. As the macroscopic current corresponds to an "averaged" microscopic current, the macroscopic material law should be obtained by averaging the microscopic flux. More precisely, the homogenization corrector φ ξ associates a microscopic electric field ξ + ∇φ ξ to a given macroscopic electric field ξ; therefore the macroscopic current A hom (ξ) should be given by the "average" of the microscopic current A(ω ε , ξ + ∇φ ξ ). In our setting, due to stationarity and ergodicity "averaging" corresponds to taking the expected value at an arbitrary point x ∈ R d (and we will suppress the point x ∈ R d in the notation). In other words, we have Our main results on the quantitative approximation of the solution u ε to the nonlinear elliptic PDE with randomly fluctuating material law −∇ · (A(ω ε , ∇u ε )) = f by the solution u hom to the homogenized equation −∇ · (A hom (∇u hom )) = f are stated in Theorem 2 and Theorem 3 in the case of the full space R d . The case of a bounded domain -however with a lower rate of convergence -is considered in Theorem 4.
Our second main result -the error estimates for the approximation of the effective material law A hom by the method of representative volumes -is stated in Theorem 9 and Corollary 10.
2.1. Assumptions and setting. We denote by d ∈ N the spatial dimension and by m ∈ N the system size; in particular, the case m = 1 corresponds to a scalar PDE. Let λ and Λ, 0 < λ ≤ Λ < ∞, denote ellipticity and boundedness constants. Let H and H ∩ B 1 denote a Hilbert space and the open unit ball in H, respectively. We denote by A : H × R m×d → R m×d a family of operators, indexed by a parameter in H. We require A to satisfy the following conditions: (A1) Each operator A(ω, ·) in the family is monotone in the second variable in the sense for every parameter ω ∈ H and all ξ 1 , ξ 2 ∈ R m×d . (A2) Each operator A(ω, ·) is continuously differentiable in the second variable and Lipschitz in the sense for every parameter ω ∈ H and all ξ 1 , ξ 2 ∈ R m×d . Furthermore, we have A(ω, 0) = 0 for every parameter ω ∈ H. (A3) The operator A(ω, ξ) and its derivative ∂ ξ A(ω, ξ) are differentiable in the parameter ω with bounded derivative in the sense for every ω ∈ H and all ξ ∈ R m×d . Here, ∂ ω and ∂ ξ denote the Fréchet derivative with respect to the first variable and the partial derivative with respect to the second variable, respectively. Furthermore, | · | denotes the operator norm on H → R m×d and H × R m×d → R m×d , respectively. Throughout our paper, we will reserve the term parameter field for a measurable function ω : R d → H ∩ B 1 . With help of the operator family A, we may associate to each parameter field ω a space-dependent monotone material law R d x → A( ω(x), ·). We denote the space of all parameter fields by Ω and equip Ω with the L 1 loc (R d ; H) topology. We then equip the space of parameter fields Ω with a probability measure P and write ω ε : R d → H ∩ B 1 to denote a random parameter field sampled with P.
It will be our second key assumption that the probability measure P describes a stationary random field with correlation length ε (which is also the reason why we include the index "ε" in our notation). To be precise, we impose the following conditions on P: (P1) P is stationary in the sense that the probability distribution of ω ε (· + y) coincides with the probability distribution of ω ε (·) for all y ∈ R d . From a physical viewpoint this corresponds to the assumption of statistical spatial homogeneity of the random material: While each sample of the random material is spatially heterogeneous, the underlying probability distribution is spatially homogeneous. (P2) P features fast decorrelation on scales ≥ ε in the sense of the spectral gap assumption of Definition 11 below. Here, and throughout the paper 0 < ε ≤ 1 is fixed and denotes the correlation length of the material.
Note that this corresponds to a quantitative assumption of ergodicity by assuming a decorrelation in the coefficient field ω ε on scales ≥ ε.
Under the previous conditions homogenization occurs (in fact (P2) can be weakend to qualitative assumption of ergodicity). In particular, we may introduce the corrector of stochastic homogenization and define a homogenized monotone operator (i. e. a homogenized material law) A hom as follows.
Definition 1 (Corrector and homogenized operator). Let the assumptions (A1)-(A3) and (P1)-(P2) be in place. Then for all ξ ∈ R m×d there exists a unique random field φ ξ : Ω × R d → R m , called the corrector associated with ξ, with the following properties: (a) For P-almost every realization of the random field ω ε the corrector φ ξ (ω ε , ·) has the regularity φ ξ (ω ε , ·) ∈ H 1 loc (R d ; R m ), satisfies ffl B1 φ ξ (ω ε , ·) dx = 0, and solves the corrector equation (4) in the sense of distributions. (b) The gradient of the corrector ∇φ ξ is stationary in the sense that ∇φ ξ (ω ε , · + y) = ∇φ ξ (ω ε (· + y), ·) a.e. in R d holds for P-a.e. ω ε and all y ∈ R d . (c) The gradient of the corrector ∇φ ξ has finite second moments and vanishing expectation, that is (d) The corrector P-almost surely grows sublinearly at infinity in the sense Moreover, for each ξ ∈ R m×d we may define where the right-hand side in this definition is independent of the spatial coordinate x. The map A hom : R m×d → R m×d is called the the effective operator or the effective material law.
We shall see that the homogenized material law A hom : R m×d → R m×d defined by (5) inherits the monotone structure from the heterogeneous material law A(ω ε , ·), see Theorem 6 below.
For some of our results we will assume that the following microscopic regularity condition is satisfied. Note that the condition essentially implies a small-scale C 1,α regularity theory (i. e. a C 1,α theory on the ε scale) for the heterogeneous equation and a global C 1,α regularity theory for the homogenized (effective) equation.
(R) Suppose that at least one of the following three conditions is satisfied: -The equation is scalar (i. e. m = 1).
-The number of spatial dimension is at most two (i. e. d ≤ 2).
-The system is of Uhlenbeck structure in the sense that there exists a function ρ : H ∩ B 1 × R + 0 → R + 0 with A(ω, ξ) = ρ(ω, |ξ| 2 )ξ for all ω ∈ H∩B 1 and all ξ ∈ R m×d ; furthermore, the effective operator given by (5) is also of Uhlenbeck structure. Suppose in addition that the second derivative of A with respect to the second variable exists and satisfies the bound |∂ ξ ∂ ξ A(ω, ξ)| ≤ Λ for all ω ∈ H ∩ B 1 and all ξ ∈ R m×d .
Suppose furthermore that the random field ω ε is Lipschitz regular on small scales in the following sense: There exists a random field C with uniformly bounded stretched exponential moments E[exp(|C(x)| ν )] ≤ 2 for all x ∈ R d for some ν > 0 such that

2.2.
Optimal-order homogenization error estimates. Our first main result is an optimal-order estimate on the homogenization error in the stochastic homogenization of nonlinear uniformly elliptic PDEs (and systems) with monotone nonlinearity. Note that our rate of convergence coincides with the optimal rate of convergence for linear elliptic PDEs and systems [4,29,31,32,34], which form a subclass of the class of elliptic PDEs with monotone nonlinearity.
Theorem 2 (Optimal-order estimates for the homogenization error for d ≥ 3). Let d ≥ 3. Let the assumptions (A1)-(A3) and (P1)-(P2) be in place. Suppose furthermore that the small-scale regularity condition (R) holds. Let the effective (homogenized) monotone operator A hom : R m×d → R m×d be given by the defining formula (5) and let p > d. Then for any Here, C(ω ε , f ) is a random constant whose values may depend on ω ε and f , but whose (stretched exponential) stochastic moments are uniformly estimated by for someν > 0 depending only on d, m, λ, Λ, ρ, p, and on ν from assumption (R), as well as some C > 0 depending only on upper bounds for ||f || L 1 and ||f || L p as well as on p.
Since by Theorem 6 the effective material law A hom inherits the monotone structure from the heterogeneous material law A(ω, ·), the solvability of the homogenized equation is guaranteed for any In the case of low dimension d ≤ 2 the rate of convergence becomes limited by the central limit theorem scaling. In particular, as for linear elliptic equations, the case d = 2 is critical, leading to a logarithmic correction. Furthermore, even for the Poisson equation -which may be regarded as a very particular case of our PDEs (1) or (2) -the gradient of solutions of the whole-space problem may fail to be square-integrable. For this reason we include a lower order term in our PDEs.
Theorem 3 (Optimal-order estimates for the homogenization error for d = 2 and d = 1). Let d = 2 or d = 1 and let otherwise the assumptions of Theorem 2 be in place. Suppose furthermore ε ≤ 1 and (1 + |x|) κ f ∈ L 2 (R d ) for some κ > 0. Then the unique weak solution u ε ∈ H 1 (R d ; R m ) to the nonlinear elliptic PDE with heterogeneous coefficients may be approximated by the solution u hom ∈ H 1 (R d ; R m ) to the homogenized effective equation up to an error of the order Here, C is a random constant which satisfies stretched exponential stochastic moment bounds in the sense E[exp(Cν/C)] ≤ 2 for someν > 0 as in Theorem 2 and some C > 0 depending only on upper bounds for ||f || L 1 , ||(1+|x|) κ f || L 2 , and ||f || L p as well as on p and κ.
In the absence of the small-scale regularity condition (R), we still obtain half of the optimal rate of convergence. However, we also directly obtain this result in bounded domains. Note that in order to recover the optimal rates of convergence from Theorem 2 and Theorem 3 also for the Dirichlet problem on bounded domains, one would need to construct boundary correctors, as done for linear elliptic PDEs for instance in [10] in the setting of periodic homogenization or in [5,23] in the setting of stochastic homogenization.
Here, C(ω ε , f ) is a random constant whose values may depend on ω ε , O, u Dir , and f , but whose (stretched exponential) stochastic moments are uniformly estimated by for someν > 0 depending only on d, m, λ, Λ, and ρ, and some C(O) > 0 depending only on O.
Remark 5. An error estimate analogous to Theorem 4 also holds on the full space R d , provided that in case d ≥ 3 we measure the error u ε − u hom in the L 2d/(d−2) norm as in Theorem 2 and assume f ∈ L 1 ∩ L 2d d−2 , and provided that in case d = 2 we include a massive term in the equation as in Theorem 3.
We next establish several structural properties of the homogenized operator A hom , including in particular the statement that the homogenized operator A hom inherits the monotone structure from A ε . Note that these properties are actually true even in the context of qualitative stochastic homogenization, and their proof is fairly elementary. The monotonicity of the effective operator A hom has (essentially) already been established by Dal Maso and Modica [18,17], at least in the setting of convex integral functionals. We also give sufficient criteria for frame-indifference and isotropy of the homogenized operator A hom .
Theorem 6 (Structure properties of the homogenized equation). Let the assumptions (A1)-(A3) and (P1)-(P2) be in place. Define the effective (homogenized) material law A hom : R m×d → R m×d by (5). Then A hom has the following properties: a) The map A hom is monotone and Lipschitz continuous in the sense that hold for all ξ 1 , ξ 2 ∈ R m×d . Furthermore, A hom is continuously differentiable and satisfies A hom (0) = 0. b) If the operator A is frame-indifferent in the sense A(ω, Oξ) = O −1 A(ω, ξ) for all O ∈ SO(m), all ξ ∈ R m×d , and all ω ∈ H, the operator A hom inherits the frame-indifference in the sense

2.3.
Optimal-order error estimates for the approximation of the homogenized operator by periodic RVEs. To perform numerical simulations based on the homogenized PDE (2), the effective material law A hom must be determined. The theoretical expression (5) for the effective material law A hom is essentially an average over all possible realizations of the random material; it is therefore a quantity that is not directly computable. To numerically determine the effective (homogenized) material law A hom in practice, the method of representative volumes is typically employed: A finite sample of the random medium is chosen, say, a cube [0, L] d of side length L ε, and the cell formula from homogenization theory is evaluated on this sample.
In most cases, a representative volume of mesoscopic size ε L 1 is sufficient to approximate the effective material law A hom well. For this reason, at least in the case of a clear separation of scales ε 1 numerical simulations based on the homogenized equation (2) and the RVE method are several to many orders of magnitude faster than numerical simulations based on the original PDE (1).
In our next theorem, we establish a priori error estimates for the approximation of the effective material law A hom by the method of representative volumes. Our a priori error estimates are of optimal order in the physically relevant cases d ≤ 4, at least if the issue of boundary layers on the RVE is addressed appropriately. More precisely, we show that the homogenized coefficients A hom may be approximated by the representative volume element method on an RVE of size L up to an error of the order of the natural fluctuations ( L ε ) −d/2 . This rate of convergence coincides with the optimal rate in the case of linear elliptic PDEs and systems, see [32,33].
There exist various options to define the RVE approximation, see e. g. [22, Section 1.4] for a discussion in the linear elliptic setting. In the present work, we shall consider the case of periodic RVEs: One considers an L-periodic approximation P L of the probability distribution P of the random field ω ε , that is an L-periodic variant ω ε,L of the random field ω ε (see below for a precise definition), and imposes periodic boundary conditions on the RVE boundary ∂[0, L] d .
Before rigorously defining the notion of periodization, we first note that to any periodic parameter field ω L one can unambiguously (and deterministically) associate a homogenized coefficient via the classical periodic homogenization formula.
Definition 7 (Periodic RVE approximation). Let A : H × R m×d → R m×d be a family of monotone operators satisfying (A1)-(A3). To any L-periodic parameter field ω L and any ξ ∈ R m×d we associate the RVE approximation A RVE,L ( ω L , ξ) for the effective coefficient A hom (ξ) given by where the (periodic) corrector φ ξ = φ ξ ( ω L , ·) is defined as the unique solution in We next define our notion of L-periodic approximation of the coefficient field ω ε . The main condition will be that the statistics of ω ε and ω ε,L must coincide on balls of the form B L 4 (x 0 ) around any x 0 ∈ R d .
Definition 8 (L-periodic approximation of P). Let P satisfy the assumptions (P1) and (P2). Let L ≥ ε. We say that P L is an L-periodic approximation to P, if P L is stationary in the sense of (P1), concentrates on L-periodic parameter fields, satisfies the periodic spectral gap of Definition 11b, and the following property holds: For P-a.e. random field ω ε and P L -a.e. random field ω ε,L we have i.e., if ω ε is distributed with P, and ω ε,L is distributed with P L , then the restricted random fields ω ε | B L For such an L-periodic approximation P L , we abbreviate the representative volume approximation as introduced in Definition 7 by A RVE,L (ξ) := A RVE,L (ω ε,L , ξ).
Note, however, that the existence of such an L-periodic approximation P L of P has to be proven on a case-by-case basis, depending on the probability distribution. To give one example, for a random field ω ε of the form ω ε (x) := ξ(Y ε (x)), where Y ε is a stationary Gaussian random field arising as the convolution Y ε (x) := (β ε * W )(x) of white noise W with a kernel β ε with supp β ε ⊂ B ε , one may construct an Lperiodic approximation simply by replacing the white noise W by L-periodic white noise W L , i. e. by defining ω ε,L (x) := ξ((β ε * W L )(x)).
For the approximation of the effective material law A hom by such an L-periodic representative volume, we establish the following a priori error estimate. Again, our rates of convergence coincide with the optimal rates of convergence for linear elliptic PDEs and systems [29,31,32].
Theorem 9 (Error estimate for periodic RVEs). Let A : H × R m×d → R m×d satisfy the assumptions (A1) -(A3) and let P satisfy the assumptions (P1)-(P2). Let L ≥ 2ε, let P L be an L-periodic approximation of P in the sense of Definition 8, and denote by A RVE,L (ξ) the corresponding representative volume approximation for the homogenized material law A hom (ξ).
(a) (Estimate on random fluctuations). For all ξ ∈ R m×d we have the estimate on random fluctuations where C denotes a random variable that satisfies a stretched exponential moment bound uniformly in L, i. e. there exists C = C(d, m, λ, Λ, ρ) such that (b) (Estimate for the systematic error). Suppose that P and P L additionally satisfy the small-scale regularity condition (R). Then for any ξ ∈ R m×d the systematic error of the representative volume method is of higher order in the sense Here, α d is given by for some C = C(d, m, λ, Λ, ν, ρ). In dimension d ≤ 4 the estimate is optimal (up to the logarithmic factor). Without the small-scale regularity condition (R), the suboptimal estimate for d ∈ {2, 4}, 1 for d = 3, d = 1, and d ≥ 5.
In particular, we derive the following overall a priori error estimate for the method of representative volumes.
Corollary 10 (Total L 2 -error for periodic RVEs). Let assumptions (A1) -(A3) and (P1) -(P2) be in place. Let L ≥ ε. Let P L be a L-periodic approximation to P in the sense of Definition 8. (a) If the small-scale regularity condition (R) is satisfied, then for 2 ≤ d ≤ 7 we obtain the optimal estimate Moreover, for d > 7 we obtain the suboptimal estimate .
(b) If (R) is not satisfied, then for any d ≥ 2 we obtain the subotimal estimate .
For d ≤ 4 this estimate is optimal except for the logarithmic factor. In the above estimates, we have C = C(d, m, λ, Λ, ρ).

2.4.
The decorrelation assumption: spectral gap inequality. We finally state and discuss the quantitative assumption on the decorrelation of the random field ω ε . The majority of results in the present work require the correlations of the random parameter field ω ε to decay on scales larger than ε in a quantified way. This quantified decay is enforced by assuming that the probability distribution P satisfies the following spectral gap inequality. As already mentioned previously, this spectral gap inequality holds for example for random fields of the form ω ε (x) = β( ω(x)), where β : R → (−1, 1) is a 1-Lipschitz function and where ω ε is a stationary Gaussian random field subject to a covariance estimate of the form Cov ω ε (x), ω ε (y) ≤ C ε ε + |x − y| d+κ for all x, y ∈ R d for some κ > 0 and some C < ∞. The derivation is standard and may be found e. g. in [20].
(a) We say that the probability distribution P of random fields ω ε satisfies a spectral gap inequality with correlation length ε and constant ρ > 0 if any random variable F = F (ω ε ) satisfies the estimate Here, ffl where the sup runs over all random fields δω ε : (b) Let L ≥ ε and let P L be a probability distribution of L-periodic random fields ω L,ε . We say that P L satisfies a periodic spectral gap inequality with correlation length ε and constant ρ > 0 if any random variable F = F (ω ε,L ) satisfies the estimate Here, ffl where the sup runs over all L-periodic random fields δω ε,L : For a reader not familiar with the concept of spectral gap inequalities, it may be instructive to inserting the spatial average F (ω ε ) := ffl Br ω ε dx into the definition (8). For this particular choice, the Fréchet derivative is given by ∂F ∂ωε = 1 |Br| χ Br and the expression on the right-hand side in (8) is of the order ( ε r ) d . Thus, the fluctuations of the spatial average F (ω ε ) = ffl Br ω ε (x) dx are (at most) of the order ( ε r ) d/2 , just as expected when averaging ( r ε ) d independent random variables of similar variance. However, the importance of spectral gap inequalities in stochastic homogenization [30,38,29] stems from the fact that they also entail fluctuation bounds for appropriate nonlinear functionals of the random field ω ε , as the homogenization corrector φ ξ is a nonlinear function of the random field ω ε (even in the setting of linear elliptic PDEs). To give a simple example for fluctuation bounds for a nonlinear functional of ω ε , the reader will easily verify that the spectral gap inequality (8) also implies a fluctuation bound of the order ( ε r ) d/2 for a quantity like f ffl Br g(ω ε (x)) dx with two 1-Lipschitz functions f, g : R → R.

3.1.
Key objects: Localized correctors, the two-scale expansion, and linearized correctors. Before turning to the description of our strategy, we introduce the central objects for our approach. Besides the homogenization corrector φ ξ defined in Definition 1, these key objects include the flux corrector σ ξ , localized versions φ T ξ and σ T ξ of the corrector and the flux corrector, as well as the homogenization correctors φ T ξ,Ξ and flux correctors σ T ξ,Ξ for an associated linearized PDE. At the end of the section we also state optimal stochastic moment bounds for these correctors, which will play a central role in the derivation of our main results.
The flux corrector σ ξ is an important quantity in the quantitative homogenization theory of elliptic PDEs, see [29] for a first reference in the context of quantitative stochastic homogenization. Recall that the corrector φ ξ provides the connection between a given macroscopic constant field gradient ξ and the corresponding microscopic field gradient ξ + ∇φ ξ . Similarly, the flux corrector σ ξ : R d → R m × R d×d skew provides a "vector potential" for the difference between the "microscopic" and the "macroscopic" flux in the sense The central importance of the flux corrector σ ξ is that it allows for an elementary representation of the error of the two-scale expansion, see below.
The flux corrector σ ξ is a d−1-form, i. e. it satisfies the skew-symmetry condition As σ ξ is a "vector potential" (a d − 1-form), the equation (10) only defines σ ξ up to gauge invariance. It is standard to fix the gauge by requiring σ ξ to satisfy Localized correctors. It is cumbersome to work directly with the corrector φ ξ and flux corrector σ ξ in a rigorous manner; in particular, since we need to consider derivatives of (φ ξ , σ ξ ) with respect to compactly supported perturbations of the parameter field ω ε . For this reason, we shall frequently work with the localized correctors φ T ξ and σ T ξ , which for T > 0 and any parameter field ω : R d → H ∩ B 1 can be defined unambiguously on a purely deterministic level.

Lemma 12 (Existence of localized correctors). Let assumptions
uloc . Note that the additional term 1 T φ T ξ introduces an exponential localization effect. As a consequence, existence and uniqueness of φ T ξ follow by standard arguments. For example, existence follows by considering the sequence of solutions to the PDEs −∇ · (A( ω, χ Br(x0) ξ + ∇w r )) + 1 T w r = 0 (which admit a unique solution w r ∈ H 1 by standard monotone operator theory) and passing to the limit r → ∞; the exponential localization of Lemma 36 below then yields the convergence and the independence of the limit from the choice of x 0 . The argument for σ T ξ is similar. For a detailed proof see [21].
Note also that for a stationary random field ω ε , by uniqueness the associated localized homogenization corrector φ T ξ and the localized flux corrector σ T ξ inherit the property of stationarity.
The localized correctors approximate the original correctors (φ ξ , σ ξ ) in the sense that in the limit T → ∞; for d ≥ 3, we even have the convergence of φ T ξ and σ T ξ itself to stationary limits φ ξ and σ ξ (which however may differ from the φ ξ and σ ξ from Definition 1 by an additive constant). A proof of these facts is provided in Lemma 26.
Two-scale expansion. The correctors (φ ξ , σ ξ ) provide a link between the gradient of the homogenized solution ∇u hom and the gradient of the original solution ∇u ε via the two-scale expansion: Indeed, the two-scale expansion is a (formal) approximate solution to the equation with microscopically varying material law in the sense with the residual R given by If ε is much smaller than the scale on which ∇u hom changes, the residual R will be small (as we will typically have ∂ ξ φ ξ ∼ ε and ∂ ξ σ ∼ ε, see Proposition 14 for a more precise statement). In this case, taking the difference of (1) and (15) one obtains which by virtue of the monotonicity of the material law A (see (A1)-(A2)) gives rise to an estimate on ∇u ε − ∇û. From this estimate, one may then derive a bound on u ε − u hom . For a derivation of the error expression in the two-scale expansion in the form (15), which leads to some subleties regarding measurability, we refer to the forthcoming paper [21]. In our result, we will replace the term ∂ ξ φ ξ | ξ=∇u hom in the two-scale expansion by a piecewise constant approximation for u hom (with interpolation); for this variant of the two-scale expansion, essentially finite differences of the form φ ξ1 − φ ξ2 appear in the error expression. Details may be found in Proposition 27 respectively in its proof. Linearized correctors. In the error expression (16) for the residual of the two-scale expansion the derivatives ∂ ξ φ ξ and ∂ ξ σ ξ of the corrector and the flux corrector with respect to the field ξ appear. For this reason, we need optimal-order estimates on these derivatives in order to derive an optimal-order error bound for u ε − u hom . More precisely, in our variant of the two-scale expansion we need improved estimates for finite differences φ ξ1 − φ ξ2 which reflect both the magnitude of the difference |ξ 1 −ξ 2 | and the decorrelation in order to derive optimal-order error estimates. Note, however, that this is basically an equivalent problem to estimating the derivatives ∂ ξ φ ξ . It is interesting to observe that the derivatives ∂ ξ φ ξ and ∂ ξ σ ξ are at the same time the homogenization corrector and the flux corrector for the PDE linearized around the field ξ + ∇φ ξ : With the coefficient field we deduce by differentiating (4) and (11) that the equalities ∂ ξ φ ξ Ξ = φ ξ,Ξ and ∂ ξ σ ξ Ξ = σ ξ,Ξ hold, where φ ξ,Ξ and σ ξ,Ξ are defined as the solution to the PDEs i. e. φ ξ,Ξ and σ ξ,Ξ are the homogenization correctors for the linear elliptic PDE with random coefficient field a ξ .
In our analysis we will first estimate the differences φ T ξ1 − φ T ξ2 of the localized correctors and obtain the estimate on the differences φ ξ1 − φ ξ2 by passing to the limit T → ∞. For this reason, we introduce the linearized localized correctors φ T ξ,Ξ and σ T ξ,Ξ as the unique solutions in H 1 uloc to the PDEs where we have abbreviated Lemma 20. Note that by our assumption of monotonicity and Lipschitz continuity of A (A1)-(A2), the linearized coefficient fields a ξ and a T ξ are uniformly elliptic and bounded random coefficient fields with a stationary and ergodic distribution. Therefore, the existence (and the notion) of solution to (17a) -(17c) and (18a) -(18c) would follow by the linear theory of stochastic homogenization (see e. g. [29, Lemma 1]), a fact that we however do not need to use: Again, the existence of solutions to the localized corrector problems (18a) and (18c) is evident for any parameter fieldω; see Lemma 20 for a proof.
The linearized corrector problem (17a) has also been central for the homogenization result for the linearized equation by Armstrong, Ferguson, and Kuusi [2], and some of our lemmas are analogues of results from [2]: For instance, our differentiability result for the corrector φ T ξ in Lemma 20 is essentially analogous to [2, Lemma 2.4]. Furthermore, in the proofs of [2] small-scale regularity estimates similar to our estimate (42) have been employed. However, in contrast to [2] we establish optimal-order estimates on the linearized correctors φ T ξ,Ξ (see Proposition 14 below), a result that will be of key importance for our main theorems. Corrector estimates. In view of the form of the formula for the residual (16) of the two-scale expansion it is clear that a key ingredient in the quantification of the homogenization error are estimates on the correctors. In this section we state estimates on the localized correctors and its linearizations.
As the typical size of the correctors φ T ξ and φ T ξ,Ξ will be at least of order ε due to small-scale fluctuations, the corrector bounds alone cannot capture the decay of fluctuations in d ≥ 3 optimally. For d ≥ 3, it is therefore useful to state estimates also for associated vector potentials for the correctors φ T ξ and φ T ξ,Ξ . In case d ≥ 3, we introduce a (vector) potential θ T ξ : R d → R m×d for the corrector φ T ξ as the (up to an additive constant unique) sublinearly growing solution to the equation as well as the corresponding quantity θ T ξ,Ξ defined by which yields Note that as the equations (19a) and (20a) feature a non-decaying right-hand side but lack a massive term, their solvability is not guaranteed for arbitrary parameter fieldsω. For random fields ω ε subject to (P1)-(P2), in d ≥ 3 the existence of solutions to (19a) and (20a) is a consequence of standard methods in qualitative stochastic homogenization, see e. g. [29].
Proposition 13 (Estimates on the homogenization corrector for the nonlinear monotone PDE). Let the assumptions (A1)-(A3) and (P1)-(P2) be in place. Then the localized homogenization correctors φ T ξ and the localized flux correctors σ T ξdefined as the (unique) solution in H 1 uloc to the PDEs (12a) respectively (12c) -are subject to the estimate for any r ≥ 2ε, any x 0 ∈ R d , any T ≥ 2ε 2 , and any ξ ∈ R m×d . Furthermore, for d ≥ 3 we even have for any r ≥ 2ε, any x 0 ∈ R d , any T ≥ 2ε 2 , and any ξ ∈ R m×d , while for d = 1 and d = 2 we have Here, C is a random constant whose values may depend on ω ε , ξ, the localization parameter T , x 0 , and r, but whose (stretched exponential) stochastic moments are uniformly estimated by E exp Cν ≤ 2 for someν > 0 depending only on d, m, λ, Λ, and ρ (in particular, independently of T ).
In the case of three and more spatial dimensions d ≥ 3, the potential field θ T ξ exists and satisfies the estimate for any r ≥ 2ε, any x 0 ∈ R d , any T ≥ 2ε 2 , and any ξ ∈ R m×d . Furthermore, the estimates (21), (22a), and (23) also hold for the L p norm with 2 ≤ p ≤ 2d (d−2)+ and p < ∞ in place of the L 2 norm, up to the following modifications: The constantν may now also depend on p, and the factor | log(r/ε)| 1/2 in (21) and (23) is replaced by | log(r/ε)| in the cases d = 2 respectively d = 4.
Under the additional small-scale regularity assumption (R), we establish the following estimates on the homogenization corrector φ T ξ,Ξ associated with the linearized operator −∇ · ∂ ξ A(ω ε , ξ + ∇φ T ξ )∇ · . Recall that this homogenization corrector φ T ξ,Ξ is equivalently given as the derivative of the homogenization corrector φ T ξ with respect to ξ in direction Ξ, see Lemma 20.
Proposition 14 (Estimates on the homogenization corrector for the linearized PDE). Let the assumptions (A1)-(A3) and (P1)-(P2) be in place. Suppose furthermore that the small-scale regularity condition (R) holds. Then the homogenization corrector for the linearized equation φ T ξ,Ξ := ∂ ξ φ T ξ Ξ and the corresponding flux corrector σ T ξ,Ξ := ∂ ξ σ T ξ Ξ given as the solutions to the PDEs (18a) and (18c) are subject to the estimates for any r ≥ 2ε, any x 0 ∈ R d , any T ≥ 2ε 2 , and any ξ, Ξ ∈ R m×d . Furthermore, for d ≥ 3 we even have , any T ≥ 2ε 2 , and any ξ, Ξ ∈ R m×d , while for d = 1 and d = 2 we have Here, C is a random constant whose values may depend on ω ε , ξ, Ξ, r, x 0 , and the localization parameter T , but whose (stretched exponential) stochastic moments are uniformly estimated by E exp Cν ≤ 2 for someν > 0. The constantsν and C depend only on d, m, λ, Λ, ν, and ρ (in particular, they are independent of T ).
In the case of three and more spatial dimensions d ≥ 3, the potential field θ T ξ,Ξ exists and satisfies the estimate for any r ≥ 2ε, any x 0 ∈ R d , any T ≥ 2ε 2 , and any ξ, Ξ ∈ R m×d .
Furthermore, all of these estimates also hold for the correctors associated with the adjoint coefficient field a T, * ξ . Finally, the estimates (24), (25), and (26) also hold for the L p norm with 2 ≤ p ≤ 2d (d−2)+ and p < ∞ in place of the L 2 norm, up to the following modifications: The constantν may now also depend on p, and the factor | log(r/ε)| 1/2 in (24) and (26) is replaced by | log(r/ε)| in the cases d = 2 respectively d = 4.
A key consequence of our estimates on the linearized correctors φ T ξ,Ξ and σ T ξ,Ξ is the following estimate on differences of the correctors φ ξ and σ ξ for different values of ξ.
converge to limits φ ξ and σ ξ . The limits φ ξ and σ ξ satisfy the corrector equations (4) and (10); furthermore, in case d ≥ 3 they are subject to the estimate , and any ξ ∈ R m×d , while in case d = 1, 2 they satisfy for any r ≥ 2ε, any x 0 ∈ R d , and any ξ ∈ R m×d . Here, C is a random constant with stretched exponential moments in the sense E exp Cν ≤ 2 for someν > 0 depending only on d, m, λ, Λ, and ρ.
Suppose furthermore that the small-scale regularity condition (R) holds. Then the difference of homogenization correctors φ ξ1 − φ ξ2 and σ ξ1 − σ ξ2 is estimated by for any ξ 1 , ξ 2 ∈ R m×d , any r ≥ ε, and any x 0 ∈ R d , where the random constant C satisfies for someν > 0. Here,ν and C depend only on d, m, λ, Λ, ρ, and ν.

The strategy of proof for corrector estimates.
A key difficulty in quantitative stochastic homogenization is the derivation of appropriate estimates on the (localized) homogenization correctors φ T ξ and the (localized) flux corrector σ T ξ : As previously mentioned, corrector bounds form the basis for homogenization error estimates and the analysis of the representative volume approximation. For our purposes, we for example need to show that the homogenization corrector φ T ξ and the flux corrector σ T ξ are at most of order |ξ|ε (at least in case d ≥ 3). In periodic homogenization, that is for ε-periodic operators −∇ · A ε (x, ∇·), such corrector bounds are an easy consequence of the periodicity: By the defining equation (12a), the corrector φ T ξ is ε-periodic, has vanishing average on each periodicity cell [0, ε] d , and is subject to an energy estimate of the form By the Poincaré inequality on the periodicity cell, this implies a bound of order |ξ|ε on the homogenization corrector φ T ξ . The derivation of the corresponding estimate for the flux corrector σ T ξ is analogous. In contrast, in quantitative stochastic homogenization the derivation of such estimates on the correctors is much more involved and presents one of the key challenges. Our strategy for the derivation of bounds on φ T ξ , which is strongly inspired by [32,33,29,31] but streamlined by minimizing the use of elliptic regularity, proceeds as follows: • As outlined in Proposition 13, it is our goal to prove a corrector estimate of essentially the form for any R ≥ ε, where C is a random constant with stretched exponential moments in the sense E[exp(C 1/C )] ≤ 2.
• In principle, the technical Lemma 25 (see below) reduces the derivation of such estimates on the corrector φ T ξ to (mostly) the derivation of bounds on stochastic moments of integral functionals of the form where g basically takes a weighted average of ∇φ T ξ on a certain scale r with ε ≤ r ≤ R.
• The random variables F (ω ε ) as defined in (30) have vanishing expectation E[F (ω ε )] = 0 due to the vanishing expectation of the corrector gradient E[∇φ T ξ ] = 0. The latter property is a consequence of stationarity of the corrector φ T ξ : Since the probability distribution of the random monotone operator A(ω ε (x), ·) is invariant under spatial translations, the expectation of the corrector As a consequence of the vanishing expectation, it suffices to bound the centered moments of F (ω ε ), that is, the • Concentration inequalities -like the spectral gap inequality -are one of the most widespread probabilistic tools for establishing bounds on centered moments of random variables. In our context, that is for random fields ω ε with correlations restricted to the length scale ε, they will read for example for all random variables F = F (ω ε ). Here, ∂F ∂ωε denotes (essentially) the Frechét derivative of F with respect to the field ω ε and ρ > 0 denotes a constant.
Note that in stochastic homogenization it is an assumption that a spectral gap inequality like (31) holds for the random field ω ε . This assumption on the probability distribution of the random field ω ε encodes the decorrelation properties of ω ε . For instance, the reader may convince oneself that (31) implies an estimate of the order (ε/r) d/2 on the fluctuations of the average F (ω ε ) := ffl Br ω ε dx on a scale r ≥ ε, as one would expect for e. g. the average of (r/ε) d i. i. d. random variables; see also the discussion below Definition 11.
A spectral gap inequality of the form (31) is valid for many classes of random fields, see Figure 1 and the accompanying text. It also implies estimates on higher centered moments, see Lemma 16 below.
• In order to employ the spectral gap inequality (31) to estimate the stochastic moments of random variables F (ω ε ) =´g · ∇φ T ξ dx as defined in (31), we need to estimate the right-hand side of (31), that is we need to estimate the sensitivity of the functionals F (ω ε ) with respect to changes in the coefficient field ω ε . By standard computations (see the proof of Lemma 17a for details), one may show that the Frechét derivative of F with respect to ω ε is given by where h is the solution to the auxiliary linear elliptic PDE (the uniformly elliptic and bounded coefficient field a T ξ being given by ). By the (standard) growth assumptions for A(ω ε , ·) in (A3), the representation (32) implies that the sensitivity ∂F ∂ωε may be estimated by C|ξ + ∇φ T ξ ||∇h|. In conclusion, by the spectral gap inequality (31) respectively its version for higher stochastic moments in Lemma 16 we see that we have • Let us next discuss how to estimate the right-hand side of (34). Recall that g basically takes a weighted average on a scale r (see (30)), i. e. g is supported on a ball B r and satisfies an estimate like |g| r −d . As a consequence, we obtain the deterministic estimate´R d |∇h| 2 dx r −d by a simple energy estimate for the defining equation (33). If we knew that the small-scale averages of the corrector gradient were bounded in the sense ffl Bε(x) |ξ + ∇φ T ξ | 2 dx |ξ| 2 , by Hölder's inequality we would directly obtain the deterministic bound By (34) this would imply an estimate of fluctuation order on the functionals (30) of the form E[|F | q ] 1/q q|ξ|(ε/r) d/2 . This would be precisely the bound we seek to obtain.
However, in the context of stochastic homogenization we cannot expect a uniform bound on the locally averaged corrector gradient ffl Bε(x) |ξ + ∇φ T ξ | 2 dx, as in general the random field ω ε may contain geometric configurations which could cause the corrector gradient to be arbitrarily large. As we are dealing with the case of systems, we also cannot expect more than L p integrability of the gradient ∇h of the auxiliary function for a Meyers exponent p slightly larger than 2. We therefore need (almost) an L ∞ bound on the local averages ffl It is here useful to introduce an auxiliary quantity, namely the minimal radius r * ,T,ξ (x) above which the corrector φ T ξ satisfies a bound of the form (plus one additional technical condition). This allows us to bound using a trivial estimate and the Caccioppoli inequality This estimate is however again not sufficient for our purposes. It is here that elliptic regularity theory in form of the hole-filling estimate enters and provides a slight but crucial improvement for some δ > 0 (see Lemma 88 for details).
Together with a version of the spectral gap estimate -see Lemma 16 below -and the vanishing expectation E[∇φ T ξ ] = 0, this estimation strategy yields a bound of the form for any 0 < τ < 1 and any q large enough (the latter of which is not a problem).
• One then observes that one may control the moments of r * ,T,ξ on the righthand side of (35) in terms of moments of functionals F =´g · ∇φ T ξ dx (see the proof of Proposition 13). It is here that the slight gain δ in the exponent (from hole-filling) is crucial, as it causes the estimate to buckle, yielding a bound of the form The resulting moment bounds on r * ,T,ξ then allow to deduce the corrector estimate in Proposition 13.
• The derivation of the estimates for the flux corrector σ T ξ and the linearized correctors φ T ξ,Ξ , σ T ξ,Ξ follows a similar strategy. However, in the case of the linearized correctors φ T ξ,Ξ , σ T ξ,Ξ the derivation of the sensitivity estimates involves an additional integrability issue on small scales, making it necessary to use a C 1,α regularity theory on the microscopic scale. It is here that our additional regularity assumption (R) enters. For details, see the derivation of Lemma 21.
Let us now state the lemmas used in the course of the proof of our main results. An immediate consequence of the spectral gap inequality of Definition 11 is the following version for the q-th centered moment; for a proof, we refer to e. g. [ A central step towards the proof of the corrector estimates of Proposition 13 are the following estimates on stochastic moments of linear functionals of the localized corrector and the localized flux corrector.
Lemma 17 (Estimates for linear functionals of the corrector and the flux corrector for the monotone system). Let the assumptions (A1)-(A3) and (P1)-(P2) be satisfied. Let ξ ∈ R m×d , T ≥ 2ε 2 , K mass ≥ C(d, m, λ, Λ), and let φ T ξ be the unique nongrowing solution to the corrector equation (12a). Define for any x 0 ∈ R d r * ,T,ξ (x 0 ) := inf r = 2 k ε : k ∈ N 0 and for all R = 2 ε ≥ r, ∈ N 0 , we have both Then the random field r * ,T,ξ is stationary and allows for the following estimates.
a) Let 2 < p < 2 + c (where c = c(d, m, λ, Λ) > 0 will be defined in the proof below), let x 0 ∈ R d and r ≥ ε, and let F be a functional of the form Then there exists an exponent δ = δ(d, m, λ, Λ) > 0 (from hole-filling) such that the stochastic moments of the functional F σ may be estimated by for any 0 < τ < 1 and any q ≥ C. Here, the constants C may depend on d, m, λ, Λ, ρ, K mass , p, and τ . b) Let 2 < p < 2 + c, let x 0 ∈ R d and r ≥ ε, and let F σ be a functional of the form Then there exists δ = δ(d, m, λ, Λ) > 0 such that the stochastic moments of the functional F σ may be estimated by for any 0 < τ < 1 and any q ≥ C(d, m, λ, Λ, p, τ ). c) Suppose that d ≥ 3. Let 2 < p < 2 + c, let x 0 ∈ R d and r ≥ ε, and let F θ be a functional of the form Then there exists δ > 0 such that the stochastic moments of the functional F θ may be estimated by for any 0 < τ < 1 and any q ≥ C(d, m, λ, Λ, p, τ ).
In particular, all of our estimates are independent of T ≥ 2ε 2 .
We also obtain the following estimates on averages of the correctors.
Lemma 18. Given the assumptions of Lemma 17, for any 0 < τ < 1, any q ≥ C, any x 0 ∈ R d , any r ≥ ε, and any R ≥ r we have the estimate . We then establish the following moment bounds on the minimal radius r * ,T,ξ , which will enable us to deduce the corrector bounds in Proposition 13.
It will be central for our optimal-order homogenization error estimates to obtain optimal-order estimates on differences of correctors for different macroscopic field gradients ξ, like φ ξ1 − φ ξ2 for ξ 1 , ξ 2 ∈ R m×d . To estimate such differences, we will rely on estimates for the derivative φ T ξ,Ξ of the corrector φ T ξ with respect to ξ in direction Ξ. Formally differentiating the corrector equation (12a) with respect to ξ, we obtain for φ T ξ,Ξ : (see also (18a) above). It is interesting that this PDE again takes the form of a corrector equation, namely the corrector equation for the linear elliptic equation with random coefficient field ∂ ξ A(ω ε , ξ + ∇φ T ξ ). We next show that this differentiation can be justified rigorously.
Lemma 20 (Differentiability of the corrector with respect to ξ). Let T > 0 and assume (A1)-(A2). For any ξ ∈ R m×d and any parameter fieldω, let φ T ξ denote the unique solution in H 1 uloc (R d ; R m ) to the localized corrector equation (12a). Denote by φ T ξ,Ξ the unique solution in H 1 uloc (R d ; R m ) to the localized linearized corrector equation (18a). Then the map ξ → φ T ξ -as a map R m×d → H 1 uloc (R d ; R m ) -is differentiable with respect to ξ in the Frechét sense with Similarly, denoting by σ T ξ and σ T ξ,Ξ the unique solutions in H 1 uloc to the PDEs (12c) and (18c), the map ξ → σ T ξ -as a map In order to establish appropriate estimates on the linearized corrector φ T ξ,Ξ , we again start by deriving estimates for linear functionals of the corrector gradient.
Then the random field r * ,T,ξ,Ξ is stationary and allows for the following estimates.
a) Let 2 < p < 2 + c (where c = c(d, m, λ, Λ) > 0 will be defined in the proof below), let x 0 ∈ R d and r ≥ ε, and let F be a functional of the form Then there exists an exponent δ = δ(d, m, λ, Λ) > 0 (from hole-filling) such that the stochastic moments of the functional F may be estimated by for any 0 < τ ≤ c = c(d, m, λ, Λ) and any q ≥ C. Here, the constant C may depend on d, m, λ, Λ, ρ, ν, K mass , p, and τ . b) Let 2 < p < 2 + c, let x 0 ∈ R d and r ≥ ε, and let F σ be a functional of the form Then the stochastic moments of the functional F σ may be estimated by for any 0 < τ ≤ c(d, m, λ, Λ) and any q ≥ C. c) Suppose that d ≥ 3. Let 2 < p < 2 + c, let x 0 ∈ R d and r ≥ ε, and let F θ be a functional of the form Then the stochastic moments of the functional F θ may be estimated by for any 0 < τ ≤ c(d, m, λ, Λ) and any q ≥ C.
We also derive estimates on the averages of the linearized correctors that are analogous to the ones of Lemma 18.

Lemma 22.
Given the assumptions of Lemma 21, for any 0 < τ ≤ c(d, m, λ, Λ), any x 0 ∈ R d , any r ≥ ε, and any R ≥ r we have the estimate .
We then establish moment bounds on the minimal radius r * ,T,ξ,Ξ .

Lemma 23 (Moment bound on the minimimal radius of the linearized equation).
Given the assumptions of Lemma 21, the minimal radius r * ,T,ξ,Ξ has stretched exponential moments in the sense for some constant C = C(d, m, λ, Λ, ρ, ν).
Note that under a different decorrelation assumption -namely, finite range of dependence as opposed to a spectral gap inequality -stochastic moment bounds for (basically) the quantity r * ,∞,ξ,Ξ have already been established in [2]. The estimates in [2] even achieve optimal stochastic integrability.
A necessary ingredient for the sensitivity estimates for functionals of the linearized corrector φ T ξ,Ξ -as derived in Lemma 21 -is the following regularity estimate, which is a consequence of our corrector estimates for the nonlinear problem and the small-scale regularity condition (R).
holds for any x ∈ R d .
Proof. We use the assumption of small-scale regularity (R) to yield by Proposition 41 Using (89) to estimate the right-hand side in this equation, we obtain the desired bound.
The following result converts estimates on linear functionals of the gradient of a random field and estimates on the gradient of the random field into L p -estimates for the random field.
Lemma 25 (Estimate on the L p norm by a multiscale decomposition). Let γ > 0, ε > 0, m ≥ 2, and K ≥ 0. Let u = u(ω ε , x) be a random function subject to the estimates and for all r ≥ 2ε, all x 0 ∈ R d , and all vector fields g : Then estimates of the form hold for all r ≥ ε and all 2 ≤ p ≤ 2d (d−2)+ (as long as p < ∞), the constant C 3 depending on γ and p, but being independent of m.
Proof. The L 2 -version of the statement is shown e. g. in [13,Lemma 12]. The L p version is proven analogously.
Finally, note that our main results rely on estimates for the correctors φ ξ and σ ξ (and not the localized approximations φ T ξ and σ T ξ ). While all of our bounds on the localized correctors φ T ξ and σ T ξ are uniform with respect to the parameter T ≥ 2ε 2 , it remains to justify the passage to the limit T → ∞.

3.3.
Quantitative two-scale approximation. The estimate on the homogenization error invokes a quantitative two-scale expansion of the homogenized equation. As indicated earlier, for technical reasons, we do not work with the usual expansion u hom + ∇φ ∇u hom , but consider the (easier) case of a piecewise affine approximation 1 of the form where {η k } denotes a partition of unity subordinate to a uniform cover of R d on a scale δ ε, and {ξ k } denotes the associated piecewise-constant approximation of ∇u hom . Depending on our application, we will choose either δ := ε or δ := ε 1/2 .
For any givenū ∈ H 2 (O) and any k ∈ K, introduce the weighted average ξ k of ∇ū near k as Define the two-scale expansion forū with a piecewise constant approximation for the gradient ∇ū asû Then the action of the operator −∇ · (A ε (x, ∇·)) onû is related to the action of the operator −∇ · (A hom (∇·)) onū via . Furthermore, the ξ k satisfy the estimates k,k∈K:|k−k|≤dδ for any p ≥ 2.
3.4. The approximation of the homogenized operator by periodic representative volumes. In this section we outline the general strategy for the proof of the a priori estimates for the RVE approximation of the effective material law A hom stated in Theorem 9. It is inspired by [30], where the first optimal result for periodic RVEs in a linear discrete setting has been established. In the following we focus on the situation where the small-scale regularity assumption (R) is satisfied, since in that case we can obtain better rates. The argument features additional subtleties (even compared to the linear case of [30]). We start with the observation that the total approximation error decomposes into a random and a systematic part The random error (the first term on the RHS) is a random variable with vanishing expectation and corresponds to the fluctuations of the periodic RVE approximation around its expected value. Theorem 9a asserts that this error decays with the same rate as the fluctuations of a linear average of the random parameter field on scale L, i. e. like ( L ε ) − d 2 . The second term on the RHS is the systematic error. It captures the error coming from approximating the whole-space law P by the L-periodic law P L . As in [30] we decompose the systematic error into different contributions. In particular, we introduce the following notion of localized RVE approximation.
Definition 28 (Localized RVE approximation). Let A satisfy (A1)-(A3). Let T ≥ 2ε 2 , let η be a non-negative weight η : R d → R with´R d η dx = 1, and let ω : R d → H ∩ B 1 be a parameter field. We then define the localized RVE approximation of size L with localization parameter T ∈ [2ε 2 , L] for the effective operator A hom by the expression for any ξ, Ξ ∈ R m×d , where φ T ξ = φ T ξ ( ω) and φ * ,T ξ,Ξ = φ * ,T ξ,Ξ ( ω) denote the localized corrector and the localized linearized adjoint corrector, i. e. φ T ξ is the unique solution in H 1 uloc (R d ; R m ) to the equation (12a) and φ T ξ,Ξ is the unique solution in H 1 uloc (R d ; R m ) to the equation (18a) but with a T ξ replaced by its transpose a T, * ξ . If we choose in the previous definition a random field ω ε with some probability distribution P subject to (P1)-(P2), the localized RVE approximation A RVE,η,T (ξ) converges almost surely for T → ∞ to the effective material law A hom (ξ). To see this, one may e. g. use ergodicity and stationarity as well as the sublinear growth of correctors. However, in contrast to the periodic RVE approximation A RVE,L to the effective material law or the homogenized material law A hom itself, the localized RVE can be defined for all parameter fields ω, since it only invokes the localized correctors. This allows to couple parameter fields sampled with P and P L , respectively. More precisely, with the restriction map we note that if P L is an L-periodic approximation of P in the sense of Definition 8 and if ω ε,L and ω ε denote random fields distributed according to P L respectively P, then the equality of laws π L ω ε ∼ π L ω ε,L holds. Hence, if the weight η of the localized RVE is supported in B L 4 and P L is a L-periodic approximation of P, then This identity couples localized approximations for the homogenized material law associated with P and P L . Motivated by this we decompose the systematic error as (50) The differences in the second and third row capture the error coming from replacing ω ε with π L ω ε . As our next lemma shows, this error becomes small upon choosing the localization parameter T suitably. At the heart of the proof of the lemma is the exponential locality of the localized corrector equations (12a) and (18c), see Lemma 36.
Lemma 29 (Estimate for the coupling error). Let A : R d × R m×d → R m×d be a monotone operator subject to conditions (A1)-(A2) and subject to the Lipschitz estimate for ∂ ξ A as in (R). Let L ≥ √ T ≥ 2ε and let η L denote a non-negative weight supported in B L 8 with |η L | ≤ C(d)L −d and |∇η| ≤ C(d)L −d−1 . Then there exist q = q(d, m, λ, Λ), γ = γ(d, m, λ, Λ), and C = C(d, m, λ, Λ) such that for all parameter fields ω and all ξ, Ξ ∈ R m×d we have and where X L,T ⊂ B L 8 denotes an arbitrary finite set with cardinality #X L, The differences in the first and last row of the right-hand side in (50) are the systematic localization errors, which originate from the localization with parameter T . The systematic localization error can be estimated as follows.
Proposition 30 (Systematic error of localized RVE). Let A : H × R m×d → R m×d satisfy (A1)-(A3). Let P be stationary in the sense of assumption (P1). Then the following holds for all ξ ∈ R m×d : (a) E A RVE,η,T (ξ) as defined in (47) is independent of the weight η.
(b) Suppose that P satisfies a spectral gap estimate in the sense of assumption (P2).
Assume furthermore that the small-scale regularity condition (R) holds. Then for all T ≥ 2ε 2 and all ξ, Ξ ∈ R m×d we have for d = 2 and d = 4, 1 for d = 3, d = 1, and d ≥ 5 with a constant C = C(d, m, λ, Λ, ρ, ν). (c) If P concentrates on L-periodic parameter fields and satisfies a periodic spectral gap estimate in the sense of Definition 8b as well as the regularity condition (R), then we have for all T ∈ [2ε 2 , L 2 ] and all ξ, for d = 2 and d = 4, 1 for d = 3, d = 1, and d ≥ 5.
In order to prove this result, we need to quantify the systematic error on the level of the correctors. Note that this estimate is slightly pessimistic (by the logarithmic factor) for d = 2 and d = 4. Moreover, note that for d ≥ 5 the estimate saturates, an effect that is also observed in the stochastic homogenization of linear elliptic PDEs, see [30].
Lemma 31 (Localization error in the corrector). Let A satisfy the assumptions (A1)-(A3) and let P satisfy the assumptions (P1)-P(2). Then for all T ≥ 2ε 2 and all x 0 ∈ R d we have for d = 3, d = 1, and d ≥ 5.
Here C denotes a random constant as in Proposition 14.
With help of Meyers' estimate we may upgrade the previous estimate to an L p bound.
We also require control of the localization error for the linearized corrector.

4.1.
Ingredients from regularity theory. Our estimates crucially rely on three basic regularity estimates for elliptic PDEs, the first two being the Caccioppoli inequality and the hole-filling estimate for nonlinear elliptic equations (and systems) with monotone nonlinearity and the last one being a weighted Meyers' estimate for linear elliptic equations (and systems). We first state the Caccioppoli inequality and the hole-filling estimates in the nonlinear setting. The (standard) proofs are provided in Appendix A.
Lemma 34 (Caccioppoli inequality and hole-filling estimate for monotone systems). Let A(x, ξ) be a monotone operator subject to the assumptions (A1)-(A2). Let 0 < T ≤ ∞ and let u be a solution to the system of PDEs Then there exist constants C > 0 and δ > 0 depending only on d, m, λ, and Λ with the following property: For any R, r > 0 with R ≥ r we have the Caccioppoli inequality for any b ∈ R m and the hole-filling estimatê Br(x0) We next state a weighted Meyers-type estimate for linear uniformly elliptic equations and systems. Its proof (which is provided in Appendix C) relies on the usual Meyers estimate, along with a duality argument and a hole-filling estimate for the adjoint operator. The details are provided in [13] for the case T = ∞; however, the proof applies verbatim to the case T > 0, as the only ingredients are the Meyers' estimate for the PDE and the hole-filling estimate for the adjoint PDE.

Lemma 35 (A weighted Meyers estimate for linear elliptic systems).
Let a : R d → (R m×d ) 2 be a uniformly elliptic and bounded coefficient field with ellipticity and boundedness constants λ and Λ. Let r > 0 be arbitrary.
The localization ansatz for the correctors relies crucially on the following elementary deterministic energy estimate with exponential localization. As the proof is short and elementary, we directly provide it here.

Lemma 36 (Exponential localization). Suppose that
Suppose that u, f , and F have at most polynomial growth in the sense that Then for 0 < γ ≤ c(d, m, λ, Λ) we havê Proof. Set η(x) := exp(−γ|x|/L). We test the equation with uη (which can be justified by approximation thanks to the polynomial growth assumption). By an integration by parts, the ellipticity and Lipschitz continuity of A, and using |∇η| ≤ The claim now follows for γ ≤ c by absorbing the terms with u and ∇u on the right-hand side into the left-hand side with help of Young's inequality.
Remark 37. We frequently apply the exponential localization in the following form: Suppose that A : R d × R m×d → R m×d is a monotone operator satisfying (A1) and (A2). Let T > 0 and L ≥ √ T . Consider u 1 , u 2 ∈ H 1 loc (R d ; R m ) and f ∈ L 2 loc (R d ; R m ), F ∈ L 2 loc (R d ; R m×d ), all with polynomial growth and related by A is a monotone operator satisfying (A1) and (A2), the derivative ∂ ξ A(x, ξ) is a uniformly elliptic matrix field; hence, a(x) is a uniformly elliptic coefficient field and the claimed estimate follows from Lemma 36.

4.2.
The convergence rate of the solutions. We first provide the proof of the error estimate for ||u ε − u hom || L 2 . It is based on a two-scale expansion with a piecewise constant approximation for the slope of the limiting solution u hom , whose approximation properties are stated in Proposition 27.
Proof of Proposition 27. Note that a simple application of the Poincaré inequality implies an estimate on the local approximation error ∇ū − ξ k and on the differences ξ k − ξk between nearby lattice sites k ∈ K andk ∈ K ∩ (k + [−δ, δ] d ) of the form for each k ∈ K. From this estimate and a straightforward estimate, the bounds (46) are immediate. Furthermore, this enables us to relate the action ∇ · (A ε (x, ∇û)) of the monotone operator A ε on the two-scale expansionû to the action ∇ · (A hom (ū)) of the homogenized operator A hom on the (sufficiently smooth) functionū. To see this, we first compute by adding zero Adding zero several times and using the fact that the η k form a decomposition of unity (i. e. k η k = 1), we get This yields using the equation for the flux corrector ∇ · σ ξ k = A ε x, ξ k + ∇φ ξ k − A hom ξ k (see (10)) and the skew-symmetry of σ ξ k (which implies ∇ · (η∇ · σ ξ k ) = −∇ · (σ ξ k ∇η)) ∇ · (A ε (x, ∇û)) = ∇ · A hom ∇ū + ∇ · I + ∇ · II + ∇ · III We next show that this relation may be simplified to ∇ · (A(x, ∇û)) = ∇ · A hom ∇ū + ∇ · R with an error term R estimated bŷ To see this, first observe that´O |I| 2 dx may be bounded by the first term on the right-hand side of (55): Indeed, using (54), the assumption of Lipschitz continuity of A ε in ξ (see (A2)), the Lipschitz continuity of A hom (see Theorem 6a), the fact that δ ≤ c(O) in case O = R d , and the support property supp η k ⊂ k + (−δ, δ) d , one verifies that´O |I| 2 dx is of the desired order. To see that the term´O |II| 2 dx may be bounded by the second term on the righthand side of (55), one uses the fact that k ∇η k = 0 (as the η k are a partition of unity) along with the estimate |∇η| ≤ Cδ −1 as well as the Lipschitz continuity of . This directly implies the estimate for the contribution of II in (55).
Finally, to estimate the third term III, we first need to bound the difference of two corrector gradients ∇φ ξ k and ∇φ ξ l for nearby grid points k and l with l ∈ k + [−δ, δ] d . To this aim, we compute using the corrector equation (4) −∇ · (A ε (x, ξ k + ∇φ ξ k ) − A ε (x, ξ l + ∇φ ξ l )) = 0 which yields with an energy estimate using the assumptions (A1) and (A2) In conjunction with the Lipschitz continuity of A ε in ξ (see (A2)) and (54), this entails that´O |III| 2 dx may be bounded by the right-hand side of (55). We then observe that (55) directly implies the desired bound (45).
We now establish our main theorems on the estimate of the homogenization error ||u ε − u hom || L p . As the proofs are highly similar, we combine the proofs of Theorem 2, Theorem 3, and Theorem 4.

Proof of Theorem 2, Theorem 3, and Theorem 4.
Step 1: Estimates on u hom . We intend to derive the estimates on the homogenization error by making use of the error bound for the two-scale expansion from Proposition 27. For this reason, we need an estimate on the second derivative of u hom . We first treat the case d ≥ 3. Differentiating the effective equation −∇ · (A hom (∇u hom )) = f with respect to the spatial coordinate x i , we deduce by adding zero (using the convention u Dir = 0 in case of the full space O in the full space R d respectively in the interior of the domain O. Note that the coefficient field a(x) := ∂ ξ A hom (∇u hom (x)) is uniformly elliptic and bounded by the structure properties of A hom stated in Theorem 6a. An energy estimate for the previous equation therefore yields for any τ > 0 Note that while this argument requires local H 2 regularity of u hom , it can easily be justified for u hom ∈ H 1 by the difference quotient method (yielding u hom ∈ H 2 locally as a result). By the usual argument -a local change of coordinates, differentiation at first only in tangential directions, and using the equation itself to estimate the second derivative in normal direction -the estimate can also be easily extended up to the boundary of O, with a domain-dependent constant. In total, we obtainˆO Furthermore, testing the homogenized PDE (2) with u hom − u Dir and using the Sobolev embedding as well as Young's inequality yields (at least for d ≥ 3) the boundˆO In the cases d = 2 and d = 1, the estimate (58) is also valid, the proof being entirely analogous. Due to the massive term, for d = 2 and d = 1 a weighted energy estimate yields the bound (59b) Step 2: Proof of Theorem 4. In order to establish Theorem 4, we apply Proposition 27 with δ := ε 1/2 in case d ≥ 3 and δ := ε 1/2 | log ε| 1/4 in case d = 2 as well asū := u hom to deduce with R subject to the bound in (45). From Corollary 15 we deduce the estimate for any r ≥ ε, any x 0 ∈ O, and any ξ ∈ R m×d . Plugging in this bound as well as (58) and (59a) into (45), we see that we have the boundŝ with a random constant C with uniformly bounded stretched exponential moments in the sense E[exp(C 1/C(d,m,λ,Λ,ρ /C(O))] ≤ 2.

This implies by the definition ofû
We next observe that by the last estimate in (46), a trace-type inequality, and the estimates (59a) respectively (59b) we have k∈K∩∂O+B 2dδ ). Plugging in this estimate, the estimate on φ ξ from (61), the Caccioppoli inequality (51) for φ ξ , as well as the bound (62) into the previous estimate, we deduce upon choosing µ := ε 1/2 in case d ≥ 3 and µ := ε 1/2 | log ε| 1/4 Using the Poincaré inequality, we finally deduce Using again the estimate on φ ξ from (61), the L 2 estimate for the ξ k in (46), and the energy estimate (59a), we obtain the statement of Theorem 4.
Step 3: Proof of Theorem 2 and Theorem 3. We will crucially use the improved estimate on the difference of correctors φ ξ k − φ ξk and σ ξ k − σ ξk provided by Corollary 15; however, as these estimates grow with a factor of (1+|ξ 1 | C +|ξ 2 | C ), we first need an L ∞ estimate for the gradient ∇u hom . In the scalar case, this may be obtained by using Moser iteration for the differentiated equation (56), while in the vector-valued case we have to appeal to the Uhlenbeck structure of the limiting equation (and hence, in essence, use the fact that |∇u hom | 2 is a subsolution to a suitable elliptic PDE, which allows one to apply Moser iteration again). In the scalar setting m = 1, by applying [28,Theorem 8.15] to (56) we obtain the estimate for p > d and all i. In the vector-valued case m ≥ 2, we may use the assumption (see (R)) that the effective operator A hom is of Uhlenbeck structure; hence, |∇u hom | 2 is the subsolution of an elliptic PDE (see [41]) and we obtain an analogous result. Finally, in the case d ≤ 2 we may apply Meyers' estimate (Lemma 35) to obtain an L p estimate for ∇ 2 u hom for some p > 2 ≥ d and therefore a uniform bound on ∇u hom of the form (63) by Morrey's embedding.
An energy estimate for the PDEs (65) and (66) followed by the Sobolev embedding now yields in case d ≥ 3 while for d = 2 we directly obtain from the energy estimate Using again the estimate on φ ξ from Corollary 15, the bound for the ξ k from (46), and the energy estimate (59a), we obtain with p := 2d d−2 in case d ≥ 3 and p := 2 Combining the previous three estimates, we obtain the statements of Theorem 2 and Theorem 3.
Proof of Lemma 26. By (12a), the difference φ 2T ξ − φ T ξ satisfies the equation with the -by (A1) and (A2) uniformly elliptic and bounded -coefficient fieldâ defined byâ By Lemma 36, we deducê Taking the expected value and plugging in the estimates (21) and (22a), we deduce by stationarity Thus, ∇φ T ξ converges in L 2 (Ω × B r ) to a stationary gradient field Θ with finite second moments and zero expectation. It may thus be represented in the form Θ = ∇ϕ, where ϕ is potentially non-stationary, but satisfies´B ε(0) ϕ dx = 0. By passing to the limit in the localized corrector equation (12a), we see that ϕ is a solution to (4) in the sense of Definition 1. By uniqueness, we conclude that ϕ = φ ξ .
We next treat σ ξ . Taking the difference of the PDE (12c) for T and 2T , we obtain where by slight abuse of notation we have written ∇ × G to abbreviate ∂ j (G · e k ) − ∂ k (G · e j ). Testing this equation with σ ξ,jk exp(−γ|x|/ √ T ), taking the expectation, and using stationarity as well as (A2) and (68), we obtain The conclusion now follows by an argument analogous to the argument for φ T ξ . We finally note that for d ≥ 3 the argument extends to φ T ξ and σ T ξ itself.

4.3.
Estimates on the random fluctuations of the RVE approximation for the effective material law. We next establish the estimates on fluctuations of the representative volume approximation for the effective material law stated in Theorem 9a.
Introducing the unique L-periodic solution h with vanishing mean to the PDE we deduce by testing (71) with φ ξ and testing (70) with h This establishes by (A3) which yields by the q-th moment version of the spectral gap inequality in Lemma 16 . By Hölder's inequality, we infer for any p > 2 . Bounding the last integral by C|Ξ| by Meyers' estimate for (71) (see Lemma 35) and using the estimate (88), we obtain Using stationarity of r * ,L,ξ and the moment bound of Lemma 19 (which we prove explicitly for the probability distribution P, but which may be established for P L analogously; furthermore, while the estimates of Lemma 19 are stated for finite T < ∞, they are uniform in T ≥ ε 2 and therefore also hold in the limit T → ∞), we deduce for q large enough This is the assertion of Theorem 9a.

4.4.
Estimates for the error introduced by localization. We next establish the estimates from Lemma 31, Corollary 32, and Lemma 33 for the error introduced in the correctors by the exponential localization on scale √ T via the massive term. We then prove Proposition 30, which estimates the systematic error in the approximation for the effective coefficient E[A RVE,η,T ] introduced by the finite localization parameter T < ∞.
Proof of Lemma 31. We will use the exponential weight η(x) := exp(−γ|x|/ √ T ) with 0 < γ 1. Note that By the localized corrector equation (12a) we have Testing with (φ 2T ξ − φ T ξ )η and using the monotonicity and Lipschitz continuity of A (see (A1)-(A2)) as well as (72), we get Choosing γ ≤ c(d, m, λ, Λ), we may absorb the second term on the RHS into the LHS to obtain In the following, we treat dimensions d ≤ 2 and d ≥ 3 separately. In the case d ≤ 2, by Young's inequality and absorption the previous inequality yieldŝ Note that Proposition 13 (in connection with a dyadic decomposition of R 2 into the ball B √ T (0) and the annuli Since η ≥ exp(−1) on B √ T , the claimed estimate follows for d ≤ 2. In the case d ≥ 3 we can use in (73) the representation φ T ξ = ∇ · (θ T ξ − b) for any b ∈ R m×d (see (19b)). We obtain by an integration by parts and by the Cauchy-Schwarz inequality the estimate 1 With Young's inequality we may absorb the second factor into the LHS of (73).
We thus obtain By appealing to Proposition 13 (in connection with a dyadic decomposition of R d ), and the fact that η ≥ exp(−1) on B √ T , the claimed estimate follows for d ≥ 3.
Proof of Corollary 32. Set u := φ 2T ξ −φ T ξ and note that withâ(ξ) : Applying Lemma 44 to this PDE -upon rewriting the right-hand side using (19b) in case d ≥ 3 -we obtain This implies by Proposition 13 and Lemma 31 Taking the p-th stochastic moment and using stationarity, we conclude.
Step 1: Proof of (a). This is a direct consequence of stationarity of P (see assumption (P1)) and stationarity of the random field A(ω ε , ξ + ∇φ T ξ ) · Ξ − 1 T φ T ξ φ * ,T ξ,Ξ , the latter of which is a consequence of the former.
Step 2: Proof of (b). First note that it suffices to prove for any T ≥ ε 2 the estimate (75) Indeed, the claimed estimate then follows by rewriting the systematic localization error as a telescopic sum, which holds since lim T →∞ E A RVE,η,T (ξ) · Ξ = A hom · Ξ P-almost surely.
We present the argument for (75). In view of (a), we may assume without loss of generality that the weight η satisfies Let φ * ,T ξ,Ξ denote the localized, linearized, adjoint corrector (i. e. the T -localized homogenization corrector associated with the linear elliptic PDE with coefficient field (a T ξ ) * ). The localized corrector equation (12a) yields (76) Combined with the definition of the localized RVE approximation in Definition 28, we get We subtract the linearized corrector equation (18a) in its form for the adjoint coefficient field and the corresponding corrector where a T ξ := ∂ ξ A(ξ + ∇φ T ξ ). We get (77) We take the expectation of this identity and note that the expectation of the second integral on the right-hand side vanishes: Indeed, since it is of the form E ´R d B∇η where B is a stationary random field and η is compactly supported, Moreover, for the first term on the RHS of (77) we appeal to the uniform bound on ∂ 2 ξ A from assumption (R) in form of (recall that a T ξ = ∂ ξ A(ω ε , ξ + ∇φ T ξ )) We thus get To estimate I 1 we first apply Hölder's inequality with exponents p and p p−1 with 0 < p − 1 1, and then appeal to Corollary 32 and the moment bound on the linearized corrector from (42) as well as (23) to deduce Regarding I 2 we distinguish the cases d ≤ 2 and d ≥ 3. In the case d ≤ 2, we apply Lemma 33, Lemma 31, as well as Proposition 13 and Proposition 14 to obtain and in case d = 1 we proceed similarly.
In the case d ≥ 3 we appeal to the representation of φ 2T ξ and φ * ,T ξ,Ξ as ∇ · θ 2T ξ and ∇ · θ * ,2T ξ,Ξ by (19b) and (20b), respectively. To shorten the notation, in the following we assume without loss of generality that ffl An integration by parts thus yields With the properties of η (in particular, |∇η| √ T −d−1 and supp η ⊂ B √ T ), by the Cauchy-Schwarz inequality, and by stationarity of the localized correctors, we get By appealing to Proposition 13 and Lemma 33 for the first term and to Proposition 14 and Lemma 31 for the second term, we obtain Plugging in the estimates on I 1 and I 2 into (78), this establishes the estimate on the localization error for the representative volume element method for P.
Step 3: Proof of (c). For the periodized probability distribution P L , one may proceed analogously to (b), deriving an error estimate on

Coupling error for RVEs.
Proof of Lemma 29. To shorten the presentation we set ω := π L ω and similarly mark quantities that are associated with ω, i.e., . Moreover, we shall use the following notation for the differenceŝ (we use the symbolδ to distinguish the quantity from the sensitivities considered in Sections 5 and 6). In the proof we make use of the exponential test-functions where 0 < γ 1 is chosen such that the exponential localization estimate Lemma 36 applies.
Step 1. Estimate forδφ T ξ . We claim that (79) sup Indeed, by subtracting the equations for φ T ξ and φ T ξ , we find that . By the exponentially localized energy estimate of Lemma 36 we havê By construction F vanishes on B L/4 . Hence, since and sinceω =ω for |x| ≤ L 4 , we obtain where for the last estimate we appealed to the localized energy for φ T ξ , see Lemma 36. We conclude that (79) holds. For further reference, we note that we may similarly derive , the estimate (79), and the bound Note that in the last step of of the last inequality we have again used the Meyers estimate of Lemma 44 together with the localized energy estimate of Lemma 36 and a dyadic decomposition.
Step 2. Estimate forδφ T ξ,Ξ . We claim that there exists q = q(d, m, λ, Λ) such that for all x 0 ∈ B L 8 we have Indeed, by subtracting the equations (18a) for φ T ξ,Ξ and φ T ξ,Ξ , we get Note that By exponential localization in form of Lemma 36, the Lipschitz continuity of ∂ ξ A (see (R)), the fact that F 1 vanishes on B L 4 byω = ω on B L 4 , and the uniform bound on ∂ ξ A from (A2), we get (84) We estimate the first term on the RHS by Hölder's inequality aŝ q,T,x0 . Next, we estimate the second term on the RHS in (84). By Lipschitz continuity of ∂ ξ A in the second variable (see (R)), Hölder's inequality with exponents p/2 and q = p p−2 with 0 < p − 2 1, Meyers' estimate in form of Lemma 44 for the PDE (80), and (82) from Step 1, we get q,T,x0 . This completes the argument for (83).
Step 3. Conclusion. Set Since η L is supported in B L 8 and η L ≤ L −d , we have We cover B L By the definition of ζ, the estimates in (79) and (83) from Step 1 and Step 2, and the deterministic exponentially localized bounds on 1 √ T φ T ξ and 1 √ T φ * ,T ξ,Ξ (which are a consequence of Lemma 36), and the Lipschitz continuity of A with respect to the second variable (see (A2)), we conclude for In total, we have shown the desired deterministic estimate

4.6.
Estimate on the systematic error of the RVE method. We now estimate the systematic error of the RVE approximation for the effective material law. We begin with the case in the presence of the regularity condition (R).
Proof of Theorem 9b -the case with (R). By rescaling we may assume without loss of generality that ε = 1. In the following η : R d → R denotes a non-negative weight supported in B L 8 with |η| ≤ C(d)L −d and´R d η dx = 1. Moreover, we consider a localization parameter T according to for the γ = γ(d, m, λ, Λ) from Lemma 29. Our starting point is the error decomposition (50), which we recall in the following form Note that in the above decomposition we already used the equality which is valid since P L is assumed to be a L-periodic approximation of P in the sense of Definition 8 (recall also (48)). The terms I 2 and I 3 are coupling errors that can be estimated deterministically with help of Lemma 29. Combined with the choice of T in (85) and the bound on high moments of ∇φ T ξ,Ξ obtained by combining (42) and (23), we arrive at .
The terms I 1 and I 4 are systematic localization errors that can be estimated with help of Proposition 30b. We obtain using again (85) Having estimated all terms in (86), this establishes the first estimate in Theorem 9b upon taking the supremum with respect to Ξ, |Ξ| ≤ 1.
We next establish the suboptimal estimate for the systematic error of the RVE method in the case without the small-scale regularity condition (R).
Proof of Theorem 9b -the case without (R). As in the case with small scale regularity (R), we denote by η : R d → R a non-negative weight supported in B L 8 with |η| ≤ C(d)L −d and´R d η = 1. Moreover, we consider a localization parameter √ T ≤ L, whose relative scaling with respect to L will be specified below in Step 3. For any parameter field ω we consider the localized RVE-approximation A RVE,η,T ( ω, ξ) :=ˆηA( ω, ξ + ∇φ T ξ ) dx.
Note that it has a simpler form compared to the quantity introduced in Definition 28. In particular, the above expression does not invoke the linearized corrector (for which we cannot derive suitable estimates without the small scale regularity condition (R)). As in the case with small scale regularity, the starting point is estimate (86), i. e. the decomposition of the systematic error into the two coupling errors and the two systematic localization errors Step 1. Estimate of the coupling errors. We claim that Indeed, this can be seen by an argument similar to the proof of Lemma 29. In fact the argument is significantly simpler thanks to the absence of the linearized corrector in the definition of the localized RVE-approximation. We only discuss the argument for I 3 , since the one for I 2 is analogous. We first note that thanks to the assumptions on η and the Lipschitz continuity of A, see (A2), we have whereδφ T ξ is defined by (80). As in Section 4.6 we consider a minimal cover of B L 8 by balls of radius √ T ; more precisely, let X L,T ⊂ B L 8 denote a finite set of points Then where 0 < γ 1 is chosen such that the exponential localization estimate of Lemma 36 applies. Combining the estimate with (79) thus yields Step 2. Estimate of the systematic localization errors. We claim that for d ∈ {2, 4}, 1 for d = 3 and d ≥ 5.
We only discuss I 4 , since the argument for I 1 is similar. The Lipschitz continuity of A (see (A2)) and the localization error estimate for the corrector in form of Lemma 31 yield for d ∈ {2, 4}, 1 for d = 3 and d ≥ 5.
The claimed estimate now follows by a telescopic sum argument similar to Step 2 of the proof of Proposition 30.
Step 3. Conclusion. The combination of the previous steps yields This establishes the result.

5.
Corrector estimates for the nonlinear PDE

5.1.
Estimates on linear functionals of the corrector and the flux corrector. We now turn to the first key step, the estimate on the localized homogenization corrector φ T ξ and the localized flux corrector σ T ξ . For the proof of the estimate on linear functionals of the corrector and the flux corrector in Lemma 17, we need the following auxiliary lemma, which follows by a combination of the Caccioppoli inequality with hole-filling.
Lemma 38. Let the assumptions (A1)-(A2) be satisfied, let ε > 0 be arbitrary, letω be an arbitrary parameter field, let the correctors φ T ξ and φ T ξ,Ξ be defined as the unique solutions to the corrector equations (12a) and (18a) in H 1 uloc , and let r * ,T,ξ and r * ,T,ξ,Ξ be as in (36) and (39). Let the parameter K mass in (36), (39) be chosen as K mass ≥ C(d, m, λ, Λ) for some C determined in the proof below. Then for any x 0 ∈ R d the estimates and hold true. Furthermore, we have Proof. The estimates (90) and (91) are a simple consequence of the definitions (36) and (39), the assumed lower bound on K mass , and the bounds valid for all R ≥ √ T which are a consequence of Lemma 36 applied to the corrector equations (12a) and (18a).
We now prove the estimate in Lemma 17 for functionals of the localized corrector φ T ξ for the nonlinear elliptic PDE and the corresponding localized flux corrector σ T ξ . As the proof of Lemma 18 is very similar, we combine their proofs.
Proof of Lemma 17 and Lemma 18. Part a: Estimates for linear functionals of the homogenization corrector φ T ξ . Taking the derivative of the corrector equation (12a) with respect to a perturbation δω ε of the random field ω ε , we see that the change δφ T ξ of the corrector under such an infinitesimal perturbation satisfies the linear elliptic PDE Define the coefficient field a T ξ (x) := ∂ ξ A(ω ε (x), ξ + ∇φ T ξ (x)). By our assumptions (A1) and (A2), the coefficient field a T ξ : R d → (R m×d ) 2 exists and is uniformly elliptic and bounded in the sense a T ξ v · v ≥ λ|v| 2 and |a T ξ v| ≤ Λ|v| for any v ∈ R m×d . Now, consider a functional of the form F :=´R d g · ∇φ T ξ dx for a deterministic compactly supported function g. Denoting by h ∈ H 1 (R d ; R m ) the (unique) solution to the dual equation i. e. we have ∂F ∂ω ε = ∂ ω A ω ε , ξ + ∇φ T ξ · ∇h.
Applying the weighted Meyers estimate of Lemma 35 to the equation (93) and using our assumption ( ffl Br(x0) |g| p dx) 1/p ≤ r −d , the integral in the last term may be estimated to yield for any 2 < p < 2 + c, any 0 < τ < 1, and any q ≥ C(d, m, λ, Λ, p, τ ). This is the desired estimate in Lemma 17a. Next, we establish (38). To this end, we consider the random variable F := ffl B R (x0) φ T ξ dx. By E[φ T ξ ] = 0 -which follows by testing the PDE (12a) with a test function, taking the expectation, and using stationarity -we see that we have E[F ] = 0. We may therefore repeat the previous computation to estimate the stochastic moments of F , up to the following changes: The equation (93) is replaced by the equation with non-divergence form right-hand side −∇ · (a T, * ξ ∇h) , and the estimate for ∇h deduced from Lemma 35 now reads In total, we deduce . This is precisely the desired estimate on the average of φ T ξ from (38). Part b: Estimates for linear functionals of the flux corrector σ T ξ . Taking the derivative of the flux corrector equation (12c) with respect to a perturbation δω ε of the random field ω ε , we see that the change δσ T ξ of the corrector under such an infinitesimal perturbation satisfies the linear elliptic PDE . For the sensitivity of the functional we obtain by definingh ∈ H 1 (R d ; R m×d ) as the unique decaying solution to the (system of) Poisson equation(s) with massive term and definingĥ ∈ H 1 (R d ; R m×d ) as the unique decaying solution to the uniformly elliptic PDE (with a T ξ (x) : i. e. we have ∂F ∂ω ε = (∂ jh e k − ∂ kh e j ) · ∂ ω A(ω ε , ξ + ∇φ T ξ ) + ∇ĥ · ∂ ω A(ω ε , ξ + ∇φ T ξ ).
Using again the version of the spectral gap inequality for the q-th moment from Lemma 16, the fact that E[F σ ] =´R d g · ∇E[σ T ξ,jk ] dx = 0 by stationarity, and the bound |∂ ω A(ω, ξ)| ≤ Λ|ξ| (recall (A1)-(A2)), we obtain . By (88), we have By a combination of Hölder's and Jensen's inequality in form of Lemma 39 (see below) we deduce that for any α 0 ∈ (0, c), any p ∈ (2, 2 + c(d, m, λ, Λ, α 0 )), any τ ∈ (0, 1), and any q ≥ q(d, m, λ, Λ, α 0 , p) . By the weighted Meyers estimate of Lemma 35 and the defining equation (98) for h as well as the uniform bound on ∂ ξ A inferred from (A2), we deduce for any Applying Lemma 35 to the equation (97) and using ( ffl Br(x0) |g| p dx) 1/p ≤ r −d , the integral in the last term may be estimated to yield . This is the desired estimate in Lemma 17b. As in Part a, the estimate on the average of σ T ξ from (38) follows upon replacing the equation forh by the PDE −∆h Part c: Estimates on linear functionals of the potential θ T ξ,i . Taking the derivative of the equation (19a) with respect to a perturbation δω ε of the random field ω ε , we get Definingḡ ∈ H 1 (R d ; R m ) as the unique decaying solution to the PDE we obtain In other words, we are back in the situation of Part a, the only difference being that the function g has been replaced by the functionḡe i . The functionḡ now no longer has compact support; rather, being the solution to the Poisson equation (100) in d ≥ 3 dimensions, for any 0 < α 1 < 1 and any p ≥ 2 it satisfies the weighted Calderon-Zygmund estimate . Now, we may follow the estimates from Part a leading to (95) line by line. In the next step, we again employ Lemma 35, which by the previous estimate onḡ gives . This is our desired estimate. Part d: Proof of the estimate (37). Let w ∈ H 1 (R d ) be arbitrary. Assuming for the moment R = 2 N r for some N ∈ N, we may write where we have used the orthogonality of the functions f n . Solving the PDEs ∆v = f n on B 2 n r (x 0 ) with Neumann boundary conditions, we see that g n := ∇vχ B 2 n r (x0) solves ∇ · g n = f n on R d and satisfies for any p ≥ 2 the estimate ( ffl B 2 n r (x0) |g n | p dx) 1/p ≤ C(2 n r) 1−d . This implies Combining this estimate with the bounds on g n and the estimates from Lemma 17, we deduce the estimate (37) if R is of the form R = 2 N r. If R is not of this form, we need one additional step to estimate the difference of the averages on the radius 2 log 2 R r r and R.
The following estimate is a consequence of a straightforward application of Hölder's inequality and Jensen's inequality. Nevertheless, we state it to avoid repetition of the same computations, as it is repeatedly used in the proofs of Lemma 17 and Lemma 21.
Lemma 39. Let ε > 0 and let b and B be stationary random fields with Let h be a random field satisfying ∇h ∈ L p (R d ) almost surely. Let 0 < α 0 ≤ 1 and 0 < τ < 1. Let p > 2 be close enough to 2 depending only on d and on (a lower bound on) α 0 , but otherwise arbitrary. Then for any q ≥ C(p, α 0 , τ ) and for any holds true.
Proof of Lemma 39. Several applications of Hölder's inequality yield together with the bound on b . This implies using Hölder's inequality with exponents 1 1−τ and 1 τ and Jensen's inequality .
Assuming that q ≥ p/(p − 2) and α 0 /(p − 2) ≥ 2d, we obtain using Jensen's inequality in the first term (note that the integral r −d´R d 1 + |x| r α0/(p−2) dx is bounded by C(d, α 0 , p)) as well as the fact that the supremum and the infimum of the function 1 + |x| r on an ε-ball differ by at most a factor of 2 in the second term . By the stationarity of B, this yields the assertion of the lemma.

5.2.
Estimates on the corrector for the monotone system. We will need the following technical lemma. holds.
Proof. The proof is an elementary consequence of the Fourier series representation of v.
Combining the estimates on linear functionals of the corrector from Lemma 17 with the estimate on the corrector gradient (88) and the technical Lemma 40, we now derive stochastic moment bounds on the minimal radius r * ,T,ξ .
Proof of Lemma 19. In order to obtain a bound for the minimal radius r * ,T,ξ , we derive an estimate on the probability of the event r * ,T,ξ (x 0 ) = R = 2 ε for a fixed x 0 ∈ R d and any R = 2 ε > ε. In the case of this event, we have by the Caccioppoli inequality (51) applied to the function ξ · (x − x 0 ) + φ T ξ , which solves the PDE the definition of r * ,T,ξ (x 0 ) in (36), and the fact that r * ,T,ξ ≤ C √ T (see (90)) Furthermore, in the event r * ,T,ξ (x 0 ) = R > ε we also know by the definition (36) that at least one of the inequalities holds. We now distinguish these two cases.
Case 1: The estimate (102) holds. By Lemma 40, we have for any δ > 0 for a sufficiently large K = K(d, δ) Now let g n,R be the family of all functions (2R) −d d i=1 cos( πki(xi+R)
We now derive the estimates on the corrector φ T ξ , the flux corrector σ T ξ , and the potential field θ T ξ . Note that the only required ingredients for the proof are the estimate for functionals of the corrector which we derived in Lemma 17, the estimate on the corrector gradient given by (88), as well as the general technical results of Lemma 40 and Lemma 25.
Proof of Proposition 13. We insert the estimate on r * ,T,ξ from Lemma 19 into Lemma 17. For any x 0 ∈ R d , any r ≥ ε, and any g with supp g ⊂ B r (x 0 ) and for any q ≥ C(d, m, λ, Λ). Plugging in the estimates on r * ,T,ξ from Lemma 19 into (88), we deduce for any for any q ≥ C(d, m, λ, Λ). Plugging in this bound and (108) into the (spatially rescaled) multiscale estimate for the L 2 norm from Lemma 25, we obtain the bounds on φ T ξ stated in (21) as well as the corresponding result for the L p norm. The estimates on φ T ξ in (22a) and (22b) are shown by combining (21) with the estimate from Lemma 18 and (111) as well as using in case d ≥ 3 the relation In view of this use of Lemma 25, in order to obtain the estimates on σ T ξ,jk and θ T ξ stated in (21) and (23), we only need to establish estimates on ffl Bε(x0) |∇σ T ξij | 2 dx and ffl Bε(x0) |∇θ T ξ | 2 dx. The Caccioppoli inequality (51) for the defining equation of the flux corrector (12c) yields in conjunction with (A2) Estimating the first term on the right-hand side by Lemma 40 and taking stochastic moments, we deduce (where we use the family of functions g n of the form Choosing K large enough and using stationarity, we may absorb the first term on the right-hand side in the left-hand side. Estimating the linear functionals of ∇σ T ξ,jk by (109), bounding the third term on the right-hand side by (111), and estimating the last term by (37) and (38) The estimate for the gradient of the potential field ∇θ T ξ is analogous but even simpler (due to the lack of the massive regularization in (19a)). We obtain the Using this estimate and (109) in Lemma 25, we obtain (23).
6. Corrector estimates for the linearized system 6.1. Estimates on linear functionals of the corrector and the flux corrector for the linearized PDE.
Proof of Lemma 21 and Lemma 22. Part a: Estimates for linear functionals of the homogenization corrector φ T ξ,Ξ . Without loss of generality we may assume in the following argument that x 0 = 0, i. e. g is supported in B r (0), and that ( ffl Br |g| p 2 /2 dx) 2/p 2 ≤ 1 -otherwise replace in the following argument p by √ 2p.
The argument is similar to the case of the corrector φ T ξ . We first observe that the expectation E[F ] vanishes. Indeed, φ T ξ,Ξ is easily seen to be a stationary random field, which entails E[F ] =´R d g·∇E[φ T ξ ] dx = 0. By Lemma 16, to obtain stochastic moment bounds for F it suffices to estimate the sensitivity of F with respect to changes in the random field ω ε . Taking the derivative with respect to ω ε in (18a), we obtain Denoting by h the unique solution in H 1 (R d ; R m ) to the auxiliary PDE (where we again used the abbreviation a T, * ξ := ∂ ξ A(ω ε (x), ξ + ∇φ T ξ )) and denoting byĥ ∈ H 1 (R d ; R m ) the unique solution to the auxiliary PDE In other words, we have the representation By (A3), this implies the sensitivity estimatê Plugging this bound into the version of the spectral gap inequality for the q-th moment (see Lemma 16) and using E[F ] = 0, we deduce for any q ≥ 1 . By Lemma 39 and (89) as well as (88), this entails for any τ,τ ∈ (0, 1) with C = C(d, m, λ, Λ, ρ, α 0 , p, τ,τ ). By the weighted Meyers' estimate in Lemma 35 -applied to (114) -and the uniform bound |∂ 2 ξ A| ≤ Λ from (R), we infer for any with C = C(d, m, λ, Λ, ρ, α 0 , α 1/2 , p, τ,τ ). Inserting the bound (42) into (116), we get where C now additionally depends on ν. To proceed further, we write the second factor on the right-hand side in the form By first smuggling in the weight , via Hölder's inequality with exponents p p−2 and p 2 in space, and next by Hölder's inequality with exponentsτ τ −τ /2 and 2τ τ in probability, we get .
By Jensen's inequality for the integral´R d f ϕ dx -using that ϕ has mass of order unity and assuming also that q ≥ C(p, τ,τ ) -and by exploiting the fact that v 1 is a stationary random field, we deduce that the right-hand side is bounded from above by .
By combining the previous estimates and plugging in the definitions of v 1 , v 2 and ϕ, we thus arrive at Applying once more Hölder's inequality to the first expected value on the righthand side, choosingτ ∈ (0, 1) close enough to 1, choosing τ > 0 small enough, and using the stretched exponential moment bounds for C reg,ξ , we deduce . Plugging in the resulting estimate into (115) and estimating r * ,T,ξ in (115) via Lemma 19 yields Choosing p close enough to 2 and choosing α 0 and α 1 small enough (all depending only on d, m, λ, and Λ), we may estimate the last factor by applying the weighted Meyers' estimate from Lemma 35 to the PDE (113). This yields by our assumed bound on g for any q ≥ C with C = C(d, m, λ, Λ, ρ, ν, τ ) and any τ ≤ c(d, m, λ, Λ).
We then obtain an estimate for F θ of the form (115), but with h solving the PDE −∇ · (a T, * ξ ∇h) + 1 T h = ∇ · (ḡe i ) andĥ solving the same PDE but with the new h. Inserting the Calderon-Zygmund bounds onḡ and the modified bound on ∇ĥ in the steps leading to (116) and (118), we deduce the desired estimate  (40) and (41) is analogous to the proof of (37) and (38).
6.2. Estimates on the corrector. We first establish the moment bounds on r * ,T,ξ,Ξ .
Proof of Lemma 23. The proof is entirely analogous to the proof of Lemma 19: Recall that the only required ingredients for the proof of moment bounds for r * ,T,ξ in Lemma 19 were the estimate for functionals of the corrector φ T ξ from Lemma 17, the estimate on corrector averages from Lemma 18, the estimate on the corrector gradient given by (88), as well as the general technical results of Lemma 40 and Lemma 25. Lemma 21 provides bounds on the stochastic moments of linear functionals of the corrector φ T ξ,Ξ for the linearized PDE that are essentially analogous (up to the prefactor q C (1 + |ξ|) C ) to the bounds for functionals of φ T ξ in Lemma 17. Similarly, Lemma 22 provides estimates on averages of the linearized corrector φ T ξ,Ξ that are (again up to the prefactor q C (1 + |ξ|) C ) analogous to the bounds on averages of φ T ξ from Lemma 18. Furthermore, the estimate (89) for the gradient of φ T ξ,Ξ is completely analogous to the estimate (88) for the gradient of φ T ξ . In conclusion, by the same arguments as the proof of Lemma 19 (up to replacing φ T ξ by φ T ξ,Ξ , ξ by Ξ, r * ,T,ξ by r * ,T,ξ,Ξ , and including an additional prefactor q C (1 + |ξ|) C in the bounds), we obtain E r * ,T,ξ (x 0 ) ε (d−δ/2)q/2 1/q ≤ C(d, m, λ, Λ, ρ, ν)q C (1 + |ξ|) C .
Proof of Proposition 14. By the same arguments as in the proof of Proposition 13, we obtain the desired bounds on φ T ξ,Ξ , σ T ξ,Ξ , and θ T ξ,Ξ . Note that we simply need to replace the use of Lemma 17 by Lemma 21, the use of Lemma 19 by Lemma 23, and the use of (88) by (89).
In order to derive (29), we first write We may then pass to the limit T → ∞ in these estimates to deduce (29). The corresponding bound for σ ξ2 − σ ξ2 is derived analogously.

Structural properties of the effective equation
We finally establish the structural properties of the effective (homogenized) equation, as stated in Theorem 6.
Proof of Lemma 34. Let R > 0 and let η be a standard cutoff with η ≡ 0 outside of B R , η ≡ 1 in B R/2 , and |∇η| ≤ CR −1 . Testing the equation with η 2 (u − b) for some b ∈ R m to be chosen, we obtain by (A1)-(A2) ≤ CΛ RˆB R (x0)\B R/2 (x0) η(|∇u| + |g|)|u − b| dx + 1 TˆRd Young's inequality and an absorption argument yields the Caccioppoli-type inequalitŷ which directly implies (51). An application of the Poincaré inequality on the annulus B R (x 0 ) \ B R/2 (x 0 ) in the previous estimate gives upon choosing b := ffl B R (x0)\B R/2 (x0) u dx, using also Jensen's inequality and the fact that |B R (x 0 )| ∼ |B R (x 0 ) \ B R/2 (x 0 )| to estimate the second term on the right-hand side, This in turn yields by the hole-filling argument for θ = C C+1 . Iterating this estimate with R replaced by 2 −k R, we deduce our desired estimate (52) if r is of the form r = 2 −K R. For other values of r, we may simply use the already-established inequality for the next bigger radius of the form r = 2 −K R and increase the constant C if necessary.
Case a: Two-dimensional systems with smooth coefficients. Meyers' estimate for the PDE (124) in the form of Lemma 44 with T := ∞ and the condition (A3) imply for some p > 2 and any b ∈ R m B ε/2 (x0) Using the Caccioppoli inequality (51) with T = ∞ for the PDE (124), choosing p − 2 > 0 small enough, and using the Poincaré inequality, we get by T ≥ ε 2 B ε/2 (x0) |∇ω ε | p dx 1/p . By our estimate (88), Lemma 19, and the bound on ∇ω ε in (R), the right-hand side may be bounded by Cε −1 (|ξ| + 1) for some random constant C with stretched exponential moments. By Morrey's embedding, we obtain the desired estimate. Case b: Scalar equations and systems with Uhlenbeck structure with smooth coefficients.
In the systems' case, one replaces the De Giorgi-Nash-Moser theory by Uhlenbeck's regularity result [41].