The choice of representative volumes in the approximation of effective properties of random materials

The effective large-scale properties of materials with random heterogeneities on a small scale are typically determined by the method of representative volumes: A sample of the random material is chosen - the representative volume - and its effective properties are computed by the cell formula. Intuitively, for a fixed sample size it should be possible to increase the accuracy of the method by choosing a material sample which captures the statistical properties of the material particularly well: For example, for a composite material consisting of two constituents, one would select a representative volume in which the volume fraction of the constituents matches closely with their volume fraction in the overall material. Inspired by similar attempts in material science, Le Bris, Legoll, and Minvielle have designed a selection approach for representative volumes which performs remarkably well in numerical examples of linear materials with moderate contrast. In the present work, we provide a rigorous analysis of this selection approach for representative volumes in the context of stochastic homogenization of linear elliptic equations. In particular, we prove that the method essentially never performs worse than a random selection of the material sample and may perform much better if the selection criterion for the material samples is chosen suitably.


Introduction
The most widely employed method for determining the effective large-scale properties of a material with random heterogeneities on a small scale is the method of representative volumes. It basically proceeds by taking a small sample of the material -a "representative volume element" (RVE) -and determining the properties of the sample by the cell formula. The criteria for the choice of the representative volume have been the subject of an ongoing debate; while in principle increasing the size of the material sample increases the accuracy of the approximation of the material properties, this comes at a correspondingly larger computational cost. It has been conjectured that for a fixed size of the material sample, selecting a material sample which captures certain statistical properties of the material in a particularly good way may be beneficial: For example, for a composite material consisting of two constituent materials, one would try to select a material sample for which the volume fraction of each constituent material within the sample Figure 1. Among the six depicted material samples, the method of Le Bris, Legoll, and Minvielle in its simplest realization would choose either the first sample or the fifth sample as the representative volume element and discard the others, as the volume fraction of the inclusions in the first and the fifth sample is closest to the overall material average. Note that in the depicted material samples the volume fraction of the inclusions is proportional to the number of inclusions, as all inclusions are of equal size. For a better illustration of the method, both the size and the number of the depicted samples have been chosen much smaller than in actual computations.
matches the overall volume fraction of this constituent in the composite as closely as possible (see Figure 1). Alternatively, for linear materials one might try to match the averaged material coefficient in the sample with the average taken over the full material. There have been efforts in material science and mechanics towards replicating further statistical properties of the material in a representative volume, an approach called "special quasirandom structures" [80,81,84] or "statistically similar representative volume elements" [15,16,17,18,27,79]. A particularly successful approach in this direction has been developed for linear materials by Le Bris, Legoll, and Minvielle [62]; their method proceeds by considering a large number of material samples, evaluating one or more cheaply computable statistical quantities of the samples (like, for example, the spatial average of the coefficient), and then choosing the sample as the representative volume that is most representative for the material as measured by these quantities. In the present work, in the context of stochastic homogenization of linear elliptic PDEs we provide the first rigorous justification of these approaches 1 .
For materials with random heterogeneities on small scales, the approximation of the effective material coefficient by the method of representative volumes is a random quantity itself, as the outcome depends on the sample of the material. In the setting of linear elliptic PDEs with random coefficient fields -which corresponds to the setting of heat conduction, electrical currents, or electrostatics in a material with random microstructure -, Gloria and Otto [51,52,46] have investigated the structure of the error of the approximation of the effective material coefficient by the method of representative volumes: The leading-order contribution to the error (with respect to the size of the RVE) consists of random fluctuations; in expectation the approximation of effective coefficients by the method of representative volumes is accurate to higher order, i. e. the systematic error of the RVE method is of higher order 2 . For a given size of the RVE -which corresponds to a fixed computational effort -, the accuracy of the RVE method may therefore be increased significantly by reducing the variance of the approximations of the effective coefficient. It is precisely such a reduction of the variance by which the selection approach for representative volumes of Le Bris, Legoll, and Minvielle [62] achieves its gain in accuracy.
For linear elliptic PDEs with random coefficients and moderate ellipticity contrast, the reduction of the variance by the ansatz of Le Bris, Legoll, and Minvielle [62] is particularly remarkable: By selecting the representative volume according to the criterion that the averaged coefficient in the RVE should be particularly close to the averaged coefficient in the overall material, in numerical examples with ellipticity contrast ∼ 5 they observed a variance reduction by a factor of ∼ 10. Going beyond this simple selection criterion, they devised a criterion based on an expansion of the effective coefficient in the regime of small ellipticity contrast, which numerically achieves a remarkable variance reduction factor of ∼ 60 even for a moderate ellipticity contrast ∼ 5. Note that this basically corresponds to the gain of about one order of magnitude in accuracy for a negligible additional computational cost and implementation effort.
However, the analysis of the selection approach for representative volumes has been restricted to the one-dimensional setting [62], in which the homogenization of linear elliptic PDEs is linear in the inverse coefficient and therefore independent of the geometry of the material. Besides the highly nonlinear dependence of the effective coefficient on the heterogeneous coefficient field in dimensions d ≥ 2, one of the main challenges in the analysis of the selection method for representative volumes is the fact that it is only expected to increase the accuracy by a (though often very large) constant factor, at least for a fixed set of statistical quantities by which the selection is performed. At the same time, the available error estimates for the representative volume element method in stochastic homogenization are only optimal up to constant factors. For this reason, the analysis of the selection approach for representative volumes necessitates a fine-grained analysis of the structure of fluctuations in stochastic homogenization.

Stochastic homogenization of linear elliptic PDEs: A brief outline.
The subject of the present contribution is the rigorous justification of the selection method for representative volumes by Le Bris, Legoll, and Minvielle [62] in the context of linear elliptic equations −∇ · (a∇u) = f (1) 2 At least if a suitable periodization of the probability distribution of the coefficient field is available, see below for an explanation of this concept.
with random coefficient fields a on R d for arbitrary spatial dimension d. Note that this setting describes e. g. heat conduction or electrostatics in a random material. Our assumptions on the probability distribution of the coefficient field a are standard in the theory of stochastic homogenization: We assume just uniform ellipticity and boundedness, stationarity, and finite range of dependence (see conditions (A1)-(A3) below). In particular, our analysis includes the case of a two-material composite with random non-overlapping inclusions as depicted in Figure 1.
The theory of stochastic homogenization of linear elliptic PDEs predicts that for coefficient fields with only short-range correlations on a scale ε 1 the solution u to the equation with random coefficient field (1) may be approximated by the solution u hom of an effective equation of the form −∇ · (a hom ∇u hom ) = f, where a hom ∈ R d×d is a constant effective coefficient which describes the effective behavior of the material. In this context of linear materials, the method of representative volumes is employed to compute the effective coefficient a hom .
Let us describe the method of representative volumes for the approximation of the effective material coefficient a hom in more detail. It proceeds by choosing a sample of the material, say, a cube with side length Lε for some L 1, uniformly at random. Roughly speaking -for the moment passing silently over the question of boundary conditions -, by solving the equation for the homogenization corrector φ i associated with the i-th coordinate direction on the representative volume −∇ · (a(e i + ∇φ i )) = 0 (e i ∈ R d denoting the i-th vector of the standard basis) one may obtain an approximation a RVE for the effective coefficient a hom in terms of the averaged fluxes a RVE e i := − [0,Lε] d a(e i + ∇φ i ) dx. (4) This expression is also known in homogenization as the cell formula. As already mentioned before, the approximation a RVE for the effective material coefficient a hom is a random variable itself, as it depends on the realization of the random coefficient field a on the sample volume [0, Lε] d . It has been proven by Gloria and Otto [52,53] and also observed in numerical computations that the main contribution to the error of the RVE method is caused by the random fluctuations of the approximation a RVE , while the systematic error is of higher order: For spatial dimensions d ≥ 1 one has √ Var a RVE L −d/2 As a consequence, a reduction of the fluctuations of the approximations a RVE would lead to an increase in accuracy of the approximation for the effective coefficient a hom . It has been observed numerically by Le Bris, Legoll, and Minvielle [62] and shall be proven below rigorously that the selection approach for representative volumes achieves its gain in accuracy precisely by reducing the fluctuations of the approximations for the effective coefficients.
1.2. Informal summary of our main results. In the present work, we prove that in the setting of stochastic homogenization of linear elliptic equations the selection approach for representative volumes by Le Bris, Legoll, and Minvielle [62] • essentially never performs worse than a completely random selection of the representative volume element, but may perform much better for suitable selection criteria, • basically maintains the order of the systematic error of the approximation for the effective coefficient, and • reduces also the error in the approximation for the effective coefficient that may occur with a given low probability, i. e. reduces also the "outliers" of the approximation for the effective coefficient.
As mentioned before, in the setting of linear elliptic PDEs the method of representative volumes is employed to obtain an approximation a RVE for the effective (homogenized) coefficient a hom . The role of "material samples" is assumed by realizations of the random coefficient field a : [0, Lε] d → R d×d , on which the computation of the approximations a RVE is based. The selection approach for representative volumes proposed in [62] then proceeds as follows: At first, one or more statistical quantities F are chosen which assign a real number F(a) ∈ R to any realization a : [0, Lε] d → R d×d . Note that the simplest statistical quantity proposed in [62] is the spatial average F(a) := − [0,Lε] d a dx. Next, one considers a sequence of independent samples of the random coefficient field until a sample meets the selection criterion F(a) − E[F(a)] ≤ δ Var F(a) (7) for some chosen parameter δ with CL −d/2 | log L| C ≤ δ ≤ 1. Finally, the approximation for the effective coefficient is computed by solving the equation for the homogenization corrector (3) and using the cell formula (4) for this sample of the random coefficient field.
To give a flavor of our main result, let us formulate it informally in the case of a single statistical quantity F(a). We denote the approximation for the effective coefficient by the standard representative volume element method (without selection of material samples) by a RVE and the approximation for the effective coefficient by the selection approach for representative volumes by a sel-RVE . In this case, our main theorems Theorem 2 and Theorem 3 may be summarized as follows: • The systematic error of the approximation a sel-RVE is essentially (up to powers of log L and some prefactors) of the same order as the systematic error of the standard representative volume element method a RVE : We have E a sel-RVE − a hom ≤ The quantity κ will be discussed below. Var F(a)Var a RVE , and where r Var := L −d Var a RVE denotes the ratio between the expected order of fluctuations of a RVE and the actual magnitude of fluctuations. Note that the last term in the estimate on Var a sel-RVE converges to zero as the size L of the representative volume increases.
• The probability of "outliers" is reduced by the selection method just as suggested by the variance reduction, at least in an "intermediate" region between the "bulk" and the "outer tail" of the probability distribution: One has a moderate-deviations-type estimate of the form where N 1 denotes the centered normal distribution with unit variance. • In the above bounds, κ := (1 − |ρ F (a),a RVE | 2 ) −1 denotes (essentially) the condition number of the covariance matrix Var (a RVE , F(a)). For the case that the correlation |ρ F (a),a RVE | is close to one, we derive bounds which are independent of κ but come at the cost of a lower rate of convergence in L, namely Our estimate on the variance reduction achieved by the selection approach for representative volumes is implicit in the sense that it is determined by the correlation coefficient Var F(a)Var a RVE .
In fact, the failure of the correlation coefficient ρ F (a),a RVE to be nonzero also implies the failure of gaining accuracy by the selection approach for the representative volumes (see Theorem 4): In such a case of vanishing correlation, the method of Le Bris, Legoll, and Minvielle [62] is not superior (but essentially also not inferior) to the standard method of choosing a representative volume randomly. This raises the question whether such a degeneracy of the correlation coefficient can occur for "natural" choices of the statistical quantity F(a). In Theorem 4, we shall prove that even for a "natural" choice like F(a) := − [0,εL] d a dx there is a priori no guarantee that there is a nonzero correlation between a RVE and F(a): We construct an example of a probability distribution of a for which the covariance Figure 2. For a multivariate Gaussian probability distribution, conditioning on the event of one variable being close to its expectation reduces the variance of the other variable, provided that the two random variables are nontrivially correlated. In our setting, conditioning on the event "spatial average of coefficient field is close to its expectation" reduces the variance of the random variable "approximation for the effective conductivity" a RVE , as their joint probability distribution is close to a multivariate Gaussian.
of a RVE and the average of the coefficient field − a in fact vanishes, while the variances Var − [0,εL] d a dx and Var a RVE are nondegenerate.
However, the failure of the variance reduction approaches to effectively reduce the variance is presumably limited to rather artificial examples: We prove that the covariance of a RVE and the average of the coefficient field − a is positive for coefficient fields which are obtained from iid random variables by applying a "monotone" function, see Proposition 5.
1.3. Outline of our strategy. The basic idea underlying our analysis of the selection approach for representative volumes is the observation that the joint probability distribution of the approximation for the effective coefficient a RVE and one or more statistical quantities F(a) like the average of the coefficient field F(a) := − [0,Lε] d a is close to a multivariate Gaussian, up to an error of the order L −d | log L| C in a suitable notion of distance between probability measures. The selection of representative volumes by the criterion (7) -which amounts to conditioning on the event |F(a) − E[F(a)]| ≤ δ Var F(a) -then reduces the variance of the probability distribution of a RVE by the variance explained by the statistical quantity F(a), up to error terms due to the deviation of the probability distribution from a multivariate Gaussian and the non-perfectness of the conditioning δ > 0, see Figure 2. Note that for an ideal multivariate Gaussian distribution, the expected value of the approximation a RVE would be left unchanged under conditioning since the criterion (7) is symmetric around E[F(a)], i. e. the conditioning would not introduce a bias. As a consequence, for our approximate multivariate Gaussian (a RVE , F(a)) the expectation of a RVE is changed under conditioning only by the distance of our probability distribution to a multivariate Gaussian, which is a higher-order term. Note that both the reduction of the variance by conditioning and the estimate on the bias introduced by the conditioning rely crucially on the fact that our probability distribution is close to a multivariate Gaussian (and not another probability distribution): It is obvious from the picture in Figure 2 that a probability distribution other than a multivariate Gaussian could introduce a large bias under conditioning and even an increase in variance. Our analysis of the selection approach for representative volumes by Le Bris, Legoll, and Minvielle [62] is a first practical application of the beautiful theory of fluctuations in stochastic homogenization, which has been developed in recent years and which our work both draws ideas from and contributes to.
The underlying reason for the convergence of the joint probability distribution of a RVE and one or more functionals F(a) towards a multivariate Gaussian is a central limit theorem for suitable collections of vector-valued random variables: We show that the approximation a RVE for the effective coefficient a hom -and also the functionals F(a) that are used in the work of Le Bris, Legoll, and Minvielle [62]may be written as a sum of random variables with a local dependence structure with multiple levels, see Definition 6 and Proposition 7. For such sums of vector-valued random variables with multilevel local dependence, a proof of quantitative normal approximation is provided in the companion article [41] (see also Theorem 9 below). To the best of our knowledge such quantitative normal approximation results were previously known only for sums of random variables with local dependence structure [32,33,78] (corresponding more or less to just the lowest level of random variables in Figure 4 below), a framework into which the approximation for the effective coefficient a RVE does not fit. Note that the sharp boundaries of the region defined by the selection criterion (7) (see also the sharp boundaries in Figure 2) necessitate the use of a rather strong (though standard) distance between probability measures for our quantitative normal approximation result (see Definition 8); in particular, a stronger notion of distance between probability measures than the 1-Wasserstein distance must be used.
As a by-product, our work also provides a proof of quantitative normal approximation for a RVE in a different setting than available in the literature so far: To the best of our knowledge, the results on quantitative normal approximation for a RVE in the literature always rely on an assumption that the coefficient field a is obtained as a function of iid random variables [37,50,75] or that the probability distribution of a is subject to a second-order Poincaré inequality like in [36]. In contrast, our result holds under the assumption of finite range of dependence, in which to the best of our knowledge only a qualitative normal approximation result had been known [6].
The companion article [41] also provides a result on moderate deviations in the sense of Kramers for sums of random variables with multilevel local dependence structure, see Theorem 10. Our result on the reduction of the error by the selection approach for representative volumes in the case of unlikely events (Theorem 3) is based on this moderate deviations theorem.
Our counterexample for the variance reduction -which shows that even "natural" statistical quantities like the spatial average F(a) := − [0,Lε] d a dx do not necessarily explain a positive fraction of the variance of a RVE -is based on the nonlinear dependence of the effective coefficient in periodic homogenization on the underlying coefficient field: More precisely, our counterexample consists of an interpolation between a standard random checkerboard and a random checkerboard with two types of tiles, one tile type being a constant coefficient field and one tile type being a second-order laminate microstructure. See Section 6 for details of the construction.
1.4. Computation of effective properties of random materials: A more detailed look. In the homogenization of periodic linear materials -i. e. in the homogenization of the linear elliptic PDE (1) with periodic coefficient field a in the sense a(x) = a(x + εk) for all k ∈ Z d -it is possible to compute the effective coefficient a hom by exploiting the periodicity of the coefficient field, basically reducing the problem to solving a PDE -the PDE for the homogenization corrector -on a single periodicity cell: For a period of length ε, the effective coefficient is given by the cell formula with the homogenization corrector φ i defined as the unique ε-periodic solution with zero average to the PDE −∇ · (a(e i + ∇φ i )) = 0.
As a consequence, in periodic homogenization the numerical computation of the effective coefficient a hom typically requires only modest effort.
In contrast, in stochastic homogenization this simplification is no longer possible due to the absence of a periodic structure in the random coefficient field a R d : R d → R d×d and the computation of the effective coefficient becomes a computationally costly problem: The effective coefficient in stochastic homogenization is given by the infinite volume limit cell formula 3 a hom e i · e j := lim with φ L,Dir i denoting the solution to the corrector problem with Dirichlet boundary conditions In practice, in order to approximate the effective coefficient a hom a representative volume [0, Lε] d of finite size must be chosen. However, the approximation of the effective coefficient by the standard cell formula with Dirichlet boundary conditions for the corrector is only of first-order accuracy E[|a RVE Dir − a hom | 2 ] 1/2 L −1 due to the presence of a boundary layer: The artificial Dirichlet boundary condition leads to the creation of a boundary layer in an O(ε)-neighborhood of the boundary ∂[0, Lε] d . The limitation to first-order accuracy is present even in the systematic error E[a RVE ] − a hom . Note that while replacing the volume average in the cell formula by an average taken strictly in the interior of the representative volume typically increases the accuracy [82], for general probability distributions it does not increase the order of convergence due to global effects of the boundary layer. To achieve the convergence (6) and (5), the boundary layer phenomenon must necessarily be addressed by the use of a more careful approximation technique than the method of correctors with Dirichlet boundary data.
One possibility of avoiding the creation of boundary layers is the use of a so-called "periodization" of the probability distribution: Given a probability distribution of coefficient fields a R d , one first fixes the size Lε of the desired representative volume and then attempts to construct a probability distribution of Lε-periodic coefficient fields a such that the law of a| x+[0, 1 2 Lε] d (i. e the law of a restricted to some box of half the size of the representative volume) coincides with the law of a R d | x+[0, 1 for any x ∈ R d . For one realization of the periodized probability distribution of coefficient fields a one may then solve the corrector equation −∇ · (a(e i + ∇φ i )) = 0 with periodic boundary conditions on ∂[0, Lε] d and define the approximation a RVE for the effective coefficient a hom as This approximation a RVE then has the desired approximation properties (5) and (6). Note that this construction requires the knowledge of the probability distribution of a R d and must be done in a case-by-case basis; it is therefore not feasible in all practical situations.
To give an example, random non-overlapping inclusions like in Figure 1 may be constructed by considering a Poisson point process on R d ×[0, 1], ordering the points (x k , y k ) ∈ R d × [0, 1] with respect to their last coordinate y k , and then successively placing inclusions in R d centered at the x k and with diameter ε if the "previous" points x l , l < k, have a distance of at least ε from x k (i. e. |x l − x k | ≥ ε). The result of such a construction is shown in Figure 3a. For this probability distribution, one may define a periodization in a natural way by considering a Poisson point process on [0, Lε) d × [0, 1] and defining an Lε-periodic coefficient field with non-overlapping inclusions in the obvious way, replacing the Euclidean distance |x l − x k | by the periodicity-adjusted distance |x l − x k | per := inf z∈Z d |x l − x k + Lεz|. A sample from the periodized probability distribution is shown in Figure 3b.
If no periodization of the probability distribution is available -for example if only samples from the probability distribution are available and the underlying probability distribution is not known, like in applications where one has access to samples of the materials -, one has to resort to an alternative means of increasing the rate of convergence of the method of representative volumes. One feasible option is to "screen" the effect of the boundary by introducing a "massive" term in the PDE for the homogenization corrector [24,45,52]: Fixing a scale √ T ∼ L log L , one replaces the equation for the homogenization corrector by the PDE and approximates the effective coefficient a hom by a hom e i ≈ a RVE e i := 1 where η is a smooth nonnegative weight supported in the slightly smaller box Due to the already substantial length of the present paper, we shall limit ourselves to the analysis of the selection approach for representative volumes in the context of periodizations of the probability distribution and defer the analysis of the screening approach to a future work.
Generally speaking, in the method of representative volumes the equation for the homogenization corrector may be solved by any numerical algorithm that is feasible for the given size of the representative volume: For example, standard finite element methods may be employed for representative volumes of moderate size, while for very large representative volumes one may use appropriate instances of modern computational homogenization methods like the multiscale finite element method, heterogeneous multiscale methods, and related approaches (see e. g. [1,14,28,38,59,58,69]) or the local orthogonal decomposition method by Målqvist and Peterseim [68].
Note that besides the modern numerical homogenization methods -which are in principle applicable to any elliptic PDE involving a heterogeneous coefficient field -, there have been numerous numerical works on the more specific problem of the approximation of effective coefficients in stochastic homogenization, see for example [13,31,39,40,60,70,77].
1.5. The selection approach for representative volumes by Le Bris, Legoll, and Minvielle. Let us describe the selection approach for representative volumes by Le Bris, Legoll, and Minvielle [62] in more detail. The selection approach for representative volumes achieves its gain in accuracy of approximations a RVE for the effective coefficient a hom (as compared to the standard representative volume element method with completely random choice of the material sample) by selecting only those realizations of the random coefficient field a| [0,Lε] d which capture some important statistical properties of the coefficient field a in an exceptionally good way: For example, in the simplest setting Le Bris, Legoll, and Minvielle [62] propose to restrict one's attention to realizations of the coefficient field a for which the average on [0, Lε] d is exceptionally close to its expected value in the sense for some δ 1. Note that for generic realizations of a only is true by the central limit theorem for the averages − [0,Lε] d a dx and the finite range of dependence ε. On a numerical level, such a selection approach typically provides an increase in computational efficiency if the accuracy is indeed increased by conditioning on the event (9): Usually, the most expensive step in the computation of the approximations a RVE is the computation of the homogenization corrector as the solution to the PDE (3). In contrast, the generation of random coefficient fields a and the evaluation of the average of a is typically cheap. Therefore it is often worth generating about 1 δ independent realizations of a to obtain on average one realization of a which satisfies (9); for this single realization, the corrector equation (3) is solved numerically and the approximation a RVE for the effective coefficient is computed. This strategy is also applicable to situations in which the probability distribution of the coefficient field is not known, but one has only access to a large number of samples of the coefficient field, like in applications in which one has access to data from actual material samples.
The selection criterion (9) based on the average of the coefficient field in the material sample is the first out of two selection criteria proposed by Le Bris, Legoll, and Minvielle [62]. In order to reduce the variance of a RVE further, they propose to consider several such statistical quantities at the same time, for example in addition to the spatial average for some (approximation of the) solution v i to the constant-coefficient equation and require that all of these statistical quantities be close to their expectation at the same time. The quantities (10) arise as a second-order correction to the effective conductivity a RVE in the expansion in the regime of small ellipticity contrast: Expanding the homogenization corrector φ i and the approximate effective conductivity a RVE as a power series in ν for the family of coefficient fields a = Id +νâ, we deduce and φ 2 i defined as the solution to another PDE. As a consequence, for the approximation of the effective conductivity we obtain where in the last step we have used the periodicity of φ 2 i . To see that the contribution of v i is actually of second order in ν, one uses again a = Id +νâ and the periodicity of v i .
By selecting the representative volumes by the two criteria (9) and at the same time, in the model problem of the random checkerboard with an ellipticity ratio of 5 Le Bris, Legoll, and Minvielle were able to reduce the variance of the approximations a sel-RVE for the effective conductivity by a factor of 50, compared to the approximations a RVE by the standard representative volume element method.
Another remarkable feature of the selection approach for representative volumes by Le Bris, Legoll, and Minvielle is its compatibility with the vast majority of numerical homogenization methods: As the selection approach for representative volumes operates at the level of the choice of the coefficient field a, it may be combined with essentially any numerical discretization method for the corrector problem (59). Note that there exist many numerical homogenization methods that are particularly well-adapted to certain geometries of the microstructure; the selection approach for representative volumes may be employed in most of these methods to achieve a further speedup.
The selection approach for representative volumes is only one out of several variance reduction concepts in the context of stochastic homogenization: Blanc, Costaouec, Le Bris, and Legoll [22,23,25] have succeeded in reducing the variance by the method of antithetic variables; note that however for this approach the achievable variance reduction factor is much more limited. The method of control variates has also been demonstrated to be successful in the context of the computation of effective coefficients in stochastic homogenization [25,63].
1.6. A brief overview of quantitative stochastic homogenization. For the sake of completeness, let us give a short overview of the tremendous progress that has been achieved in the quantitative theory of stochastic homogenization in recent years. The earliest (non-optimal) quantitative homogenization results for linear elliptic equations are due to Yurinskiȋ [83]. A decade later, Naddaf and Spencer [74] introduced the use of spectral gap inequalities in stochastic homogenization and derived optimal fluctuation estimates in the regime of small ellipticity contrast ||a − Id || L ∞ 1, i. e. in a perturbative setting. Another decade later, Caffarelli and Souganidis derived the first -though only logarithmic -rates of convergence for nonlinear stochastic homogenization problems [30]. Gloria and Otto [51,52] and Gloria, Neukamm, and Otto [47] succeeded in the derivation of optimal homogenization rates for discrete linear elliptic equations with i. i. d. random conductances. Subsequently, these results were generalized to elliptic equations on R d and correlated probability distributions by Gloria, Neukamm and Otto [48,49]. For coefficient fields a whose correlations decay quickly on scales larger than ε > 0, these quantitative estimates for the homogenization error -that is, for the difference between the solutions to the PDE with the random coefficient field (1) and its homogenized approximation (2) -read with C(a) satisfying stretched exponential moment bounds and for suitable p = p(d). Armstrong and Smart [9] were the first to obtain power-law rates of convergence for nonlinear equations, deriving and employing an Avellanda-Lin type regularity estimate [12]; see also Armstrong and Mourrat [8]. Their estimates also come with optimal -almost Gaussian -stochastic moment bounds. Recently, the progress in stochastic homogenization culminated in the derivation of the optimal homogenization rates with optimal stochastic moment bounds by Armstrong, Kuusi, and Mourrat [5] and Gloria and Otto [53]: For finite range of dependence ε, a quantitative error bound for the homogenization error of the form (12) holds true with a random constant C(a) with almost Gaussian moments E[exp(C(a) 2−δ /C(δ))] ≤ 2 for any δ > 0.
Higher-order approximation results in terms of homogenized problems have been derived in [19,20,21,54,67], relying on the concept of higher-order correctors which was first used in the stochastic homogenization context in [42] to establish Liouville principles of arbitrary order in the spirit of Avellaneda and Lin's result in periodic homogenization [11]. Further works in quantitative stochastic homogenization include the analysis of nondivergence form equations [7], a regularity theory up to the boundary [43], denerate elliptic equations [2,44], and the homogenization of parabolic equations [3,64]. Recently, Armstrong and Dario [4] and Dario [35] succeeded in establishing quantitative homogenization for supercritical Bernoulli bond percolation on the standard lattice.
The fluctuations of the mathematical objects arising in the stochastic homogenization of linear elliptic PDEs have been the subject of a beatiful series of works, starting with the work of Nolen [75] and a subsequent work of Gloria and Nolen [50] on quantitative normal approximation for (a single component of) the approximation of the effective conductivity a RVE and a work of Mourrat and Otto [72] on the correlation structure of fluctuations in the homogenization corrector φ i . Mourrat and Nolen [71] have shown a quantitative normal approximation result for the fluctuations of the corrector. Gu and Mourrat [55] have derived a description of fluctuations in the solutions to the equation with random coefficient field (1). Recently, a pathwise description of fluctuations of the solutions to the equation with random coefficient field (1) -namely, in terms of deterministic linear functionals of the so-called homogenization commutator Ξ := (a − a hom )(e i + ∇φ i ), a random field converging (for ε → 0) towards white noise, -was developed by Duerinckx, Gloria, and Otto [37]. As far as quantitative normal approximation results are concerned, all of these works work under the assumption of i.i.d. coefficients (in the discrete setting) or second-order Poincaré inequalities. To the best of our knowledge, the present work provides the first quantitative description of fluctuations (though so far limited to the approximation of the effective conductivity a RVE ) when the decorrelation in the coefficient field is quantified by the assumption of finite range of dependence instead of functional inequalities.
Note that despite its long history [34,61,65,76], the qualitative theory of stochastic homogenization has also been a very active area of research in the past years, see e. g. [10,26,56,57]; however, due to the substantial length of the present manuscript we shall not provide a more detailed discussion and refer the reader to these references instead.
Notation. Throughout the paper, we shall use standard notation for Sobolev spaces and weak derivatives; for a space-time function v(x, s), we denote by ∇v its spatial gradient (in the weak sense) and by ∂ s v its (weak) time derivative. The notation − B f dx :=´B f dx B 1 dx is used for the average integral over a set B of positive but finite Lebesgue measure. The space of measurable functions f with ||f || L p := (´R d |f | p dx) 1/p < ∞ will be denoted by L p . By L p loc we denote the space of functions f with f χ {|x|≤R} ∈ L p for all R < ∞. We shall also use the weighted As usual, we shall denote by C and c constants whose value may change from occurrence to occurrence. We are going to use the notation C(a) and similar expressions to denote a random constant subject to suitable moment bounds; again, the precise value of C(a) may change from occurrence to occurrence.
For a vector v ∈ R m we denote by |v| its Euclidean norm. We denote the identity matrix in R N ×N by Id or Id N . For a matrix A ∈ R m×m we shall denote by |A| its natural norm |A| := max v,w∈R m ,|v|=|w|=1 |v · Aw| and by A * its transpose (as all our matrices are real). For x ∈ R d we denote by |x| ∞ = max i |x i | its supremum norm. By |x−y| per respectively (for sets) dist per (U, V ), we denote the periodicity-adjusted distance (in the context of the torus [0, Lε] d ). By |x − y| per ∞ and dist per ∞ (x, y), we denote the corresponding distances associated with the maximum norm. For a positive definite matrix A, we denote by κ(A) its condition number.
Given a positive definite symmetric matrix Λ ∈ R N ×N , we denote the Gaussian with covariance matrix Λ by For γ > 0, we equip the space of random variables X with stretched exponential moment E[exp(|X| γ /a)] < ∞ for some a = a(X) > 0 with the norm ||X|| exp γ := For a map f : R N → V into a normed vector space V , we denote for any r > 0 by osc The conditional expectation of a random variable X given Y is denoted by E[X|Y ].

Main Results
In the present work, we establish a rigorous justification of the selection approach for representative volumes by Le Bris, Legoll, and Minvielle [62] in the context of stochastic homogenization of linear elliptic PDEs for quite general probability distributions of the coefficient field a R d : Our only assumptions on the probability distribution of the coefficient field a R d : R d → R d×d are uniform ellipticity and boundedness, stationarity, and finite range of dependence, which is a standard set of assumptions in stochastic homogenization [9,53] (note that we equip the space of uniformly elliptic and bounded coefficient fields with the topology of Murat and Tartar's H-convergence [73]). Let us remark that all of our results and proofs are also valid in the case of strongly elliptic systems, upon adapting the notation in the obvious way.
(A1) Uniform ellipticity of a coefficient field a as usual means that there exists a positive real number λ > 0 such that almost surely we have a(x)v · v ≥ λ|v| 2 for a. e. x ∈ R d and every v ∈ R d . Furthermore we assume uniform boundedness in the sense that almost surely |a(x)v| ≤ 1 λ |v| holds for a. e. x ∈ R d and every v ∈ R d . (A2) Stationarity means that the law of the shifted coefficient field a(· + x) must coincide with the law of a(·) for every x ∈ R d . On a heuristic level, this means that "the probability distribution of a is everywhere the same" or, in other words, that the material is spatially statistically homogeneous. (A3) Finite range of dependence ε means that for any two Borel sets A, B ⊂ R d with dist(A, B) ≥ ε the restrictions a| A and a| B must be stochastically independent. In particular, this assumption restricts the correlations in the coefficient field to the scale ε 1.
Note that these assumptions include e. g. the case of a two-material composite with random (either overlapping or non-overlapping) inclusions of diameter ε, the centers distributed according to a Poisson point process (up to removal in case of overlap); see Figure 3a. Further examples include coefficient fields a R d (x) := ξ(ã(x)) that arise by pointwise application of a nonlinear function ξ : R d×d → R d×d to a (tensorvalued) stationary Gaussian random fieldã with finite range of dependence ε and integrable correlations, provided that the function ξ is Lipschitz and takes values in the set of uniformly elliptic and bounded matrices. For the approximation of the effective coefficient a hom , it is of advantage to work with a so-called periodization of the stationary ensemble of random coefficient fields a R d (employing terminology from statistical mechanics, a probability measure on the space of coefficient fields shall also be called an ensemble of coefficient fields). By a periodization of an ensemble of coefficient fields a R d we understand an ensemble of coefficient fields a which are almost surely LεZ d -periodic for some L 1 and for which the probability distribution of a on each cube of size of half the period Lε 2 coincides with the probability distribution of the original coefficient field a R d , i. e. for which the probability distribution of a| Furthermore, to include examples like the random checkerboard in our analysis, we need the following notion of discrete stationarity.
(A2') We say that our probability distribution of coefficient fields a satisfies discrete stationarity if the law of the shifted coefficient field a(· + x) coincides with the law of a(·) for every shift x ∈ εZ d .
Our main assumptions stated in Assumption 1 below consist of two parts: First, we assume that the probability distribution of coefficient fields a R d satisfies the standard assumptions from stochastic homogenization and that there exists a suitable periodization a of the probability distribution. Second, we require the statistical quantities F(a) to admit a "multilevel local dependence structure decomposition" as introduced in Definition 6 below. Let us remark that both the spatial average and the higher-order quantity F 2−point (a) considered by Le Bris, Legoll, and Minvielle [62] as defined in (10) satisfy the conditions in Definition 6; a proof of this fact is provided in Proposition 7 below. As a consequence, both the spatial average F avg (a) and the higher-order quantity F 2−point (a) may be chosen as the statistical quantities by which the selection of representative volumes is performed in our main theorems Theorem 2 and Theorem 3.

Assumption 1 (Assumptions and Notation).
Consider a probability distribution of random coefficient fields a R d on R d , d ≥ 1, which satisfies the conditions of ellipticity, stationarity, and finite range of dependence (A1)-(A3). Let L ≥ 2 and suppose that there exists an Lε-periodization a of the probability distribution of a R d subject to (A1), (A2), (A3 a ) -(A3 c ). Denote by a RVE the approximation for the effective coefficient a hom by the standard representative volume element method with a material sample of size with φ i being the unique Lε-periodic solution with vanishing average to the corrector equation Let F(a) = (F 1 (a), . . . , F N (a)) be a collection of statistical quantities of the coefficient field a which are subject to the conditions of Definition 6 with K ≤ C 0 , B ≤ C 0 | log L| C0 , and γ ≥ c 0 for some 0 < c 0 , C 0 < ∞. Suppose that the covariance matrix of F(a) is nondegenerate and bounded in the natural scaling in the sense For any 1 ≤ i, j ≤ d introduce the condition number κ ij of the covariance matrix of (a RVE ij , F(a)) and the ratio r Var,ij between the expected order of fluctuations and the actual fluctuations of the approximation a RVE ij r Var,ij := Denote by C a constant depending on d, λ, γ, N , and C 0 .
Under the above assumptions, the selection approach for representative volumes to capture certain statistical properties of the material in the representative volume particularly well -as proposed by Le Bris, Legoll, and Minvielle [62] -leads to the following increase in accuracy of the computed material coefficients.

Theorem 2 (Justification of the Selection Approach for Representative Volumes).
Let the assumptions and notations of Assumption 1 be in place. Denote by a sel-RVE the approximation for the effective coefficient a hom by the selection approach for representative volumes introduced by Le Bris, Legoll, and Minvielle [62] in the case of a representative volume of size Lε. Suppose that the representative volumes a| [0,Lε] d are selected from the periodized probability distribution according to the criterion for some δ ∈ (0, 1]. Let the selection criterion be chosen not too strict in the sense that δ N ≥ CL −d/2 | log L| C(d,γ,C0) . Then the selection approach for representative volumes is subject to the following error analysis: a) The systematic error of the approximation a sel-RVE satisfies the estimate b) The variance of the approximation a sel-RVE is estimated from above by where |ρ| 2 is the fraction of the variance of a RVE ij explained by the F(a), that is, |ρ| 2 is the maximum of the squared correlation coefficient between a RVE ij and any linear combination of the F n (a). The explained fraction of the variance is given by the formula c) The probability that a randomly chosen coefficient field a satisfies the selection criterion (14) is at least d) The systematic error and the variance of a sel-RVE may be estimated independently of κ ij at the price of lower rate of convergence in L and The previous theorem states that the approximation of effective coefficients by the selection approach for representative volumes is essentially at least as accurate as a random selection of samples (except for a possible additional relative error of the order CL −d/2 | log L| C , which however converges to zero quickly as L increases), at least when measuring the mean-square error. If the selection is based on a statistical quantity F(a) which is capable of explaining a large part of the variance of a RVE ij , the selection approach achieves a much better accuracy than a random selection of samples (namely, by a factor of about 1 − |ρ| 2 ).
However, the previous theorem only provides a statement about the reduction of the mean-square error by the selection approach for representative volumes. A natural question is whether this reduction of the error also applies to rare events: More precisely, if we fix a small probability p > 0, is the bound on the error |a sel-RVE ij − a hom,ij | which holds with probability 1 − p also improved as suggested by the variance reduction estimate (16)? The following theorem shows that this is in fact true for "moderate deviations", i. e. basically for probabilities p exp(−L β ) for some β > 0. More precisely, the theorem is to be read as follows: Up to error terms that converge to zero as L → ∞ and s → ∞, the probability of a sel-RVE ij deviating from a hom,ij by more than s times the ideally reduced standard deviation (1 − |ρ| 2 )Var a RVE ij behaves like the probability of a normal distribution deviating from its mean by more than s standard deviations, at least in some regime s ≤ L β/3 . Theorem 3. Let the assumptions and notations of Theorem 2 be in place. Suppose in addition L ≥ C. Then the selection approach for representative volumes leads to a reduction of the "outliers" of the probability distribution of a sel-RVE in the sense of the moderate-deviations-type bound We have shown in the preceding two theorems that the selection approach for representative volumes by Le Bris et al. essentially does not increase the error; it succeeds in reducing the fluctuations of the approximations as soon as the functionals F(a) and the approximation a RVE have a nonzero covariance.
However, as we shall show in the next theorem there exist cases in which the selection approach for representative volumes in fact fails to reduce the variance significantly, even for a "natural" statistical quantity like the average of the coefficient field Theorem 4 (Possible Failure of the Reduction of the Variance). Suppose that the assumptions of Theorem 2 hold. Then the estimate (16) on the reduction of the variance is sharp in the sense Furthermore, for d ≥ 2 there exist Lε-periodic probability distributions of coefficient fields a which satisfy the conditions of ellipticity, discrete stationarity, and finite range of dependence (A1), (A2'), (A3 a ) -(A3 c ) with the following property: The covariance of a RVE and the spatial average − a vanishes while the fluctuations of a RVE and − [0,Lε] d a are nondegenerate in the sense for some universal constant c. These coefficient fields may be chosen to be of the form a(x) =ã(x) Id for some scalar random fieldã.
As a consequence, for these probability distributions of coefficient fields the selection approach for representative volumes based on the spatial average − a fails to efficiently reduce the variance in the sense Let us note that it is presumably not too difficult to replace the random checkerboard in our construction of the counterexample featuring (23) by random spherical inclusions distributed according to a Poisson point process (with overlaps of the inclusions). This would yield a counterexample subject to the continuous stationarity (A2).
The next theorem suggests that the failure of effective variance reduction is atypical and may be limited to rather artificial examples: For a large class of random coefficient fields -namely for coefficient fields that are obtained from a collection of iid random variables ξ k , k ∈ εZ d , by applying a stationary monotone map with finite range of dependence -the correlation coefficient between a RVE and the average F(a) := − a is bounded from below by a positive number. Therefore, for such (ensembles of) coefficient fields both the method of special quasirandom structures and the method of control variates in fact reduce the variance by some factor τ < 1 when applied with the choice F(a) := − a.

Proposition 5 (Reduction of the Variance for a Large Class of Coefficient Fields).
Let ε > 0 and let L ≥ 2 be an integer and let V denote some measure space. Let (Γ k ), k ∈ εZ d ∩ [0, Lε) d , be a collection of independent identically distributed Vvalued random variables, and denote by (Γ k ) an independent copy. Extend Γ k to k ∈ εZ d by Lε-periodicity. For k ∈ εZ d and z ∈ V , denote by ∆ k,z Γ the collection (Γ k ) obtained by settingΓ k := z andΓ j = Γ j for all j = k.
Let a = a(x, Γ) be a measurable map into the uniformly elliptic Lε-periodic symmetric coefficient fields with the property that a(x, Γ) depends only on the Γ k with |x − k| per ≤ Kε for some K ≥ 1 (in a measurable way). Suppose that the map is stationary in the sense that a(x + y, Γ) = a(x, Γ ·+y ) for any y ∈ εZ d .
Suppose that the dependence of a on Γ is monotone in the sense that for every k ∈ εZ d and every pair z 1 , z 2 ∈ V either for all x the inequality holds or for all x the reverse inequality holds. Suppose furthermore that there exists ν > 0 such that we have the quantified monotonicity 1/2 + denotes the matrix square root and whereΓ denotes an independent copy of Γ.
Then the probability distribution of a = a(x, Γ) satisfies the conditions of ellipticity, periodicity, and finite range of dependence (A1), (A3 a ), and (A3 b ) (with ε replaced by 4Kε), as well as the discrete stationarity (A2'). Furthermore, for such coefficient fields a the correlation between ξ · a RVE ξ (where ξ ∈ R d is any nonzero vector) and the average In the statements of our main theorems, we have made use of the following notion of "multilevel local dependence decomposition"; this structure will also be at the heart of the proof of our main results. An illustration of this decomposition is provided in Figure 4.
Definition 6 (Sums of Random Variables with Multilevel Local Dependence Structure). Let d ≥ 1, N ∈ N, ε > 0, and L ≥ 2. Consider a probability distribution of coefficient fields a on R d subject to the assumptions of ellipticity and boundedness, Figure 4. An illustration of the "multilevel local dependence structure" introduced in Definition 6 (in a one-dimensional setting). At the bottom, a sample of the random coefficient field a is depicted; the X k y may depend not only on the values of the coefficient field directly below their box, but on the coefficient field in a region that is wider by a factor of K log L.
We then say that X is a sum of random variables with multilevel local dependence if there exist random variables X m y = X m y (a), 0 ≤ m ≤ 1 + log 2 L and y ∈ 2 m εZ d ∩ [0, Lε) d , and constants K ≥ 1, γ ∈ (0, 2], and B ≥ 1 with the following properties: • The random variable X m y (a) only depends on • The random variables X m y satisfy the bound The following proposition shows that the approximation a RVE of the effective coefficient by the method of representative volumes may indeed be rewritten as a sum of random variables with a multilevel local dependence structure. We establish the same result for the spatial average of the coefficient field F avg (a) := − [0,Lε] d a dx and the second-order term F 2−point (a) in the low ellipticity contrast expansion of a RVE given by (10).
Furthermore, the last result of the next proposition shows that the fraction of the variance of a RVE that is explained by the statistical quantities F avg (a) and F 2−point (a) -that is, the gain in accuracy achieved by the selection approach for representative volumes when employing these statistical quantities -stabilizes as the size L of the representative volume increases; more precisely, it converges to some limit with rate L −d/2 | log L| C . Proposition 7. Let the assumptions (A1), (A2), (A3 a ) -(A3 c ) be satisfied, that is consider the periodization of a stationary ensemble of random coefficient fields. For any coefficient field a, denote by φ i the unique (up to additions of constants) periodic solution to the corrector equation Then the approximation a RVE of the effective coefficient a hom by the representative volume element method, given by is a sum of a family of random variables with multilevel local dependence. More precisely, a RVE satisfies the criteria of Definition 6 for any γ < 1 with K := C(d, λ) and Furthermore, the spatial average is also a sum of a family of random variables with multilevel local dependence. The criteria of Definition 6 are satisfied by F avg (a) for any γ < ∞ with K := C(d) and Additionally, the second-order correction to the effective conductivity in the setting of small ellipticity contrast F 2−point , given by (27) with v i denoting the solution to is a sum of random variables with multilevel local dependence structure: The random variable F 2−point (a) satisfies the criteria of Definition 6 for any γ < 1 with K := C(d, λ) and B := C(d, γ, λ)| log L| C(d,γ) .
Finally, the rescaled variances and covariances of a RVE and the statistical quantities F avg (a) and F 2−point (a) converge as L → ∞: There exist positive semidefinite

Strategy of the proof and intermediate results
Our main result relies on a quantitative normal approximation result for the joint probability distribution of the approximation of the effective conductivity a RVE and auxiliary random variables F(a) like the spatial average − [0,Lε] d a dx. The distance of the probability distribution to a multivariate Gaussian will be quantified through the following notion of distance between probability measures. Note that this distance is a standard choice in the theory of multivariate normal approximation, see e. g. [32] and the references therein.
Definition 8. Given a symmetric positive definite matrix Λ ∈ R N ×N and somē L < ∞, we consider the classes ΦL Λ of functions φ : R N → R subject to the following properties: • φ is smooth and its first derivative is bounded in the sense |∇φ(x)| ≤L for all x ∈ R N . • For any r > 0 and any where osc r φ(x) is the oscillation of φ defined as The class Φ Λ is defined as Furthermore, we introduce the distance D between the law of an R N -valued random variable X and the N -variate Gaussian N Λ as Note that defining the distance D with the class of functions Φ 1 Λ instead of Φ Λ would lead to the 1-Wasserstein distance. The distance D is a stronger distance than the 1-Wasserstein distance: The 1-Wasserstein distance is defined by taking the supremum in (30) only over all functions φ which are 1-Lipschitz. In contrast, the condition (29) corresponds more or less to a slightly stronger condition than an L 1 loc -type bound for ∇φ: It in particular implies by letting r → 0 It is well-known that Stein's method of normal approximation allows to establish a quantitative result on normal approximation for sums of random variables with local dependence structure, see e. g. [32,33,78] and the references therein. However, the approximation of the effective coefficient a RVE -that is, the random variable a RVE as defined by (4) -features global dependencies. It is shown in Proposition 7 that a RVE may nevertheless be approximated by a sum of random variables with a multilevel local dependence structure. We then employ the following quantitative central limit theorem for sums of vector-valued random variables with a multilevel local dependence structure, which is not covered by the normal approximation results for sums of random variables with a given dependency graph in the literature and which is established in the companion article [41].
Theorem 9 ([41, Theorem 4]). Consider a probability distribution of uniformly elliptic and bounded coefficient fields a on R d or a periodization of such a probability distribution, and suppose that assumptions (A1)-(A3) respectively (A1), (A2), (A3 a )-(A3 c ) are satisfied. Let X = X(a) be a random variable that is a sum of random variables with multilevel local dependence in the sense of Definition 6. Then the law of the random variable X is close to a multivariate Gaussian in the sense where Λ := Var X and where the constant C(d, γ, N, K) depends in a polynomial way on d, N , and K.
Furthermore, we have for any symmetric positive definite providing a better bound in the case of degenerate covariance matrices Var X.
Our result on moderate deviations of the probability distribution of a sel-RVE is based on the following simple general moderate deviations result for sums of random variables with multilevel local dependence structure.
Theorem 10 ([41, Theorem 5]). Consider an ensemble of coefficient fields a on R d , d ≥ 1, or its periodization for some L ≥ 1, subject to the conditions (A1)-(A3) respectively (A1), (A2), and (A3 a )-(A3 c ). Let X = X(a) be a random variable that may be written as a sum of random variables with multilevel local dependence structure X = i in the sense of Definition 6. Then there exists β = β(d, γ) > 0 and a positive definite symmetric matrix Λ ∈ R N ×N with |Λ − Var X| ≤ C(d, γ, N, K)B 2 L −2β L −d such that for any measurable A ⊂ R N we have the estimate

Justification of the selection approach for representative volumes
We now provide the proof of our main result -the error estimates for the selection approach for representative volumes by Le Bris, Legoll, and Minvielle [62] -which is stated in Theorem 2 and Theorem 3.
The idea for the proof of all statements of Theorem 2 is the following: Theorem 9 enables us in conjunction with Proposition 7 to approximate the joint probability distribution of a RVE and F(a) by a multivariate Gaussian with the same covariance matrix. The probability distribution of a sel-RVE arises as the probability distribution of a RVE conditioned on the event (14). As a consequence, the probability distribution of a sel-RVE may be approximated by the marginal of the conditional probability distribution of an ideal multivariate Gaussian. The results of Theorem 2 on the probability distribution of a sel-RVE are then a consequence of corresponding properties of multivariate normal distributions.
Proof of Theorem 2. For the proof of the theorem we may assume without loss of generality that E[F(a)] = 0. Throughout the proof, the constants c and C may depend on d, λ, N , γ, c 0 , and C 0 , if not otherwise stated.
Recall that the probability distribution of a sel-RVE is given by the probability distribution of a RVE conditioned on the event (14). Theorem 9 and Proposition 7 entail that the joint probability distribution of any component a RVE where the renormalization factor p is given by The assertions (15) and (16) on the systematic error and the variance reduction in Theorem 2 will be a consequence of the lower bound (18) on the probability of a random coefficient field satisfying the selection criterion, the related lower bound the stretched exponential moment bounds for any γ < 1/2 and the approximation result of the distribution of a sel-RVE for any continuousφ : R → R satisfying for all r > 0 and all x 0 ∈ R. To obtain the κ-independent estimates (19) and (20), the bound (37) is replaced by We defer the proof of (18) and (37) (as well as (39)) to the last step and first demonstrate that these estimates entail the assertions (15) and (16) of our theorem.
Step 1: Estimate on the systematic error. In order to derive the estimate on the systematic error (15), we first use the formula (34) and Fubini's theorem to see thatˆx where in the second step we have used the symmetry of the Gaussian N Var F (a) . In other words, if the probability distribution of (a RVE , F(a)) were an ideal multivariate Gaussian, we would have the perfect equality We would now like to transfer the property (40) (up to an error) from M δ to our actual probability distribution a sel-RVE by choosingφ(x) := x in the estimate (37). However, this choice is not possible due to the upper bound onφ in (38a). Instead, for some cutoff factor B c ≥ 1 we consider the functionφ( Note that for this choice ofφ we have |∇φ| ≤ 1 and |φ| ≤ B c L −d/2 . As a consequence, 1 Bcφ satisfies (38) and hence is an admissible choice in (37), which gives by (40) Using first the lower bounds (18) and (35) and the representation (44) and then in the next step Hölder's inequality, the previous estimate implies , y) dy dx This yields by Lemma 19b and the bounds (36a) and (36b) .
Plugging in the bound for the systematic error of the standard representative volume element method |E[a RVE ] − a hom | ≤ CL −d | log L| C from [53] (note that this estimate for the systematic error of the standard representative volume element method may also be derived by slightly modifying the proof of our Proposition 7), we obtain (15). Repeating the previous proof but replacing the use of the estimate (37) by (39), we obtain (19).
Step 2: Proof of the variance reduction estimate. To prove the variance estimate (16), we proceed similarly and define for a cutoff factor B c ≥ 1 the function Note that this function satisfies the global bounds |∇φ| ≤ 2B c L −d/2 and |φ| ≤ B 2 c L −d . Thus, (38) and is therefore an admissible choice in (37), yielding The tails (subject to truncation in our choice of φ) can be estimated by where in the last step we have used (18), (35), and (44). Applying Hölder's inequality, we obtain where in the last step we have used Lemma 19b and the bounds (36a) and (36b).
Combining this estimate with (42) and choosing B c := C| log L| C(d,γ) , we infer In other words, the variance of a sel-RVE ij is determined up to an error by the variance of the probability distribution M δ . To estimate the latter, a straightforward computation yieldŝ By the symmetry of the set {|y| ≤ δL −d/2 } and the probability density N Var F (a) (y) we have´R N yχ {|y|≤δL −d/2 } N Var F (a) (y) dy = 0. As a consequence, we get Together with (43), this entails (16). To prove (20), we repeat the proof of (43) and just replace the use of (37) in the proof of (43) by (39). Note that the lower bound (22) on the variance given in Theorem 4 follows also from the estimates (43) and (15) and the lower bound´( , the latter of which is derived analogously to the upper bound Step 3: The probability density of the reference distribution. Var F(a) .
The probability density M δ of the first-variable marginal of the corresponding multivariate Gaussian conditioned on |F(a)| ≤ δL −d/2 , which is the probability distribution by which we approximate the distribution of a sel-RVE ij , is given by Our goal is to show that this probability density M δ may be rewritten in the form (34). To this aim, we recall some basic linear algebra: The Schur complement of the symmetric block matrix is given by T := A − BD −1 B T and the inverse of the matrix may be written as The determinant may be expressed as det M = det T ·det D. The Schur complement allows us to rewrite the quadratic form defined by M −1 as As a consequence, we get for M := Λ that Now, (34) and (44) are seen to be equivalent.
Step 4: Proof of the normal approximation estimate and the lower bound on the probability of the event |F(a)| ≤ δL −d/2 . First, let us show the lower bound (35). We havê establishing (35).
The estimate (36b) is a consequence of the estimate on Var (a RVE , F(a)) which follows from (36a), (13), and the exponential moment bounds for Gaussians. The bound (36a) is a consequence of Lemma 12 (note that by Proposition 7, Lemma 12 is indeed applicable).
Our next goal is to show (37) and (39). Letφ : R → R satisfy (38) and suppose that we would like to estimate the error As the distribution of a sel-RVE ij is obtained from the distribution of a RVE ij by conditioning on the event |F(a)| ≤ δL −d/2 , by (34) and (44) this error expression is equal to Up to the normalizing factor 1/P |F(a)| ≤ δL −d/2 , the first term on the right-hand side is given by We would now like to show that (a suitable multiple of) the function φ is admissible in the error bound (33). By the estimate det Var F(a) 1/N r.
By our assumption (13), this yields for any Looking at Definition 8, we would have 1 C φ ∈ Φ Λ if it were not for the qualitative Lipschitz continuity condition for functions in Φ Λ . However, for a standard family of mollifiers ρ ε supported in {|x| 2 + |y| 2 ≤ ε} the approximations φ ε (x, y) := (ρ ε * φ)(x, (1−2δ −1 L d/2 ε)y) satisfy 1 C φ ε ∈ Φ Λ for any ε ∈ (0, 1 4 δL −d/2 ] (see Definition 8) for some constant C. Furthermore, the φ ε converge poinwise to φ for ε → 0 (by (47) and the continuity assumption onφ; it is here that we need the dilation factor (1 − 2δ −1 L d/2 ε) in the second variable due to the discontinuity in the definition (47)) and satisfy a uniform bound of the form |φ ε (x, y)| ≤ L −d/2 (by (47) and (38a)). Choosing the functions 1 C φ ε in the definition of the distance D and passing to the limit ε → 0, we infer ], F(a)), N Λ ). Theorem 9 is applicable to the random variable X := (a RVE ij , F(a)) by our assumptions on F(a) (see Assumption 1) and by the multilevel decomposition of a RVE ij provided by Proposition 7. In total, with the notation Λ := Var (a RVE ij , F(a)) the application of Theorem 9 to (a RVE ij , F(a)) yields where in the last step we have used (13) (which entails L −d ≤ |Λ 1/2 | 2 ) and the definition of κ ij .
We now turn to the proof of the moderate-deviations-type result for the selection approach for representative volumes stated in Theorem 3.
Proof of Theorem 3. FixS ≥ CL −d/2−β/2 . Our goal is to estimate the probability The main task is the derivation of a suitable estimate for the numerator. To this aim, we apply the moderate deviations estimate from Theorem 10 to the random variable (a RVE By Proposition 7 and our assumptions, the application of Theorem 10 is possible, resulting in the estimate for some positive definite matrixΛ with We intend to apply the factorization property (45) to the matrixΛ with the notatioñ Λ = ÃB By (52) and the bounds L −d Id ≤ Var F(a) ≤ CL −d Id (see (13)) and Var a RVE and As a consequence of these estimates and (52), the formula (17) for |ρ| 2 implies for Using the bounds Var a RVE ij ≤ CL −d | log L| d and |ρ| ≤ 1 as well as (54), (17), and (13), we obtain for any |y| ≤ (δ + CL −β )L −d/2 that Applying the factorization property (45) to the first term on the right-hand side of (51), we obtain Using (53) to estimate the last factor in this estimate and assuming for the moment S ≥ Cδ|ρ| Var a RVE ij as well as L ≥ C(β) to estimate the quotient in the first factor, we get Using the bound L −d Id ≤ Var F(a) from (13) and assuming L −2β ≤ c, we get NΛ(x, y) dx dy NΛ(x, y) dx dy ByT ≤ (1−|ρ| 2 )Var a RVE ij +CL −d−β Id (which follows from (55)) and Var a RVE ij ≤ CL −d Id, we deduce from (57) under the assumptionsS ≥ Cδ|ρ| Var a RVE ij and L ≥ C(β) As a consequence, we obtain Plugging in this bound into (51), we obtain Inserting the previous estimate into (50) and using (49), (35), and (18) as well as the assumption δ N ≥ CL −d/2 to estimate the denominator, we get

Note that we have the estimate |E[a RVE
ij ] − a hom,ij | ≤ CL −d | log L| C . By redefining S (and possibly increasing the constant in (58); recall thatS ≥ L −d/2−β/2 ), we obtain

The multilevel local dependence structure of the approximation for the effective conductivity
We now prove that the approximation a RVE for the effective conductivity obtained by the representative volume element method may indeed be written as a sum of a family of random variables with multilevel local dependence structure in the sense of Definition 6. Furthermore, we show that the same is true for the spatial average of the coefficient field F avg (a) := − [0,Lε] d a dx and also for the second-order correction F 2−point (a) to a RVE in the setting of small ellipticity contrast.
Proof of Proposition 7. Part 1: The spatial average of the coefficient. First, let us show that the average F avg (a) := − [0,Lε] d a dx is approximately the sum of a family of random variables with multilevel local dependence structure. Decomposing defining the X 0 y as indicated in this formula, and setting X m y := 0 for m ≥ 1, we immediately observe that the average F avg (a) is the sum of a family of random variables with multilevel local dependence structure with K := 1. The bound (26) follows immediately from the uniform bound on a (with B := ||a|| L ∞ and arbitrary γ > 0). Part 2: The approximation a RVE for the effective coefficient. Next, let us show that a RVE is approximately the sum of a family of random variables with multilevel local dependence structure. For simplicity of notation, let us assume that ε = 1.
Recall that the corrector φ i associated with the periodized ensemble is the unique L-periodic solution to the equation ∇ · (a(e i + ∇φ i )) = 0 (59) with vanishing average − [0,L] d φ i dx = 0. We shall use the decomposition of the (L-periodic) corrector φ i according to Observe that the parabolic PDE directly entails Thus, decay of u i for t → ∞ implies that φ i may indeed be decomposed aś Recall the key result from [53] which states that under the assumptions of ellipticity, stationarity, and finite range of dependence (A1)-(A3) the full-space variant u R d i (·, s) -that is, the solution to the equation with a R d denoting a coefficient field from the original (non-periodic) ensemble of coefficient fields -actually decays like s −(1+d/2)/2 in suitable norms:

Theorem 11 ([53], Corollary 4).
Consider an ensemble of random coefficient fields a R d subject to the assumptions (A1)-(A3) with range of dependence ε := 1. Then for any T > 0 we have the estimate where the random constant C(a R d , T ) satisfies for any δ > 0 a bound of the form Note that the second inequality (62b) is actually not contained in [53,Corollary 4]. However, it is an easy consequence of (62a) (the proof is provided below).
By φ * j and u * j we shall denote the corresponding quantities for the adjoint coefficient field a * , i. e. φ * j (·) :=´∞ 0 u * j (·, s) ds with u * j being the L-periodic solution to d ds The full space variants u * ,R d j satisfy also estimates of the form (62a)-(62b), as the conditions (A1)-(A3) are invariant under passing to the adjoint coefficient fields.
We introduce a "cutoff scale" L K as the largest integer power of 2 not larger than L 16K log L for some constant K ≥ 1 that remains to be chosen. Defining T L := (L K ) 2 , we now compute using the properties (59), (60), and (61) We now decompose the integrals into integrals over cubes with side length ∼ 2 k , resulting in We now intend to replace u i and u * j in each of these expressions by a proxy with localized dependence. To this aim, for any k ∈ N 0 and any x 0 ∈ 2 k Z d , define the coefficient field a k,x0 on the full space R d as Define a corresponding u i,k,x0 as the solution to the equation and introduce analogously the function u * i,k,x0 as the solution to the equation with a k,x0 replaced by a * k,x0 . Note that while u i and a are defined on [0, L] d and extended to R d by periodicity, both a k,x0 and u i,k,x0 are defined on R d and lack any periodicity.
By Lemma 15 -applied with M := 1 2 K| log L| and r := 2 k -we have and analogous estimates for the difference u * j − u * j,k,x9 . As our probability distribution of coefficient fields a on [0, L] d is the periodization of a probability distribution of coefficient fields a R d on R d , by definition of a periodization (see (A3 c )) for each x 0 ∈ [0, L) d and any k ≤ log 2 L K the law of a| x0+K log L[−2 k ,2 k ] d coincides with the law of a R d | x0+K log L[−2 k ,2 k ] d . As a consequence, the law of u i,k,x0 coincides with the law of u R d i,k,x0 , where u R d i,k,x0 is defined analogously to u i,k,x0 (replacing a in the definition by a R d ). Therefore, any moment bound on u R d i,k,x0 carries over to u i,k,x0 . Applying Lemma 15 to u R d i,k,x0 , we obtain estimates analogous to (66) and (67). The estimates from Theorem 11 therefore carry over to u R d i,k,x0 , provided that we choose K ≥ C: We have for t ∈ [4 k , 4 k+1 ] and for any δ > 0. By coincidence of laws, we get for t ∈ [4 k , 4 k+1 ] and for random constants C satisfying for any δ > 0. Furthermore, the bound (102) yields an estimate of the form By (61), its analogue for u i,0,x0 , and the definition of a 0,x0 , we have in As a consequence of our definition of u i,k,x0 , for the choice for 0 ≤ k ≤ log 2 L K , we see by (64) and (65) and √ K log L ≥ 1 that X k x0 is a random variable which depends only on a| x0+K log L[−2 k ,2 k ] d , i. e. the first condition of Definition 6 is satisfied. Furthermore, by (68) and (69) we obtain for any 0 < γ < 1 an estimate of the form We now intend to replace the terms in the first five terms on the right-hand side of (63) by the X k x0 with 0 ≤ k ≤ log 2 L K + 1, using the estimates (66), (67), (70), and Hölder's inequality to bound the arising error: For example, we may estimate where in the last step we have used 4 k ≤ CL 2 and (2 k ) d/2 ≤ CL d/2 , absorbing these factors in the factor L −cK (possible for cK ≥ 4 + 2d). Proceeding analogously for the other terms in (63), we deduce Inserting the estimates (68), (69), we get for some C(a) with ||C(a)|| exp γ ≤ C(d, λ, K, γ) for any γ ∈ (0, 1) The bound (66) and its equivalent for u R d i and u R d i,k,x0 enable us to transfer the bounds in Theorem 11 from u R d i to u i : Recalling that T L = (L K ) 2 , we obtain The latter estimate entails in view of Theorem 11 (choosing K ≥ C and recalling that √ where ||C(a, T L )|| exp γ ≤ C(d, λ, K, γ) for any γ < 1. An analogous bound holds for u * j . Finally, the energy estimate for As the average of u i over [0, L] d vanishes, the Poincaré inequality implies for and as a consequence Note that this estimate yields in particular where in the last step we have used that √ T L = L K is the largest power of 2 with L K ≤ L 4K log L . Plugging in these bounds and (75) into (73), we get for K ≥ C with ||C(a, T L )|| exp γ ≤ C(d, λ, K, γ) for any γ < 1. Choosing γ ∈ (0, 1) and B := C(d, λ, K, γ)(4K log L) 2+d in Definition 6, defining the variable X log 2 L+1 0 (which may depend on a on the full volume [0, L] d ) to account for the remaining difference , and setting the remaining X k i := 0 for log 2 L K + 1 < k < log 2 L + 1, this establishes that a RVE may be rewritten as a sum of a family of random variables with multilevel local dependence.
Part 3: The higher-order statistical quantity. Next, we derive the multilevel decomposition of the higher-order quantity in the small ellipticity contrast setting F 2−point . To do so, we decompose the solution v i to (28) as where w i is defined as the solution to the parabolic PDE d dt w i = ∆w i , w i (·, 0) = ∇ · (ae i ).
As before, the representation (77) follows from the exponential decay of w i , as we have −∆´T 0 w i (·, t) dt = ∇ · (ae i ) − w i (·, T ). We introduce analogous definitions for v * j . Again, we may assume without loss of generality that ε = 1. We then observe following an argument of Mourrat [70] that by formula (78) below Next, we deduce We may now proceed to argue just like in the case of a RVE . The required decay estimates for the semigroup of the form λ)) are now a consequence of the explicit heat kernel representation of the solution w i (as we are now dealing with a constantcoefficient parabolic equation), the finite range of dependence ε = 1 of the initial data w i (·, 0) = ∇ · (ae i ), and standard Gaussian concentration estimates (or, alternatively, -though then with a less strong stretched exponential bound -the concentration estimates of Lemma 20). In the computation above we have used the simple fact that Part 4: Convergence of the variance. Finally, we prove that the rescaled variances L d Var a RVE , L d Var F avg (a), and L d Var converge for L → ∞. We limit ourselves to proving convergence of the rescaled variance L d Var a RVE ; the proofs for the convergence of the other variances and the covariances are analogous. Furthermore, to simplify notation we limit ourselves to proving convergence of the variance for L = 2 n for some n ∈ N; the proof in the general case is similar.
By Lemma 12, we obtain Var a RVE ≤ C(d, λ, K)L −d | log L| C(d) . Using (76) and this estimate, we deduce Expanding the sum and using stochastic independence of many of these terms, we may write Denote by X k,R d y the quantities defined as in (71) but with u i,k,x0 and u * j,k,x0 replaced by u R d i and u * ,R d j , i. e. for example for k ≥ 0 and y ∈ 2 k Z d . By the full-space variants of the estimates (66), (67), and (70) (i. e. the estimates for the differences u R d i − u R d i,k,x0 etc., which are derived in exactly the same way) and (72) as well as the equality of laws of (products of the) u i,k,x0 etc. and (products of the) u R d i,k,x0 etc. , we get for k,k ≤ 1 + log By the definition of the X k y (see (71)), the definition of the u i,k,x0 , and the stationarity of the probability distribution of a R d , the covariance Cov[X k,R d y , Xk ,R d y ] depends only on k,k, y −ỹ, L, and the law of a R d (but not on y for fixed y −ỹ).
Furthermore, by (72) we have | Cov[Xk y , X k y ]| ≤ CL −2d . This implies by (79) Var a RVE − for K chosen large enough. The fact that (by stochastic independence) we have Cov[L d Xk y , L d X k y ] = 0 for |y−ỹ| per ≥ C(d)2 k K log L and k ≥k implies together with (79) and the definition of X k,∞ y that (by selecting K large enough and by choosing L to be just small enough for |y −ỹ| ≥ C(d)2 k K log L to hold in case |y −ỹ| ≥ C(d)K2 k and otherwise -i. e. for |y −ỹ| ≤ C(d)K2 k -appealing to the upper bound (72)) As a consequence, we obtain

This implies
.
We now distinguish the casesỹ ∈ [−R k 2 k , R k 2 k ] d andỹ / ∈ [−R k 2 k , R k 2 k ] d for some R k to be chosen. Using (80) in the latter case, we get Fork ≤ k and R2 k ≤ L K we have by Lemma 12 and (72) Cov X k y , which entails by (79) As a consequence, choosing R k = Sk for S ≥ 1 large enough we get In total, we have shown convergence of the rescaled variance L d Var a RVE towards a limit independent of L with the desired rate.
The proof of the other cases is analogous.
Proof of Theorem 11. The estimate (62a) is contained in [53,Corollary 4]. In view of the Poincaré inequality the bound (62b) is a consequence of (62a) and an estimate on a (weighted) average of u R d i . Hence, we only need to derive a bound on for a suitably chosen smooth function ψ supported in {|x| ≤ 1}. To this aim, we computeˆu which yields upon applying the Poincaré inequality to the second term (note that the second factor in the integral has vanishing average) and using the bound (62a) ˆu Summing over a dyadic sequence of times 2 k T and using the fact that almost surely we infer (62b) (upon redefining the constant C(a, T )).
In the previous proofs, we have made use of the following elementary concentration estimate for sums of random variables with multilevel local dependence. Lemma 12 ([41], Lemma 9). Consider a probability distribution of uniformly elliptic and bounded coefficient fields a on R d or a periodization of such a probability distribution, and suppose that assumptions (A1)-(A3) respectively (A1), (A2), (A3 a )-(A3 c ) are satisfied. Let X = X(a) be a random variable that is approximately a sum of random variables with multilevel local dependence in the sense of Definition 6. Then forγ := γ/(γ + 1) the concentration estimate holds true.

Failure and Success of the Variance Reduction Approaches
We now establish our theorems on the failure and the success of the variance reduction approaches in stochastic homogenization. We start with the counterexample that shows that in general there is no guarantee that the variance reduction techniques provide an effective reduction of the variance, even for "natural" choices of the statistical quantity F(a) like the spatial average F avg (a) := − [0,Lε] d a dx. Note that the derivation of (24) from (23) requires the estimate (22) under the assumption (A2') instead of (A2). However, the only place where the assumption (A2) entered in our analysis is in Proposition 7, where it was used to apply the result of [53] on the decay of the semigroup. However, the arguments of [53] may be modified to yield the corresponding estimate under the assumption of discrete stationarity (A2').
Let us now turn to the construction of our counterexample featuring the degenerate covariance (23). The construction is based on the following ideas: • The approximation a RVE for the effective coefficient depends in a uniformly continuous way on a as a map L ∞ ([0, Lε] d ; R d×d ) → R d×d , as long as a is uniformly elliptic and bounded.
(which follows from the definition of a τ,κ and the independence of a and a τ ) and the fact that the latter two variances satisfy such a lower bound (note that the spatial average of the coefficient field on a tile with microstructure A τ does not equal σ Id). The non-degeneracy of Var a RVE τ,κ is shown as follows: First, a new coefficient field a τ,κ,eff is introduced by letting a τ,κ,eff = a τ,κ on each tile without microstructure but replacing the values of a τ,κ by the effective coefficient from periodic homogenization on each tile with microstructure. Note that a τ,κ,eff corresponds to a standard random checkerboard. Denote by a RVE τ,κ,eff the approximation for the effective coefficient associated with the coefficient field a τ,κ,eff (i. e. the result of formula (8) for the coefficient field a τ,κ,eff ). The nondegeneracy of Var a RVE τ,κ now follows from the nondegeneracy Var a RVE τ,κ,eff,ii L −d and the convergence |a RVE τ,κ − a RVE τ,κ,eff | → 0 for τ → 0 (uniformly in κ, see below). Note that a RVE τ,κ,eff corresponds to a random checkerboard with tiles (κσ + (1 − κ)) Id, κσ + (1 − κ) · 1 2 Id, κA τ + (1 − κ) Id, and κA τ + (1 − κ) · 1 2 Id, each tile chosen with probability 1 4 (and the microscopic tiles rotated and reflected at random). Thus the nondegeneracy of Var a RVE τ,κ,eff,ii for 1 ≤ i ≤ d follows from the covariance estimate of Proposition 5 and the quantitative upper bound Var − [0,Lε] d a τ,κ,eff dx ≤ CL −d .
To complete the proof, it only remains to establish the negativity of the covariance for τ 1 small enough and suitable σ, µ, λ, as well as the convergence a RVE τ,κ → a RVE τ,κ,eff for τ → 0, uniformly in κ. The underlying idea for our choice of the tiles in Figure 5 is that we intend to exploit the nonlinear dependence of the effective coefficients in periodic homogenization on the coefficient field, equipping such a tile with an effective coefficient that is unrelated to the spatial average of the coefficient field. Heuristically, by classical results in periodic homogenization we expect the following to happen: • Consider our (sub)pattern of periodic horizontal stripes of equal height (i. e. the red-and-blue subpattern in Figure 5), in which the coefficient field a alternatingly takes the values Id and λ Id. Then the (large-scale) effective coefficient for this pattern is given by , that is by the arithmetic mean in the horizontal direction and by the harmonic mean in the vertical direction.
• Consider now the pattern of periodic vertical stripes of equal width, in which the coefficient alternatingly takes the value µ Id respectively is given by the pattern of horizontal stripes from the previous step. The effective coefficient for this (second-order laminate) pattern is (at least in the limit of an infinitesimally fine horizontal pattern) given by the arithmetic mean of the effective coefficients in the vertical direction and the harmonic mean of the effective coefficients in the horizontal direction, that is by Choosing µ : -which is positive for any λ ∈ (0, 1] -, the effective coefficient becomes a multiple of the identity matrix.
Note that the spatial average of the coefficient field on a tile is given by Id .
• Consider the coefficient field a τ,eff that is obtained from our random checkerboard with microstructure a τ by replacing a τ on the tiles with microstructure with the effective coefficient ( λ 1+λ + µ 2 ) Id. The coefficient field a τ,eff is now just a usual random checkerboard; by Lemma 13 and Proposition 5, the covariance is a positive multiple of Id ⊗ Id, and we have a lower bound of the form ≥ cL −d Id ⊗ Id for the choice of λ, µ, and τ to be made below. Note that a τ,eff -and hence also the preceding covariance -is actually independent of τ (we just keep the τ to emphasize that a τ,eff is the coefficient field obtained from a τ in the homogenization limit τ → 0). We shall prove below that a RVE τ is (quantitatively) close to a RVE τ,eff for τ 1 small enough, which implies that is close to a positive multiple of Id ⊗ Id (again with a lower bound of the form ≥ cL −d Id ⊗ Id).
• The average − [0,Lε] d a τ dx is an affine function of − [0,Lε] d a τ,eff dx: The coefficient field a τ,eff is constant on each tile and may only take the values σ Id or ( λ 1+λ + µ 2 ) Id. On the tiles on which the value of a τ,eff is σ Id, a τ also takes the constant value σ Id. However, on the tiles on which a τ,eff is given by ( λ 1+λ + µ 2 ) Id (i. e. on the tiles on which a τ features a microstructure), the average of a τ is 2µ+λ+1 4 Id. We thus have Choosing σ such that σ > λ 1+λ + µ 2 but σ < 2µ+λ+1 4 -which is possible for λ > 0 small enough -, we obtain a relation of the form It now only remains to prove two things: We need to show that a RVE τ is quantitatively close to a τ,eff if we choose the width τ of the vertical stripes and the height τ 2 of the horizontal stripes in the pattern in Figure 5 small enough and we need to establish the corresponding assertion for the interpolated coefficient field a τ,κ,eff . As the latter result is shown similarly -though with two different microscopic tiles κA τ + (1 − κ) 1 2 Id and κA τ + (1 − κ) 1 2 Id, depending on whether the random checkerboard a equals Id or 1 2 Id on the tile (and correspondingly, with two sets of homogenization correctors and two characteristic functions χ microtile1 and χ microtile2 , see below for this notation) -, we only provide the proof of the latter result.
For the remainder of the proof, we shall fix without loss of generality ε := 1 to avoid even more cumbersome notation. Again to avoid even more cumbersome notation, we only give the proof in the case that all tiles with microstructure have the same orientation as in Figure 5.
To see this quantitative closeness, we construct an approximate homogenization corrector φ i,appr for a RVE τ . To this aim, let φ i,eff be the homogenization corrector associated with the coefficient field a τ,eff , that is let φ i,eff solve −∇ · (a τ,eff (e i + ∇φ i,eff )) = 0 on [0, L] 2 with periodic boundary conditions. We now intend to build the approximate homogenization corrector φ i,appr for a RVE τ by a nested two-scale expansion, using the homogenization correctors for the periodic laminate microstructures.
By Meyer's estimate, there exists p > 2 with Furthermore, a τ,eff is constant on each tile k + [0, 1) 2 , which implies on each tile T = k + [0, 1) 2 (with k ∈ Z 2 ) for each x ∈ T by regularity theory for constant coefficient equations Let ρ δ denote a standard mollifier. The L p estimate and the estimate on ∇ 2 φ i,eff imply (for notational convenience we extend φ i,eff by periodicity) for some α > 0 (for a proof of this estimate, split the domain into a neighborhood of size δ 1/5 of the tile boundaries ∂T , on which one uses the Hölder inequality and the L p bound on ∇φ i,eff in (81), and the interior {x ∈ T : dist(x, ∂T ) ≥ δ 1/5 }, where one applies the regularity estimate (82)).
Let φ i,h denote the 2-periodic homogenization corrector for the coefficient field a h (x, y) associated with the pattern of horizontal stripes in Figure 5 (i. e. let a h (x, y) = a h (y) take alternatingly on intervals of length 1 the values Id and λ Id). Note that φ 1,h ≡ 0 and that φ 2,h is explicitly given by We shall frequently use the uniform bound on the gradient |∇φ i,h | ≤ C derived easily from this formula. Let φ i,v denote the 2-periodic homogenization corrector associated with the pattern of vertical stripes of width 1, in which the coefficient field a v (x, y) = a v (x) alternatingly takes the values µ Id and Note that we have φ 2,v ≡ 0 and that φ 1,v is given explicitly by We shall again frequently use the uniform bound on the gradient |∇φ i,v | ≤ C. We define the vector potential for the flux correction σ h,ijk , skew-symmetric in its last two indices, as σ h,212 := 0 and σ h,112 :=ˆy Note that with this definition σ h,ijk satisfies ∇ · σ h,i = a h (e i + ∇φ i,h ) − a h,eff e i , as one checks by a case-by-case analysis.
Similarly, we define σ v,ijk , skew-symmetric in its last two indices, as σ v,121 := 0 and Let us denote the indicator function of the tiles with microstructure by χ microtile (i. e. χ microtile is 1 on all tiles k + [0, 1) d ⊂ [0, L) d with microstructure and 0 on the other tiles). Similarly, we denote by χ vmicrostripe the indicator functions of all vertical stripes that according to Figure 5 contain a micropattern of horizontal stripes. We then build our approximate correctors as We observe that φ i,appr,1 satisfies the estimate We also have the bound |∇φ i,appr,2 | ≤ C min{1, δ 2 } (|∇φ i,appr,1 | + 1) (87) Furthermore, if we are at least τ δ 1 away from the tile boundaries and the boundaries of the vertical stripes (note that ρ δ1τ * ∇φ j,v (·/τ ) is then equal to ∇φ j,v (·/τ ) as the latter quantity is constant in each stripe; note also that then ρ τ δ1 * χ microtile is locally constant = 0 or = 1 and that we have a uniform bound on ∇φ j,v ), we have by (82) on each tile If we are at least τ δ 1 away from the tile boundaries and the boundaries of the vertical stripes and at least τ 2 δ 2 away from the boundary of the horizontal stripes, we get (note that ρ δ2τ 2 * ∇φ k,h (·/τ 2 ) is then equal to ∇φ k,h (·/τ 2 ) as the latter quantity is constant in each small horizontal stripe; note also that then ρ τ 2 δ2 * χ hmicrostripe is locally constant = 0 or = 1 and that we have a uniform bound on ∇φ k,h ) ≤ C e i + ∇φ i,appr,1 − j (e j + χ microtile ∇φ j,v (·/τ ))(δ ij + ∂ j φ i,eff ) Using the fact that by Meyers inequality we have for some p = p(λ) > 2 we obtain by choosing δ 0 , δ 1 , and δ 2 as appropriate powers of τ and using (87) for some η > 0.
Having bounded the error in the gradient, we next estimate the error in the flux. In an analogous fashion to the definition of a τ,eff as the effective coefficient from periodic homogenization on each tile, we define a τ,veff as equal to a τ,eff = a τ on the tiles without microstructure and equal to the effective coefficient from periodic homogenization on each vertical stripe of width τ on each tile with microstructure. Recalling the definitions (84) and (85), we may rewrite the error in the flux in a pointwise way as Thus, having choosen δ 0 , δ 1 , and δ 2 as suitable powers of τ , we obtain by (89), (82), and (81) It now only remains to show that ∇φ i,appr,2 is a good approximation for ∇φ i . To do so, we consider the difference φ i − φ i,appr,2 and observe that it satisfies the PDE − ∇ · (a τ (∇φ i − ∇φ i,appr,2 )) = ∇ · (a τ (e i + ∇φ i,appr,2 )) = ∇ · (a τ (e i + ∇φ i,appr,2 ) − a τ,eff (e i + ∇φ i,eff )).
We now replace the divergence-form right-hand side using (89) for some g with − [0,L] 2 |g| 2 ≤ Cτ η (recall that δ 1 and δ 2 have been chosen as a suitable small powers of τ and recall also the uniform L p bound for ∇φ i,eff in (81)).
Using the skew-symmetry of σ v,i and σ h,i , we obtain − ∇ · (a τ (∇φ i − ∇φ i,appr,2 )) Using again the skew-symmetry of σ v,i and σ h,i , we get − ∇ · (a τ (∇φ i − ∇φ i,appr,2 )) Choosing β > 0 small enough, we finally end up with −∇ · (a τ (∇φ i − ∇φ i,appr,2 )) = ∇ ·ĝ Proof. For such a probability distribution of coefficient fields a, the spatial average − [0,Lε] d a dx is almost surely a multiple of the identity matrix, which entails that The matrix B must also be a multiple of the identity matrix: Under reflection of the i-th coordinate, by the corrector equation (3) and the fact that a is pointwise a multiple of the identity matrix we have that the i-th corrector for the reflected coefficient fieldâ( Thus, the off-diagonal entries of a RVE which are given by (for i = j, using also that a(x) = a scalar (x) Id) switch sign under such reflections, while the average − [0,Lε] d a dx remains invariant. As our probability distribution is invariant under reflections, the off-diagonal entries of B must be zero. Similarly, as our probability distribution is invariant under exchange of coordinates, all diagonal entries of B must coincide; therefore the covariance must be a multiple of Id ⊗ Id.
We now turn to the proof of our theorem on successful variance reduction for random coefficient fields that are obtained by applying a "monotone" functions to a collection of iid random variables.
Proof of Proposition 5. Without loss of generality (by rescaling), we may consider the case ε = 1.
As both f and g are increasing functions in each of their arguments, the integrands in this formula are either nonnegative (for X n ≥ Y n ) or nonpositive (for X n ≤ Y n ).
Taking the sum of these formulas for n = 1, . . . , N , we infer which establishes the desired lower bound (94) for the covariance.
Taking the sum with respect to n entails E f (X)g(X) ≤ Taking the sum over all t = 2 k for 2 k ≤ T , we deduce our desired estimate (99).
Lemma 16. Let a ∈ L ∞ (R d ; R d×d ) be a uniformly elliptic and bounded coefficient field in the sense of (A1). Let b ∈ L ∞ (R d ; R d ) be a bounded vector field. Then the unique nongrowing weak solution w to the equation Inserting this estimate in the previous inequality and passing to the supremum over all g ∈ L 2 supported in {|x| ≤ √ T } with´{ |x|≤ √ T } |g| 2 dx ≤ 1, we get This establishes the estimate (101).
Appendix B. Calculus for random variables with stretched exponential moments On the space of random variables X with stretched exponential moments in the sense for some γ > 0 and some C > 0, it is convenient to work with the norm ||X|| exp γ := sup p≥1 1 p 1/γ E |X| p 1/p .
For γ ≥ 1, this norm is equivalent to the Luxemburg norm associated with the convex function exp(x γ ) − 1. However, it has two advantages: First, it simplifies calculus when considering the integrability of products of random variables or the concentration properties of independent random variables. Secondly and more importantly, it is also a well-defined norm for γ ∈ (0, 1), a parameter range which we shall employ heavily.
For independent random variables with stretched exponential moments, a standard argument via an inequality by Burkholder [29] provides a simple concentration estimate.
Lemma 20. Let X 1 , . . . , X M be independent random variables with vanishing expectation and uniformly bounded stretched exponential moments