On the existence of an invariant measure for isotropic diffusions in random environment

The results of this paper build upon those first obtained by Sznitman and Zeitouni (Invent Math 164(3), 455–567, 2006). We establish, for spacial dimensions d≥3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\ge 3$$\end{document}, the existence of a unique invariant measure for isotropic diffusions in random environment on Rd\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathbb {R}^d$$\end{document} which are small perturbations of Brownian motion. Furthermore, we establish a general homogenization result for initial data which are locally measurable with respect to the coefficients.


Introduction
The results of this paper should be seen as an extension of those first obtained in Sznitman and Zeitouni [12] for stationary diffusion processes in random environment on R d , for d ≥ 3, which are a small perturbation of Brownian motion and which satisfy a restricted isotropy condition and finite range dependence. The framework depends upon an underlying probability space (Ω, F, P), which can be viewed as indexing the collection of all equations or environments described, for each x ∈ R d and ω ∈ Ω, by the coefficients More precisely, the stationarity is described by a group of transformations {τ x } x∈R d preserving the measure of Ω and satisfying, for each x, y ∈ R d and ω ∈ Ω, A(x + y, ω) = A(x, τ y ω) and b(x + y, ω) = b(x, τ y ω). (1.2) There exists R > 0 quantifying the finite-range dependence such that, whenever subsets A, B ⊂ R d satisfy d(A, B) ≥ R, the sigma-algebras 3) The coefficients are isotropic in the sense that for every orthogonal transformation r : R d → R d preserving the coordinate axis, for every x ∈ R d , the random variables rb(x, ω), r A(x, ω)r t and (b(r x, ω), A(r x, ω)) have the same law. (1.4) Finally, the perturbation is described, for a parameter η > 0 to be chosen small, by the condition |b(x, ω)| < η and |A(x, ω) − I | < η on R d × Ω. (1.5) We remark that these assumptions are identical to model considered in [12] and are the continuous counterpart of the model first studied in the discrete setting by Bricmont and Kupiainen [2]. The coefficients will be sufficiently regular to guarantee, for each x ∈ R d and ω ∈ Ω, the well-posedness of the martingale problem corresponding to the generator where we have written A = (a i j ) d i, j=1 for the diffusion matrix. See Stroock and Varadhan [11,Chapter 6,7]. We denote by P x,ω the corresponding probability measure on the space of continuous paths C([0, ∞); R d ) and recall that, almost surely with respect to P x,ω , paths X t ∈ C([0, ∞); R d ) satisfy the stochastic differential equation for A(y, ω) = σ (y, ω)σ (y, ω) t , and for B t a standard Brownian motion under P x,ω with respect to the canonical right continuous filtration on C([0, ∞); R d ).
We now present our main result where in the statement we write, for every measurable subset E ∈ F, using the transformation group appearing in (1.2), P t (ω, E) = P 0,ω τ X t ω ∈ E .
(1.6) Theorem 1.1 There exists a unique probability measure π on (Ω, F) which is absolutely continuous with respect to P and satisfies, for every t ≥ 0 and E ∈ F, Furthermore, π is mutually absolutely continuous with respect to P and defines an ergodic probability measure with respect to the canonical Markov process on Ω defining (1.6).
Theorem 1.1 is obtained by analyzing the long term behavior of solutions u : since, for 1 E : Ω → Ω the indicator function of E ∈ F, for f E (x, ω) = 1 E (τ x ω), if u E (x, t, ω) satisfies (1. 7) with initial data f E (x, ω), then, for each ω ∈ Ω and t ≥ 0, Indeed, along an exponentially increasing sequence of time scales L 2 n , see (2.18), the invariant measure π is first identified, for every E ∈ F, as the limit π(E) = lim n→∞ E u E (0, L 2 n , ω) .
We prove that the limit exists in Propositions 3.10 and 3.11 and, in Proposition 3.12, we prove that π defines a probability measure on (Ω, F) which is absolutely continuous with respect to P. An almost sure characterization of π is then established along the full limit, as t → ∞, for a class of subsets E ∈ F whose indicator functions satisfy a version of (1.3), see Proposition 4.3. Precisely, on a subset of full probability depending on E, lim t→∞ u E (0, t, ω) = π(E). (1.8) Here, we use crucially the results of [12], where it is shown that, with high probability, there exists a coupling at large length and time scales between the diffusion process generated in environment ω by coefficients A(y, ω) and b(y, ω) and a Brownian motion with deterministic variance, see Control 2.2. Notice, however, that this coupling cannot in general provide an effective comparison between solutions of (1.7) and solutions u : R d × [0, ∞) × Ω → R satisfying, for α > 0 defined in Theorem 2.1, the deterministic equation  (0, ω)).
However, in Proposition 3.9, this coupling does provide a means by which the solution of (1.7) can be effectively compared, with high probability, on large length and time scales, to a quantity which, for suitable initial data, is nearly constant. That is, with high probability, we obtain an effective comparison between the solution u(x, t, ω) of (1.7) at time L 2 n+1 with the solution of (1.9) at time L 2 n+1 − 6L 2 n corresponding to initial data u(x, 6L 2 n , ω). This is essentially to say that u(x, L 2 n+1 , ω) is an averaged version of u(x, 6L 2 n , ω), where we provide a quantitative version of the averaging in Proposition 4.4 for subsets whose characteristic function satisfies a version of (1.3), see Propositions 4.2 and 4.3. In combination, the comparison and averaging complete the proof of (1.8).
Finally, in [12], localization estimates for the diffusion in environment ω are obtained with high probability, see Control 2.3. We use this localization in Proposition 4.6 to upgrade the convergence along the discrete sequence L 2 n to the full limit, as t → ∞, at the cost of obtaining the convergence on a marginally smaller portion of space. The proof of invariance and uniqueness then follow by standard arguments, see Proposition 4.7 and Theorem 4.8.
As an application of Proposition 4.6, we establish a homogenization result for oscillating initial data which are locally measurable with respect to the coefficients. Precisely, we define, for each R > 0, the sigma algebra and consider functions f ∈ L ∞ (R d × Ω) which are stationary with respect to the translation group {τ x } x∈R d and satisfy There exists a subset of full probability such that, as → 0, These methods also apply to equations like (1.10) involving an oscillating righthand side and to the analogous time independent problems. See Theorems 5.2 and 5.3.
We remark that, in the case b(y, ω) = 0, the existence of an invariant measure and applications to homogenization were established by Papanicolaou and Varadhan [9] and Yurinsky [13]. Furthermore, when equation (1.7) may be rewritten in divergence form, results have been obtained by De Masi et al. [3], Kozlov [5], Olla [6], Osada [7] and Papanicolaou and Varadhan [8]. We point the interested reader to the introduction of [12] for a more complete list of references regarding related problems in the discrete setting.
The paper is organized as follows. In Sect. 2, we present our notation and assumptions as well as provide a summary of the aspects of [12] most relevant to our arguments. We identify the invariant measure in Sect. 3 and, in Sect. 4, we prove that the invariant measure is indeed invariant and unique. Finally, in Sect. 5, we prove the general homogenization result for functions which are locally measurable with respect to the coefficients.

Notation
Elements of R d and [0, ∞) are denoted by x and y and t respectively and (x, y) denotes the standard inner product on R d . We write Dv and v t for the derivative of the scalar function v with respect to x ∈ R d and t ∈ [0, ∞), while D 2 v stands for the Hessian of v. The spaces of k × l and k × k symmetric matrices with real entries are respectively written M k×l and S(k). If M ∈ M k×l , then M t is its transpose and |M| is its norm |M| = tr(M M t ) 1/2 . If M is a square matrix, we write tr(M) for the trace of M. The Euclidean distance between subsets A, B ⊂ R d is and, for an index A and a family of measurable functions f α : for the sigma algebra generated by the random variables f α (x, ω) for x ∈ A and α ∈ A. for the support of f . Furthermore, B R and B R (x) are respectively the open balls of radius R centered at zero and x ∈ R d . For a real number r ∈ R we write [r ] for the largest integer less than or equal to r . Finally, throughout the paper we write C for constants that may change from line to line but are independent of ω ∈ Ω unless otherwised indicated.

The random environment
There exists an underlying probability space (Ω, F, P) indexing the individual realizations of the random environment. Since the environment is described, for each x ∈ R d and ω ∈ Ω, by the diffusion matrix A(x, ω) and drift b(x, ω), we may take We remark that the ergodicity is not an assumption, and can be deduced from (2.1) and (2.7). We assume that the diffusion matrix and drift are bounded and Lipschitz uniformly for ω ∈ Ω. There exists C > 0 such that, for all y ∈ R d and ω ∈ Ω, |b(y, ω)| ≤ C and |A(y, ω)| ≤ C (2.4) and, for all x, y ∈ R d and ω ∈ Ω, In addition, we assume that the diffusion matrix is uniformly elliptic uniformly in Ω. There exists ν > 1 such that, for all y ∈ R d and ω ∈ Ω, The coefficients satisfy a finite range dependence. There exists R > 0 such that, whenever A, B ⊂ R d satisfy d(A, B) ≥ R, the sigma algebras (2.7) The diffusion matrix and drift satisfy a restricted isotropy condition. For every orthogonal transformation r : R d → R d which preserves the coordinate axes, for every x ∈ R d , (b(r x, ω), A(r x, ω)) and (rb(x, ω), r A(x, ω)r t ) have the same law. (2.8) And, finally, the diffusion matrix and drift are a small perturbation of the Laplacian. There exists η 0 > 0, to later be chosen small, such that, for all y ∈ R d and ω ∈ Ω, To avoid cumbersome statements in what follows, we introduce a steady assumption. (2.4), (2.5), (2.6), (2.7), (2.8) and (2.9). (2.10) The collection of assumptions (2.4), (2.5) and (2.6) guarantee the well-posedness of the martingale problem set on R d , for each ω ∈ Ω and x ∈ R d , associated to the generator see [11,Chapter 6,7]. We write P x,ω and E x,ω for the corresponding probability measure and expectation on the space of continuous paths C([0, ∞); R d ) and remark that, almost surely with respect to P x,ω , paths X t ∈ C([0, ∞); R d ) satisfy the stochastic differential equation for A(y, ω) = σ (y, ω)σ (y, ω) t , and for B t a standard Brownian motion under P x,ω with respect to the canonical right-continuous filtration on C([0, ∞); R d ).
We write P x = P × P x,ω and E x = E × E x,ω for the corresponding semi-direct product measure and expectation on Ω ×C([0, ∞); R d ). The annealed law P x inherits the translation invariance and restricted rotational invariance implied by (2.3) and (2.8).
In particular, for all x, y ∈ R d , (2.12) and, for all orthogonal transformations r preserving the coordinate axis and x ∈ R d , . (2.13) This stands in contrast to the quenched laws P x,ω , for which no invariance properties can be expected to hold, in general.

A review of [12]
In this section, we review the aspects of [12] most relevant to our arguments. Observe that this summary is by no means complete, as considerably more was achieved in their paper than we mention here. We are interested in the long term behavior of the equation, for a fixed, Hölder continuous function f : R d → R, This is essentially achieved by comparing the solutions of (2.14) to the solution of the deterministic problem, for α > 0 identified in Theorem 2.1, Let L 0 be a large integer multiple of five. For each n ≥ 0, inductively define n = 5 L a n 5 and L n+1 = n L n , (2.18) so that, for L 0 sufficiently large, we have 1 2 L 1+a n ≤ L n+1 ≤ 2L 1+a n . For each n ≥ 0, for c 0 > 0, let where we remark that, as n tends to infinity, κ n is eventually dominated by every positive power of L n . Furthermore, define, for each n ≥ 0, D n = L n κ n andD n = L nκn . (2.20) We choose L 0 sufficiently large so that, for each n ≥ 0, L n < D n <D n < L n+1 , 4κ n <κ n+1 and 3D n+1 < L 2 n+1 . (2.21) The following constants enter into the probabilistic statements below. Fix m 0 ≥ 2 satisfying In the arguments to follow, we will use the fact that δ and M 0 are sufficiently larger than a. We now describe the identification of α. Recall, for each x ∈ R d and ω ∈ Ω, the quenched law P x,ω on C([0, ∞); R d ) and, for each x ∈ R d , the annealed law P x on Ω × C([0, ∞); R d ). The constant α is seen effectively as the limit of the effective diffusivities, in average, of the ensemble of equations (2.14) along the sequence of time steps L 2 n . However, so as to apply the finite range dependence, see (2.7), the stopping time is introduced, for each n ≥ 0, and the approximate effective diffusivity of ensemble (2.14) is defined as The following theorem describes the control and convergence of the α n to α, see [12,Proposition 5.7]. We discuss next the coupling between solutions of (2.14) and (2.15). The first step involves comparing solutions of (2.14), for each n ≥ 0, at time L 2 n , with respect to a Hölder norm at scale L n , to solutions of the deterministic problem To do so, introduce, for each n ≥ 0, the rescaled Hölder norm A localized control of the difference between solutions of (2.14) and (2.26) at time L 2 n is obtained via a cutoff function. For each v > 0, let (2.28) and define, for each x ∈ R d and n ≥ 0, The following result then describes the desired comparison between solutions of (2.14) and (2.26), at time L 2 n , for Hölder continuous initial data. We emphasize here that this control depends upon x ∈ R d , ω ∈ Ω and n ≥ 0. It is not true, in general, that this contraction is available for all such triples (x, ω, n). However, as described below, it is shown in [12,Proposition 5.1] that such controls are available for large n, with high probability, on a large portion of space.

Controll 2.
2 Fix x ∈ R d , ω ∈ Ω and n ≥ 0. Let u and u n respectively denote the solutions of (2.14) and (2.26) corresponding to initial data f ∈ C 0,β (R d ). We have The final control we will use concerns tail-estimates for the diffusion process. We wish to control, under P x,ω , for X t ∈ C([0, ∞); R d ), the probability that is large with respect to the time elapsed. The desired result is similar to the standard exponential estimates for Brownian motion at large length scales. As with Control 2.2, this control depends upon x ∈ R d , ω ∈ Ω and n ≥ 0. It is not true, in general, that this type of localization control is available for all such triples (x, ω, n), but it is shown in [12, Proposition 2.2] that such controls are available for large n, with high probability, on a large portion of space.
We now introduce the primary probabilistic statement concerning Controls 2.2 and 2.3. Notice that the event defined below does not include the control of traps described in [12,Proposition 3.3], which play in important role in propagating Control 2.2 in their arguments. Since we simply use the Hölder control there obtained, we do not require a further use of their control of traps.
Consider, for each x ∈ R d , the event B n (x) = {ω ∈ Ω | Controls 2.2 and 2.3 hold for the triple (x, ω, n).} . (2.31) Notice that, in view of (2.3), for all x ∈ R d and n ≥ 0, It is therefore shown that the probability of the compliment of B n (0) approaches zero as n tends to infinity, see [12, Theorem 1.1].
Theorem 2.4 Assume (2.10). There exist L 0 and c 0 sufficiently large and η 0 > 0 sufficiently small such that, for each n ≥ 0, We henceforth fix the constants L 0 , c 0 and η 0 appearing above.
Fix constants L 0 , c 0 and η 0 satisfying (2.21) and the hypothesis of Theorems 2.1 and 2.4. (2.33) We conclude this section with a few basic observations concerning Control 2.2, Control 2.3 and the Hölder norms introduced in (2.27). Since Control 2.2 cannot be expected to hold globally in space, it will be frequently necessary to introduce cutoff functions of the type appearing in (2.28). The primary purpose of Control 2.3 is to bound the error we introduce, as seen in the following proposition. Proposition 2.5 Assume (2.10) and (2.33). Fix x ∈ R d , ω ∈ Ω and n ≥ 0 and suppose that Control 2.3 is satisfied for the triple (x, ω, n).
Proof The proof is immediate from the representation formula for the solution. We have, for each y ∈ R d , Therefore, √ d L n , and since Control 2.3 is satisfied for the triple (x, ω, n), this implies that, for all |y − x| ≤ 30 which completes the argument.
The following two elementary propositions will be used to extend Control 2.2 to a larger portion of space. The first is an elementary and well-known fact concerning the product of Hölder continuous functions.
The second will play the most important role in extending Control 2.2. The only observation is that the Hölder norms introduced in (2.27) occur at the length scale L n . Therefore, a function agreeing locally with Hölder continuous functions on scale L n must itself be globally Hölder continuous. The proof is elementary and can be found in [12, Lemma A.1].

The identification of the invariant measure
In order to identify the invariant measure, we will analyze the long term behavior of the solution u : Therefore, to simplify the notation in what follows, we write, for each s ≥ 0 and ω ∈ Ω, for u(x, s, ω) satisfying (3.1) with initial data f (y, ω).
We will be particularly interested in translations of functionsf ∈ L ∞ (Ω) with respect to the translation group {τ x } x∈R d , and therefore assume in many of the propositions to follow that a function f : , we identify a deterministic constant π( f ) ∈ R which is effectively identified as the limit of the sequence defined, for each We will prove that π is a probability measure on (Ω, F) which is absolutely continuous with respect to P. And, for every The following two propositions describe the basic existence and regularity results concerning equation (3.1) for bounded and stationary initial data.
Before proceeding, it is convenient to introduce some useful notation. We write, for each n ≥ 0 and f ∈ C 0,β (R d ), for u(x, t) satisfying Similarly, for each n ≥ 0 and f ∈ C 0,β (R d ), And, finally, for each n ≥ 0 and f ∈ C 0,β (R d ), This allows us to restate Control 2.2 in the following equivalent way, where we recall from (2.29), for each x ∈ R d and n ≥ 0, the cutoff function χ n,x .
We now make two elementary observations concerning the interaction of the heat kernels R n introduced in (3.12) and the scaled Hölder norms introduced in (2.27), and an observation concerning the localization properties of the kernels R n . Notice that, in the following proposition, we make use of Theorem 2.1, which in particular provides a lower bound for the α n . This lower bound ensures that the kernels R n provide a sufficient regularization, uniformly in n ≥ 0, for our arguments to follow. Proposition 3.4 Assume (2.10) and (2.33). There exists C > 0 satisfying, for each n ≥ 0 and f ∈ L ∞ (R d ), Therefore, (3.14) It remains to bound the Hölder semi-norm.
For each x ∈ R d , Therefore, in view of Theorem 2.1, for each x ∈ R d , for C > 0 independent of n ≥ 0 and f ∈ L ∞ (R d ), And, in view of (3.14), if |x − y| ≥ L n , The claim follows from (3.14), (3.15) and (3.16).
The following observation is elementary and well-known. The kernels R n preserve Hölder continuous initial data.

Proposition 3.5 For each n
Finally, the following proposition describes the localization properties of the kernels R n . Here, notice again the role of Theorem 2.1 and recall the cutoff function introduced in (2.28). Proposition 3.6 Assume (2.10) and (2.33). There exits C = C(d) > 0 and c > 0 independent of n such that, for each f ∈ L ∞ (R d ), Therefore, using Theorem 2.1, there exists c > 0 independent of n such that, for which completes the argument.
We are now prepared to begin our identification of the measure. In order to exploit the finite range dependence in what follows, see (2.7), we introduce localized versions of the kernels R n . Define, for each n ≥ 0 and ω ∈ Ω, 17) The following proposition describes the basic properties of the solutions to (3.17).
Proof Fix n ≥ 0 and k ≥ 0. The existence and uniqueness of a solution to (3.17) satisfying the above estimates, for each ω ∈ Ω, is an elementary consequence of (2.4), (2.5) and f ∈ L ∞ (R d × Ω). See, for instance [4, Chapter 3, Theorem 9]. The stationarity is a consequence of (2.3) and the uniqueness since, for each ω ∈ Ω and x, y ∈ R d , ifũ(·, ·, ω) satisfies (3.17) corresponding to ω on We now obtain Controls 2.3 and 3.3 on a large portion of space, with high probability. Define, for each n ≥ 0, and, for each n ≥ 0, The following proposition provides, for each n ≥ 0, a lower bound for the probability of A n . We remark that, in view of (2.17) and (2.23), the exponent Proposition 3.8 Assume (2.10) and (2.33). For each n ≥ 0, for C > 0 independent of n, Proof In view of (2.32), for each n ≥ 0, for C > 0 independent of n, Therefore, using Theorem 2.4, for each n ≥ 0, for C > 0 independent of n, This implies that, for each n ≥ 0, which completes the argument.
The following proposition is the essential step toward constructing the invariant measure and provides the first comparison between R n+1 f (x, ω) to R n f (x, ω) for environments in the subset A n defined in (3.18). Notice that the estimates contained below depend upon the unscaled β-Hölder norm of the initial data. Since the identification of the measure requires us to consider initial data f ∈ L ∞ (R d × Ω), we will later use Proposition 3.2 and apply the following result to R 1 f (x, ω). Observe that, in view of (2.17) and (2.23), the exponent β − 7(δ − 5a) appearing below is negative. Proposition 3.9 Assume (2.10) and (2.33). For each n ≥ 0, ω ∈ A n , 1 ≤ k < 2 n+1 and f ∈ C 0,β (R d ), for C > 0 independent of n, And, recall that √ kD n+1 and define the cutoff functionχ n,x : R d → R d , recalling (2.28), Since Proceeding inductively, (3.23) We now write and, for nonnegative integers k i ≥ 0, n,x S n χ n,x R n k 1 . . .χ n,x S n χ n,x R n k m f (x).
Since, for each n ≥ 0, and since x ∈ B 4 n,x S n χ n,x R n k 1 . . .χ n,x S n χ n,x R n k m f (x) Therefore, for C > 0 independent of n, using (2.17) to write 4a + 2a 2 < 5a, since 1 ≤ k < 2 n+1 , the lefthand side of the above string of inequalities is bounded by where we remark that β − 7(δ − 5a) < 0 in view of (2.17) and (2.23). It remains to consider 6 m=0 k 0 +···+k m +m=k 2 n χ n,x R n k 0χ n,x S n χ n,x R n k 1 . . .χ n,x S n χ n,x R n k m f (x).
(3.25) We will prove that, up to an error which vanishes as n approaches infinity, the above sum reduces to To do so, we consider each summand in m individually.
We now prepared to provide the initial characterization of the invariant measure π : F → R. In view of Proposition 3.9, for each n ≥ 0 and f ∈ L ∞ (R d × Ω), define The following two propositions prove that, for each f ∈ L ∞ (R d ×Ω) satisfying (3.2), the sequence {π n ( f )} ∞ n=0 is Cauchy. Notice in particular that the rate of convergence depends only upon the L ∞ -norm of the initial condition.
we have, for each ω ∈ A n , using Control 2.3 and (3.17), and, using Proposition 3.2 and Proposition 3.9 for k = 6, for C > 0 independent of n and f , we have, for each ω ∈ A n , in view of (3.37) and (3.38), for C > 0 independent of n, (3.39) Therefore, since Proposition 3.1, the stationarity guaranteed by (3.2) and Proposition 3.7 imply that, for each x ∈ R d , and since (2.17) and (2.23) imply that, for each n ≥ 0, by Proposition 3.8 and (3.39), for C > 0 independent of n, which, since n ≥ 0 and f ∈ L ∞ (R d × Ω) were arbitrary, completes the argument.

Proposition 3.11 Assume (2.10) and (2.33). For each f
Furthermore, for each n ≥ 0, for C > 0 independent of n and f , Proof In view of (2.17), (2.18) and (2.23), since β − 7(δ − 5a) < 0, the ratio test implies that Since, for each f ∈ L ∞ (R d × Ω) stationary in the sense of (3.2), Proposition 3.10 implies that the sequence , the triangle inequality, Proposition 3.10 and (3.40) imply that, for each n ≥ 0, for C > 0 independent of n and f , which completes the argument.
We now identify what is shown in the next section to be the unique invariant measure. For every E ∈ F, write 1 E : Ω → R for the indicator function of E ⊂ Ω, and define (3.41) We define π : F → R, for each E ∈ F, by the rule π(E) = π( f E ), (3.42) and prove now that π defines a probability measure on (Ω, F) which is absolutely continuous with respect to P.
Proof For each E ∈ F, since 0 ≤ f E ≤ 1 on R d × Ω, the comparison principle implies that, for each n ≥ 0, and, therefore, for each E ∈ F, Furthermore, since f Ω is identically one and, since f ∅ is identically zero, we have, for each n ≥ 0, π n ( f Ω ) = 1 and π n ( f ∅ ) = 0.
⊂ F be a countable collection of disjoint subsets. Since, for each n ≥ 0 and 1 ≤ m ≤ ∞, for the stopping time the dominated convergence theorem implies that, for each n ≥ 0, there exists k n ≥ n such that Therefore, in view of Proposition 3.11, since each initial condition has unit L ∞ -norm, the triangle inequality implies, for C > 0 independent of n, Therefore, in view of (3.45) and (3.46), since we choose k n ≥ n, which, since the family {A i } ∞ i=1 was arbitrary, completes the proof of countable additivity.
We now prove the absolute continuity. We first show that whenever E ∈ F satisfies P(E) = 0 we have R 1 f E (x, ω) = 0 on R d for almost every ω ∈ Ω. To do so, recall that there exists a density p(x, 1, y, ω) satisfying for each x ∈ R d , ω ∈ Ω and E ∈ F, Furthermore, for each x ∈ R d and ω ∈ Ω, the probability measure defined by p(x, 1, y, ω) dy on R d is equivalent to Lebesgue measure. See, for instance [4, Chapter 1, Theorem 11].
Fix E ∈ F satisfying P(E) = 0. Then, for each x ∈ R d , using (2.2) and P(E) = 0, by Fubini's theorem since 1 E (τ y ω) = 0 almost everywhere in Ω for every y ∈ R d . Therefore, Fubini's theorem implies that, for every x ∈ R d , there exists a subset A x ⊂ Ω of full probability such that, for every ω ∈ A x , Define the subset of full probability and observe that, for each ω ∈ A and x ∈ Q d , Since Proposition 3.2 implies that, for every ω ∈ Ω, we have R 1 f E (x, ω) ∈ C 0,β (R d ), we conclude that, for every x ∈ R d and ω ∈ A, and, therefore, for every ω ∈ A and n ≥ 0, Since P(A) = 1, this implies that, for each n ≥ 0, π n ( f E ) = 0 and, therefore, that π(E) = 0. Since E ∈ F satisfying P(E) = 0 was arbitrary, this completes the argument.
In the final proposition of this section, we prove that for each f ∈ L ∞ (R d × Ω) satisfying (3.2), the constant π( f ) characterizes the integral of f (0, ω) with respect to π . This is essentially an immediate consequence of the definition of π and the fact that the kernels R t preserve the L ∞ -norm of initial data. Proposition 3.13 Assume (2.10) and (2.33).
Proof We recall, for every subset E ∈ F and ω ∈ Ω, the definition f E (x, ω) = 1 E (τ x ω), and from which it follows immediately by definitions of π and π that (3.47) And, since for every t ≥ 0, n ≥ 0 ω ∈ Ω and f ∈ L ∞ (R d ), we have, for each n ≥ 0 and f, g ∈ L ∞ (R d × Ω) satisfying (3.2), Therefore, for every pair f, g ∈ L ∞ (R d × Ω) satisfying (3.2), The claim now follows from (3.47), (3.48) and the definition of the Lebesgue integral.

The proof of invariance and uniqueness
In this section, we prove that the measure π defined in (3.42) is the unique invariant measure which is absolutely continuous with respect to P. Furthermore, π is mutually absolutely continuous with respect to P and defines an ergodic probability measure for the canonical Markov process on Ω defining (1.6). We observe that, for each t ≥ 0, ω ∈ Ω and E ∈ F, for P t (ω, E) defined below and as in (1.6), In order to prove invariance, therefore, it suffices to prove that, for each t ≥ 0 and E ∈ F, See Proposition 4.7.
To exploit the finite range dependence we define, for each R > 0, t ≥ 1 and ω ∈ Ω, the localized kernelsR 2) The following proposition controls the error we make due to this localization. And, in contrast to Control 2.3, we obtain this control globally for x ∈ R d and ω ∈ Ω at the cost of an effective length scale which is significantly larger than that appearing in Control 2.3. That is, this control is effective at length scale approximately t whereas Control 2.3 is effective at length scale approximately √ t.

Proposition 4.1 Assume (2.10).
For each x ∈ R d , t ≥ 1, ω ∈ Ω and R > 0, for C > 0 independent of x, t, ω and R, for every f ∈ L ∞ (R d ), We recall that, almost surely with respect to P x,ω , for B s a Brownian motion on R d under P x,ω with respect to the canonical right-continuous filtration on C([0, ∞); R d ), paths X s ∈ C([0, ∞); R d ) satisfy the stochastic differential equation Therefore, using the exponential inequality for Martingales, see Revuz and Yor [10, Chapter 2, Proposition 1.8], and (2.4) and (2.5), for everyR ≥ 0, for C > 0 independent ofR, t, x and ω, Therefore, by choosingR = (R − CT ) + in (4.4), we conclude in view of (4.3) that, for C > 0 independent of x, t, ω and R, which, since x, t, ω and R were arbitrary, completes the argument.
We define, for each subset A ⊂ R d , the sub sigma algebra of F (4.5) The following proposition uses stationarity, see (2.3), to describe the interaction between the transformation group {τ x } x∈R d and the sigma algebras σ A .

Proposition 4.2 Assume (2.10). For every subset A ⊂ R d and y ∈ R d ,
and Furthermore, since the group {τ y } y∈R d is composed of invertible, measure-preserving transformations, for every fixed y ∈ R d , is a sigma algebra generated by sets of the form, for fixed x ∈ A, B d ∈ B d and And, in view of the stationarity (2.3), for each y ∈ R d , x ∈ A, B d ∈ B d and (4.9) and (4.10) We therefore conclude, using (4.6), (4.7), (4.8), (4.9) and (4.10), for each y ∈ R d , (4.11) Since A ⊂ R d was arbitrary, this completes the argument.
We will later use the fact that This will allow us to obtain our general statement after considering measurable subsets E ⊂ Ω in the algebra of subsets ∪ R>0 σ B R , where it is shown in the next proposition that for these subsets we can effectively apply the finite range dependence, see (2.7).
Recall that in Proposition 3.9, with high probability, we obtained an effective comparison between the kernels where, in view of Theorem 2.1, the expectation is that the presence of the heat kernel will result significant averaging for appropriate initial data. The following proposition quantifies the effect of this averaging.

Proposition 4.4
Assume (2.10) and (2.33). Suppose that, for R 1 > 0, E ∈ F satisfies E ∈ σ B R 1 . For each n ≥ 0, 1 ≤ k < 2 n and t ≥ 0 there exists C = C(t, R 1 ) > 0 independent of E, n and k, and there exists ζ > 0 independent of R 1 , E, n, k and t, such that Proof Fix E ∈ F and R 1 > 0 satisfying E ∈ σ B R 1 , t ≥ 0, n ≥ 0 and 1 ≤ k < 2 n . We define R 2 = 6D n , (4.14) and observe that, in view of Proposition 4.1, for every x ∈ R d and ω ∈ Ω, for C 1 > 0 independent of n, k, x, t and ω, (4.15) In order to obtain better localization properties, we consider the quantity Here, observe that Theorem 2.1 implies, for C > 0 independent of n, We now define, for each n ≥ 0, 1+t,R 2 f E (0, ω) , (4.18) and see, in view of (4.15), for each n ≥ 0, . (4.19) Furthermore, using the stationary (2.3), and since f E is stationary in the sense of (3.2), we have, for each x ∈ R d , (4.20) The definition ofR n in (3.17) and the choice of R 2 = 6D n in (4.14) imply that, for each x ∈ R d and ω ∈ Ω, and, using Proposition 3.6, for each x ∈ R d , Therefore, for R > 0 as in (2.7), whenever x, y ∈ R d satisfy |x−y| ≥ 12D n +2R 1 +R, the random variables t+1,R 2 f E (y, ω) are independent. (4.22) We now writeπ n =π n (R t f ) and compute the variance Since there exists C = C(R 1 ) > 0 such that, for all n ≥ 0, CD n ≥ 12D n + 2R 1 + R, and since, for each x ∈ R d and ω ∈ Ω, we have, in view of (4.22), for C = C(R 1 ) > 0 independent of n, k and t, Therefore, using (4.17), for C = C(R 1 ) > 0 independent of n, k and t, (4.24) and, together with (4.24), Chebyshev's inequality implies that, for C = C(R 1 ) > 0 independent of n, k and t, (4.25) We will now extend a version of this estimate to the whole of B 4 √ kD n+1 . Fix 0 < γ < 1 satisfying, in view of (2.16) and (2.17), Since f E is stationary in the sense of (3.2), the stationarity (2.3) and (4.20) imply that, for C = C(R 1 ) > 0 independent of n, k, and t, where we observe (4.26) implies that −γ d/2 < −1 and (1 − (1 + a)γ )d < 0. Therefore, in view of (2.18) and (2.19), there exists ζ > 0 and C = C(R 1 ) > 0 independent of n, k, and t such that Using Theorem 2.1, Proposition 3.1 and (4.16), for each ω ∈ Ω and x ∈ R d , for C > 0 independent of n, k, t and R 1 , ) γ for C > 0 independent of n, we conclude that, in view of (4.28), (4.29) Because (2.18) and (2.19) imply that there exists C > 0 independent of n such that, for all n ≥ 0 and k ≥ 1, √ k L n+1 for C = C(R 1 ) > 0 independent of n, k, and t, using (4.27), (4.29) and (4.30), (4.31) Finally, since (2.19) and (2.20) imply that there exists C = C(t) > 0 such that, for C 1 > 0 as in (4.15), for all n ≥ 0, we conclude in view of (4.15), (4.19) and (4.31) that there exists C = C(R 1 , t) > 0 independent of n and k such that which, since E, R 1 , n, k and t were arbitrary, completes the argument.
The following proposition is essentially a restatement of Proposition 3.9 best suited to our current circumstances, where we recall the definition of the subsets {A n } ∞ n=0 in (3.18). Proposition 4.5 Assume (2.10) and (2.33). For every E ∈ F, n ≥ 0, 1 ≤ k < 2 n+1 , t ≥ 0 and ω ∈ A n , for C > 0 independent of E, n, k, t and ω, Proof Fix E ∈ F, n ≥ 0, 1 ≤ k < 2 n+1 and ω ∈ A n . In view of Proposition 3.2, there exists C > 0 independent of E, n, k, t and ω such that Therefore, since ω ∈ A n , Proposition 3.9 implies that, for C > 0 independent of E, n, k and ω, which, since E, n, k, t and ω were arbitrary, completes the argument.
Observe that the convergence obtain in Proposition 4.5 occurs along the discrete sequence of time steps k L 2 n on 4 √ kD n . We now upgrade this convergence along the full limit, as t → ∞, using Control 2.3. The cost is that the convergence now occurs on a marginally smaller portion of space.
We are now prepared to present our main result. Define, for each n ≥ 0, 1 ≤ k < 2 n+1 , t ≥ 0 and E ∈ F satisfying, for some R 1 > 0, E ∈ σ B R 1 , for C = C(R 1 , t) > 0 as in Proposition 4.6, (4.35) and, for each n ≥ 0, (4.36) We have, using Proposition 4.6, for each n ≥ 0, t ≥ 0, E ∈ F satisfying, for some And, in view of (2.18), for each t ≥ 0 and E ∈ F satisfying E ∈ σ B R 1 , for some We therefore define, for each t ≥ 0 and E ∈ F satisfying E ∈ σ B R 1 , for some R 1 > 0, the subset Ω t (E) = ω ∈ Ω | There exists n(ω) ≥ 0 such that, for all n ≥ n, ω ∈ B n,t . , (4.37) where the Borel-Cantelli lemma implies that, for every t ≥ 0 and E ∈ F satisfying E ∈ σ B R 1 , for some R 1 > 0, We now present the invariance property of the measure π . In the proof, we use that fact that That is, the sigma algebra F is generated by subsets satisfying the hypothesis of Proposition 4.3.
Proposition 4.7 Assume (2.10) and (2.33). For every E ∈ F and t ≥ 0, and observe thatF is an algebra of subsets of Ω. That is,F is closed under relative complements and finite unions. Furthermore, for every E ∈F, there exists Fix t ≥ 0 and E ∈F. For every ω ∈ Ω t (E) ∩ Ω 0 (E), (4.35) and (4.36) imply that And, in view of Proposition 3.13, since R t f E satisfies (3.2), Therefore, for every t ≥ 0 and E ∈F, To conclude, the absolute continuity of π with respect to P and the dominated convergence theorem imply that, using a repetition of the argument appearing in Proposition 3.12, for each t ≥ 0, the rule defines a probability measure on (Ω, F). Therefore, since the Caratheodory Extension Theorem implies that, for every E ∈ F and t ≥ 0, which completes the argument.
In the final proposition of this section, we prove that the invariant measure π is the unique invariant measure which is absolutely continuous with respect to P. Furthermore, π is mutually absolutely continuous with respect to P and defines an ergodic probability measure for the canonical Markov process on Ω defining (1.6). The proof of ergodicity is presented for the convenience of the reader, since it is virtually identical to that presented in [9, Theorem 2.1].
Recall that the ergodicity of (2.2) implies that whenever E ∈ F satisfies, with equality up to sets of measure zero, τ x (E) = E for every x ∈ R d then P(E) = 1 or P(E) = 0.
(4.39) Theorem 4.8 Assume (2.10) and (2.33). There exists a unique invariant measure which is absolutely continuous with respect to P. Furthermore, the invariant measure is mutually absolutely continuous with respect to P and defines an ergodic probability measure for the canonical Markov process on Ω defining (1.6).
Proof The measure π constructed according to (3.42) was shown in Proposition 3.12 to be absolutely continuous with respect to P and, was shown to be invariant in Proposition 4.7. It therefore suffices to prove the uniqueness. Suppose that μ is a probability measure on (Ω, F) which is absolutely continuous with respect to P and satisfies, for each t ≥ 0 and E ∈ F, (4.40) Fix E ∈F. In view of (4.35) and (4.36), for every ω ∈ Ω 0 (E), as t → ∞, Furthermore, since μ is absolutely continuous with respect to P, μ(Ω 0 (E)) = P(Ω 0 (E)) = 1. (4.41) Therefore, the dominated convergence theorem, (4.40) and (4.41) imply that Since E ∈F was arbitrary, and sinceF is an algebra of subsets, the Caratheodory Extension Theorem implies, using the fact that F = σ (F), we have, for every E ∈ F, which completes the argument. The argument for mutual absolute continuity proceeds by contradiction. If not, the Lebesgue decomposition theorem implies that there exists a subset E ⊂ Ω satisfying 0 < P(Ω\E) < 1 and π(Ω\E) = 0, and with P absolutely continuous with respect to π on E. Since and since π is absolutely continuous with respect to P, this implies that, for almost every ω ∈ E with respect to P, and for almost every y ∈ R d , we have τ y ω ∈ E. Fubini's theorem therefore implies that, for almost every y ∈ R d , up to a set of measure zero, And, since the map y → 1 E (τ y ω) is continuous from R d to L 1 (Ω), we conclude that, for every y ∈ R d , up to a set of measure zero, Therefore, using (4.39), we have P(E) = 0 or P(E) = 1, a contradiction, which completes the proof of mutual absolute continuity.
We now prove the ergodicity. Suppose that, for E ∈ F and t ≥ 0, we have R t f E (0, ω) = P t (ω, E) = 1 for almost every ω ∈ E with respect to π.
Since P is absolutely continuous with respect to π , this implies R t f E (0, ω) = P t (ω, E) = 1 for almost every ω ∈ E with respect to P, which, by repeating the above argument, implies E ∈ F is an invariant set under the transformation group {τ x } x∈R d . Therefore, because (4.39) implies P(E) = 0 or P(E) = 1, we conclude that, since π is absolutely continuous with respect to P, either π(E) = 0 or π(E) = 1, and this completes the argument.

A Proof of Homogenization for Locally Measurable Functions
In this section, we characterize the limiting behavior, as → 0, on a subset of full probability, of solutions u : R d × [0, ∞) × Ω → R satisfying u t = 1 2 tr(A(x/ , ω)D 2 u ) for initial data f ∈ L ∞ (R d × Ω) stationary in the sense of (3.2) and which is locally measurable in the sense that, for some R > 0, f (0, ω) ∈ L ∞ (Ω, σ B R ).
The local measurability ensures that the triplet (A(x, ω), b(x, ω), f (x, ω)) satisfies a finite range dependence in the sense of (2.7), and the following three statements can be extended to this situation by a repetition of the arguments appearing in Sect. 4. We remark, however, that after a rescaling the case of local measurability and Theorem 5.3 below are enough to prove the convergence, on a subset of full probability, for each p ∈ R d , of the approximate first-order correctors In what follows, for each E ∈ R>0 σ B R , recall the subset of full probability Ω 0 (E) defined in (4.37), and the related subsets B n,0 (E) defined in (4.36).
Furthermore, for each f, g ∈ L ∞ (R d × Ω), for each n ≥ 0, and It therefore suffices, using the definition of the Lebesgue integral, to prove the theorem for translates under {τ x } x∈R d of indicator functions corresponding to locally measurable sets in F.