Quenched normal approximation for random sequences of transformations

We study random compositions of transformations having certain uniform fiberwise properties and prove bounds which in combination with other results yield a quenched central limit theorem equipped with a convergence rate, also in the multivariate case, assuming fiberwise centering. For the most part we work with non-stationary randomness and non-invariant, non-product measures. Independently, we believe our work sheds light on the mechanisms that make quenched central limit theorems work, by dissecting the problem into three separate parts.


Remark 1.1
There is no fundamental reason for working with one-sided time other than that the randomness in our paper is mostly non-stationary-a context in which the concept of an infinite past is perhaps unnatural. For stationary randomness there is no obstacle for twosided time. The other reason is plain philosophy: our concern will be the future, and whether the observed system has been running before time 0 we choose to ignore-without damage as long as our assumptions (specified later) hold from time 0 onward.
Consider an observable f : X → R. Introducing notations, we write as well as Given an initial probability measure μ, we writef i andW n for the corresponding fiberwisecentered random variables: Note that all of these depend on ω. Next, we define Note that σ 2 n depends on ω. It is said that a quenched CLT equipped with a rate of convergence holds if there exists σ > 0 such that d(W n , σ Z ) tends to zero with some (in our case, uniform) rate for almost every ω.
Here Z ∼ N (0, 1) and the limit variance σ 2 is independent of ω. Moreover, d is a distance of probability distributions which we assume to satisfy d(W n , σ Z ) ≤ d(W n , σ n Z ) + d(σ n Z , σ Z ) and d(σ n Z , σ Z ) ≤ C|σ n − σ |, at least when σ > 0 and σ n is close to σ ; and that d(W n , σ Z ) → 0 implies weak convergence ofW n to N (0, σ 2 ). One can find results in the recent literature that allow to bound d(W n , σ n Z ); see Nicol-Török-Vaienti [19] and Hella [13]. In this paper we supplement those by providing conditions which allow to identify a non-random σ and to obtain a bound on |σ n (ω) − σ | which tends to zero at a certain rate for almost every ω, which is a key feature of quenched CLTs.
Our strategy is to find conditions such that σ 2 n (ω) converges almost surely to This is motivated by two observations: (i) if lim n→∞ σ 2 n = σ 2 almost surely, dominated convergence should yield the equation above, and (ii) Eσ 2 n is the variance ofW n with respect to the product measure P ⊗ μ, since μ(W n ) = 0: Remark 1. 2 One has to be careful and note thatW n has been centered fiberwise, with respect to μ instead of the product measure. Therefore, Var P⊗μWn and Var P⊗μ W n differ by Var P μ(W n ): Eσ 2 n = Var P⊗μWn = Eμ(W 2 n ) = E Var μWn = E Var μ W n = Var P⊗μ W n − Var P μ(W n ). In special cases it may happen that Var P μ(W n ) → 0, or even Var P μ(W n ) = 0 if all the maps T ω i preserve the measure μ, whereby the distinction vanishes and the use of a non-random centering becomes feasible. We will briefly return to this point in Remark C.2 motivated by a result in [1]. A related observation is made in Remark A.3 which answers a question raised in [2] concerning the trick of "doubling the dimension".
To implement the strategy, we handle the terms on the right side of |σ 2 n (ω) − σ 2 | ≤ |σ 2 n (ω) − Eσ 2 n | + |Eσ 2 n − σ 2 | separately, obtaining convergence rates for both. Note that these are of fundamentally different type: the first one concerns almost sure deviations of σ 2 n about the mean, while the second one concerns convergence of said mean together with the identification of the limit. Remark 1. 3 That the required bounds can be obtained illuminates the following pathway to a quenched central limit theorem: (1) d(W n , σ n Z ) → 0 almost surely, (2) σ 2 n − Eσ 2 n → 0 almost surely, (3) Eσ 2 n → σ 2 for some σ 2 > 0, where the last step involves the identification of σ 2 . Remark 1. 4 Let us emphasize that in general we do not assume P to be stationary or of product form; μ to be invariant for any of the maps T ω i ; or P ⊗ μ (or any other measure of similar product form) to be invariant for the random dynamical system (RDS) associated to the cocycle ϕ.
Quenched limit theorems for RDSs are abundant in the literature, going back at least to Kifer [14]. Nevertheless they remain a lively topic of research to date: Recent central limit theorems and invariance principles in such a setting include Ayyer-Liverani-Stenlund [4], Nandori-Szasz-Varju [18], Aimino-Nicol-Vaienti [2], Abdelkader-Aimino [1], Nicol-Török-Vaienti [19], Dragičević et al. [9,10], and Chen-Yang-Zhang [8]. Moreover, Bahsoun et al. [5][6][7] establish important optimal quenched correlation bounds with applications to limit results, and Freitas-Freitas-Vaienti [11] establish interesting extreme value laws which have attracted plenty of attention during the past years. Structure of the paper the main result of our paper is Theorem 4.1 in Sect. 4. It is an immediate corollary of Theorem 2.14 of Sect. 2, which concerns |σ 2 n (ω) − Eσ 2 n |, and of Theorem 3.9 of Sect. 3, which concerns |Eσ 2 n − σ 2 |. In Sect. 4 we also explain how the results of this paper extend to the vector-valued case f : X → R d . As the conditions of our results may appear a bit abstract, Remark 4.5 in Sect. 4 contains examples of systems where these conditions have been verified.
At the end of the paper the reader will find several appendices, which are integral parts of the paper: in Appendix A we interpret the limit variance σ 2 in the language of RDSs and skew products. In Appendix B we present conditions for σ 2 > 0. In Appendix C, we discuss how the fiberwise centering in the definition ofW n affects the limit variance. For completeness, in Appendix D we elaborate on the structure of an invariant measure intimately related to the problem.

The Term
In this section identify conditions which guarantee that, almost surely, |σ 2 n (ω) − Eσ 2 n | tends to zero at a specific rate. Standing Assumption (SA1) throughout this paper we will assume that f is a bounded measurable function and μ is a probability measure. We also assume that a uniform decay of correlations holds in that Note already that and δ i j = 0 otherwise) and their centered counterparts Note that these are uniformly bounded. We also denotẽ Thus, our objective is to showσ 2 n → 0 at some rate. The following lemma is readily obtained by a well-known computation: Lemma 2.1 Assuming (2), there exists a constant C > 0 such that for all ω.
Proof First, we compute The last sums tend to zero by assumption.
We skip the elementary proof based on Lemma 2.1.

Remark 2.3
Of course, the upper bounds in the preceding results apply equally well tõ The following result, which has been used in dynamical systems papers including Melbourne-Nicol [17], will be used to obtain an almost sure convergence rate of 1 n n−1 i=0ṽ i to zero: ; see also ) Let (X n ) be a sequence of centered, square-integrable, random variables. Suppose there exist C > 0 and q > 0 such that for all m ≥ 0 and n ≥ 1. Let δ > 0 be arbitrary. Then, almost surely,

Remark 2.5
In this paper the theorem is applied in the range 1 ≤ q < 2. In particular, n q + m q ≤ (n + m) q then holds, so it suffices to establish an upper bound of the form Cn q .
Our application of Theorem 2.4 will be based on the following standard lemma: Bounding the last sum in each case yields the result.

Dependent Random Selection Process
It is most interesting to study the case where the sequence ω = (ω i ) i≥1 is generated by a nontrivial stochastic process such that the measure P is not the product of its one-dimensional marginals. Essentially without loss of generality, we pass directly to the so-called canonical version of the process, which corresponds to the point of view that the sequence ω is the seed of the random process. In the following we briefly review some standard details. Let π i : → 0 be the projection π i (ω) = ω i . The product sigma-algebra F is the smallest sigma-algebra with respect to which all the latter projections are measurable. For any I = (i 1 , . . . , i p ) ⊂ Z + , p ∈ Z + ∪ {∞}, we may define the sub-sigma-algebra F I = σ (π i : i ∈ I ) of F . (In particular, F = F Z + .) We also recall that a function u : → R is F I -measurable if and only if there exists an E p -measurable functionũ : p 0 → R such that u =ũ • (π i 1 , . . . , π i p ), i.e., u(ω) =ũ(ω i 1 , . . . , ω i p ). With slight abuse of language, we will say below that the sigma-algebra F I is generated by the random variables ω i , i ∈ I , instead of the projections π i . In particular, we denote Denote In the following (α(n)) n≥1 will denote a sequence such that for each n ≥ 1.
Standing Assumption (SA2) throughout the rest of the paper we assume that the random selection process is strong mixing: α(n) can be chosen so that 1 lim n→∞ α(n) = 0 and α is non-increasing.
as is well known. Ultimately, we will impose a rate of decay on α(n).
We denote by T * the pushforward of a map T , acting on a probability measure m, i.e., (T * m)(A) = m(T −1 A) for measurable sets A. We write We also write for l ≥ k. Note that all of these objects depend on ω through the maps T ω i . We use the conventions μ 0 = μ, μ r ,r +1 = μ and f k,k+1 = f here. Standing Assumption (SA3) throughout the rest of the paper we assume the following uniform memory-loss condition: there exists a constant C ≥ 0 such that for all whenever k ≥ r . The bound holds uniformly for (almost) all ω.
In the cocycle notation, (4) reads Note that, settingc Lemma 2.7 There exists a constant C ≥ 0 such that Proof The first bound holds, because while the choices g = f and g = f l,k+1 together yield Hence Note that here the expression in the curly braces only depends on the random variables by (6). On the other hand, the strong-mixing bound (3) implies Moreover, Collecting the bounds leads to the estimate Note that (6) immediately yields the estimate which by the boundedness of α results in Taking the minimum with respect to r proves the lemma.
The upper bound |E[c i jckl ]| ≤ Cη( j − i)η(l − k) of Lemma 2.7 yields the following intermediate result: In the third line we used the upper bound Next we investigate the remaining term by choosing r = j. Suppose furthermore that k − j ≥ l − k and recall η is non-increasing. Then the right side of the above display is bounded above by Cη(l − k). In other words, if i ≤ j ≤ k ≤ l ≤ 2k − j, then Cη( j − i) min r : j≤r ≤k {η(k − r ) + α(r − j)η(l − k)} is the tightest bound on |E[c i jckl ]| that Lemma 2.7 can provide. This observation motivates the following lemma.

Lemma 2.9 Define
(ii) There exist constants C 1 ≥ 0 and C 2 ≥ 0 such that Proof Part (i) is an immediate corollary of Lemma 2.7. As for part (ii), let us first prove the lower bound. Since all the terms in S(i, k) are nonnegative and α is non-increasing, we have for i < k that It remains to prove the upper bound in part (ii). We choose r = (k + j)/2 . Since η is summable, we have Next we split the last sum above into two parts, keeping in mind that α and η are non-increasing and η is also summable: This completes the proof.
The next two lemmas concern the case when η and α are polynomial.

Lemma 2.10
Let η(n) = Cn −ψ , ψ > 1 and α(n) = Cn −γ , γ > 0. Then Proof The lower bound follows immediately from Lemma 2.9(ii). Let first m ≥ 8. Then m/4 ≥ m/8. Thus Lemma 2.9(ii) yields Cm 2 by counting terms, we can choose a large enough C 2 such that the claimed upper bound holds also for 1 ≤ m < 8. Secondly, Regarding the last sum appearing above, observe that In other words, also Now, by Lemmas 2.9(i) and 2.10 we have Thus, Lemma 2.8 and bounds (7) and (8) yield The proof is complete.
Notice that for any ε > 0 we have n log n = O(n 1+ε ). Applying Theorem 2.4 with yields the claim.
almost surely.
We are now in position to prove the main result of this section: Then, for arbitrary δ > 0, almost surely.
Proof By Corollary 2.2, Combining this with Proposition 2.13 yields the following upper bounds on |σ 2 n − Eσ 2 n |: In each case the first term is the largest, so the proof is complete.

The Term |E 2 n − 2 |
In this section we formulate general condition that allow to identify the limit σ 2 = lim n→∞ Eσ 2 n and obtain a rate of convergence. Write for brevity. Then we arrive at Recall that

Asymptotics of Related Double Sums of Real Numbers
In this subsection we consider double sequences of uniformly bounded numbers a ik , (i, k) ∈ N 2 , with the objective of controlling the sequence for large values of n. In this subsection, we make the following assumption tailored to our later needs: We also denote the tail sums of η by We begin with a handy observation: Proof For all choices of 0 < K ≤ n we have The error is uniform because of the uniform condition |a ik | ≤ η(k).
uniformly, which concludes the proof.
The following lemma helps identify the limit of B n and the rate of convergence under certain circumstances: The series on the right side converges absolutely. Furthermore, denoting also |b k | ≤ η(k), so the series ∞ k=0 b k converges absolutely. Lemma 3.1 with L = K yields uniformly for all 0 < K ≤ n. Thus, the definition of r k (n) gives (11). To prove the convergence of B n , consider (11) and fix an arbitrary ε > 0. Fix K so large that R(K ) < ε/2C. Since K k=0 r k (n) + K n −1 tends to zero with increasing n, it is bounded by ε/2C for all large n. Then |B n − ∞ k=0 b k | < ε.

Convergence of E 2 n : A General Result
In this subsection we apply the results of the preceding subsection to the sequence Recall from (9) and (2) of (SA1) that the standing assumption in (10) is satisfied: |Ev ik | ≤ 2η(k) and ∞ k=0 η(k) < ∞. The next theorem is nothing but a rephrasing of Lemma 3.2 in the case a ik = Ev ik at hand.

Theorem 3.3 Suppose the limit
Ev ik exists for all k ≥ 0. The series In particular, σ 2 ≥ 0. Furthermore, there exists a constant C > 0 such that

Convergence of E 2 n : Asymptotically Mean Stationary P
For the rest of the section we assume P is asymptotically mean stationary, with meanP. In other words, there exists a measureP such that, given a bounded measurable g : → R, The measureP is then τ -invariant. We denoteĒg = g dP. We will shortly impose additional rate conditions; see (15).
Recall the cocycle property of the random compositions. In what follows, it will be convenient to use the notations and For the results of this section we need the following preliminary lemma, which crucially relies on the memory-loss property (SA3), assumed to hold throughout this text.
Proof Note that we may rewrite the memory-loss property in (5) as On the other hand, which completes the proof.
The following lemma guarantees that both limits lim n→∞ n −1 n−1 i=0 Eμ( f i f i+k ) and lim n→∞ n −1 n−1 i=0 Eμ( f i )μ( f i+k ) exist and can be expressed in terms ofP. Eg a ik = lim j→∞Ē g a jk .
In particular, the limits exist.
Proof First we make the observation that sinceP is stationary, (13) implies whenever i ≥ r . From assumption (2) it follows that lim r →∞ η(r ) = 0. The sequence (Ēg a ik ) ∞ i=0 is therefore Cauchy, so lim i→∞Ē g a ik exists and respects the same bound, i.e., We are now ready to show that lim n→∞ n −1 n−1 i=0 Eg a ik exists and in the process we see that it is equal to lim j→∞Ē g a jk . Let ε > 0. Choose r ∈ N such that Cη(r ) < ε/5, where C is the same constant as above. Then choose n 0 ∈ N that satisfies two following conditions. First, f 2 ∞ r /n 0 < ε/5.
Second, by (12), n −1 n−1 i=0 Eg a rk • τ i −Ēg a rk < ε/5 for all n ≥ n 0 . Next we show that n −1 n−1 i=0 Eg a ik − lim j→∞Ē g a ik < ε for all n ≥ n 0 . The following five estimates yield the desired result: In this first estimate, note that g a ik ∞ ≤ f 2 ∞ for all i, k ∈ N and a ∈ {1, 2}: In the second estimate, we apply (13): The third estimate follows the same reasoning as the first: The fourth estimate follows by the definition of n 0 : The last estimate holds by (14): These estimates combined, yield n −1 n−1 i=0 Eg a ik − lim j→∞Ē g a jk < ε for all n ≥ n 0 . Since lim j→∞Ē g a jk exists, then also lim i→∞ n −1 n−1 i=0 Eg a ik exists and is equal to it. Theorem 3.3 yields the next result as a corollary.

Theorem 3.6 The series
is absolutely convergent, and Now the rest of the claim follows from Theorem 3.3.
Standing Assumption (SA4) for the rest of the paper we assume that P is asymptotically mean stationary, and there exist C 0 > 0 and ζ > 0 such that for all n ≥ 1. Here the sup is taken over all r , k ≥ 0 and a ∈ {1, 2}.

Lemma 3.7
For all integers 0 < n 1 < n 2 , where C is uniform.
Next we use Lemma 3.7 to provide an upper bound on n −1 n−1 i=0 Eg a ik − lim r →∞Ē g a rk . Note that just making the substitutions n 1 = 0 and n 2 = n in Lemma 3.7 does not yield a good result. Instead we divide the sum n−1 i=0 Eg a ik into an increasing number of partial sums and then apply Lemma 3.7 separately to those parts.
Before proceeding to the next lemma, we define a function h ζ : N → R which depends on the parameter ζ in the following way Lemma 3.7 yields Lemma 3.7 also gives 1 n In the last line we used the fact that n/2 ≤ 2 n * ≤ n, implying n − 2 n * ≤ n/2. Collecting the estimates (20), (21) and (22), we deduce 1 We are finally ready to state and prove the main result of this section: Theorem 3.9 Assume (SA1) and (SA3) with η(n) = Cn −ψ , ψ > 1. Assume (SA4) with ζ > 0. Then Here σ 2 is the quantity appearing in Theorem 3.6.
Proof Let k ≥ 0. The previous lemma applied to case a = 1 yields Similarly in the case a = 2 Equations (23), (24) and Theorem 3.6 imply that We apply Theorem 3.3, which yields for all 0 < K ≤ n. The estimate on the right side of (25) is minimized, when h ζ (n) = K −ψ . Therefore choosing

Main Result and Consequences
Theorems 2.14 and 3.9 immediately yield the main result of the paper, given next. The bounds shown are elementary combinations of these theorems, so we leave the details to the reader. Let us remind the reader of the Standing Assumptions (SA1)-(SA4) in Sects. 2 and 3 . At the end of the section we also comment on the case of vector-valued observables.  3 Here δ k0 = 1 if k = 0, and δ k0 = 0 if k = 0.
is well defined, nonnegative, the series is absolutely convergent, and lim n→∞ σ 2 n (ω) = σ 2 for every ω ∈ * . Moreover, the absolute difference n (ω) = σ 2 n (ω) − σ 2 has the following upper bounds, for any ω ∈ * : Let us reiterate that Theorem 4.1 facilitates proving quenched central limit theorems with convergence rates for the fiberwise centeredW n . Recalling the discussion from the beginning of the paper, we namely have the following trivial lemma (thus presented without proof): In other words, once a bound on the first term on the right side has been established (e.g., using methods cited earlier), one can use Theorem 4.1 to bound the second term almost surely. Typical metrics satisfying (26) are the 1-Lipschitz (Wasserstein) and Kolmogorov distances. The results presented above allow to formulate some sufficient conditions for σ 2 > 0. For simplicity, we proceed in the ideal parameter regime Generalizations of the next result involving any of the other parameter regimes of Theorem 4.1 are straightforward, and left to the reader. Then σ 2 > 0.
Proof Suppose σ 2 = 0. We will derive a contradiction in each case.
We will return to the question of whether σ 2 = 0 or σ 2 > 0 in Lemma B.1.

Vector-Valued Observables
Let us conclude by explaining, as promised, how the results extend with ease to the case of a vector-valued observable f : X → R d . This time σ 2 n is a d × d covariance matrix and, if the limit exists, so is σ 2 = lim n→∞ σ 2 n . Define the functions n : R d → R by for all matrix norms.
Dropping the subindex n yields the limit matrix elements σ 2 αβ . Since α and β can take only finitely many values, simultaneous almost sure convergence for the matrix elements with the claimed rate follows.
According to the lemma, the rate of convergence of the covariance matrix σ 2 n to σ 2 can be established by applying the earlier results to the finite family of scalar-valued observables (e α + e β ) T f . Further, one may apply Corollary 4.3 (or Lemma B.1) to the observables v T f for all unit vectors v to obtain conditions for σ 2 being positive definite. Assuming now it is, for certain metrics (e.g. 1-Lipschitz) one has where Z ∼ N (0, I d×d ) and C = C(σ ), which again yields an estimate of the type We refer the reader to Hella [13] for details, including the hard part of establishing an almost sure, asymptotically decaying bound on d(W n , σ n Z ) in the vector-valued case.

Remark 4.5
As an application, Hella [13] establishes the convergence rate n − 1 2 log 3 2 +δ n for random compositions of uniformly expanding circle maps in the regime (27). Furthermore, Leppänen and Stenlund [16] establish the same result for random compositions of nonuniformly expanding Pomeau-Manneville maps.
Acknowledgements Open access funding provided by University of Helsinki including Helsinki University Central Hospital.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix A. Random Dynamical Systems
In this section we interpret the limit variance of Theorems 3.6 and 3.9 from the point of view of RDSs. Like elsewhere in the paper, we will assume the system possesses the good, uniform, fiberwise properties of the Standing Assumptions.
Recall that τ preserves the probability measureP in (12), i.e., τ −1 F ∈ F andP(τ −1 F) = P(F) for all F ∈ F . One says that ϕ( · , · , · ) in (1) is a measurable RDS on the measurable space (X , B) over the measure-preserving transformation ( , F ,P, τ ). The map is called the skew product of the measure-preserving transformation ( , F ,P, τ ) and the cocycle ϕ(n, ω) on X . It is a measurable self-map on ( × X , F ⊗ B). In general, RDSs and skew products have one-to-one correspondence; in particular, the measurability of one implies the measurability of the other.
Thus, our task is to study the statistics of the projection of n (ω, x) to X . It now becomes interesting to study the invariant measures of . However, the class of all invariant measures of is unnatural, for we must incorporate the fact that τ preserves the measureP. For this reason, it is said that a probability measure P on F ⊗ B is an invariant measure for the RDS ϕ if it is invariant for and the marginal of P on coincides withP. In other words, * P = P and ( 1 ) * P =P, We will also need to consider the cocycle ϕ (2) (n, ω)(x, y) = (ϕ(n, ω)x, ϕ(n, ω)y) on the product space X × X . The corresponding skew product is (2) (ω, x, y) = (τ ω, ϕ (2) (1, ω)(x, y)).
Of particular interest will be the sequence of functions Z n : × X × X defined by Z n (ω, x, y) = S n (ω, x) − S n (ω, y).

For then
Notice already that writing yields the identity Standing Assumption (SA5) assume there exists an invariant measure P (2) for the RDS ϕ (2) that is symmetric in the sense that for all bounded measurable h : × X × X → R. The common marginal is then trivially an invariant measure for the RDS ϕ. Moreover, assume and are satisfied. While Standing Assumption (SA5) may, from the point of view of the initial setup of our problem, seem mysterious at a first glance, it is quite natural. We will later provide an example of a more concrete condition which implies (SA5), and stick to the abstract setting for now.

Lemma A.1 The function F satisfies
The latter has the upper bound Proof That F is centered is due to the symmetry property (30) of P (2) in (SA5). Since ϕ(i, ω, y))}, the same symmetry property also yields (35). The upper bound in (36) then follows from (33) and (34) in (SA5) together with (SA1).
Recall that in Theorems 3.6, 3.9 and 4.1 we have The next lemma connects this expression to the RDS notions when also (SA5) is assumed.

Lemma A.2
The limit variance σ 2 in Theorems 3.6, 3.9 and 4.1 satisfies Proof The first line is just the expression of σ 2 rewritten using (33) and (34). The second line then follows by (35). The last line holds by (29) together with (36) and (SA1).

Remark A.3
Note that the expression of σ 2 in (38) is exactly one half of the Green-Kubo formula in terms of the skew-product (2) , its invariant measure P (2) , and the observable F. This trick of "doubling the dimension" is not new. To our knowledge, however, (38) is a new observation at this level of generality. It answers a question raised in [2, Sect. 7] by Aimino, Nicol and Vaienti (who studied the special case where P, P and P (2) are product measures, allowing for a non-random centering of S n ): The key that makes (38) an algebraic fact is the symmetry property (30) of the measure P (2) . It deserves a separate remark that even though σ 2 does not in general (see Remark C.2) admit a classical Green-Kubo formula in terms of , P, and f , "doubling the dimension" still yields (38).

Appendix B. Positivity of 2
In this section we return to the question of positivity of the limit variance σ 2 . We shall assume (SA1) and (SA3)-(SA5), the strong-mixing assumption (SA2) being unnecessary here. Again we assume nice parameters-e.g. ψ > 2-for simplicity of the statements.
(ii) σ 2 > 0 is equivalent to each of the following conditions: There exist c > 0 and N > 0 such that Z 2 n dP (2) ≥ cn for all n ≥ N . (iii) If ζ > 1, then σ 2 > 0 is equivalent to each of the following conditions: (iv) If P is stationary, then σ 2 = 0 is equivalent to each of the following conditions: (v) If P is stationary, then σ 2 > 0 is equivalent to each of the following conditions: From the point of view of applications, parts (iii)(b and d), (iv)(b) and (v)(b and d) may be the most relevant ones as they involve the measures P and μ, and the process (S n ) n≥1 , which are immediately apparent from the definition of the system. Note that (iii)(b) is the same condition as in Corollary 4.3(ii).
Proof of Lemma B.1 By (36) we can appeal to a well-known result due to Leonov [15], which guarantees that the limit b = lim n→∞ Z 2 n dP (2) exists in [0, ∞], and b < ∞ if and only if sup n≥0 Z 2 n dP (2) < ∞. Moreover, the last condition is equivalent to the existence of G ∈ L 2 (P (2) ) such that F = G − G • (2) . On the other hand, standard computations and the formula for σ 2 in (38) yield Here ψ > 2 was used. Thus, σ 2 > 0 is equivalent to linear growth of Z 2 n dP (2) to infinity, while σ 2 = 0 is equivalent to sup n≥0 Z 2 n dP (2) < ∞. Parts (i) and (ii) are proved. As for part (iii), (28) and Theorem 3.9 with ζ > 1 yield If σ 2 > 0, the right side grows asymptotically linearly in n, and (a)-(d) are all satisfied. Finally, parts (iv) and (v) follow from (i) and (ii), respectively, because in the stationary case it holds that Z 2 n d(P ⊗ μ ⊗ μ) = Z 2 n dP (2) + O(1); see Lemma B.2 below.
We close the section with the lemma below, which was needed in the last part of the preceding proof.

Appendix C. Effect of the Fiberwise Centering of W n
In this section we discuss Remark 1.2 concerning the variance of W n , as opposed to the fiberwise-centeredW n = W n − μ(W n ). Note that and The difference of (43) and (44) equals . Under the assumptions of our paper (1). Therefore Var P⊗μ W n and Var P μ(W n ) either converge or diverge simultaneously. We now derive their asymptotic expressions in terms of series, restricting to the case where the law P of the selection process is stationary.
Remark C.2 Note that in the latter case This is the classical Green-Kubo formula in terms of the skew-product , its invariant measure P, and the observable f . Let us stress that it is not the expression of σ 2 , save for exactly the special case lim n→∞ Var P μ(W n ) = 0. The latter special case is the very same in which Abdelkader and Aimino [1] establish a quenched central limit theorem with nonrandom centering, assuming i.i.d. randomness (P = P N 0 ) in particular; see also Remark A.3.
Proof of Lemma C. 1 We prove the statements concerning Var P μ(W n ) first. We have We will apply Lemma 3.2 to show convergence as n → ∞. To that end, we need control of a ik in the limits i → ∞ and k → ∞. We begin with the first limit. By (5) below (SA3), we have a uniform bound whenever r ≤ i. Since P is stationary, this yields Thus, (Eμ( f i )) ∞ i=0 is Cauchy, so its limit exists and SinceP = P by stationarity, (14) gives Thus, the limit exists and as i → ∞. Since η is summable, as n → ∞. Both of the preceding bounds are uniform in k.
In order to bound a ik as k → ∞, first note that (45) allows to estimate the last estimate being true by strong mixing. Picking r k/2 yields uniformly in i. Since γ > 1 and ψ > 1, this bound is summable, so Lemma 3.2 can now be applied; recall (10). The bound in (11) becomes Now, choosing K n 1/ min{γ,ψ} yields the upper bound Cn 1/ min{γ,ψ}−1 claimed.
The expressions of the limits b k in term of the RDS notations is obtained with the help of (32)-(34), recalling again P =P due to stationarity.
Finally, the claims regarding Var P⊗μ W n = Var P⊗μWn + Var P μ(W n ) follow since we already have control of both terms on the right side: in the stationary case at hand, Theorem 3.9 applies with any ζ > 1, yielding Var P⊗μWn = σ 2 + O n 1 ψ −1 .

Appendix D. (SA5 ): A Less Abstract Substitute for (SA5)
Standing Assumption (SA5) is abstract in that it involves the invariant measure P (2) of the RDS ϕ (2) , and a number of properties of the measure, which are not obvious from the setup of the system at the beginning of the paper. For that reason we give in this section, as an example, another assumption which (i) is more concrete in that it involves only the initial measure μ and the basic cocycle ϕ, and (ii) is stronger than (SA5). Standing Assumption (SA5 ) throughout this section we assume following: the measures ϕ(n, ω) * μ have uniformly square integrable densities with respect to μ, i.e., there exists K > 0 such that dϕ(n, ω) * μ dμ for all n and ω. Moreover, for every bounded measurable g : X → R and ε > 0 there exists N ≥ 0 such that the memory-loss property hold for n ≥ N , m ≥ 0 and all ω. The rest of the section is devoted to investigating some consequences of (SA5 ). Note that (47) asks that the integrals of x → g((n, τ m ω)x) with respect to the two measures ϕ(m, ω) * μ and μ are essentially the same for large n, uniformly in m and ω. The role of (46) is to allow for uniform approximations of the compositions h • ( (2) ) n , n ≥ 0, by compositionsĥ • ( (2) ) n , where h is measurable andĥ is "simple": observe that (h −ĥ) • ( (2) ) n is not guaranteed to be uniformly (in n) small in L 1 (P ⊗ μ ⊗ μ), even if h −ĥ is small, without some assumption. To that end, let us already prove a little lemma: Lemma D.1 Let h : × X × X → R belong to L 2 (P ⊗ μ ⊗ μ). Then h • ( (2) ) n L 1 (P⊗μ⊗μ) ≤ K 2 h L 2 (P⊗μ⊗μ) holds for all n ≥ 0 with K as in (46). Proof Write λ =P ⊗ μ ⊗ μ for brevity. Observe that |h| • ( (2) by Hölder's inequality. Here sinceP is stationary. On the other hand, by (46). Combining the estimates and taking square roots yields the result.
Let H denote the set of all measurable functions h : × X × X → R such that lim n→∞ h • ( (2) ) n d(P ⊗ μ ⊗ μ) exists. Let A denote the set of all measurable cubes in × X × X . Clearly A is nonempty and closed under finite intersections, and it contains the product space × X × X . Clearly H is closed under linear combinations. Furthermore, the argument above shows 1 A ∈ H for all A ∈ A. Suppose now that h k ∈ H are nonnegative functions increasing to a bounded function h. Showing h ∈ H proves that H contains all bounded functions that are measurable with respect to the sigma-algebra σ (A) = F ⊗ B ⊗ B. We will show h ∈ H next.
Let ε > 0 be fixed. Since 0 ≤ h k ↑ h where h is bounded, by the bounded convergence theorem there exists k 0 = k 0 (ε) such that h − h k 0 L 2 (P⊗μ⊗μ) < ε. Thus, by Lemma D.1, for all n ≥ n 0 . Hence h ∈ H. Therefore, by the monotone class theorem H contains all bounded measurable functions.
This yields the claims concerning P.
We are in position to prove the promised fact:
The proof is complete.

D.2 Disintegration of the Invariant Measure P (2)
In this subsection we shed some light on the invariant measure P (2) of the RDS ϕ (2) with the aid of disintegrations. The mathematical constructions here are well known, and we include this part for completeness. The results call for nice structure of the measurable spaces: we assume that both (X , B) and ( 0 , E) are standard measurable spaces.
We are ready to state another basic fact: Lemma D. 5 (1) There exists a unique probability measureQ on (¯ ,F) which is invariant forτ and satisfies ( + ) * Q =P.
(2) There exists an essentially unique family of set functions q ω : F − → [0, 1], ω ∈ , such that (i) the map ω → q ω (E) is measurable for all E ∈ F − ; (ii) q ω is a probability measure forP-a.e. ω ∈ ; (iii) for all h ∈ L 1 (Q), Proof (1) Since ( 0 , E) is a standard measurable space, the shift-invariant measureQ hav-ingP as its marginal is uniquely constructed with the aid of Kolmogorov's extension theorem by requiring that the finite dimensional distributions are translation invariant and coincide with those ofP. See, e.g., Arnold [3, Appendix A.3] for details.