Functional Correlation Bounds and Optimal Iterated Moment Bounds for Slowly-Mixing Nonuniformly Hyperbolic Maps

Consider a nonuniformly hyperbolic map T:M→M\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ T:M\rightarrow M $$\end{document} modelled by a Young tower with tails of the form O(n-β)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ O(n^{-\beta }) $$\end{document}, β>2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \beta >2 $$\end{document}. We prove optimal moment bounds for Birkhoff sums ∑i=0n-1v∘Ti\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sum _{i=0}^{n-1}v\circ T^i $$\end{document} and iterated sums ∑0≤i<j<nv∘Tiw∘Tj\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \sum _{0\le i<j<n}v\circ T^i\, w\circ T^j $$\end{document}, where v,w:M→R\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ v,w:M\rightarrow {{\mathbb {R}}} $$\end{document} are (dynamically) Hölder observables. Previously iterated moment bounds were only known for β>5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \beta >5$$\end{document}. Our method of proof is as follows; (i) prove that T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ T$$\end{document} satisfies an abstract functional correlation bound, (ii) use a weak dependence argument to show that the functional correlation bound implies moment estimates. Such iterated moment bounds arise when using rough path theory to prove deterministic homogenisation results. Indeed, by a recent result of Chevyrev, Friz, Korepanov, Melbourne & Zhang we have convergence to an Itô diffusion for fast-slow systems of the form xk+1(n)=xk(n)+n-1a(xk(n),yk)+n-1/2b(xk(n),yk),yk+1=Tyk\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} x^{(n)}_{k+1}=x_k^{(n)}+n^{-1}a(x_k^{(n)},y_k)+n^{-1/2}b(x_k^{(n)},y_k) , \quad y_{k+1}=Ty_k \end{aligned}$$\end{document}in the optimal range β>2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$ \beta >2$$\end{document}.


Introduction
Let T : M → M be an ergodic, measure-preserving transformation defined on a bounded metric space (M, d) with Borel probability measure μ. Consider a fast-slow system on R d × M of the form where the initial condition x (n) 0 ≡ ξ is fixed and y 0 is picked randomly from (M, μ). When the fast dynamics T : M → M is chaotic enough, it is expected that the stochastic process X n defined by X n (t) = x (n) [nt] will weakly converge to the solution of a stochastic differential equation driven by Brownian motion. This is referred to as deterministic homogenisation and has been of great interest recently [8][9][10]12,13,18,21,27]. See [7] for a survey of the topic.
In [18], Kelly and Melbourne considered the special case where a(x, y) ≡ a(x) and b(x, y) = h(x)v(y). By using rough path theory, they showed that deterministic homogenisation reduces to proving two statistical properties for T : M → M. In [8] this result was extended to general a, b satisfying mild regularity assumptions.
One of the assumed statistical properties is an "iterated weak invariance principle". In [18,29] it was shown that this property is satisfied by nonuniformly expanding/hyperbolic maps modelled by Young towers, provided that the tails of the return time decay at rate O(n −β ) for some β > 2 (which is the optimal range for such results).
The second assumed statistical property is control of "iterated moments", which gives tightness in the rough path topology used for proving convergence. This condition has proved much more problematic. Advances in rough path theory [7,8] significantly weakened the moment requirements from [18] and these weakened moment requirements were eventually proved for nonuniformly expanding maps in the optimal range (i.e. β > 2) in [21].
However, for nonuniformly hyperbolic maps modelled by Young towers previously it was only possible to show iterated moment bounds for β > 5 [11]. In this article, we extend iterated moment bounds to the optimal range β > 2.

Illustrative examples.
Many examples of invertible dynamical systems are modelled by Young towers [33,34]. For example, Axiom A (uniformly hyperbolic) diffeomorphisms, Henon attractors and the finite-horizon Sinai billiard are modelled by Young towers with exponential tails, so for such systems deterministic homogenisation results follow from [18,19]. We now give some examples of slowly-mixing nonuniformly hyperbolic dynamical systems for which it was not previously possible to show deterministic homogenisation, due to a lack of control of iterated moments. We start with an example which is easy to write down: x > 1/2 is a prototypical example of a slowly-mixing nonuniformly expanding map [25]. As in [29,Exa 1 2 ], x 2 ∈ [0, 1], (T x 1 , (x 2 + 1)/2), There is a unique absolutely continuous invariant probability measure μ. The map T is nonuniformly hyperbolic and has a neutral fixed point at (0, 0) whose influence increases with α. In particular, T is modelled by a two-sided Young tower with tails of the form ∼ n −β where β = 1/α. For β > 2 the central limit theorem (CLT) holds for all Hölder observables. For β ≤ 2 the CLT fails for typical Hölder observables [14], so it is natural to restrict to β > 2 when considering deterministic homogenisation. By [11] it is possible to show iterated moment bounds for β > 5. Our results yield iterated moment bounds and hence deterministic homogenisation in the full range β > 2.
Dispersing billiards provide many examples of slowly-mixing nonuniformly hyperbolic maps. Markarian [26], Chernov and Zhang [5] showed how to model many examples of dispersing billiards by Young towers with polynomial tails.
We give two classes of dispersing billiards for which it is now possible to show deterministic homogenisation: • Bunimovich flowers [2]. By [5] the billiard map is modelled by a Young tower with tails of the form O(n −3 (log n) 3 ). • Dispersing billiards with vanishing curvature In [6] Chernov and Zhang introduced a class of billiards modelled by Young towers with tails of the form O((log n) β n −β ) to any prescribed value of β ∈ (2, ∞).
The rest of this article is structured as follows. In Sect. 2 we state our main results. Our first main result, Theorem 2.3, is that mixing nonuniformly hyperbolic maps modelled by Young towers with polynomial tails satisfy a functional correlation bound. Our second main result, Theorem 2.4, is that this functional correlation bound implies control of iterated moments.
In Sect. 3 we recall background material on Young towers and prove Theorem 2.3. In Sect. 4 we prove that our functional correlation bound implies an elementary weak dependence condition. Finally in Sect. 5 we use this condition to prove Theorem 2.4.

Main Results
Let T : M → M be a nonuniformly hyperbolic map modelled by a Young tower. We state our results for the class of dynamically Hölder observables, noting that this includes Hölder observables. We delay the definitions of Young tower and dynamically Hölder until Sect. 3.1. Let H (M) denote the class of dynamically Hölder observables on M and let [·] H denote the dynamically Hölder seminorm. Definition 2.1. Fix an integer q ≥ 1. Given a function G : M q → R and 0 ≤ i < q we denote We call G separately dynamically Hölder, and write Fix γ > 0. We consider dynamical systems which satisfy the following property: Suppose that there exists a constant C > 0 such that for all integers 0 ≤ p < q, 0 ≤ n 0 ≤ · · · ≤ n q−1 , for all G ∈ S H q (M). Then we say that T satisfies the Functional Correlation Bound with rate n −γ .
A similar condition was introduced by Leppänen in [22] and further studied by Leppänen and Stenlund in [23,24]. In particular, [22] showed that functional correlation decay implies a multi-dimensional CLT with bounds on the rate of decay. We are now ready to state the main results which we prove in this paper.
The rate of decay of correlations of a dynamical system modelled by a Young tower is determined by the tails of the return time to the base of the tower. Indeed, let T be a mixing transformation modelled by a two-sided Young tower with tails of the form O(n −β ) for some β > 1. In [28] by using a method due to S. Gouëzel (privately communicated based on ideas from [3]), it was shown that there exists C > 0 such that Our first main result is that the Functional Correlation Bound holds with the same rate: Theorem 2.3. Let β > 1. Let T be a mixing transformation modelled by a two-sided Young tower whose return time has tails of the form O(n −β ). Then T satisfies the Functional Correlation Bound with rate n −(β−1) .
Our second main result is that the Functional Correlation Bound implies moment estimates for S v (n) and S v,w (n). Let · H = |·| ∞ + [·] H denote the dynamically Hölder norm.
Theorem 2.4. Let γ > 1. Suppose that T satisfies the Functional Correlation Bound with rate n −γ . Then there exists a constant C > 0 such that for all n ≥ 1, for any mean zero v, w ∈ H (M), Remark 2.5. As mentioned above, by [8,Theorem 2.10] to obtain deterministic homogenisation results it suffices to prove the iterated WIP and iterated moment bounds. Let T be a mixing transformation modelled by a two-sided Young tower with tails of the form O(n −β ) for some β > 2. By [29], the Iterated WIP holds for all Hölder observables. Together Theorem 2.3 and Theorem 2.4 imply that for all η ∈ (0, 1] there exists C > 0 such that for all mean zero v, w ∈ C η (M), giving the required control of iterated moments.

Prerequisites.
Young towers were first introduced by Young in [33,34], as a broad framework to prove decay of correlations for nonuniformly hyperbolic maps. Our presentation follows [1]. In particular, this framework does not assume uniform contraction along stable manifolds and hence covers examples such as billiards.
-For all distinct y, y ∈Ȳ the separation time s(y, y ) = inf{n ≥ 0 :F n y,F n y lie in distinct elements of α} < ∞.
Two-sided Gibbs-Markov maps Let (Y, d) be a bounded metric space with Borel probability measure μ Y and let F : Y → Y be ergodic and measure-preserving. Let F :Ȳ →Ȳ be a full-branch Gibbs-Markov map with associated measureμ Y .
Two-sided Young towers Let F : Y → Y be a two-sided Gibbs-Markov map and let φ : Y → Z + be an integrable function that is constant onπ −1 a for each a ∈ α. In particular, φ projects to a functionφ :Ȳ → M that is constant on partition elements of α.
We are now finally ready to say what it means for a map to be modelled by a Young tower: Let T : M → M be a measure-preserving transformation on a probability space Then we say that T : M → M is modelled by a (two-sided) Young tower.
From now on we fix β > 1 and suppose that T : M → M is a mixing transformation modelled by a Young tower Δ with tails of the form The following bound is standard, see for example [20,Lemma 5.5].
The transfer operator L corresponding tof :Δ →Δ andμ Δ is given pointwise by It follows that for n ≥ 1, the operator L n is of the form We say that z, z ∈Δ are in the same cylinder set of length n iff k z andf k z lie in the same partition element ofΔ for 0 ≤ k ≤ n − 1. We use the following distortion bound (see e.g. [20, Proposition 5.2]): Proposition 3.3. There exists a constant K 1 > 0 such that for all n ≥ 1, for all points z, z ∈Δ which belong to the same cylinder set of length n, is mixing then by [34], The same bound holds pointwise onΔ 0 : This is a straightforward application of operator renewal theory developed by Sarig [31] and Gouëzel [15,16]. However, we could not find a reference to this result in the literature so we provide a proof.
Proof. Define partial transfer operators T n and B n as in [17,Section 4]. Then It follows that The conclusion of the lemma follows by noting that the expressions ∞ b=n+1 b −β and Finally we recall the class of observables on M that are of interest to us: We say that v is dynamically Hölder if v H < ∞ and denote by H (M) the space of all such observables. It is standard (see e.g. [1,Proposition 7.3]) that Hölder observables are also dynamically Hölder for the classes of dynamical systems that we are interested in:

Reduction to the case of a mixing Young tower.
In proofs involving Young towers it is often useful to assume that the Young tower is mixing, i.e. gcd{φ(y) : y ∈ Y } = 1. Hence in subsequent subsections we focus on proving the Functional Correlation Bound under this assumption: Lemma 3.6. Suppose that T is modelled by a mixing two-sided Young tower whose return time has tails of the form O(n −β ). Then T satisfies the Functional Correlation Bound with rate n −(β−1) .
Construct a mixing two-sided Young tower Δ = Y φ , with tower measure μ Δ . Define π M : Δ → M by π M (y, ) = (T ) y. Then T is modelled by Δ with ergodic, Tinvariant measure (π M ) * μ Δ . Now by assumption the measure μ is mixing so by the same argument as in [1,Section 4 Let y, y ∈ Y and 0 ≤ φ (y) < . Then

Approximation by one-sided functions.
Let 0 ≤ p < q and 0 ≤ n 0 ≤ · · · ≤ n q−1 be integers and consider a function G ∈ S H q (M). We wish to bound where H : Δ 2 → R is given by We approximate H ( f R ·, f R ·) by a function H R that projects down ontō Δ. Our approach is based on ideas from Appendix B of [28].
Recall that ψ R (x) = #{ j = 1, . . . , R : f j x ∈ Δ 0 } denotes the number of returns to Δ 0 = {(y, ) ∈ Δ : = 0} by time R. Let Q R denote the at most countable, measurable partition of Δ with elements of the form {x ∈ Δ : s(x, x ) > 2ψ R (x)}, x ∈ Δ. Choose a reference point in each partition element of Q R . For x ∈ Δ letx denote the reference point of the element that x belongs to. Define H R : Δ 2 → R by (iii) For allȳ ∈Δ, Here we recall that · θ denotes the d θ -Lipschitz norm, which is given by Proof. We follow the proof of Proposition 7.9 in [1].
By definition H R is piecewise constant on a measurable partition of Δ 2 . Moreover, this partition projects down to a measurable partition onΔ, since it is defined in terms of s and ψ R which both project down toΔ. It follows thatH R is well-defined and measurable. Part (i) is immediate. Let Let a i = f n i x and b i = f R f k i y. By successively substituting a i byâ i , i and (3.1), Thus By a similar argument, completing the proof of (ii).
Letx,x ,ȳ ∈Δ. Recall that , Here, as usual we have paired preimagesz,z that lie in the same cylinder set of length n p−1 + R. By bounded distortion (Proposition 3.3), It remains to prove the claim. Choose points z, z , y ∈ Δ that project toz,z ,ȳ. Let a i = f n i z, a i = f n i z , b i = f R+n i y. As in part (ii), for otherwiseâ i andâ i are reference points of the same partition element soâ i =â i and E i = 0. Now as in part (ii), Sincez,z lie in the same cylinder set of length R + n p−1 , we have ψ R (a i ) = ψ R (a i ) and Now a i andâ i are contained in the same partition element so s(â i , , completing the proof of the claim.
for all n ≥ 1.

Remark 3.9. Let V (x, y) = v(x)w(y)
where v is d θ -Lipschitz and w ∈ L ∞ (Δ). Then we obtain that so Lemma 3.8 can be seen as a generalisation of the usual upper bound on decay of correlations for observables on the one-sided towerΔ.
Proof of Lemma 3.6. Recall that we wish to bound

An Abstract Weak Dependence Condition
The Functional Correlation Bound can be seen as a weak dependence condition. Let k ≥ 1 and consider k disjoint blocks of integers Consider random variables X i on (M, μ) of the form When the gaps i+1 −u i between blocks are large, the random variables X 0 , . . . , X k−1 are weakly dependent. Let X 0 , . . . , X k−1 be independent random variables with X i = d X i .

Lemma 4.1. Suppose that T satisfies the Functional Correlation Bound with rate n
Proof. We proceed by induction on k. For k = 1 the inequality is trivial. Assume that this lemma holds for k ≥ 1.
Consider an enriched probability space which contains independent copies of Since X k = d X k and X k is independent of X 0 , . . . , X k−1 and X 0 , . . . , X k−1 , Let y ∈ M. The function F y = F(·, . . . , ·, X k (y)) : M k → R satisfies Lip(F y ) ≤ Lip(F). Hence by the inductive hypothesis, and . By a straightforward calculation, G ∈ S H s (M) and Hence by the Functional Correlation Bound, This completes the proof.

Moment Bounds
In this section we prove Theorem 2.4. Throughout this section we fix γ > 1 and assume that T : M → M satisfies the Functional Correlation Bound with rate n −γ .
In both parts of Theorem 2.4 we use the following moment bounds for independent, mean zero random variables, which are due to von Bahr, Esseen [32] and Rosenthal [30], respectively: Lemma 5.1. Fix p ≥ 1. There exists a constant C > 0 such that for all k ≥ 1, for all independent, mean zero random variables X 0 , . . . , X k−1 ∈ L p : w (0, n). Some straightforward algebra yields the following proposition.

Lemma 5.4. There exists a constant C > 0 such that
for all n ≥ 2k, k ≥ 1, for any v ∈ H (M).
We are now ready to prove the moment bound for S v (n) (Theorem 2.4(a)).

Proof of Theorem 2.4(a).
We prove by induction that there exists D > 0 such that for all m ≥ 1, for any mean zero v ∈ H (M).
Claim There exists C > 0 such that for all mean zero v ∈ H (M), for any D > 0, for any k ≥ 1 and any n ≥ 2k such that (5.6) holds for all m < n, we have It remains to prove the claim. Note that in the following the constant C > 0 may vary from line to line.
Fix 0 ≤ i < k. By stationarity, a 2i ). Thus by the inductive hypothesis (5.6) Hence by (5.7), overall Exactly the same argument applies to |I 2 | 2γ 2γ . The conclusion of the claim follows by noting that We now prove Theorem 2.4(b). Our proof follows the same lines as that of part (a). Let n, k ≥ 1. Recall that a i = in 2k . For 0 ≤ i < k define mean zero random variables X i on (M, μ) by w (a 2i , a 2i+1 ) .
Let X 0 , . . . , X k−1 be independent random variables with X i = d X i .
The following lemma plays the same role that Lemma 5.4 played in the proof of Theorem 2.4(a). Lemma 5.5. There exists a constant C > 0 such that for any v, w ∈ H (M), Proof. Note that w (a 2i , a 2i+1 ) . by F(y 0 , . . . , y k−1 ) = |y 0 + · · · + y k−1 | γ . Hence by Lemma 4.1, It remains to bound A. The first step is to bound the expressions Now, Next note that Combining these bounds with (5.5), (5.8) and (5.9) yields that as required.
We are now ready to prove Theorem 2.4(b).
Proof of Theorem 2.4(b). We prove by induction that there exists D > 0 such that for all m ≥ 1, for any v, w ∈ H (M) mean zero.
Claim There exists C > 0 such that for all v, w ∈ H (M) mean zero, for any D > 0, any k ≥ 1 and any n ≥ 2k such that (5.10) holds for all m < n, we have Fix D > 0 such that Ck γ ≤ 1 2 D γ and (5.10) holds for all m < 2k and any mean zero v, w ∈ H (M). Then the claim shows that if n ≥ 2k and (5.10) holds for all m < n, then S v,w (n) γ γ ≤ D γ (n v H w H ) γ . Hence by induction, (5.10) holds for all m ≥ 1.
It remains to prove the claim. Note that in the following the constant C > 0 may vary from line to line.
Acknowledgements. The author would like to thank his supervisor Ian Melbourne for suggesting the problem considered in this paper, providing constant feedback and participating in many helpful discussions. He is also grateful to the anonymous referee for their comments, which improved the presentation of this paper.
Funding The author is funded by a departmental award.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.