Quantitative global-local mixing for accessible skew products

We study global-local mixing for a family of accessible skew products with an exponentially mixing base and non-compact fibers, preserving an infinite measure. For a dense set of almost periodic global observables, we prove rapid mixing; and for a dense set of global observables vanishing at infinity, we prove polynomial mixing. More generally, we relate the speed of mixing to the"low frequency behaviour"of the spectral measure associated to our global observables. Our strategy relies on a careful choice of the spaces of observables and on the study of a family of twisted transfer operators.


INTRODUCTION
Dynamical systems within the category of skew products have a long history. They receive attention for many different reasons: in earlier ergodic theory, they were studied as mild generalizations of suspensions which cannot be factored [18,Chapter 10], providing examples of simple partially hyperbolic systems; in recent days, they are often used to model real life situations where fast-slow dynamics can be observed, such as in the study of climate [26,5]. As their name suggests, they are products of a base dynamics and a fiber dynamics. Here, we are concerned with the statistical properties of these systems.
One of the simplest examples of a partially hyperbolic skew product is given by circle extensions of Anosov diffeomorphisms on the 2-dimensional torus T 2 , defined as follows. Given an Anosov diffeomorphism A : T 2 → T 2 and a smooth map f : T 2 → T, a skew product F : T 3 → T 3 over A induced by f is defined by F(x, r) = (Ax, r + f (x)). The torus T 3 is equipped with a product measure µ u × Leb, where µ u is any Gibbs measure with Hölder potential u and Leb is the Lebesgue measure on the fibers 1 . In this case, Dolgopyat [19] proved that generic functions f induce skew products F with rapid decay of correlations, or rapid mixing, i.e. decorrelation of C ∞ -observables is faster than any given polynomial. The speed of mixing may, in fact, be exponential, but this is still an open problem ( [19,Problem 2]). Dolgopyat's result holds in general for compact group extensions. The interested reader can also check the introductions of [42,12,25] for an overview of old and new results on skew products.
In this paper, we are interested in skew products with non-compact fibers. In particular we will consider R-extensions of topologically mixing Anosov diffeomorphisms via their symbolic counterparts. For an introduction to infinite ergodic theory, we refer the reader to Aaronson's book [1]. In our setting, Guivarc'h showed that any Hölder function f with zero integral which is not cohomologous to a constant induces an ergodic skew product [29] (see also [17]).
Concerning stronger statistical properties, an historical perspective on the various possible definitions of mixing in the infinite ergodic setting can be found in [35]. We will be interested in global-local mixing, a notion introduced by Lenci in [35], namely we will study the correlations between global and local observables (see Definition 2.1). Local observables are akin to compactly supported observables, while global observables are supported over most of the of phase space. One possible concern about the notion of global-local mixing may be the seemingly arbitrary choice of the averaging involved in the definition of global observables. In our setting, the infinite volume average is analogous to statistical infinite volume limits introduced and refined along the years by Van Hove, Fisher [38,Section 3.3] and Ruelle [47,Section 3.9], which built on the inspirational work of Bogoliubov [7]. Global-local mixing has been studied in different situations, for example random walks [35,36], mechanical systems [21,20] and one dimensional parabolic systems [13].
The main ingredient needed to study our class of skew product is accessibility (see, among others, [15,14,48]). The skew-product F is accessible if, roughly speaking, it is possible to reach any point in the space by moving along segments of stable and unstable manifolds.
From the measure-theoretic point of view, using Markov partitions, we can translate the problem to study a skew product over a subshift of finite type, keeping the same R-fibers. The question whether accessibility is preserved passing to symbolic dynamics appears to be delicate, and will be discussed in Appendix A. Our main result, Theorem 3.9, provides quantitative estimates for the decay of correlations of global and local observables for an accessible R-extension of a symbolic shift. To the best of our knowledge, this is the first quantitative result in the context of global-local mixing. 1 Recall that if one chooses the potential as the det(DA| E s ), one recovers the usual SRB measure.
Contrary to the case of compact group extensions, we cannot expect exponential mixing in general, since, taking Fourier transforms, we have to deal with arbitrary low frequencies. Indeed, we will show in Theorem 3.9 that the speed of convergence of correlations depends on the behavior near zero of the spectral measure associated to the global observable (namely, its inverse Fourier-Stieltjes transform): if the support of this measure intersects a neighbourhood of 0 only at 0, then mixing is rapid (Theorem 2.2) as we expect from Dolgopyat's result; in other cases we obtain polynomial estimates (Theorem 2.3, Theorem 2.4), which correspond to the expected behaviour (see Remark 2.5). Note that since the infinite volume average equals the value of the associated spectral measure at zero, our choice of infinite volume average is natural.
The main novelties of the work rely on a careful choice of the functional spaces involved. The accessibility hypothesis, when coupled with a standard central limit theorem for the underlying symbolic dynamics on the basis, allows for transfer operator bounds on suitable splitting of lower and higher frequency modes and the exploiting of cancellation effects due to accessibility.
1.1. Krickeberg mixing. Another independent notion of mixing for infinite measure preserving transformation is known as Krickeberg (or local) mixing. An infinite measure preserving system (X , µ, T ) is said to be Krickeberg mixing if there exists a sequence of positive numbers ρ n → ∞ such that, for any pair of "nice" finite measure sets A, B (precisely, bounded sets whose boundary has zero measure), the rescaled correlations converge, that is (1) lim n→∞ ρ n µ(T −n A ∩ B) = µ(A)µ(B).
The first example of a system satisfying this property dates back to Hopf in 1937 [32], but it was overlooked for several years, until the 60s when Krickeberg proposed (1) as a definition of infinite measure mixing [34]. Since then, it has received considerable attention and Krickeberg mixing has been proved in several situations, see, e.g., [22,28,40,41,39,43] to name a few.
In the language of this paper, Krickeberg mixing can be seen as a remarkably strong form of "local-local mixing": the correlations of local observables converge to zero and the first order term is the same up to a constant for any two sufficiently smooth compactly supported functions. On the other hand, global-local mixing is a softer property, which however gives information on the correlations for a much wider class of observables, in particular for those which are not supported on a compact subset but "see" the whole phase space.
Exploiting some variations on the methods developed in this paper, we are able to establish a strong form of Krickeberg mixing for the accessible skew-products we consider here, namely we can prove a full asymptotic expansion of the correlations of Schwartz observables, in the spirit of the main theorem of [23]. This result will appear in a separate paper.

1.2.
Outline of the paper. The rest of the paper is organized as follows. In Section 2 we rigorously introduce our framework and state our main results. In Section 3, we describe in detail the classes of global and local observables we consider and we state our core result, Theorem 3.9. We then deduce Theorems 2.2, 2.3 and 2.4 from Theorem 3.9.
In Section 4, we present a preliminary result in the non-invertible case of skew products over one-sided subshifts, Theorem 4.5. We also describe a "collapsed accessibility" property, which constitutes the main working assumption on the skew product in this setting.
The main tool to prove Theorem 4.5 is a family of twisted transfer operators. In Section 5, we show how the collapsed accessibility property can be exploited to obtain some cancellations in the expression for the twisted transfer operators, as in the work of Dolgopyat [19]. In Section 6, we prove some estimates on the norm of the twisted transfer operators. For large twisting paramenters, the estimates are obtained exploiting the results in Section 5; for small parameters, we apply some standard results in the theory of analytic perturbations of bounded linear operators. Section 7 contains some technical results that will be applied to prove the main theorems. Section 8 is devoted to the proof of Theorem 4.5. In order to deduce Theorem 3.9 from Theorem 4.5, in Section 9 we deduce the collapsed accessibility property for a one-sided skew-product from the accessibility property of the corresponding two-sided skew-product. In Section 10, we prove Theorem 3.9.
In Appendix A, we discuss the problem whether the accessibility property for a skew product over an Anosov diffeomorphism is equivalent to the accessibility of the associated symbolic system. Appendices B and C contain the proofs of several technical results.
Let us denote by F θ the space of Lipschitz continuous functions w : Σ → C equipped with the norm We consider the skew-product where f : Σ → R is a Lipschitz continuous function with zero average, Σ f dµ u = 0. As mentioned in the introduction, we will assume that F is accessible: roughly speaking, this means that any two points can be connected by a path consisting of pieces of stable and unstable manifolds; see Section 3 for the precise definition.
We are interested in the mixing properties of the map F with respect to the infinite measure ν = µ × Leb, where Leb is the Lebesgue measure on R.
Definition 2.1. A local observable is any function ψ ∈ L 1 (ν). A global observable is any function Φ ∈ L ∞ (ν) such that the following limit exists (4) ν av (Φ) := lim If ψ is a local observable, we will write ν(ψ) = Σ×R ψ dν. We will show in Lemma 3.2 below that if Φ is a global observable, then so is Φ • F, and the average ν av defined in (4) is invariant under F.
For any pair of global and local observables (Φ, ψ), let us denote by cov(Φ, ψ) the covariance cov(Φ, ψ) We are interested in showing "global-local mixing", namely in proving that the correlations cov(Φ • F n , ψ) satisfy and the rate of convergence to such limit (also known as the rate of decay of correlations).
The main result of this paper, Theorem 3.9 below, establishes quantitative globallocal mixing estimates. Since some preliminary work is needed, the statement of the main theorem is postponed to Section 3. We state here some corollaries which should give the reader a rather complete picture of the possible scenarios. Theorem 2.2 states that, for a dense class of almost periodic 2 global observables, we have rapid mixing, namely the decay of correlations is faster than any given polynomial, in analogy to Dolgopyat's result [19] in the case of circle extensions. On the other hand, for a dense class of global observables which vanish at infinity, Theorems 2.3 and 2.4 state that the decay is polynomial. The bound in Theorem 2.4 is generically optimal, as we show in §3.6. Let us fix some notation. Let C k (R) be the space of k-times differentiable functions on R. We will denote by P the subspace of C 2 (R) consisting of 2π-periodic functions; by C 0 (R) the space of continuous functions on R which vanish at infinity, and by C ∞ c (R) the subspace of infinitely differentiable functions with compact support. The space C ∞ c (R) has the structure of a Fréchet space, induced by the family of seminorms · C k , for k ∈ N. We will say that a map ψ : In the rest of the paper, we will implicitely identify maps a from Σ to some space of complex-valued measurable functions over R with complex-valued measurable functions on Σ × R by setting a(x, r) = [a(x)](r). Theorem 2.2 (Rapid global-local mixing). Assume that F, defined as in (3), is accessible. For any Lipschitz map ψ : Σ → C ∞ c (R), for any Lipschitz map Φ : Σ → P, and for every ℓ ∈ N, there exists a constant C = C(ℓ, ψ, Φ) ≥ 0 such that for all n ∈ N, Rapid mixing holds in many more situations besides the one in Theorem 2.2. In a precise sense that will become clear later, the key property of the global observable that ensures rapid mixing is the absence of "low frequencies components" in almost every fiber, a property which is clearly satisfied by smooth functions which are periodic on the fibers.
The situation is different when the global observable vanishes at infinity. Note that, in this case, the average ν av defined in (4) is zero. Assume that F, defined as in (3), is accessible. There exists a space of bounded continuous functions D ⊂ C 0 (R), which is dense in C 0 (R) with respect to · ∞ , and there exists α > 0 such that the following holds. For any Lipschitz map ψ : Σ → C ∞ c (R), and for any map 2 We recall that the space of almost periodic functions is the closure of the space of trigonometric polynomials with respect to the uniform norm.
Φ : Σ → D which satisfies some explicit Lipschitz condition, there exists a constant C = C(ψ, Φ) ≥ 0 such that for all n ∈ N, The Lipschitz conditions for the global observable in Theorem 2.3 will be stated explicitly in Section 3 below, as well as a bound on α (see the second paragraph of the proofs at §3.5). If we further assume that the global observable takes values in W 1 (R), where W 1 (R) is the Sobolev space of L 2 functions with weak derivative in L 2 , then the statement reads as follows.
Theorem 2.4 (Polynomial global-local mixing II). For any Lipschitz map ψ : Σ → C ∞ c (R), for any Lipschitz map Φ : Σ → W 1 (R), and for any ε > 0, there exists a constant C = C(ψ, Φ, ε) ≥ 0 such that for all n ∈ N, Remark 2.5. The bound in Theorem 2.4 is optimal: we will provide an example in §3.6 of a pair of global and local observables Φ, ψ for which the correlations are bounded below by | cov(Φ • F n , ψ)| ≥ Bn − 1 2 for some constant B > 0, and, on the other hand, Theorem 2.4 implies that for any ε > 0 there exists a constant C > 0 We stated our results on the symbolic systems, as it makes the statements easier to read. Thanks to the semiconjugacy between the diffeomorphism and the symbolic dynamics, analogous classes of observables give analogous results on the original dynamics, as their regularity is preserved (see Lemma 3.2 for the details).

THE MAIN RESULT
In this section, we first recall the definition of accessibility for F as in (3); then, we describe the classes of global and local observables we consider, and we state our main result. We deduce Theorems 2.2, 2.3 and 2.4 from Theorem 3.9. To help the reader in following the flow of the proofs, some of the lemmas stated in this section are proved in Appendix B.
3.1. Accessibility. For each point x ∈ Σ, we define the stable and unstable set at x by, respectively, W s (x) = {y ∈ Σ : there exists n ∈ Z such that y i = x i for all i ≥ n}, W u (x) = {y ∈ Σ : there exists n ∈ Z such that y i = x i for all i ≤ n}.
The sets W s (x, r) and W u (x, r) are called the (strong) stable and (strong) unstable manifold at (x, r) ∈ Σ × R. Vertical lines {x} × R constitute the center manifolds, namely they form an invariant fibration and the action of F on each line is isometric.
We now define the accessibility property. A su-path from (x, r) to (y, s) is a finite We say that F is accessible if for any two points (x, r), (y, s) ∈ Σ × R there is a su-path from (x, r) to (y, s).
A consequence of the accessibility property is the following fact, which will be proved in Appendix B. This lemma is used in Section 6 when analyzing the analytic perturbation of the transfer operator. However, we still need to directly use accessibility of F in other parts of the proof.
3.2. The classes of global and local observables. We now describe the classes of global and local observables we consider, which we will call good global and local observables. Let us start by observing that the average defined in (4) is invariant under F. The proof of Lemma 3.2, which can be found in Appendix B, is a consequence of the invariance of the Gibbs measure with respect to the dynamics.
We will denote by S the Fréchet space of Schwartz functions on R, with the family of seminorms g a,ℓ := sup r∈R |r| a d ℓ (dr) ℓ g(r) .
We will say that a function ψ : Σ → S is Hölder if it is Hölder with respect to · a,ℓ for all a, ℓ ∈ N. Starting from definition (2.1), we restrict ourselves from now on to smaller classes of observables. Let η be a complex measure over R. We will denote by |η| the variation of η and by η TV = |η|(R) its total variation. We recall that the Fourier-Stieltjes transform η(r) of a complex measure η of finite total variation is the L ∞ function defined by η(r) := R e −irξ dη(ξ ).
Let A be the set of such η's: the space A of all Fourier-Stieltjes transforms is an algebra of functions, called the Fourier-Stieltjes algebra. We equip A with the total variation norm, namely, for η 1 , η 2 ∈ A , we set η 1 − η 2 := η 1 − η 2 TV . Definition 3.4 (Good global observables). We denote by G ⊂ L ∞ (ν) the space of Hölder functions Φ : Σ → A which satisfy the following tightness condition: there exist a, A > 0 such that for all r ≥ 1 and x ∈ Σ we have The tightness condition (TC) above ensures that one can control the large frequency behaviour of Φ(x, ·) uniformly in the point x ∈ Σ. In particular, it will be exploited in the proofs of Lemma 10.2 and Proposition 10.3 in Appendix C to have some compactness property.
Remark 3.5. Any Hölder function from Σ into a metric space can be made Lipschitz by choosing a larger θ and hence changing the metric d θ on Σ. Since we are not imposing any condition on θ , here and henceforth we will assume that good local observables are Lipschitz maps ψ : Σ → S and good global observables are Lipschitz maps Φ : Σ → A satisfying (TC).
The following lemma shows that elements of G are indeed global observables; more precisely, the average ν av of Φ is the average of the values of the associated measures η x at 0. See Appendix B for the short complex integration computation which leads to the result. where, as before, η x = Φ(x).
We conclude this section by providing a useful criterion to determine whether a given function is the Fourier-Stieltjes transform of a finite complex measure which satisfies (TC). A positive definite function is any function g : R → C such that for all n ≥ 1, x i , x j ∈ R and z i , z j ∈ C. By Bochner's theorem, a function g is continuous and positive definite if and only if it is the Fourier-Stieltjes transform η of a finite positive measure η on R (see, e.g., [46,Theorem IX.9]). For example, it is easy to check that g(x) = e ix or g(x) = cos(x) are positive definite functions. A less trivial example is the function g(x) = 1 |x|+1 ; the fact that g is positive definite follows from Pólya's Criterion: any positive, continuous, even function which, for positive x, is non-increasing, convex and tends to 0 for x → ∞ is the Fourier-Stieltjes transform of an L 1 function, thus positive definite. We refer the reader to [37] and [50,Chapter 6] for more results concerning the Fourier-Stieltjes transform of measures. Since any complex measure of finite total variation is a linear combination of positive finite measures, the proof of the lemma above follows immediately from the following tail estimate, whose proof can be found in Appendix B.
3.3. Statement of the main result. For any good global observable Φ ∈ G and for any r > 0, let us define the "low frequency variation" as Notice that LF(Φ, ·) is monotone and LF(Φ, r) → 0 for r → 0. We are now ready to state our main result.
Theorem 3.9 (Quantitative global-local mixing). Assume that F, defined as in (3), is accessible. Then, for every ψ ∈ L , for every Φ ∈ G , for any k ∈ N, and for every ε > 0, there exists a constant C = C(Φ, ψ, k, ε) > 0 such that for every n ∈ N, The bound in Theorem 3.9 is the sum of two terms, namely a superpolynomial term and the contribution given by the measures |η x | close to 0. In particular, if the support of the measures η x does not intersect some neighbourhood of 0, then the decay of correlations is superpolynomial. On the other hand, for example under the assumptions of Theorem 2.4, the measures |η x | are absolutely continuous and the decay is polynomial.
In the rest of the section we prove Theorems 2.2, 2.3 and 2.4 from the result above.
By the theory of Fourier series, any p ∈ P ⊂ C 2 (R), by periodicity, is the Fourier-Stieltjes transform of a discrete measure η of the form η = ∑ n∈Z a n δ n , where a n ∈ C and δ n is the Dirac measure at n. We claim that any Lipschitz map Φ : Σ → P is contained in G . Theorem 3.9 then immediately implies the result, We first check the Lipschitz condition. For x ∈ Σ, let us write η x = ∑ n∈Z a n (x)δ n . Since Φ(x) ∈ C 2 (R), it follows that lim n→∞ |n 2 a n (x)| = 0. In particular, the sequence |a n (x)| · (1 + i|n|) is square-summable (notice that ina n (x) are the Fourier coefficients of the derivative Φ(x) ′ ). Thus, for any x, y ∈ Σ, by Cauchy-Schwartz, we have Hence, by Plancharel formula, there exists a constant C > 0 such that We now show that Φ satisfies the tightness condition (TC). Since Φ(x) ∈ C 2 (R), we can bound |a n (x)| ≤ Φ(x) ′′ ∞ n −2 . Thus, for any r ≥ 2 we have which concludes the proof.
3.5. Proof of Theorems 2.3 and 2.4. Let us first prove Theorem 2.3. To this end, fix any p > 1 and consider as D the space of Fourier transforms of functions f ∈ L 1 ∩ L p with power decay, namely, for which there exist constants A, a > 0 such that Then, using Hölder inequality, with 1 Let us now prove Theorem 2.4. It follows from [6, Theorem 4.2] that any function f ∈ W 1 is the Fourier transform of a function g ∈ L 1 ∩ L 2 , which satisfies g 2 = f 2 and g 1 ≤ f W 1 . This implies that any Lipschitz map Φ : Σ → W 1 is Lipschitz also with respect to the total variation norm.
Let us check that Φ satisfies the tightness condition (TC). Denote dη For any r ≥ 2, by Cauchy-Schwartz and by Plancharel formula, we have The estimate |η x | −n − 1 2 +ε , n − 1 2 +ε = O n − 1 4 +ε follows from Cauchy-Schwarz inequality exactly as above. If in addition Φ has range in W 1 ∩ L p , then the functions f x belong to L q , where 1 p + 1 q = 1, and one can conclude using Hölder inequality again. This finishes the proof.
3.6. Example. We discuss a simple example, which shows that the bound in Theorem 2.4 cannot, in general, be improved. As good local observable, let us consider any non-negative ψ(x, r) = ψ(r) ∈ C ∞ c (R) which equals 1 in the interval − 1 2 , 1 2 and, as good global observable, let Φ(x, r) = Φ(r) = 1 1+|r| . Then, Φ ∈ W 1 (R) ∩ L p (R) for any p > 1, so that Theorem 2.4 implies that for any ε > 0 there exists a constant C ≥ 0 such that Let us show that there is a lower bound of order exactly O(n − 1 2 ). Lemma 3.1 implies that f is not cohomologous to zero. Moreover, by the Central Limit Theorem, there exists a constant C ′ > 0 such that for any n ∈ N sufficiently large, on a subset Y n ⊂ Σ of measure at least 1/2, the Birkhoff sums f n ( In particular, for any x ∈ Y n and r ∈ − 1 2 , 1 2 , we have Thus, for any n ∈ N sufficiently large, it follows that We have shown that there exists a constant C = (4C ′ ) −1 and, for any ε > 0, there exists a constant C ε > 0 such that we can bound the correlations by 2 +ε , hence the bound of Theorem 2.4 is, in this case, optimal.

SKEW-PRODUCTS OVER ONE-SIDED SUBSHIFTS
To prove Theorem 3.9, we have to first prove analogous statements for one-sided subshifts. In this section, we discuss the case of skew-products over topologically mixing one-sided subshifts of finite type.
Let σ : X → X be a topologically mixing one-sided subshift of finite type, equipped with a Gibbs measure µ = µ u with respect to the potential u. For 0 < θ < 1, the distance d + θ and the space of Lipschitz functions F + θ are defined analogously to the case of the two-sided shift. Let f + ∈ F + θ be a real-valued Lipschitz function with zero average, and consider the skew-shift Denote by ν the infinite measure µ × Leb on X × R. For any pair of global and local observables Φ, ψ over X × R, define the analogous correlation function

4.1.
Good global and local observables for skew-shifts over one-sided subshifts. The class of global and local observables we consider in this case are described below. In this setting, we require less regularity of the observables than in the case of two-sided shifts.
Definition 4.1 (Good local observables -one-sided case). Let L + ⊂ L 1 (ν) be the space of functions ψ : X → S such that, for every ℓ ∈ N, the function x → ∂ ℓ ψ(x) from X to L 1 (R) is Lipschitz. For every ψ ∈ L + , denote by Max ℓ (ψ) and Lip ℓ (ψ) the minimum constants such that Let us remark that, if ψ ∈ L + , then, for every fixed x ∈ X , the Fourier transform ψ(x) of ψ(x) ∈ S is a Schwarz function as well. For any fixed ξ ∈ R, we denote by ψ ξ : X → C the function ψ ξ (x) = ψ(x)(ξ ). Lemma 4.2. Let ψ ∈ L + . For every ξ ∈ R, we have ψ ξ ∈ F + θ . Moreover, for every ℓ ≥ 0, and for all ξ = 0 we have Proof. For any ξ = 0, x ∈ X , and ℓ ≥ 0 we have, by assumption in equation (8), Let us recall that A ⊂ L ∞ denotes the space of Fourier-Stieltjes transforms of complex measures with finite total variation.
In the next sections, we will deal only with the non-invertible case of the skewproduct F + and we will often suppress the + in the notations introduced above, as it should not generate confusion. In Section 10 we will return to the invertible setting.

4.2.
Collapsed accessibility. The property we need in the case of one-sided shifts which will replace the accessibility assumption is the following notion of collapsed accessibility.

Definition 4.4.
A Lipschitz function f : X → R has the collapsed accessibility property if there are constants C and N such that the following holds: for any x ∈ X ,t ∈ [0, 1], and n ≥ 2N, there is a sequence of points The adjective "collapsed" refers to the fact that local stable manifolds are collapsed to points when going from Σ × R to X × R.
In order to prove Theorem 3.9, we will see in Section 10 that we can reduce an accessible skew-product F to a skew-product F + over a one-sided shift such that f + enjoys the collapsed accessibility property.

4.3.
The one-sided version of the main theorem. We state our main theorem in the case of skew-products over one-sided subshifts which have the collapsed accessibility property. In Section 10, we will deduce Theorem 3.9 from Theorem 4.5 below.
Theorem 4.5 (Quantitative global-local mixing for one-sided subshifts). Assume that f + , defined as in (7), has the collapsed accessibility property. Then, for every ψ ∈ L + , for every Φ ∈ G + , for any k ∈ N, and for every ε > 0, there exists a constant C = C(Φ, ψ, k, ε) > 0 such that for every n ∈ N, The "low frequency" term LF(Φ, ·) in Theorem 4.5 is defined exactly as in (6), except that the integral is on X instead of Σ.

4.4.
An expression for the correlation function. The main tool to study the correlations is the transfer operator. We recall the relevant definitions.
For any z ∈ C, we let us further define the twisted transfer operator L z : where u is the potential defining the Gibbs measure and u n its cocycle. Notice that all the operators described above restrict to operators acting on F + θ .
Proposition 4.6. Let ψ ∈ L + and Φ ∈ G + . Then, for every n ∈ N we have Proof. By definition of the transfer operator L F + , we can write where the applicability of the Fubini-Tonelli Theorem follows immediately from the definition of G + and L + . Since Φ(x) is the Fourier-Stieltjes transform of a measure η x we get thus we can again apply the Fubini-Tonelli Theorem to get The conclusion follows by construction due to the equality

CANCELLATIONS FOR TWISTED TRANSFER OPERATORS
From Proposition 4.6, it is clear that, in order to estimate the correlations, we need to study the twisted transfer operators L ξ , for real frequencies ξ ∈ R. The aim of this section is to show that the collapsed accessibility property can be exploited to obtain some cancellations in the expression of L ξ .
Let us fix a complex-valued Lipschitz functiong : X → C, and let |g| = g. In this section, we use a tilde to denote a complex-valued or "twisted" function, and the same letter without a tilde to denote its absolute value. We denote by L : and by L : Up to conjugating L with a suitable multiplication operator, we can assume that Notice that the for the operator L ξ defined in the previous section, we havẽ g = exp(u + iξ f ), where u is the potential for the Gibbs measure µ.

By induction,
where g n andg n are the cocycles It follows that for all n ≥ 1 we have 5.1. Collapsed accessibility and cancellation pairs. Let us fix a positive constant ε > 0, and an integer n ≥ 1. We assume ε < 1 2 and ε < 1 − θ . Define We say L has (ε, n)-cancellation if for any observableṽ with |ṽ| θ ≤ H and A pair of points (x, y) in X is a stable pair if σ n x = σ n y. We say a stable pair (x, y) is a cancellation pair for a nice observableṽ if Proof. By definition of cancellation pair, we have where we used that L n 1 = 1.
For a stable pair (x, y) define the phase of (x, y) as arg g n (y) g n (x) .
Here, arg is the complex argument and so the phase is the angle betweeng n (x) and g n (y) in the complex plane. For the most part, we can just think of this value as an angle. However, if we include it in an inequality, we will assume it is a real number between −π and π.
Note that the right hand side must be less than two for this to be well defined. In practice, we will always choose ε small enough so that this is the case.
Proposition 5.3. Let (x, y) be a stable pair, andṽ a nice observable. If (x, y) is not a cancellation pair forṽ, then where δ is the stable tolerance of (x, y).
In other words, if s is the phase of (x, y), then ignoring issues of the angle only being defined up to a multiple of 2π.
To prove the proposition, we first establish the following lemma.
This may be rewritten as Proof of Proposition 5.3. We show the contrapositive. Define z 1 =g n (x)ṽ(x) and z 2 =g n (y)ṽ(y). Then and a similar estimate holds for |z 2 |. Using ε < 1 2 and the definition of the stable tolerance, one sees that Let α be the angle between z 1 and z 2 . If δ < α, then 1 − cos(δ ) < 1 − cos(α) and Lemma 5.4 shows that (x, y) is a cancellation pair forṽ.
For an arbitrary pair (x, y) of points in X , define the unstable tolerance as 0 ≤ δ < π 2 such that sin(δ ) = 2Hd(x, y) Note that x and y must be reasonably close for this to be well defined.
Proposition 5.5. If (x, y) is a pair with unstable tolerance δ andṽ is a nice observable, then Again, we rely on a trigonometric lemma.
Proof. Assume |z 1 | > |z 2 | and consider the acute triangle defined by the points 0, z 1 and z 2 in complex plane. Split this triangle into two right triangles by adding a line segment from z 2 to the opposite side of the triangle. This new segment has length |z 2 | sin(α) ≥ (1 − ε) sin(α) and so the line segment from z 1 to z 2 has length at least (1 − ε) sin(α).
Proof of Proposition 5.5. Let z 1 =ṽ(x) and z 2 =ṽ(y) and let α be the angle between them. The above lemma and the definition of "nice" together imply that Since ε < 1 2 by assumption, the result follows. A us-cycle is a (finite) sequence of points in X : where x 1 = x m+1 and each pair (x k , y k ) is a stable pair. The tolerance of the cycle is the sum of the stable tolerances of the pairs We only consider us-cycles for which this tolerance is well defined. The phase of the cycle is .
That is, the phase of the cycle is the sum of the phases of the individual stable pairs (up to a multiple of 2π). As defined, the phase is a number in (−π, π]. We will only consider cycles where the phase is positive. Proposition 5.7. If there is a us-cycle where the phase is greater than the tolerance, then L has strong (ε, n)-cancellation.
Proof. Letṽ be a nice observable. Our goal is to show that one of the stable pairs in the cycle is a cancelling pair forṽ. We assume none of them is a cancelling pair and derive a contradiction. Let S be the phase of the cycle and δ = δ s + δ u be the tolerance, where δ s is the sum of the stable tolerances and δ u is the sum of the unstable tolerances. We are assuming 0 < δ < S. Proposition 5.3 implies that < S + δ s and Proposition 5.5 implies that Since x 1 = x m+1 , the complicated product in the middle of each inequality is actually the same complex number and so we get S − δ s < δ u , a contradiction.

5.2.
Cancellation by frequency. We now apply the results above to the specific case of the operators L ξ defined in the previous section, namely to the case To simplify the presentation we only consider positive ξ , but analogous results will hold for negative frequencies.
The notion of a "nice observable" will also depend on the frequency. In particular, the value H from the previous section depends on ξ and so H = H ξ = 2 1−θ R ξ , which also grows linearly in ξ . Define a constant G = inf{ 1 g(x) : x ∈ X } and an exponent α > 0 determined by θ α G = 1. Note that G and α are independent of the frequency.
We now show that accessibility of the skew product leads to cancellation of these twisted operators and we give quantitative estimates of the amount of cancellation.
Proposition 5.8. Suppose f + has the collapsed accessibilty property and ξ 0 > 0 is given.
Then there are positive constants A and B such that if ξ ≥ ξ 0 and ε ξ = 1 AG n ξ where n ξ is the smallest integer which satisfies θ n ξ < 1 Bξ , then L ξ has strong (ε ξ , n ξ )-cancellation.
Remark 5.9. One can see from the definitions of ε ξ and n ξ that ε ξ = 1 Proof. Assume without loss of generality that 0 < ξ 0 < π. The overall strategy of the proof is to use collapsed accessibility to show that, for any frequency ξ ≥ ξ 0 , there is a us-cycle with phase equal to ξ 0 and tolerance less than ξ 0 . Proposition 5.7 then gives cancellation.
Let C and N be as given in Definition 4.4 of collapsed accessibility . Then there is a constant 0 < a < 1 such that any angle 0 < δ < π which satisfies either 1 − cos(δ ) ≤ a or sin(δ ) ≤ a also satisfies δ < 1 2N ξ 0 . Since H ξ grows linearly in ξ , there is B > 0 such that 2CH ξ ≤ aBξ , for all ξ ≥ ξ 0 . Up to increasing the value of B, we can also ensure that n > 2N for any integer n which satisfies θ n < 1 Bξ 0 . Define A = 2 a . Now consider a specific frequency ξ ≥ ξ 0 and use n = n ξ and ε = ε ξ defined as in the statement of the proposition. Using this n and t = ξ ξ 0 , there is a sequence of points x 1 , y 1 , x 2 , y 2 , . . . , y m , x m+1 satisfying Definition 4.4. This sequence is a uscycle for L ξ and has phase equal to ξ 0 . If δ is the stable tolerance of a pair (x k , y k ), then If instead δ is the unstable tolerance of a pair (y k , x k+1 ), then sin(δ ) = 2H ξ d(y k , x k+1 ) ≤ 2CH ξ θ n ≤ aBξ θ n ≤ a.
Together, these estimates show that the total tolerance of the us-cycle is less than ξ 0 and so Proposition 5.7 gives cancellation.

CONTRACTION
In this section, we show how to obtain some estimates on the norm of the operator L ξ . For high frequencies, we exploit the cancellations obtained in the previous section, while, for low frequencies, we apply some standard results from the perturbation theory of bounded linear operators.
6.1. High frequencies. Recall that we defined H := max 1, 2R 1−θ . It will be convenient to define the following norm on F + θ : let Notice that the norms · H and · θ are equivalent, namely In this section, we will prove the following result.
Proposition 6.1. Suppose that f + has the collapsed accessibility property, and let ξ 0 > 0 be given. Then, there exists positive constants A, B > 0 and an exponent β > 0 such that for all ξ ≥ ξ 0 we have We start by proving some simple preliminary results. Lemma 6.2. For any given ξ > 0, ifṽ ∈ F + θ , then L ξṽ H ≤ ṽ H . Proof. Clearly, L ξṽ ∞ ≤ ṽ ∞ ≤ ṽ H . From the Basic Inequality in Lemma 5.1 we also get . This completes the proof Let us recall that, from the definition of Gibbs measure, it follows that there exist constants C u , d such that for any ball B(x, r) centered at x ∈ X with radius r ≥ 0 we can bound (11) µ(B(x, r)) ≥ C u r d .
We will also use the fact that the untwisted transfer operator L on F + θ has a spectral gap, namely the following well-known result, see, e.g., [42, Theorem 2.2]. Lemma 6.3. There exist a bounded operator N : F + θ → F + θ , a real number 0 < δ < 1, and a constant C > 0 such that for all n ∈ N we have N n θ ≤ Cδ n , and for allṽ ∈ F + θ , L n (ṽ) = Xṽ dµ + N n (ṽ).
We have the following result.
Lemma 6.5. There exist constantsĀ,B > 0 such that the following holds. Assume that L = L ξ has (ε, n)-cancellation. Then, for everyṽ ∈ F + θ with ṽ H ≤ 1, and for any N ≥ N 0 := ⌊−B log(ε/H)⌋, we have Proof. Let N 1 and N 2 be the minimum integers which satisfy For every x ∈ X , where, by Lemma 6.3 and (10),  We are in position to complete the proof of Proposition 6.1

By the definition of N 2 and (12), we conclude
Proof of Proposition 6.1. By Proposition 5.8, L ξ has (ε ξ , n ξ )-cancellations, with ε ≥ A 0 ξ −α and n ξ ≤ B 0 | log ξ |, for some positive constants A 0 , B 0 . Therefore, by Lemma 6.5, for every N ≥ N 0 + n ξ we have , for some constant A > 0. By the definitions of N 0 and n ξ , there exists a constant B > 0 such that N 0 + n ξ ≤ B| log ξ |.

Low frequencies.
We now want to estimate the norm of L ξ for small ξ ∈ R. Let us notice that there exists ξ 0 > 0 such that for all 0 ≤ ξ ≤ ξ 0 we have H = max 1, 2R 1−θ = 1, so that · H ≤ · θ ≤ 2 · H . We will prove the following bound.
Proposition 6.6. There exist κ > 0 and a constant A κ > 0 such that, for all 0 < ξ < κ and for all n ≥ 0, we have L n ξ H ≤ 4(1 − A κ ξ 2 ) n . Let us recall that the family of operators z → L z is analytic for z ∈ C. This ensures that we can apply classical results from analytic perturbation theory to study the spectrum of bounded linear operators, see in particular [ Theorem 6.7. There exists a κ > 0 such that the twisted transfer operator L z on F + θ has a spectral gap for all |z| < κ. Moreover, there exist λ z ∈ C and linear operators P z and N z such that L z = λ z P z + N z and which satisfy the following properties: (1) λ z , P z and N z are analytic on the disk {|z| < κ}, (2) P z is a projection and its range has dimension 1, the spectral radius ρ(N z ) of N z satisfies ρ(N z ) < λ z − δ , for some δ independent of z.
In our case, we restrict to real frequencies 0 < ξ < κ. For the proof of the following lemma, see [42,Chapter 4] and [49, Section 4]. Lemma 6.8. With the notation of Theorem 6.7, there exist constants A κ , B κ > 0 such that for all 0 < ξ < κ we have The fact that A κ is strictly positive follows from the fact that f + is not cohomologous to zero, see Lemma 3.1.

RAPID DECAY
In Section 8, we will use the contraction results established for the twisted transfer operator L ξ in the previous section in order to prove rapid mixing. In this section, we give several technical propositions in an abstract setting which encapsulate most of the difficult inequalities involved in the proof. Definition 7.1. Consider a function w : A ⊆ (0, ∞) → R. We say w(ξ ) decays rapidly in ξ if for each ℓ ≥ 1, there is a constant C such that |w(ξ )| ≤ Cξ −ℓ for all ξ . We say a sequence {s n } decays rapidly in n if for each ℓ ≥ 1, there is a constant C such that |s n | ≤ Cn −ℓ for all n.
then the sequence {s n } defined by s n = sup ξ w n (ξ ) decays rapidly in n.
In order to prove this, we first give a lemma which establishes for each fixed ξ an exponential rate of decay of the sequence {w n (ξ )}. Lemma 7.3. In the setting of Proposition 7.2, there are constants D and γ such that w n+K (ξ ) < 1 e w n (ξ ) for all K > Dξ γ . Proof. Consider a specific ξ and let k and N be the smallest integers such that If we choose an exponent γ > β , then there is a constant D such that Moreover, this constant D may be chosen uniformly for all ξ . If K > Dξ γ , then K > kN and w n+K (ξ ) ≤ w n+kN (ξ ).
Propositions 7.2 and 7.4 are enough to establish rapid mixing in the setting of skew products over one-sided shifts. However, to handle two-sided shifts, we will need the following more technical results.
Proof. As w(ξ ) is bounded, we may without loss of generality assume that w(ξ ) ≤ 1 for all ξ . Define w n (ξ ) = sup m θ m v n,m (ξ ). One can verify that {w n } satisfies the hypotheses of Proposition 7.2. Hence sup ξ w n (ξ ) decays rapidly in n, meaning that for a given ℓ, there is C such that v n,m (ξ ) < Cθ −m n −ℓ for all m and n. If m < c log(n), then From this, one can see that {t n } decays rapidly in n. Proof. This follows from Proposition 7.4 using the proof of Proposition 7.5.

PROOF OF THEOREM 4.5
This section is devoted to the proof of Theorem 4.5. Let ψ ∈ L + and Φ ∈ G + be given, and fix k ∈ N and 0 < α < 1/2. Recall that the good global observable Φ defines a complex measure η x for each x ∈ X and that there is a uniform constant M = Φ G + such that η x TV ≤ M for all x ∈ X . The Fourier transform of the good local observable ψ is a function of the form ψ : X × R → C where, for each frequency ξ , the function ψ ξ : X → C defined by ψ ξ (x) = ψ(x)(ξ ) is Hölder and lies in F + θ . By Proposition 4.6, we have We will estimate the correlations by splitting the frequencies ξ ∈ R into the cases ξ = 0, 0 < |ξ | < n −α , and |ξ | > n −α . In fact, we only consider ξ ≥ 0 as the estimates for ξ < 0 are analogous. The proof of Theorem 4.5 follows from the next three lemmas.
Lemma 8.1. There exist constants C > 0 and 0 < δ < 1 (depending from L, as given in Lemma 6.3) such that for all n.
Proof. Recalling that L 0 = L is the transfer operator associated to σ , we have By Lemma 6.3, there exists a constant C > 0 and 0 < δ < 1 such that where we used that ψ 0 (x) = R ψ(x, r) dr. Hence, by Lemma 3.6 and Lemma 4.2, we conclude Lemma 8.2. We have that for all n.
decays rapidly in n.
Proof. For each n, define a function w n : [0, ∞) → [0, ∞) by w n (ξ ) = L n ξ ψ ξ H . By Lemma 6.2, w n is a decreasing sequence of functions, and by Lemma 4.2, w 0 (ξ ) is a bounded function which (in the notation of Section 7) decays rapidly in ξ . Up to rescaling ψ, we may freely assume that w 0 takes values in [0, 1]. Proposition 6.6 implies that w n restricted to (0, κ] satisfies the hypotheses of Proposition 7.4. We then fix ξ 0 = κ, so that Proposition 6.1 implies that w n restricted to [ξ 0 , ∞) satisfies the hypotheses of Proposition 7.2. Hence, the sequence defined by s n = sup {w n (ξ ) : n −α ≤ ξ < ∞} decays rapidly in n. Note that L n ξ ψ ξ ∞ ≤ L n ξ ψ ξ H and so, for each n, As µ is a probability measure, it follows that where M is the uniform bound on η x TV .

FROM ACCESSIBILITY TO COLLAPSED ACCESSIBILITY
In this section, we relate the notion of accessibility for a skew-product F as in (3) to the property of collapsed accessibility defined in Section 4.
For a two sided shift σ : Σ → Σ, let X be the corresponding one sided shift and let π : Σ → X be the projection. Note that π is a continuous, surjective, open map. We also write x + for π(x).
For x ∈ Σ, define W s 0 (x) = π −1 π(x). In other words, y ∈ W s 0 (x) if and only if x + = y + . For n ∈ Z, define W s n (x) = σ −n W s 0 (σ n x) and note that  For points x and y in Σ and n ∈ Z, the following are equivalent: (1) y ∈ W s n (x), (2) dist(σ n+k x, σ n+k y) ≤ θ k for all k ≥ 0.
Proof. One can show that each of these conditions is equivalent to the sequences of symbols for x and y satisfying x m = y m for all m ≥ n.
Instead of projecting onto the future x → x + , we can analogously project onto the past x → x − . Define local unstable manifolds by y ∈ W u 0 (x) if and only if x − = y − , and for n ∈ Z define W u n (x) = σ n W s 0 (σ −n x). Analogous versions of the above lemmas hold for these manifolds.
Let us now consider the skew-product (3). Writing p = (x, s) and q = (y,t), we define local stable manifolds by p ∈ W s n (q) if and only if p ∈ W s (q) and x ∈ W s n (y). Define local unstable manifolds analogously. For points p and q and an integer n > 0, a us-N-path from p to q is a sequence p = p 0 , p 1 , . . . p n = q such that n ≤ N and for each 0 ≤ k < n either p k+1 ∈ W s N (p k ) or p k+1 ∈ W u N (p k ). For a point p, define AC N (x) by q ∈ AC N (x) if and only if there is a us-N-path from p to q. Note that AC N (x) form an increasing sequence whose union is AC(p).
For a subset U ⊂ Σ × R, define is open for all n ≥ 0. If K ⊂ Σ × R is compact, then AC N (K) is compact for all n ≥ 0.
Proof. This follows directly from Lemma 9.1.
Proposition 9.4. Let K be a compact subset of Σ × R such that int(K) = K. If p ∈ Σ × R is such that K ⊂ AC(p), then there is N such that K ⊂ AC N (p).
Proof. Since AC N (p) is an increasing sequence of compact sets and K is a Baire space, there is N 1 such that AC N 1 (p) contains a non-empty open subset U ⊂ K.
Since AC N (U ) is an increasing sequence of open sets whose union contains the compact set K, there is N 2 such that K ⊂ AC N 2 (U ). Then K ⊂ AC N 1 +N 2 (p).
We have the following result.
Proposition 9.5. Let f : X → R be a Lipschitz function. If the skew product is accessible, then f has the collapsed accessibility property.
Proof. Let K = Σ × [0, 1]. By Proposition 9.4, there is a uniform constant N such that AC N (p) contains K for any p ∈ K. With N fixed, let x ∈ X ,t ∈ [0, 1], and n ≥ 1 be given. Then there is a sequence of points p 1 , q 1 , p 2 , q 2 , . . . , q m , p m+1 such that (1) m ≤ N, p 1 = F N−n (x, 0), and p m+1 = F N−n (x,t); (2) p k ∈ W s N (q k ); and (3) p k+1 ∈ W u N (q k ). Applying F N−n to this sequence, we define (a k , s k ) = F N−n (p k ) and (b k ,t k ) = F N−n (q k ) which satisfy b k ∈ W s n (a k ) and a k+1 ∈ W u 2N−n (b k ). As (a k , s k ) and (b k ,t k ) are on the same stable manifold in Σ × R, it follows that t k − s k = f n (b + k ) − f n (a + k ). By the unstable analogue of Lemma 9.2, one can show that d(b k , a k+1 ) ≤ θ n−2n+1 . That is, d(b k , a k+1 ) ≤ Cθ n where C = θ 1−2N . One can then check that x k = a + k and y k = b + k satisfy all of the conditions in the definition of collapsed accessibility.
10. PROOF OF THEOREM 3.9 We now prove Theorem 3.9. The strategy of the proof is to reduce the problem to the setting of Theorem 4.5.

10.1.
Step 1: f only depends on future coordinates. Let us start with a preliminary step: we show that we can assume that the function f in (3) only depends on the future coordinates. From [42, Proposition 1.2], we inherit the following result.
When reducing to a one-sided shift, we will encounter some loss in regularity as in the previous lemma: the functions h and f + are Holder with exponent 1/2. We can however replace θ with √ θ in the definition of the distance d θ to make them Lipschitz. We remark that this is not an issue, and we will freely replace θ with a suitable choice that makes the functions Lipschitz.
For any Φ ∈ G and ψ ∈ L , using Lemma 10.1, we can write Let us define Φ h (x, r) = Φ(x, r − h(x)) and ψ h (x, r) = ψ(x, r − h(x)). We change variable s = r + h(x) and we get where the skew-product F 1 is defined by F(x, r) = (σ x, r + f + (x)). The map H(x, r) = (x, r + h(x)) used in the change of variable above is a conjugacy between F and F 1 , namely H • F = F 1 • H. Moreover, H is uniformly continuous (more precisely, it is Lipschitz with respect to the distance d √ θ , exactly as h), hence it preserves stable and unstable manifolds. In particular, F 1 is accessible.
The initial claim follows from the following lemma, whose proof is contained in the Appendix C.

10.2.
Step 2: observables only depend on future coordinates. In the previous subsection, we have seen that we can assume that f = f + ∈ F + θ (up to replacing θ with √ θ ). We now show that we can replace the observables Φ = Φ h ∈ G and ψ = ψ h ∈ L with observables in G + and in L + respectively: this is the content of Proposition 10.3 below. The proof follows the same lines as in [19]; however in our case there are some additional difficulties in showing that the functions defined belong to G + and L + . In particular, we will need to use the assumption (TC) to ensure some compactness property in A . We postpone the proof to the Appendix C. Proposition 10.3. Let Φ ∈ G and ψ ∈ L . There exist constants K, M(Φ) ≥ 0, sequences {Φ m } m∈N ⊂ G + , {ψ} m∈N ⊂ L + , and, for every ℓ ∈ N, there exist constants M(ψ, ℓ) and L(ψ, ℓ) such that the following properties hold for all ℓ, m, n ∈ N and x ∈ X : (i) ν av (Φ m ) = ν av (Φ) and ν(ψ m ) = ν(ψ), From Proposition 9.5, it follows that the function f + in the definition of the one-sided skew-product F + has the collapsed accessibility property.

10.3.
Step 3: end of the proof. We are now ready to prove Theorem 3.9. Let Φ ∈ G and ψ ∈ L , and fix k ∈ N and 0 < α < 1/2. Consider the sequence of functions {ψ} m∈N ⊂ L + given by Proposition 10.3. By Lemma 4.2, their Fourier transforms ( ψ m ) ξ satisfy If we define a function w : (0, ∞) → [0, ∞) by w(ξ ) = sup m θ m ( ψ m ) ξ H , then these estimates imply that w(ξ ) decays rapidly in ξ in the sense of Section 7. We further define functions v n,m : (0, ∞) → [0, ∞) by v n,m (ξ ) = L ξ ( ψ m ) ξ H , and we notice that v n,m and w satisfy the hypotheses of Proposition 7.5 and Proposition 7.6. Consequently, for any c > 0, the sequence {t n } n∈N defined by decays rapidly in n.
Fix n ∈ N and let m be the largest integer such that m < c log(n), where c = k/(− log(θ )); in particular By Proposition 10.3-(iv), we get hence it suffices to bound the first summand in the right-hand side above. By Proposition 4.6, we have The last summand in the right-hand side above is bounded by Mt n , hence decays rapidly. The first term, by Lemma 8.1, is bounded by which decays rapidly as well. Finally, for the second term, Lemma 8.2 implies In order to coclude the proof of Theorem 3.9, it suffices to establish the following lemma.

APPENDIX A. ACCESSIBILITY AND SYMBOLIC DYNAMICS
Let A : M → M be a diffeomorphism and let Ω ⊂ M be a transitive uniformly hyperbolic subset. One can construct a Markov partition on Ω and use it to define symbolic dynamics. That is, there is a subshift of finite type σ : Σ → Σ and a continuous surjective map π : Σ → Ω such that A • π = π • σ .
Let f : M → R be a continuous function which defines a skew product on M × R. This function then also defines a "symbolic skew product" Let u : Ω → R be a Hölder continuous function, and let µ = µ u be the unique equilibrium state for u. We want to obtain a quantitative mixing result for F with respect to ν = µ × Leb by applying Theorem 3.9 to the symbolic skew-product F sym . In this appendix, we discuss how the classes of good local and global observables and the accessibility property translate from the original system to the symbolic counterpart.
A.1. The observables. The classes of good local and global observables on M ×R we consider are defined, repsectively, as Hölder functions ψ : M → S from M to the space of Schwartz functions S and Hölder functions Φ : M → A from M to the Fourier-Stieltjes algebra A such that the tightness condition (TC) is satisfied (with x ∈ Σ replaced by p ∈ M). If we equip the symbolic system with the invariant measure ν sym = µ u•π × Leb, where µ u•π is the Gibbs measure with potential u • π, we obtain the following lemma.
Lemma A.1. If ψ is a good local observable on M × R, the function ψ sym := ψ • π is a good local observable for the symbolic system. Similarly, if Φ is a good global observable on M × R, then Φ sym := Φ • π is a good global observable for the symbolic system. Moreover, where the reference measure is ν on the left and ν sym on the right hand-side.
Proof. For any choice of θ ∈ (0, 1), the semiconjugacy π : Σ → Ω is Hölder continuous with respect to the distance d θ on Σ (see, e.g., [11,Lemma 4.2]). Hence, given any ψ and Φ as in the statement, the functions Φ sym and ψ sym are Hölder continuous and, up to taking a larger θ , they are actually Lipschitz. This shows that they are good local and global observables for the symbolic system.
By [11,Theorem 4.1], the equilibrium state µ = µ u can be expressed as the pushforward µ u = π * µ u•π . The final claim follows immediately from the semiconjugacy between F sym and A.
A.2. Accessibility. In order to apply Theorem 3.9 to F sym , one needs to address the following question.
Question A.2. If F| Ω×R is accessible, does it follow that F sym is accessible?
At first glance, the question might seem easy to answer, but there are some subtle issues here. As π is uniformly continuous, if points x and y lie on the same stable manifold in Σ, then they project down to points π(x) and π(y) lying on the same stable manifold in M. However, one can construct examples where π(x) and π(y) lie on the same stable manifold, but σ n (x) and σ n (y) stay far apart for all n ∈ Z. Hence, not all us-paths in M lift to us-paths in Σ. Despite this, we can establish accessibility of F sym in certain settings. We will also discuss the difficulties involved in the general case later in this appendix. We will adopt the notation used in Bowen's book on the subject [11]. In particular recall that if p and q are in the same rectangle, then [p, q] is the intersection of the local stable manifold of p with the local unstable manifold of q. If x = {x n } and y = {y n } are elements of the symbolic dynamics with the same "zeroth" symbol x 0 = y 0 , then [x, y] = z = {z n } is defined by z n = x n for n ≥ 0 and z n = y n for n ≤ 0 and one can show that π([x, y]) = [π(x), π(y)].
For p and q in Ω, if q ∈ W s A (p), define ∆ s (p, q) = ∑ ∞ n=0 f (A n q)− f (A n p) and note that (p, s) ∈ W s F (q,t) if and only if t − s = ∆ s (p, q). If q ∈ W u A (p), define ∆ u (p, q) analogously. For p and q in the same rectangle, define That is, h(p, q) measures the height of the "Brin quadrilateral" that has p and q as two of its four vectices. Note that h is continuous and if q is on the local stable or manifold manifold of p, then h(p, q) = 0.
Define ∆ s sym , ∆ u sym , and h sym using the same formulas, but with f sym in the place of f . Then on the cylinder C i ⊂ Σ consisting of the sequences whose zeroth symbol corresponds to R i , the function h sym : C i × C i → R is continuous and h sym (x, y) = h(π(x), π(y)).
Proof. Since h and γ are continuous, I = h(p, γ([0, 1])) is a positive length interval containing zero. For any t ∈ [0, 1], using the properties of symbolic dynamics we may find elements x, y ∈ C i such that π(x) = p and π(y) = γ(t). This implies that h sym (C i ×C i ) contains I. From this, one can show that F sym has an open accessibility class and then use this to conclude that F sym is accessible. Proof. Choose a rectangle R i in a partition. Since A is Anosov and R i has interior, there is a periodic point p in the interior of R i . Adapting the proof of the "Unweaving Lemma", that is [48,Lemma A.4.3], we can make a small perturbation to any starting f in a small neighbourhood in order to find a point q with h(p, q) = 0. We can then define a path γ : [0, 1] → R i from p to q and apply Proposition A.3.
Corollary A.5. If A : T 2 → T 2 is an Anosov diffeomorphism, then F is accessible if and only if F sym is accessible.
Proof. Here, A is topologically conjugate to a linear map [24], and so we assume A itself is linear. In this setting, we can find a Markov partition where the interior of each rectangle is homeomorphic to a disc and its boundary consists of two stable curves and two unstable curves. For such a construction, see for instance [45,Section 8]. If F sym is not accessible, then for each rectangle R i the stable and unstable directions of F are jointly integrable inside the region R i × R ⊂ M × R. This region is therefore foliated by C 1 surfaces (with boundary) tangent to E s ⊕ E u . As the rectangles meet along stable and unstable curves, we can "glue together" the leaves of neighbouring rectangles to produce a foliation on all of M × R tangent to E s ⊕ E u .
Remark A.6. The proof above is specific to the 2-torus. Bowen showed in higher dimensions that the boundaries of the rectangles are not smooth [10]. There are several slightly different constructions of Markov partitions [8,9,11,44]. In none of these contructions is connectedness of the rectangles discussed and it is not clear from the constructions if a rectangle is connected when the diffeomorphism is Anosov. Even if the construction begins with connected rectangles, complicated topological manipulations are necessary to turn this into a family rectangles satisfying all of the axioms of a Markov partition. It is not clear how this steps affect connectedness.
It may conceivably be the case that the interior of a rectangle R i consists of infinitely many connected components. The union U = i int R i of the interiors of all of the rectangles would then be an open and dense set with infinitely many connected components. Our reasoning above shows that, if F sym is not accessible, then the height function h(x, y) is constant when x and y are in the same connected component of U. Suppose a small neighbourhood V of a point p ∈ M intersects infinitely many connected components of U. Then the function V → R, q → h(p, q) may be a Cantor function or a "Devil's staircase" which is constant on an open and dense subset of V despite not being constant on all of V.
Here, the regularity of the stable and unstable foliations is important, as Cantor functions are possible with Hölder regularity, but not with Lipschitz regularity. Keeping p as above fixed, consider a point q near p. We analyze what happens as we move q along its unstable manifold until it intersects the stable manifold of p. In other words, we take q tending towards [p, q] while holding both p and [p, q] constant. In the formula defining h(p, q), the term ∆ u (p, [p, q]) will then be constant. The term ∆ u ([p, q], q) will depend in a C 1 manner on the point q since we are a moving along a single unstable manifold and this single manifold is C 1 regular. The term ∆ u ([q, p], p) also depends in a C 1 manner on the point q for the same reason. The problematic term is ∆ s (q, [q, p]). As q moves along its path toward [p, q], it passes through different stable leaves. As a consequence, ∆ s (q, [q, p]), which measures the change in "height" of stable segments on M × R, is only as regular as the stable foliation on M × R.
In general, the stable foliation is only Hölder continuous [31] and so we cannot make further conclusions. In certain settings, however, we know that the stable foliation is C 1 and we can conclude that ∆ s (q, [q, p]) is C 1 and that its derivative is unifomly bounded. From this one can show that the function h is Lipschitz in the sense that h(x, y) is bounded by a constant times d(x, y). The pathology of Cantor functions may therefore be ruled out. Further, the C 1 nature of the stable foliations allows one to establish a property known as "uniform non-integrably" or UNI which is a quantitative form of accessibility and can be used to establish exponential mixing. This is, more or less, the argument applied in [3,4,16] where they establish exponential mixing in a setting where stable foliation is C 1 .
We now consider accessibility in the setting of hyperbolic attractors. Proof. We will assume Ω is an attractor so that for each p ∈ Ω, the unstable manifold W u A (p) is a connected immersed submanifold lying entirely within Ω. As shown in [30], there is a neighbourhood of Ω on which one can define invariant stable and unstable foliations. We may find a periodic point p and a small neighbourhood U of p, such that every unstable manifold intersects U in a pathconnected set.
As in the proof of Corollary A.4 above, one can adapt the "Unweaving Lemma" to perturb f and find p and q with h(p, q) = 0. We define γ : [0, 1] → R i to be a path from [p, q] to q and then apply Proposition A.3. APPENDIX B. PROOFS OF LEMMAS 3.1, 3.2, 3.6, AND 3.8 B.1. Proof of Lemma 3.1. Assume that f is cohomologous to zero, namely there exists a measurable function w such that f = w • σ − w. By Livsic Theorem, we can assume that w is continuous. We claim that for any x ∈ Σ, all the points that can be reached by an su-path from (x, w(x)) ∈ Σ × R are contained in the graph of w, i.e. in the set G(w) := {(y, w(y)) : y ∈ Σ}, which will give a contradiction with the accessibility assumption.
Let (x, w(x)) ∈ G(w); we now show that the whole stable set W s (x, w(x)) is fully contained in G(w). Let (y, s) ∈ W s (x, w(x)); then, y ∈ W s (x) and that is s = w(y). This implies W s (x, w(x)) ⊂ G(w). An analogous argument shows that W u (x, w(x)) ⊂ G(w).

B.2. Proof of Lemma 3.2.
For any fixed R > 0, by the invariance of the Gibbs Measure with respect to the dynamics, we have The last term above converges to zero for R → ∞, hence the limit exists and equals ν av (Φ).
B.3. Proof of Lemma 3.6. By definition, we can write We have to show that the limit (4) exists; in order to do this, we prove that for any x ∈ Σ we have from which the claim follows. For any R > 0, We have that | sin(Rξ )/(Rξ )| ≤ 1 (and 1 ∈ L 1 (|η x |), since the total variation of η x is finite) and sin(Rξ )/(Rξ ) converges to 0 for all ξ = 0. Lebesgue theorem yields (13), which completes the proof.
B.4. Proof of Lemma 3.8. Let us show that for all ε > 0 we have Then, simply by choosing ε = 2/K, we conclude y). In both cases, the first term satisfies a Lipschitz bound, independent of r. For the second term, if |r| ≤ 4 h ∞ , then |r| a |∂ ℓ ψ(x, r − h(y)) − ∂ ℓ ψ(y, r − h(y))| ≤ (4 h ∞ ) a ψ(x) − ψ(y) 0,ℓ , and the Lipschitz bound follows from the assumption on ψ; otherwise, if |r| > 4 h ∞ , then and again the conclusion follows from the assumption on ψ. This concludes the proof of the claims on ψ.
Let now Φ ∈ G . Then, for any x ∈ Σ, the function Φ h (x)(r) = Φ(x, r − h(x)) is the Fourier-Stieltjes transform of the measure d(η h ) x = e iξ h(x) dη x . In particular, the variation are the same |(η h ) x | = |η x | and, from Lemma 3.6, it also follows that ν av (Φ h ) = ν av (Φ). Again, the only claim left to be shown is the Lipschitz assumption. We will exploit here the tail condition (TC).
Fix x, y ∈ Σ. Then, The second summand in the right hand-side above satisfies a Lipschitz bound by assumption. We need to verify for the first term. C.2. Proof of Proposition 10.3. This section is devoted to the proof of Proposition 10.3. The strategy follows the same lines as in [19] and [42, Proposition 1.2], although there are additional difficulties in showing that the functions Φ and ψ belong to G + and L + respectively. Let α : Σ × R → R be any Lipschitz function, with Lipschitz constant L(α). Fix m ∈ N. The first step is to prove that there exists a function β : Σ × R → R such that the function α + = (α • F m ) + β • F − β depends only on the future coordinates. In other words, we want to show that α • F m is cohomologous to a function defined on X × R. We recall the construction of β for the reader's convenience. For any cylinder C n, j := C −n,o (x j ), choose an element ω n, j ∈ C n, j . Define the element ω n (x) ∈ Σ in the following way: Notice that by definition d θ (ω n (x), x) ≤ θ n for all x ∈ Σ. Define α (n) (x, r) := α(ω n (x), r + δ n (x)), where δ n (x) := f n (ω n (x)) − f n (x).
Proof. In order to prove the result, it is sufficient to show that for every x ∈ Σ, the function β (x, ·) is Schwartz and the functions x → ∂ ℓ β (x, ·) are lipschitz with respect to the L 1 -norm. Since, for every x ∈ Σ and ℓ ∈ N, the series ∞ ∑ n=m ∂ ℓ α(σ n x, r + f n (x)) − ∂ ℓ α(ω n (σ n x), r + f n (x) + δ n (σ n x)) converges uniformly, the derivative ∂ ℓ β (x, ·) exists and equal the series above. We now show that β (x, ·) is a Schwartz function for every x ∈ Σ.
Let a, ℓ ∈ N. We need to show that r a ∂ ℓ β (x, r) is uniformly bounded in r. To save notation, let us write x n = σ n x, y n = ω n (σ n x) and δ n = δ n (σ n x). We have |r a ∂ ℓ β (x, r)| ≤ r a ∞ ∑ n=m ∂ ℓ α(x n , r + f n (x)) − ∂ ℓ α(y n , r + f n (x)) + r a ∞ ∑ n=m ∂ ℓ α(y n , r + f n (x)) − ∂ ℓ α(y n , r + f n (x) + δ n ) Each term in the sum above satisfies a Lipschitz bound exactly as in the proof of Lemma 10.2. Since the terms d(x n , y n ) and δ n can be bounded by O(θ n ), the series above converge. Therefore, |r a ∂ ℓ β (x, r)| is uniformly bounded for all a, ℓ ∈ N, hence β (x, ·) is a Schwartz function.
The lipschitz bounds on the functions x → ∂ ℓ β (x, ·) with respect to the L 1 -norm can be proved in a similar way and is left as an exercise to the reader: we remark that if α is a lipschitz function with constant L(α), then α • F m is lipschitz with constant L(α • F m ) ≤ θ −m L(α).
We now show the analogous result if we assume that α is a global observable.
Lemma C.2. If α ∈ G , then α + ∈ G + , and α + G + ≤ α G . Proof. It suffices to show that for any x ∈ Σ, β (x, ·) is the Fourier-Stieltjes transform of a complex measure η x and moreover the total variation η x TV is uniformly bounded.
From the expression for β above, by definition, the Fourier-Stieltjes transforms ζ N of ζ N converge uniformly to β (x, ·). In order to conclude that β (x, ·) is the Fourier-Stieltjes transform of a complex measure, it is enough to show that the family of measures ζ N is contained in a weakly compact set. We will proceed as in the proof of Lemma 10.2 and use the tightness condition (TC). We will show this separately for ζ The total variation norm is stronger that the weak-convergence topology, hence the sequence of measures ζ (1) N converges weakly (since the tails are exponentially small). For the second term, we apply Prokhorov theorem: it suffices to show that the sequence ζ (2) N is uniformly bounded in total variation norm and is tight. Notice that the variation of ζ This proves tightness and hence concludes the proof.