Doubly Intermittent Full Branch Maps with Critical Points and Singularities

We study a class of one-dimensional full branch maps admitting two indifferent fixed points as well as critical points and/or unbounded derivative. Under some mild assumptions we prove the existence of a unique invariant mixing absolutely continuous probability measure, study its rate of decay of correlation and prove a number of limit theorems.


Introduction
The purpose of this paper is to study the ergodic properties of a large class of full branch interval maps with two branches, including maps with two indifferent fixed points (which, as we shall see below, affects both the results and the construction of the induced map which we require).We also allow the derivative to go to zero as well as to infinity at the boundary between the two branches, and we do not assume any symmetry, even the domains of the branches can be of arbitrary length.Such maps are known to exhibit a wide range of behaviour from an ergodic point of view and many of them have been extensively studied, we give a detailed literature review below.
In Section 1.1 we give the precise definition of the class of maps we consider, which includes many cases already studied in the literature as well as many cases which have not yet been studied; in section 1.2 we give the precise statements of our results; in section 1.3 we give a literature review of related results and include specific examples of maps in our family; in Section 2 we give a detailed outline of our proof, emphasising several novel aspects of our construction and arguments.Then in Section 3 we give the construction and estimates related to our "double-induced" map and in Section 4 apply these estimates to complete the proofs of our results

Full Branch Maps
We start by defining the class of maps which we consider in this paper.Let I, I − , I + be compact intervals, let I, I− , I+ denote their interiors, and suppose that I = I − ∪ I + and I− ∩ I+ = ∅.(A0) g : I → I is full branch: the restrictions g − : I− → I and g + : I+ → I are orientation preserving C 2 diffeomorphisms and the only fixed points are the endpoints of I.
To simplify the notation we will assume that but our results and proofs will be easily seen to hold in the general setting.
If k 1 = 1 and/or k 2 = 1, then we replace the corresponding lines in (1) with the assumption that g ′ (0 − ) = a 1 > 1 and/or g ′ (0 + ) = a 2 > 1 respectively, and that g is monotone in the corresponding neighbourhood.
Remark 1.1.It is easy to see that the definition in (1) yields maps with dramatically different derivative behaviour depending on the values of ℓ 1 , ℓ 2 , k 1 , k 2 , including having neutral or expanding fixed points and points with zero or infinite derivative, see Remark 1.3 for a detailed discussion.
For the moment we just remark that the assumptions described in part ii) of condition (A1) are consistent with (1) but significantly relax the definition given there as in these cases (1) would imply that the map is affine in the corresponding neighbourhood, whereas we only need expansivity.
In particular this allows us to include uniformly expanding maps in our class of maps.In the calculations below we will explicitly consider the cases ℓ 1 = 0 and/or ℓ 2 = 0, which correspond to assuming that one or both the fixed points are expanding instead of neutral, since they yield different estimates (several quantities decay exponentially rather than polynomially in these cases) and different results, and still include some maps which, as far as we know, have not been studied in the literature.For simplicity, on the other hand, we will not consider explicitly the cases k 1 = 1 and/or k 2 = 1, which just correspond to assuming the derivative at one or both sides of the discontinuity is finite instead of being zero or infinite.These correspond to much simpler special cases and the required estimates follow by arguments which are very similar to arguments and calculations we give here, and which are essentially already considered in the literature, but treating them explicitly would require a significant amount of additional notation and calculations.
Our final assumption can be intuitively thought of as saying that g is uniformly expanding outside the neighbourhoods U 0± and U ±1 .This is however much stronger than what is needed, and therefore we formulate a weaker and more general assumption for which we need to describe some aspects of the topological structure of maps satisfying condition (A0).First of all we define ∆ − 0 := g −1 (0, 1) ∩ I − and ∆ + 0 := g −1 (−1, 0) ∩ I + . (4) Then we define iteratively, for every n ≥ 1, the sets as the n'th preimages of ∆ − 0 , ∆ + 0 inside the intervals I − , I + .It follows from (A0) that {∆ − n } n≥0 and {∆ + n } n≥0 are mod 0 partitions of I − and I + respectively, and that the partition elements depend monotonically on the index in the sense that n > m implies that ∆ ± n is closer to ±1 than ∆ ± m , in particular the only accumulation points of these partitions are −1 and 1 respectively.Then, for every n ≥ 1, we let Notice that {δ − n } n≥1 and {δ + n } n≥1 are mod 0 partitions of ∆ − 0 and ∆ + 0 respectively and also in these cases the partition elements depend monotonically on the index in the sense that n > m implies that δ ± n is closer to 0 than δ ± m , (and in particular the only accumulation point of these partitions is 0).Notice moreover, that We now define two non-negative integers n ± which depend on the positions of the partition elements δ ± n and on the sizes of the neighbourhoods U 0± on which the map g is explicitly defined.If ∆ − 0 ⊆ U 0− and/or ∆ + 0 ⊆ U 0+ , we define n − = 0 and/or n + = 0 respectively, otherwise we let We can now formulate our final assumption as follows.
Notice that (A2) is an expansivity condition for points outside the neighbourhoods U 0± and U ±1 but is much weaker than assuming that the derivative of g is greater than 1 outside these neighbourhoods, which would be unnatural and unnecessarily restrictive in the presence of critical points.This completes the set of conditions which we require, and for convenience we let The class F contains many maps which have been studied in the literature, including uniformly expanding maps and various well known intermittency maps with a single neutral fixed point.We will give a more in-depth literature review in Section 1.3.Here we make a few technical remarks concerning these assumptions before proceeding to state our results in the next subsection.
Remark 1.2 (Remark on notation).To simplify many statements which will be made through the paper, it will be useful to recall some relatively standard notation as follows.Given sequences (s n ) and (t n ) of non-negative terms, we write Remark 1.3.Changing the parameter values ℓ 1 , ℓ 2 , k 1 , k 2 gives rise to maps with quite different characteristics.For example, if ℓ 1 > 0, we have Then g ′ (−1) = 1 and the fixed point −1 is a neutral fixed point.Similarly, when ℓ 2 > 0 the fixed point 1 is a neutral fixed point.On the other hand, when ℓ 1 = 0, from (3) we have and thus the fixed point −1 is hyperbolic repelling with g ′ (−1) = 1 + b.When k 1 ̸ = 1 we have Then k 1 ∈ (0, 1) implies that |g ′ | U 0− (x)| → ∞ as x → 0, in which case we say that g| U 0− has a (one-sided) singularity at 0, whereas k 1 > 1 implies that |g ′ | U 0− (x)| → 0 as x → 0, and therefore we say that g| U 0− has a (one-sided) critical point at 0. Analogous observations hold for the various values of ℓ 2 and k 2 and Figure 1 shows the graph of g for various combinations of these exponents.
For future reference we mention also some additional properties which follow from (A1).First of all notice that if ℓ 1 ∈ (0, 1) we have g ′′ (x) → ∞ but if ℓ 1 > 1 we have g ′′ (x) → 0, as x → −1 and, as we shall see, this qualitative difference in the higher order derivative plays a crucial role in the ergodic properties of g.Analogous observations apply to g| U 1 when ℓ 2 > 0. Secondly, notice also that for every x ∈ U −1 we have and an analogous bound holds for x ∈ U 1 .Similarly, in U 0 we have and notice that in this case the bound does not actually depend on the value of k 1 or k 2 and in particular does not depend on whether we have a critical point or a singularity.Finally, we note that when ℓ 1 = 0, it follows from (9) and from the assumption that ξ ′′ (x) > 0 that for every x ∈ U −1 .Indeed, notice that 1 + x is just the distance between x and −1 and thus ξ(x)/(1 + x) is the slope of the straight line joining the point (−1, 0) to (x, ξ(x)) in the graph of ξ, which is exactly the average derivative of ξ in the interval [−1, x].Since ξ ′′ > 0, the derivative is monotone increasing and thus the derivative ξ ′ is maximal at the endpoint x, which implies (13).
The same statement of course holds for ℓ 2 = 0 and for all x ∈ U +1 .

Statement of Results
Our first result is completely general and applies to all maps in F.
Theorem A. Every g ∈ F admits a unique (up to scaling by a constant) invariant measure which is absolutely continuous with respect to Lebesgue; this measure is σ-finite and equivalent to Lebesgue.This is perhaps not completely unexpected but also certainly not obvious in the full generality of the maps in F, especially for maps which admit critical points (which can, moreover, be of arbitrarily high order).Our construction gives some additional information about the measure given in Theorem A, in particular the fact that its density with respect to Lebesgue is locally Lipschitz and unbounded only at the endpoints ±1.We will show that, depending on the exponents k 1 , k 2 , ℓ 1 , ℓ 2 , the density may or may not be integrable and so the measure may or may not be finite.More specifically, let We will show that the density is Lebesgue integrable at -1 or 1 respectively if and only if β 1 and β 2 respectively are < 1.In particular, letting we have the following result.
Theorem B. A map g ∈ F admits a unique ergodic invariant probability measure µ g absolutely continuous with respect to (indeed equivalent to) Lebesgue if and only if g ∈ F.
Notice that the condition β < 1 is a restriction only on the relative values of k 1 with respect to ℓ 2 and of k 2 with respect to ℓ 1 .It still allows k 1 and/or k 2 to be arbitrarily large, thus allowing arbitrarily "degenerate" critical points, as long as the corresponding exponents ℓ 2 and/or ℓ 1 are sufficiently small, i.e. as long as the corresponding neutral fixed points are not too degenerate.
We now give several non-trivial results about the statistical properties maps g ∈ F with respect to the probability measure µ g .To state our first result recall that the measure-theoretic entropy of g with respect to the measure µ is defined as where the supremum is taken over all finite measurable partitions P of the underlying measure space and P n := P ∨ f −1 P ∨ • • • ∨ f −n P is the dynamical refinement of P by f .Theorem C. Let g ∈ F. Then µ g satisfies the Pesin entropy formula: h µg (g) = log |g ′ |dµ g .
For Hölder continuous functions φ, ψ : [−1, 1] → R and n ≥ 1, we define the correlation function It is well known that µ g is mixing if and only if C n (φ, ψ) → 0 as n → ∞.We say that µ g is exponentially mixing, or satisfies exponential decay of correlations if there exists a λ > 0 such that for all Hölder continuous functions φ, ψ there exists a constant C φ,ψ such that C n (φ, ψ) ≤ C φ,ψ e −λn .We say that µ g is polynomially mixing, or satisfies polynomial decay of correlations, with rate α > 0 if for all Hölder continuous functions φ, ψ there exists a constant C φ,ψ such that Notice that the polynomial rate of decay of correlations (1 − β)/β itself decays to 0 as β approaches 1, which is the transition parameter at which the invariant measure ceases to be finite.Intuitively, as β → 1, the measure, while still equivalent to Lebesgue, is increasingly concentrated in neighbourhoods of the neutral fixed points, which slow down the decay of correlations.
Our final result concerns a number of limit theorems for maps g ∈ F, which depend on the parameters of the map and, in some cases, also on some additional regularity conditions.These are arguably some of the most interesting results of the paper, and those in which the existence of two indifferent fixed points, instead of just one, really comes into play, giving rise to quite a complex scenario of possibilities.We start by recalling the relevant definitions.For integrable functions φ with φdµ = 0 we define the following limit theorems.
CLT φ satisfies a central limit theorem with respect to µ if there exists a σ 2 ≥ 0 and a N (0, σ 2 ) random variable V such that for every x ∈ R for which the function x → µ(V σ 2 ≤ x) is continuous.
CLT ns φ satisfies a non-standard central limit theorem with respect to µ if there exists a σ 2 ≥ 0 and a N (0, σ 2 ) random variable V such that for every x ∈ R for which the function x → µ(V σ 2 ≤ x) is continuous.
SL α φ satisfies a stable law of index α ∈ (1, 2), with respect to a measure µ, if there exists a stable random variable W α such that for every x ∈ R for which the function x → µ(W α ≤ x) is continuous.
Finally, we say that an observable φ : We are now ready to state our result on the various limit theorems which hold under some conditions on the parameters and on the observable φ.In order to state these conditions it is convenient to introduce the following variable: We can then state our results in all cases in a clear and compact way as follows.
Theorem E. Let g ∈ F and φ : [−1, 1] → R be Hölder continuous with φdµ = 0 and satisfying where ν 1 , ν 2 are the Hölder exponents of φ| [−1,0] and φ| (0,1] respectively.Then In case 3 we can replace the Hölder continuity condition (H) by the weaker (in this case) condition Moreover, in all cases where CLT holds we have that σ 2 = 0 if and only if φ is a coboundary.
Remark 1.4.Our results highlight the fundamental significance of the value of the observable φ at the two fixed points, and how the fixed point at which φ is non-zero, in some sense dominates, and determines the kind of limit law which the observable satisfies.If φ is non-zero at both fixed points, then it is the larger exponent which dominates.
Remark 1.5.Note that (H) and (H ′ ) are automatically satisfied for various ranges of β 1 , β 2 , for example if β ≤ 1/2 then (H) always holds and if β = β φ then (H ′ ) always holds.These Hölder continuity conditions arise as technical conditions in the proof and it is not clear to us if they are really necessary and what could be proved without them.It may be the case, for example, that some limit theorems still hold under weaker regularity conditions on φ.
Remark 1.6.We remark also that the compact statement of Theorem E somewhat "conceals" quite a large number of cases which express an intricate relationship between the map parameters and the values and regularity of the observable.For example, the case β φ = 0 allows all possible values β 1 , β 2 ∈ [0, 1) and the case β φ = β 1 allows all possible values of β 2 ∈ [0, 1).We therefore have a huge number of possible combinations which do not occur in the case of maps with just a single intermittent fixed point.

Examples and Literature Review
There is an extensive literature on the dynamics and statistical properties of full branch maps, which have been studied systematically since the 1950s.Their importance stems partly from the fact that they occur very naturally, for example any smooth non-invertible local diffeomorphism of S1 is a full branch map, but also, and perhaps most importantly, because many arguments in Smooth Ergodic Theory apply in this setting in a particularly clear and conceptually straightforward way.Indeed, arguably, most existing techniques used to study hyperbolic (including non-uniformly hyperbolic) dynamical systems are essentially (albeit often highly non-trivial) extensions and generalisations of methods first introduced and developed in the setting of one-dimensional full branch maps.Our class of maps F is quite general and includes many one-dimensional full branch maps which have been studied in the literature as well as many maps which have not been previously studied.We give below a brief survey of some of these examples and indicate for which choices of parameters these correspond to maps in our family 1 .
Arguably one of the very first and simplest general class of maps for which the existence of an invariant ergodic and absolutely continuous probability measure was proved are uniformly expanding full branch maps with derivatives uniformly bounded away from 0 and infinity, a result often referred to as the Folklore Theorem and generally attributed to Renyi.Some particularly simple examples of uniformly expanding maps are piecewise affine maps such as those given by for parameters a > 1, see Figure 2a.These are easily seen to be contained in the class In the late '70s, physicists Manneville and Pomeau [Pom] introduced a simple but extremely interesting generalisation consisting of a class of full branch one-dimensional maps g : [0, 1] → [0, 1], which they called intermittency maps, defined by for α > 0, see Figure 2b (notice that for α = 0 this just gives the map g(x) = 2x mod 1, which is just (15) with a = 2).These maps can be seen to be contained in our class F by taking the parameters (ℓ , where a = g ′ (x 0 ), and x 0 ∈ (0, 1) is the boundary of the intervals on which the two branches of the map are defined.The Manneville-Pomeau maps are interesting because the uniform expansivity condition fails at a single fixed point on the boundary of the interval, where we have g ′ (0) = 1 Their motivation was to model fluid flow where long period of stable flow is followed with an intermittent phase of turbulence, and they showed that this simple model indeed seemed to exhibit such dynamical behaviour.It was then shown in [Pia] that for α > 2, the intermittency maps failed to have an invariant ergodic and absolutely continuous probability measure and satisfies the extremely remarkable property that the time averages of Lebesgue almost every point converge to the Dirac-delta measure δ 0 at the neutral fixed point, even though these orbits are dense in [0, 1] and the fixed point is topologically repelling.Various variations of intermittency maps have been studied extensively from various points of views and with different techniques yielding quite deep results, see e.g.[Liv; You; Sar; Mel; Pol; Fis; Goua; Goub; Nic; Coa; Fre; Kor; Bah; Ter; She; Zwe].One well known version is the so-called Liverani-Saussol-Vaienti (LSV) map g : [0, 1] → [0, 1] introduced in [Liv] and defined by (a) Graph of ( 15) with a = 5 (b) Graph of ( 16) with α = 9/10 with parameter α > 0, see Figure 3a.This maintains the essential features of the Maneville-Pomeau maps ( 16), i.e. it is uniformly expanding except at the neutral fixed point at the origin, but in slightly simplified form where the two branches are always defined on the fixed domains [0, 1/2] and 1/2, 1) and the second branch is affine, both of which make the map family easier to study, including the effect of varying the parameter.The family of LSV maps ( 17) can be seen to be contained in our class F by taking the parameters (ℓ In an earlier paper [Pik], Pikovsky had introduced the maps g : S 1 → S 1 , defined (in a somewhat unwieldy way) by the implicit equation for x ∈ [0, 1), and then by the symmetry g(x) = g(−x) for x ∈ (−1, 0], see Figure 3b.These maps have a neutral fixed point at the left end point, like in ( 16) and ( 17) but with the added complication of having unbounded derivative at the boundary between the domains of the two branches.On the other hand the definition is specifically designed in such a way that the order of intermittency is the inverse of the order of the singularity and, together with the symmetry of the two branches, this implies that Lebesgue measure is invariant for all values of the parameter α > 0.
Ergodic and statistical properties of these maps were studied in [Alvb; Cri; Bos] and they can be seen to be contained within our class F by taking the parameters (ℓ Finally, [Ino; Cui] consider a class of maps, see Figure 3c for an example, with a single intermittent fixed point and multiple critical points with each critical point mapping to the fixed point.These include some maps which are more general than those we consider here as they are defined near the fixed and critical points through some bounds rather than explicitly as we do here, but are also more restrictive as they only allow for a single neutral fixed point.Under a condition on the product of the orders of the neutral and (the most degenerate) critical point which is exactly analogous to our condition β < 1, the existence of an invariant ergodic probability measure is proved which exhibits decay of correlations but no bounds are given for the rate of decay and no limit theorems are obtained.

Overview of the proof
We discuss here our overall strategy and prove our Theorems modulo some key technical Propositions which we then prove in the rest of the paper.Our argument can be naturally divided into three main steps which we describe in some detail in the following three subsections.

The induced map
The first step of our arguments is the construction of an induced full branch Gibbs-Markov map, also known as a Young Tower.This is relatively standard for many systems, including intermittent maps, however, the inducing domain which we are obliged to use here due to the presence of two indifferent fixed points is different from the usual inducing domains and requires a more sophisticated double inducing procedure, which we outline here and describe and carry out in detail in Section 3. Recall the definition of ∆ − 0 in (4) and, for x ∈ ∆ − 0 , let be the first return time to ∆ − 0 .Then we define the first-return induced map We say that a first return map (or, more generally, any induced map), saturates the interval Intuitively, saturation means that the return map "reaches" every part of the original domain of the map g, and thus the properties and characteristics of the return map reflect, to some extent, all the relevant characteristics of g.
Remark 2.1.If G is a first return induced map, as in our case, then all sets of the form g i ({τ = n}) are pairwise disjoint and therefore form a partition of I mod 0.
The first main result of the paper is the following.
We give the precise definition of Gibbs-Markov map, and prove Proposition 2.2, in Section 3. In Section 3.1 we describe the topological structure of G and show that it a full branch map with countably many branches which saturates I (we will define G as a composition of two full branch maps, see ( 37) and ( 40), which is why we call the construction a double inducing procedure); in Section 3.2 we obtain key estimates concerning the sizes of the partition elements of the corresponding partition; in Section 3.3 we show that G is uniformly expanding; in Section 3.4 we show that G has bounded distortion.From these results we get Proposition 2.2 from which we can then obtain our first main Theorem.
Proof of Theorem A. By standard results G admits a unique ergodic invariant probability measure μ− , supported on ∆ − 0 , which is equivalent to Lebesgue measure m and which has Lipschitz continuous density ĥ− = dμ − /dm bounded above and below.We then "spread" the measure over the original interval I by defining the measure Again by standard arguments, we have that μ is a sigma-finite measure which is ergodic and invariant for g and, using the non-singularity of g, it is absolutely continuous with respect to Lebesgue.The fact that G saturates I implies moreover that μ is equivalent to Lebesgue, which completes the proof.□ Remark 2.3.We emphasize that we are not assuming any symmetry in the two branches of the map g.It is not important that the branches are defined on intervals of the same length and, depending on the choice of constants, we might even have a critical point in one branch and a singularity with unbounded derivative on the other.Interestingly, however, there is some symmetry in the construction in the sense that for x ∈ ∆ + 0 , we can define the first return map G + : ∆ + 0 → ∆ + 0 in a completely analogous way to the definition of G above (see discussion in Section 3.1).Moreover, the conclusions of Proposition 2.2 hold for G + and thus G + admits a unique ergodic invariant probability measure μ+ which is equivalent to Lebesgue measure m and such that the density ĥ+ := dμ + /dm is Lipschitz continuous and bounded above and below.The two maps G and G + are clearly distinct, as are the measures μ− and μ+ , but exhibit a subtle kind of symmetry in the sense that the corresponding measure μ obtained by substituting μ− by μ+ in ( 21) is, up to a constant scaling factor, exactly the same measure.Proof.Since G is a first return induced map it follows that the measure μ defined in (21) satisfies μ| ∆ − 0 = μ and so the density h of μ is Lipchitz continuous and bounded away from both 0 and infinity on ∆ − 0 .Moreover, as mentioned in Remark 2.3, μ| ∆ + 0 is equal, up to a constant, to the measure μ+ and so the density of μ| ∆ + 0 is also Lipschitz continuous and bounded away from 0 and infinity.□ Remark 2.5.We have used above the notation G rather than G − for simplicity as this is the map which plays a more central role in our construction, see Remark 3.3 below.Similarly, we will from now on simply use the notation μ to denote the measure μ− .

Orbit distribution estimates
The second step of the argument is aimed at establishing conditions under which the measure μ is finite, and can therefore be renormalized to a probability measure µ := μ/μ(I), and aimed at studying the ergodic and statistical properties of µ.Our approach here differs even more significantly from existing approaches in the literature, although it does have some similarities with the argument of [Cri]: rather than starting with estimates of the tail of the inducing time (which would themselves anyway be significantly more involved than in the usual examples of intermittency maps with a single critical point due to our double inducing procedure), we carry out more general estimates on the distribution of iterates of points in I − and I + before they return to ∆ − 0 .More precisely, we define the functions τ ± (x) : ∆ − 0 → N by These functions count the number of iterates of x in I − and I + respectively before returning to As we shall see as part of our construction of the induced map, both of these functions are unbounded and their level sets have a non-trivial structure in ∆ 0 − and, moreover, the inducing time function The key results of this part of the proof consists of explicit and sharp asymptotic bounds for the distribution of τ a,b for different values of a, b, from which we can then obtain as an immediate corollary the rates of decay of the inducing time function τ , and which will also provide the core estimates for the various distributional limit theorems.To state our results, let (the expressions defining the constants B 1 , B 2 will appear in the proof of Proposition 3.5 below).
Recall from Corollary 2.4 that the density h of μ is bounded on ∆ − 0 ∪ ∆ + 0 and let h(0 − ) and h(0 + ) denote the values of this density on either side of 0 .Then, for any a, b ≥ 0, we let Then we have the following distributional estimates.
Proposition 2.6.Let g ∈ F. Then for every a, b ≥ 0 we have the following distribution estimates.
Recall from Corollary 2.4 that μ = μ on the inducing domain ∆ 0 − and therefore all the above estimates hold for μ with exactly the same constants.In particular by Proposition 2.6 and (24), we immediately get the corresponding estimates for the tail μ(τ > t) = μ(τ > t).
Corollary 2.8.If β = 0 then μ(τ > t) decay exponentially as t → +∞.If β > 0 then there exists a positive constant C τ (which can be computed explicitly) such that Proposition 2.6 will be proved in Section 4.1, here we show how it implies Theorems B, C, D.
Proof of Theorems B, C, and D. From the definition of μ in ( 21) and since g −n (I) = I we have By Corollary 2.8, if β = 0, the quantities μ− (τ > n) decay exponentially and, if β > 0 we have for some C > 0. This implies that μ(I) < ∞ if and only if β ∈ [0, 1), i.e. if and only if g ∈ F. Thus, for g ∈ F we can define the measure µ g := μ/μ(I), which is an invariant ergodic probability measure for g, and is unique because it is equivalent to Lebesgue, thus proving Theorem B. Theorem C follows from Theorem A in [Alva] by noticing that P = {(−1, 0), (0, 1)} is a Lebesgue mod 0 generating partition such that H µg (P) < ∞ and h µg (g, P) < ∞, and therefore h µg (g) < ∞.Finally, Theorem D follows by well known results [You] which show that the decay rate of the tail of the inducing times provides upper bounds for the rates of decay of correlations as stated.□

Distribution of induced observables
The last part of our argument is focused on obtaining the limit theorems stated in Theorem E. When β = 0 the decay of correlations is exponential and the result follows from [You].Similarly, after having established Proposition 2.2 and Corollary 2.8, the case that only one of ℓ 1 , ℓ 2 is positive implies that there is only one intermittent fixed point, and thus essentially reduces to the argument given in [Goua,Theorem 1.3] for the LSV map.We only therefore need to consider the case that both ℓ 1 , ℓ 2 > 0, which implies in particular that β ∈ (0, 1).Given an observable φ : [0, 1] → R, we define the induced observable Φ : Definition 2.9.We write Φ ∈ D α if ∃c 1 , c 2 ≥ 0, with at least one of c 1 , c 2 non-zero, such that In certain settings, limit theorems can be deduced from properties of the induced observable Φ.In particular, it is proved in Theorems 1.1 and 1.2 of [Goua] that, precisely in our setting2 : We will argue that in each case of Theorem E, the induced observable Φ satisfies one of the above.
To prove Theorem E we obtain regularity and distribution results for the induced observables τ a,b and Φ and substitute them into (35) to get the various cases ( 31)-( 33).The motivation for the decomposition ( 34) is given by the observation that φ(−1) = φ(1) = 0, which allows us to prove the following estimate for the corresponding induced observable Φ.

The Induced Map
In this section we prove Proposition 2.2.We begin by recalling one of several essentially equivalent definitions of Gibbs-Markov map.
Definition 3.1.An interval map F : I → I is called a (full branch) Gibbs-Markov map if there exists a partition P of I (mod 0) into open subintervals such that: 1. F is full branch: for all ω ∈ P the restriction 2. F is uniformly expanding: there exists λ > 1 such that |F ′ (x)| ≥ λ for all x ∈ ω for all ω ∈ P; 3. F has bounded distortion: there exists C > 0, θ ∈ (0, 1) s.t. for all ω ∈ P and all x, y ∈ ω, where s(x, y) := inf{n ≥ 0 : F n x and F n y lie in different elements of the partition P}.
We will show that the first return map G defined in (19) satisfies all the conditions above as well as the saturation condition (20).In Section 3.1 we describe the topological structure of G and show that it is a full branch map with countably many branches which saturates I; this will require only the very basic topological structure of g provided by condition (A0).In Section 3.2 we obtain estimates concerning the sizes of the partition elements of the corresponding partition; this will require the explicit form of the map g as given in (A1).In Section 3.3 we show that G is uniformly expanding; this will require the final condition (A2).Finally, in Section 3.4 we use the estimates and results obtained to show that G has bounded distortion.

Topological Construction
In this section we give an explicit and purely topological construction of the first return maps G − : ∆ − 0 → ∆ − 0 and G + : ∆ − 0 → ∆ − 0 which essentially depends only on condition (A0), i.e. the fact that g is a full branch map with two orientation preserving branches.Recall first of all the definitions of the sets ∆ ± n and δ ± n in ( 5) and ( 6).It follows immediately from the definitions and from the fact that each branch of g is a C 2 diffeomorphism, that for every n ≥ 1, the maps g : δ − n → ∆ + n−1 and g : δ + n → ∆ − n−1 are C 2 diffeomorphisms, and, for n ≥ 2, the same is true for the maps g n−1 : ∆ − n−1 → ∆ − 0 , and g n−1 : ∆ + n−1 → ∆ + 0 , which implies that for every n ≥ 1, the maps We can therefore define two maps Notice that these are full branch maps although they have different domains and ranges, indeed the domain of one is the range of the other and viceversa.The fact that they are full branch allows us to pullback the partition elements δ ± n into each other : for every m, n ≥ 1 we let Then, for m ≥ 1, the sets {δ − m,n } n≥1 and {δ + m,n } n≥1 are partitions of δ − m and δ + m respectively and so are partitions of ∆ − 0 , ∆ + 0 respectively, with the property that for every m, n ≥ 1, the maps are C 2 diffeomorphisms.Notice that m + n is the first return time of points in δ − m,n and δ + m,n to ∆ − 0 and ∆ + 0 respectively and we have thus constructed two full branch first return induced maps for which we have The maps G − and G + are full branch maps which saturate I Proof.The full branch property follows immediately from (39).It then also follows from the construction that the families of the images of the partition elements ( 38) are each formed by a collection of pairwise disjoint intervals which satisfy and therefore clearly satisfy (20), giving the saturation.□ Remark 3.3.Notice that the map G − is exactly the first return map G defined in ( 19) and therefore Lemma 3.2 implies the first part of Proposition 2.2.

Partition Estimates
The construction of the full branch induced maps G ± : ∆ ± 0 → ∆ ± 0 in the previous section is purely topological and works for any map g satisfying condition (A0).In this section we proceed to estimate the sizes and positions of the various intervals defined above, and this will require more information about the map, especially the forms of the map as given in (A1).Before stating the estimates we introduce some notation.First of all, we let (x − n ) n≥0 and (x + n ) n≥0 be the boundary points of the intervals ∆ − n , ∆ + n so that ∆ − 0 = (x − 0 , 0), ∆ + 0 = (0, x + 0 ) and, for every n ≥ 1 we have The following proposition gives the speed at which the sequences (x + n ), (x − n ) converge to the fixed points 1, −1 respectively and gives estimates for the size of the partition elements ∆ ± n for large n in terms of the values of ℓ 1 and ℓ 2 .To state the result we let If ℓ 2 = 0, then for every ε > 0 If ℓ 2 > 0, then Proof.We will prove ( 46) and ( 47), as ( 48) and ( 49) follow by analogous arguments.Suppose first that ℓ 1 > 0. As x + n → 1, and as g − (y − n ) = x + n−1 we know that for all n sufficiently large we have Solving for y n this gives which is the first statement in ( 47).Now we turn our attention to the size of the intervals δ − n .First let us note that for any γ > 0 we have that n which completes the proof of ( 47).Now for ℓ 2 = 0 we proceed as before, and by (44) we get For the size of the interval δ − n , we may use the mean value theorem to conclude that As g ′ is monotone on U − 0 we know, from the above and ( 44), that which concludes the proof.□
It is enough to prove uniform expansivity for the two maps G − , G + , recall (37), since this implies the same property for their composition G = G − , recall (40).To simplify the notation we will only prove the statement for G + , i.e. we will prove that > λ follows by an identical argument.For points outside the neighourhood U 0+ on which the map g has a precise form, more precisely for 1 ≤ n ≤ n + and for x ∈ δ + n , the expansivity is automatically guaranteed by condition (A2), but for points close to 0 where the derivative can be arbitrarily small the statement is non-trivial.It ultimately depends on writing G + (x) := g n (x) for x ∈ δ + n , so that , and then showing that the, potentially small derivative g ′ (x) near 0 is compensated by sufficiently large number of iterates where the derivative is > 1.This clearly relies very much on the partition estimates in Section 3.2 which provide a relation between the position of points, and therefore their derivatives, and the corresponding values of n.
A relatively straightforward computation using those estimates shows that we get expansion for sufficiently large n ≥ 1, which is quite remarkable but not enough for our purposes as it does not give a complete proof of expansivity for G + at every point in ∆ + 0 .We therefore need to use a somewhat more sophisticated approach that shows that the derivative of G + has a kind of "monotonicity" property in the following sense.Define the function ϕ : ∆ + 0 \ δ + 1 → ∆ + 0 given implicitly by g 2 = g • ϕ and explicitly by Notice that ϕ is the bijection which makes the diagram in Figure 4 commute.
The key step in the proof of Proposition 3.6 is the following lemma.
On the other hand, if k 2 > 1, then the ratio g ′ (x)/g ′ (ϕ(x)) is < 1 and measures how much derivative is "lost" when choosing the initial condition x instead of the initial condition ϕ(x) (since ϕ(x) > x and the derivative is monotone increasing), whereas g ′ (g(x)) > 1 measures how much derivative is "gained" from performing an extra iteration of g.The Lemma says that the gain is more than the loss.
Proof.In light of the remark above we will assume that k 2 > 1.To simplify the notation let us set a = a 2 , b = b 1 , k = k 2 , and ℓ = ℓ 1 .Notice first of all that by the form of g in U 0+ given in (A1) we have Recall that k > 1 and x < ϕ(x) and so the ratio above is < 1.To estimate g ′ (g(x)) we consider two cases depending on ℓ.If ℓ > 0, using the form of g given in (A1) and plugging into (50) we get and, therefore, using the form of G in U −1 , this gives From ( 52) and ( 53) and the fact that x < ϕ(x) we immediately get which establishes (51) and completes the case that ℓ > 0. For ℓ = 0, proceeding as above we obtain )/ax k ), and so Together with (52), as above, we get the statement in this case also.□ As an almost immediate consequence of Lemma 3.7 we get the following.
Corollary 3.9.For all n ≥ n + and x ∈ δ + n+1 we have Proof.By Lemma 3.7 and (51), for any 1 ≤ m ≤ n we have Proceeding inductively we obtain the result.□

Distortion Estimates
Proposition 3.10.For all g ∈ F there exists a constant D > 0 such that for all 0 ≤ m < n and all x, y ∈ δ ± n , As a consequence we get that G is a Gibbs-Markov map with constants C = Dλ and θ = λ −1 .
Corollary 3.11.For all x, y ∈ δ i,j ∈ P with x ̸ = y we have Proof.Let n := s(x, y).Since G is uniformly expanding, we have 1 . By Proposition 3.10 this gives log . □ Proof of Proposition 3.10.We begin with a couple of simple formal steps.First of all, by the chain rule, we can write Then, since g i (x), g i (y) are both in the same smoothness component of g, by the Mean Value Theorem, there exists Substituting this into the expression above, and writing We will bound the sum above in two steps.First of all we will show that it admits a uniform bound D independent of m, n.We will then use this bound to improve our estimates and show that by paying a small price (increasing the uniform bound to a larger bound D := D 2 /|∆ − 0 |) we can include the term |g n (x) − g n (y)| as required.Ultimately this gives a stronger result since it takes into account the closeness of the points x, y.
Let us suppose first for simplicity that x, y ∈ δ + n , the estimates for δ − n are identical.Then for 1 ≤ i < n we have that g i (x), g i (y), u i ∈ ∆ − n−i and therefore we can bound (56) by From ( 12) and using the relationship between the y + n and the x − n we may bound the first term by where we have used the fact that that for some sequence ξ n → −1 we have (1 converges to 1 if ℓ > 0 (and therefore c = 0) or 1 + b 1 otherwise (and therefore c = b 1 ).If ℓ 1 = 0 then D i is uniformly bounded for i > 0, if ℓ 1 > 0, then from ( 11) and (43) we know that Then by ( 58) and ( 59) we find that Substituting this back into (57) and then into (56) we get which completes the first step in the proof, as discussed above.We now take advantage of this bound to improve our estimates as follows.By a standard and straightforward application of the Mean Value Theorem, (61) implies that the diffemorphisms all have uniformly bounded distortion in the sense that for every x, y ∈ δ + n and 1 ≤ m < n we have and Substituting these bounds back into (56) (with i = m), and letting Notice that the last inequality follows from (60).This completes the proof.□ We state here also a simple corollary of Propositions 3.5 and 3.10 which we will use in Section 4.
Lemma 3.12.For all i, j ≥ 1 we have As μ is equivalent to Lebesgue on ∆ − 0 ∪ ∆ + 0 we obtain the Lemma immediately from Proposition 3.5.□

Statistical Properties
In Section 4.1 we prove Proposition 2.6 and in Section 4.2 we prove Proposition 2.10 and Corollary 2.11.As discussed in Section 2.3 this completes the proof of Theorem E.
To prove (28), if ℓ 2 > 0 the result follows by substituting the corresponding line of ( 69) into (67) and substituting ( 72) into (68).Again, if ℓ 2 = 0 we only need to establish an upper bound for μ(aτ + − bτ − > t) rather than an asymptotic equality and therefore, instead of the decomposition in ( 67) and ( 68), we can use the fact that μ(aτ The result then follows by inserting the corresponding line of ( 69) into (74).□

Leading order asymptotics
We prove Proposition 4.1 via two lemmas which show in particular how the values h(0 − ), h(0 + ) of the density of the measure μ turn up in the constants C a , C b defined in (26).Our first lemma shows that the tails of the distributions μ(τ + > t) and μ(τ − > t) have a very geometric interpretation.
Lemma 4.3.For every t > 0 we have μ(τ + > t) = μ(y − ⌈t⌉ , 0) and μ(τ Remark 4.4.While the first statement in ( 75) is relatively straightforward, the second statement is not at all obvious since τ − is defined on ∆ − 0 and there is no immediate connection with the interval (0, y + ⌈t⌉ ) in ∆ + 0 .As we shall see, the proof of Lemma 4.3 requires a subtle and interesting argument.Remark 4.5.Since μ is equivalent to Lebesgue measure on ∆ − 0 and ∆ + 0 , we immediately have that μ(y − ⌈t⌉ , 0) ≈ |y − ⌈t⌉ | and μ(0, y + ⌈t⌉ ) ≈ y + ⌈t⌉ , and we can then use ( 47) and ( 49), and Lemma 4.3, to get upper bounds for the distributions μ(τ + > t) and μ(τ + > t).This is however not enough for our purposes as we require sharper estimates for the distributions, and we therefore need a more sophisticated argument which yields the statement in the following lemma.
Lemma 4.6.For every t > 0 we have Before proving these two lemmas we show how they imply Proposition 4.1.
Proof of Proposition 4.1.Let us first show (69).Recall from the definition of C a in (26) that a = 0 ⇒ C a = 0, so if a = 0 there is nothing to prove.Let us suppose then that a > 0. By Lemmas 4.3 and 4.6 we have μ(aτ Then, using the asymptotic estimates ( 46) and ( 47) for y − n in Proposition 3.5; and since O(t −2/β 1 ) = o(t −γ−1/β 1 ) for every γ ∈ (0, 1); and by the definition of C a in (26), we obtain yielding (69).To show (70) we can proceed similarly to the above.As before, if b = 0 there is nothing to prove so we assume b > 0 in which case Lemmas 4.3 and 4.6 we have μ(bτ ).Now using ( 48) and ( 49), and arguing as above we find that (29) holds for every γ ∈ (0, 1).□ We complete this section with the proofs of Lemmas 4.3 and 4.6.
Proof of Lemma 4.3.By definition, recall ( 22), τ + (x) = i, τ − (x) = j for all x ∈ δ i,j , and therefore μ(τ We claim that for every i, j ≥ 1 we have Then, substituting (77) into (76) we get μ(τ which is exactly the statement (75) in the Lemma.Thus it only remains to prove (77).As already mentioned in Remark 4.4, despite the apparent symmetry between the two statements, the situation in the two expressions is actually quite different.Indeed, from the topological construction of the induced map, for each i ≥ 1 we have which, since the intervals δ i,j are pairwise disjoint, clearly implies the first equality in (77).The second equality is not immediate since, for each fixed j ≥ 1, , the intervals δ i,j are spread out in ∆ − 0 , with each δ i,j lying inside the corresponding interval δ − i , and indeed the δ i,j do not even belong to δ + j and therefore we cannot just substitute i and j to get a corresponding version of (79).We use instead a simple but clever argument inspired by a similar argument in [Cri,Lemma 8] which takes advantage of the invariance of the meaure μ.Recall first of all from the construction of the induced map, that g −1 (δ + j ) consists of exactly two connected components, one is exactly the interval δ 1,j and the other one is a subinterval of ∆ + 1 .So for any j ≥ 1 we have By the invariance of the measure μ, and since these two components are disjoint, this implies The preimage of the set {x : ∆ + 1 : g(x) ∈ δ + j } itself also has two disjoint connected components and therefore, again by the invariance of μ, we get and, substituting this into (80), we get Repeating this procedure n times gives and therefore inductively, we obtain (77), thus completing the proof.□ Proof of Lemma 4.6.From Lemma 4.3 we can give precise estimates for μ(τ ± > t) in terms of the y ⌈t⌉ by making use of the fact that h is Lipschitz on ∆ ± 0 (see Corollary 2.4).Indeed, Using the fact that the density is Lipschitz we have and so μ(τ − > t) = y + ⌈t⌉ h(0 + )+O((y + ⌈t⌉ ) 2 ).The statement for µ(τ + > t) follows in the same way.□

Higher order asymptotics
In this subsection we prove Proposition 4.2.For clarity we prove ( 71) and ( 72) in two separate lemmas.We will make repeated use of some upper bounds for the measure μ(δ i,j ) of the partition elements which are given in Lemma 3.12 Lemma 4.7.If at least one of ℓ 1 , ℓ 2 are not zero, then for every a, b ≥ 0 for any γ ∈ (0, 1).
For the second term in (81) we obtain from Lemma 3.12 that μ(aτ Making the change of variables k = ⌈ai − 1⌉ and using that the first term in the sum is 0 we obtain μ(aτ Let us set a k (t) := k −1−1/β 1 + k t 1/β − 1 and use the binomial theorem to get .
Under our assumptions we know from Proposition 2.6 that the tail of τ a,b is determined by ( 27), and we recall from (26) that C b = 0.If β φ = β 2 = 0 then we know from the second line of ( 27) that μ(±τ a,b > t) ≲ t −γ−1/β 1 .

Figure 1 :
Figure 1: Graph of g for various possible values of parameters.