1 Introduction

The Collet–Eckmann condition stems from Eckmann and Collet in the 1980s [10, 11], and was used to show abundance of chaotic behaviour for certain maps on an interval. Chaotic behaviour of a system is usually associated to the property of sensitive dependence on initial conditions, meaning that two points xy sufficiently close to each other repel each other under iteration up to some large scale. Hence it is natural that such maps possess some kind of expanding property. A map satisfying the Collet–Eckmann condition is expansive along the forward critical orbit(s), and it turned out to be sufficient for chaotic behaviour in many situations, not only the pioneering case studied by Eckmann and Collet. Shortly after their works, Jakobson proved in [22] that the set of parameters \(a \in (0,2)\) for which \(f_a(x) = 1-ax^2\) admits an absolutely continuous invariant measure (acim) has positive Lebesgue measure. A corresponding celebrated result for complex rational maps was obtained by Rees [28]. These maps also exhibit chaotic behaviour. The existence of an acim describes the typical orbits of a map in a probabilistic way. It does not immediately imply chaotic behaviour, but it is often very closely related to it and with some additional properties (such as expansion, ergodicity, positive entropy etc) this is usually the case.

It was quite early realised that the Collet–Eckmann condition, or even weaker conditions, are sufficient for the existence of an (ergodic) acim, see e.g. [4,5,6,7, 12, 13, 18, 29, 32, 33]. In fact, in [6, 7], it is proven that an acim exists even if the derivative along the critical orbit is bounded from below by some positive constant (unimodal case) or tends to \(\infty \) (multimodal case). See also [23] for a closely related result. The weaker topological CE-condition has been characterized in terms of invariant measures in [15] by Przytycki and Rivera-Letelier.

In the fundamental papers [4, 5], Benedicks and Carleson showed that the Collet–Eckmann condition is satisfied for a set of positive Lebesgue measure in the quadratic family. Despite of the fact that the Collet–Eckmann condition in general is stronger than the existence of an acim, the two conditions are metrically the same in the real quadratic family. This was a deep result by Lyubich and Avila and Moreira, see [3, 31]. Conjecturally it holds more generally. In contrast to the chaotic, non-regular (sometimes called stochastic) parameters stands the regular parameters, for which the map has an attracting orbit. These maps were proven to be open and dense in the real case (the famous real Fatou conjecture), [19, 30, 34]. The complex Fatou conjecture is still open.

In the complex rational setting, not as much is known. A similar result to the papers [4, 5] was obtained by the author [1], where it was proven that post-critically finite maps, for which the Julia set is the whole sphere, are density points of Collet–Eckmann maps, improving an earlier famous result by Rees [28]. Apart from implying the existence of an ergodic acim, the Collet–Eckmann condition induces more nice properties, see e.g. [18, 35]. It has geometric implications, and there are several papers studying perturbations of Collet–Eckmann maps (or similar expanding maps); see [4, 5, 14, 16, 17, 36] for real maps on an interval and families of Hénon maps, and [28, 1, 2, 20, 21] in the complex setting.

The result in this paper is related to [36] (see also [16]) in the complex setting. We study perturbations of complex rational Collet–Eckmann maps which have their Julia set equal to the whole sphere, and where the starting map is allowed to be critically slowly recurrent (see [27, 36]).

Let Crit be the set of critical points for f and let J(f) and F(f) be the Julia set and Fatou set of f respectively. Let \(Crit'\) be the set of critical points c such that there are no other critical points in the forward orbit of c. Derivatives are always in the spherical metric unless otherwise stated. We write \(Df(z) = f'(z)\) as the space derivative throughout the paper.

Definition 1.1

Let f be a non-hyperbolic rational map without parabolic periodic points. Then f satisfies the Collet–Eckmann condition (CE), if there exist constants \(C > 0\) and \(\gamma > 0\) such that, for each critical point \(c \in Crit' \cap J(f)\), we have

$$\begin{aligned} |Df^n(fc)| \ge Ce^{\gamma n}, \text { for all }n \ge 0. \end{aligned}$$

Let us define the upper and lower Lyapunov exponents for the critical point c respectively as

$$\begin{aligned} \underline{\gamma }(c) = \liminf \limits _{n \rightarrow \infty } \frac{\log |Df^n(fc)|}{ n}, \quad \text {and} \quad \overline{\gamma }(c) = \limsup \limits _{n \rightarrow \infty } \frac{\log |Df^n(fc)|}{ n}. \end{aligned}$$

Then the CE-condition can be reformulated as the condition that the lower Lyapunov exponent is strictly positive for all critical points \(c \in Crit' \cap J(f)\). We write \(\underline{\gamma } = \min \underline{\gamma }(c)\) where the minimum is taken over all critical points \(c \in Crit' \cap J(f)\). In this paper we are going to study perturbations of rational CE maps for which the Julia set is the whole sphere, but we expect that the techniques can be used in other situations as well. Let \(Jrit = Crit \cap J(f)\). A critical point \(c \in J(f)\) is slowly recurrent, cf. [27], if for each \(\alpha > 0\) there is some \(C > 0\) such that

$$\begin{aligned} \text {dist}(f^n(Jrit),Jrit) \ge C e^{-\alpha n}, \text { for all }n \ge 0. \end{aligned}$$
(1.1)

We say that f is critically slowly recurrent if all critical points in the Julia set are slowly recurrent. Collet–Eckmann maps possess a (unique) conformal measure \(\nu \) supported on the Julia set and a unique ergodic invariant measure \(\mu \), which is absolutely continuous with respect to \(\nu \) (e.g. [13, 18, 35]). If the map f satisfies \(J(f) = \hat{{\mathbb C}}\), then \(\nu \) is the standard Lebesgue measure and hence for such maps there exists an invariant absolutely continuous measure with respect to Lebesgue measure. We say that the critical points are typical with respect to this measure if the Birkhoff means converges for all critical points \(c \in Jrit\), i.e.,

$$\begin{aligned} \frac{1}{n} \sum _{k=0}^{n-1} \varphi \left( f^k(c)\right) \rightarrow \int \varphi \,\textrm{d}\mu , \quad \text { as }\quad n \rightarrow \infty , \end{aligned}$$

for \(\varphi \in L^1(\mu )\). Setting \(\varphi (z) = \log |Df(z)|\), which belongs to \(L^1 (\mu )\) by [35], we see that if the critical points are typical, then \(\overline{\gamma } = \underline{\gamma }\). It follows that the map is slowly recurrent. The condition \(\overline{\gamma } = \underline{\gamma }\) implies that f is slowly recurrent but it is not clear if the converse holds. Conjecturally almost all CE-maps have the slow recurrence property. At least it is true in the real quadratic family (see [3, 9]).

The space of rational maps of degree d is denoted by \({\mathcal R}_d\). We will explain later what is meant by a non-degenerate real analytic family in Sect. 4.

Theorem A

Let f be a critically slowly recurrent rational Collet–Eckmann map in \({\mathcal R}_d\), of degree \(d \ge 2\), such that the Julia set is the whole sphere and let \(f_a\), \(a \in (-\varepsilon ,\varepsilon )\) be a non-degenerate real analytic family of maps around \(f=f_0\) for some \(\varepsilon > 0\). Then \(f_0\) is a Lebesgue density point of Collet–Eckmann maps in \((-\varepsilon ,\varepsilon )\).

We use a normalisation of the space of rational maps following G. Levin [24, 25]. We say that two maps f and g are equivalent if they are conjugate by a Möbius transformation. Then we can consider the space \(\Lambda _{d,\overline{p'}} \subset {\mathcal R}_d\), (see [25]) up to equivalence, as the set of rational maps f of degree \(d \ge 2\) with precisely \(p'\) critical points, i.e. \(Crit = \{ c_1, \ldots , c_{p'} \}\), with corresponding multiplicities \(\overline{p'} = \{m_1, \ldots , m_{p'} \}\) (in the same order). Inside such a set, we may state the following theorem as a direct consequence of Theorem A, by Fubini’s Theorem, since the set of degenerate directions has measure zero (this also follows from the results of Levin, see Sect. 4).

Theorem B

Let f be a critically slowly recurrent rational Collet–Eckmann map in \(\Lambda _{d,\overline{p'}} \subset {\mathcal R}_d\) of degree \(d \ge 2\) such that the Julia set is the whole sphere. Then f is a Lebesgue density point of Collet–Eckmann maps in \(\Lambda _{d,\overline{p'}}\).

Generically, all critical points are simple and then f is a Lebesgue density point of CE-maps in \({\mathcal R}_d\) in Theorem B. It is likely that this is true even if the critical points are not simple, using a suitable re-parameterisation of the parameter space so that all critical points move analytically if higher order critical points split. However, we will not consider that case in this paper.

The proof of the main theorem is mainly based on a combination of strong transversality results by Levin and developed classical Benedicks–Carleson parameter exclusion techniques. Strong transverslity means roughly the following. Given that the phase derivative grows exponentially, then the phase and parameter derivatives of the iterates of \(f_0\) are comparable with some constant, and that this comparison persists, with an arbitrarily small error, for nearby parameters. The other main part of the proof is bascially a big induction argument, based on classical techniques by Benedicks and Carleson. The idea is to control expansion of the derivative along orbits of the critical points for nearby parameters, by using the expansion of the original map and induction. If critical points for some parameters returns too close to the set of critical points, the derivative drops too much, and those parameters are deleted. One of the key problems is to control the measure of this set of deleted parameters. The expansion properties of the parameters kept enables us to transfer information of the phase space to the parameter space (transversality), so that the parameters that are left inside \((-\varepsilon ,\varepsilon )\) is a Cantor set of positive measure with arbitrarily high density.

In particular, the paper is a generalisation of [1], which was the (revised) thesis of the author. Apart from proving Theorem A, the aim of this paper is partially to make the arguments in the Benedicks–Carleson parameter exclusion techniques more transparent. The paper is organized as follows. Section 2 is devoted to some basic definitions, bound and free periods etc. In Sect. 3 we prove some results on the expansion away from critical points (the free period) and Sect. 4 is devoted to parameter-phase comparison (transversality). In Sect. 5 we introduce weak distortion which leads to parameter independence (i.e. that on a small scale, one can almost forget about the parameter dependence). Together with some distortion lemmas on the bound period (Sect. 6), this leads to strong distortion results in Sect. 7. All these results are needed to deal with the large deviations in Sect. 8, where we control the measure of parameters that return too often too deep. In Sect. 9 the whole induction proof is finalised.

Remark 1.2

It will be clear from the proof that the slow recurrence condition in Theorems A and B is a little superfluous; one only needs to have slow recurrence (1.1) for some sufficiently small \(\alpha > 0\), depending on \(f=f_0\). The CE-maps constructed in [1] have this property close to the starting (Misiurewicz-Thurston) map. It follows that the set of maps satisfying this weaker assumption has positive Lebesgue measure.

2 Some definitions

Let \(f=f_0\) be a slowly recurrent Collet–Eckmann map with \(J(f) = \hat{{\mathbb C}}\), and \(f_a\), \(a \in (-\varepsilon ,\varepsilon )\) a real analytic family around \(f_0\). We assume that the family is non-degenerate, which in particular means that every critical point \(c_l(a)\) of \(f_a\) moves analytically with the parameter a. Another condition on the family is given in Sect. 4. We put \(f_a(z) = f(z,a)\) and \(Df_a(z) = f_a'(z)\). Let \(v_l(a) = f_a(c_l(a))\) be the critical value, and suppose that \(v_l = v_l(0)\) does not contain any critical points in its forward orbit under \(f_0\), for all l. Put

$$\begin{aligned} \xi _{n,l}(a) = f_a^n(c_l(a)). \end{aligned}$$

We will study the evolution of \(\xi _{n,l}(\omega )\) for a small interval \(\omega = (-\varepsilon ,\varepsilon )\) around the starting map \(f_0\). In the beginning this curve will grow rapidly from the expansive properties of the starting map, but later on we have to delete parameters that come too close to the set of critical points, denoted by \(Crit_a\), of \(f_a\). Now, \(Crit_a\) moves analytically, but it turns out that \(\xi _{n,l}(\omega )\) and \(Crit_{\omega }\) are very different in diameter, due to the expansion of \(\xi _{n,l}(\omega )\); it will be much bigger than \({\text {diam}}(Crit_{\omega })\). Let U be a neighbourhood of the critical points for the unperturbed map. Choose \(\varepsilon > 0\) so that U is a neighbourhood around \(Crit_a\), for all \(a \in (-\varepsilon ,\varepsilon )\). Moreover, if we let \(U_l\) be a component of U which contains the critical point \(c_l\) then we impose the condition \({\text {dist}}(c_l(\omega ),\partial U_l) \gg {\text {diam}}( c_l(\omega ) )\) for all l. To make U more precise, we choose \(\delta = e^{-\Delta } > 0\) so that \(U = \cup _l B(c_l,\delta )\). Hence \(\varepsilon \) depends on \(\delta \). We will also consider larger neighbourhoods \(U_l' \supset U_l\) of the critical points, defined in Lemma 3.1, where \(U' = \cup _l U_l'\) and \(U_l' = B(c_l,\delta ')\), for some \(\delta ' \ge \delta > 0\).

The approach rate at which the distance \({\text {dist}}(\xi _{n,l}(a),Crit_a)\) may go to zero is controlled by the so called basic approach rate assumption which is inherited from the slow recurrent condition.

Definition 2.1

Let \(\alpha > 0\). We say that the critical point \(c_l(a)\), (or parameter a with critical point l) satisfies the basic assumption up to time n with exponent \(\alpha \), if

$$\begin{aligned} {\text {dist}}(\xi _{k,l}(a),Crit_a) \ge K_b e^{-2\alpha k}, \quad \text { for all }\quad k \le n, \end{aligned}$$

where \(K_b > 0\) is the same constant which appears in the slow recurrent condition.

Obviously the starting map \(f_0\) satisfies the basic assumption for all times for any \(\alpha > 0\). From now on, fix \(\alpha > 0\) to be at most \(\min (\gamma _0, \gamma _H) (1-\tau ) /(400K^2 \Gamma )\), where \(\gamma _0 = \underline{\gamma } > 0\) is the (lower) Lyapunov exponent appearing in the Collet–Eckmann condition for \(f_0\), \(\Gamma = \sup _{a \in (-\varepsilon ,\varepsilon ), z \in \hat{{\mathbb C}}} \log |Df_a(z)|\), K is the maximal degree of the critical points, \(\gamma _H\) is the exponent from Lemma 3.1, and where \(0< \tau < 1\) is fixed (this is used in Sect. 8). We assume for the starting map \(f_0\), that there is some constant \(C_0 > 0\) such that,

$$\begin{aligned} |Df^n(v_l(0))| \ge C_0 e^{\gamma _0 n}, \quad \text { for all }n \ge 0. \end{aligned}$$

We will construct a set of parameters around \(a=0\) which also satisfies both this basic assumption for this specific \(\alpha \) and the Collet–Eckmann condition for possibly slightly smaller Lyapunov exponents \(\gamma \). Since we fix \(\alpha > 0\) we only speak of the basic assumption, without mentioning the exponent in the future.

We will make an induction argument based on the fact that we have some “basic” Lyapunov exponent \(\gamma _B > 0\). This is typically smaller than the original Lyapunov exponent \(\gamma _0\) for \(f_0\). During the induction arguments we also have to allow the Lyapunov exponent to decrease down to some certain value, a fraction of \(\gamma _B\), a so called “intermediate exponent” \(\gamma _I < \gamma _B\), which is required for most lemmas to work. We will also define the number \(\gamma _B\) later, but it is slightly smaller than the minimum of \(\gamma _H\) in Lemma 3.1 and \(\gamma _0\) from the starting function \(f_0\).

We write \(A \sim _{\kappa } B \), where \(\kappa \ge 1\) if

$$\begin{aligned} \frac{1}{\kappa } A \le B \le \kappa A. \end{aligned}$$

We write \(A \sim B\) to say \(A \sim _{\kappa } B\) for some constant \(\kappa \ge 1\). In several inequalities we use C several times for possibly different constants, when it is clear that these constants do not depend on the dynamics, i.e. the number of iterations.

2.1 Bound and free periods

In this section we define some fundamental concepts which will be used throughout the paper. Many of them are direct analogues of corresponding definitions in [4, 5], see also [1]. We speak of a return of the sequence \(\xi _{n,l}(\omega )\) into U or \(U'\), when we mean that \(\xi _{n,l}(\omega ) \cap U \ne \emptyset \) or \(\xi _{n,l}(\omega ) \cap U' \ne \emptyset \) respectively. We also speak of returns into U or \(U'\) of the sequence \(\xi _{n,l}(a)\) for a single parameter a, and this means simply that \(\xi _{n,l}(a) \in U\) or \(\xi _{n,l}(a) \in U'\) respectively. Returns into the annular neighbourhoods \(U' {\setminus }U\), i.e. when \(\xi _{n,l}(a) \cap U' \ne \emptyset \) but \(\xi _{n,l}(\omega ) \cap U = \emptyset \), are called pseudo-returns. Sometimes we drop the index l and write only \(\xi _n(a) = \xi _{n,l}(a)\) for some critical point \(c_l(a)\). We will also consider so called deep returns, which are returns into a smaller neighbourhood \(U^2 = \cup _l B(c_l,\delta ^2) \subset U\) of the critical points. These deep returns will be used only in the end of the paper, in Sect. 8.

The point is that when a return occurs, so that for example \(\xi _n(a) \in U\), then the orbit follows the original orbit, i.e. \(\xi _{n+j}(a)\) stays close to \(\xi _j(a)\) for the first j. This is the so called bound period, which can be defined both for points \(\xi _n(a)\) and curves \(\xi _n(\omega )\) (precise definitions below). After the bound period ends, the free period starts until the next return, and so on. During the bound period, due to the expansion of the derivative of the original early orbit, we can show expansion of the derivative also during the bound period (with a certain loss due to the actual return, which is close to the critical set). Because of this, we will not consider returns during the bound period (bound returns), but only consider returns after the free period (free returns). When we speak of a return, we mean a free return unless otherwise stated. This is very similar to earlier constructions in [1, 4, 5]. During the free period we will show a uniform expansion of the derivative. The result is the same as in the old traditions but the techniques stem from quite different sources in this new situation of a more general CE-map. The number \(\beta > 0\) below is related to \(\alpha \) in the basic approach rate condition. There is quite a lot of freedom to choose \(\beta \), but let us set \(\beta = \alpha \), so that we can use the same exponent.

Definition 2.2

(Pointwise bound period) Let \(\beta > 0\). Let \(\xi _{n,l}(a) \in U_k' \subset U'\) be a return. Then we define the bound period for this return as the indices \(j > 0\) for which the inequality

$$\begin{aligned} |\xi _{n+j,l}(a) - \xi _{j,k}(a)| \le e^{-\beta j} {\text {dist}}\left( \xi _{j,k}(a), Crit_a\right) , \end{aligned}$$

holds. The largest number \(p > 0\) for which the inequality holds is called the length of the bound period.

To define the bound period for an interval, we consider a return \(\xi _{n,l}(\omega )\) into U. If

$$\begin{aligned} {\text {diam}}(\xi _{n,l}(\omega )) \ge \frac{1}{2} {\text {dist}}(\xi _{n,l}(\omega ),Crit_{\omega })/ \left( \log \left( {\text {dist}}(\xi _{n,l}(\omega ),Crit_{\omega })\right) \right) ^2, \end{aligned}$$
(2.1)

then we say that the return is essential. Otherwise it is inessential. With \(\tilde{r} = -\log ({\text {dist}}(\xi _{n,l}(\omega ),Crit_{\omega }))\), then the return is essential if \({\text {diam}}(\xi _{n,l}(\omega )) \ge (1/2) e^{-\tilde{r}}/\tilde{r}^2\), a bit more convenient notation. Actually we will partition the parameter intervals (explained later) so that they become so called partition elements, defined as follows.

Definition 2.3

For a given \(S > 0\), we call parameter intervals \(\omega \) satisfying the inequality

$$\begin{aligned} {\text {diam}}(\xi _{k,l}(\omega )) \le \left\{ \begin{array}{cc} \frac{{\text {dist}}(\xi _{k,l}(\omega ),Crit_{\omega })}{(\log ({\text {dist}}(\xi _{k,l}(\omega ),Crit_{\omega })))^2}, &{} \text {if } \xi _{k,l}(\omega ) \cap U \ne \emptyset , \\ S, &{} \text {if } \xi _{k,l}(\omega ) \cap U = \emptyset , \end{array} \right. \end{aligned}$$

for all \(k \le n\), partition elements at time n.

We do not speak of essential or inessential returns for pseudo-returns.

Definition 2.4

(Bound period for an interval, essential returns or pseudo-returns) Let \(\xi _{n,l}(\omega ) \cap U_k' \ne \emptyset \), (\(U_k' \subset U'\)) be an essential return or a pseudo-return. Then we define the bound period for this return as the indices \(j > 0\) for which the inequality (recall \(f_a(z) = f(z,a)\)),

$$\begin{aligned} {\text {dist}}\left( f^j(z,a), \xi _{j,k}(b)\right) \le e^{-\beta j} {\text {dist}}\left( \xi _{j,k}(b), Crit_b\right) , \end{aligned}$$

holds for all \(a, b \in \omega \), and all \(z \in \xi _{n,l}(\omega ) \).

If the return \(\xi _{n,l}(\omega )\) into \(U_k\) is inessential we will consider a host-curve as follows. Draw a straight line segment \(L'\) through the end points of \(\xi _{n,l}(\omega )\) with length equal to \(e^{-r}/r^2\) where

$$\begin{aligned} r = \lceil -\log \left( {\text {dist}}(\xi _{n,l}(\omega ),Crit_{\omega })\right) - 1/2 \rceil . \end{aligned}$$

To make it well defined, let us say that the line segment \(L'\) shall be symmetric with respect to the end points of \(\omega \). Let L be the part of \(L'\) with the central part between the end points deleted. The host curve for this return is then \(L \cup \xi _{n,l}(\omega )\).

Definition 2.5

(Bound period for an interval, inessential returns) Let \(\xi _{n,l}(\omega ) \cap U_k \ne \emptyset \) be an inessential return. Then we define the bound period for this return as the indices \(j > 0\) for which the inequality

$$\begin{aligned} {\text {dist}}\left( f^j(z,a), \xi _{j,k}(b)\right) \le e^{-\beta j} {\text {dist}}\left( \xi _{j,k}(b), Crit_b\right) , \end{aligned}$$

holds for all \(a, b \in \omega \), and all \(z \in L \cup \xi _{n,l}(\omega )\).

It will be clear later that the dependence on the parameter in these definitions is inessential.

3 Expansion during the free period

During the free period we want to show that the derivative of \(f^n(z)\) grows exponentially as long as \(f^j(z)\) stays outside U for \(j=0,\ldots ,n-1\). In earlier papers, this was settled via the orbifold metric for postcritically finite (rational) maps, given that the postcritical set consists of at least 3 points. Here we have to use different techniques to build a uniform expansion using the second Collet–Eckmann condition discussed in [17]. In Proposition 1 of that paper, it is stated that the second Collet–Eckmann condition is satisfied for all critical points of maximal multiplicity. However, with the slow recurrence condition, this statement holds for every critical point in the Julia set [8].

Without going through the whole construction, we refer to [17, 18] for the details. The main idea is based on three types of iterated preimages of shrinking neighbourhoods of a given point z, which in our case is a critical point c in the Julia set (actually we assume that \(J(f) = \hat{{\mathbb C}}\)). This critical point is assumed not to have any critical points in its backward orbit. The type 2 and type 3 orbit have a uniform expansion automatically by construction, see Lemmas 3 and 4 in [17]. The type 1 preimages connects two critical points in the backward orbit in a way that one has a ball B(cr) and considers preimages \(U_k\) which are sequences of components of \(f^{-k}(B(c,r_k)\) of shrinking neighbourhoods \(B(c,r_k)\), where \(r_k \le r\) is decreasing and \(\lim \limits _{k \rightarrow \infty } r_k \ge r/2\). For a type 1 orbit one has a critical point \(c_1 \in \partial U_n\) for some n, and no critical points in \(\overline{U_k}\) for \(0< k < n\). The length of this type 1 orbit is n. Due to the difference in multiplicity of the critical points, type 1 orbits do not ensure immediate uniform expansion. This is resolved by looking at preimages of the type \(\ldots 111113\), i.e., a sequence of 1s followed by a type 3 orbit. Such iterated preimages have uniform expansion (see p. 83 in [17]).

What can happen is that the induction starts (from the right) with a sequence of 1s only. Then it may happen that we do not have the desired expansion. Looking at such a block of 1s in the beginning of the sequence, we see from the calculations on p. 83 [17] that at a preimage \(y = f^{-k}(c)\) we can estimate the growth of the derivative as follows. Let \(\mu _{max}\) be the maximal multiplicity of the critical points and \(\mu \) the multiplicity of c. The number \(d=0\) below because that is the distance from the centre of the ball B(cr) to the critical point c. For some \(Q > 1\) we then have, verbatim,

$$\begin{aligned} |Df^k(y)|^{\mu _{max}} \ge Q^k \frac{r^{\mu _{max}-1}}{(r+d)^{\mu -1}} = Q^k r^{\mu _{max}-\mu }. \end{aligned}$$
(3.1)

So if \(\mu < \mu _{max}\) then this expansion is not uniform. By assumption there is a critical point \(c_1\) on the boundary of the shrinking neighbourhood of \(f^{-k}(B(c,r))\), for some k, i.e. \(c_1 \in \partial U\), where \(U = \text {comp}( f^{-k}(B(c,r_k))\) where \(r/2 \le r_k \le r\). However, by the slow recurrence condition, we have \({\text {dist}}(f^k(c_1),c) \ge e^{-\alpha k}\) for some small \(\alpha > 0\). This means that \(e^{-k \alpha } \le r\). Since \(\alpha > 0\) can be chosen as small as we like, (3.1) becomes, for some \(Q_1 > 1\) possibly slightly smaller than Q,

$$\begin{aligned} |Df^k(y)|^{\mu _{max}} \ge Q^k e^{-(\mu _{max} -\mu ) \alpha k} = Q_1^k. \end{aligned}$$

Lemma 3.1

If \(\varepsilon > 0\) is sufficiently small, then there exists a neighbourhood \(U'\) of the critical points such that the following holds for \(a \in (-\varepsilon ,\varepsilon )\). Let \(U \subset U'\). There exist \(\lambda > 1\), \(C' > 0\) and \(C > 0\), where \(C'\) depends on \(\delta '\) but not on \(\delta \) such that: If \(f_a^k(z) \notin U\) for \(k=0,\ldots , n-1\), then

$$\begin{aligned} |Df_a^n (z)| \ge C \lambda ^n. \end{aligned}$$

For each \(0 < q \le 1\) there exists a neighbourhood of the critical points \(\hat{U} \subset U'\) such that for any neighbourhood of the critical points \(U_1 \subset U \subset \hat{U}\) satisfying \({\text {diam}}(U_{1,j}) \ge q {\text {diam}}(U_j)\), where \(U_{1,j} \subset U_j\) are components of \(U_1\) and U respectively, we have the following. If \(z \notin U_1\), \(f_a^k(z) \notin U\) for \(k=1,\ldots , n-1\), and \(f_a^n(z) \in U\) then

$$\begin{aligned} |Df_a^n (z)| \ge C' \lambda ^n, \end{aligned}$$

(where \(C'\) only depends on \(U'\)). If \(q=1\) we can set \(U_1=U\) and \(\hat{U} = U'\).

Proof

Let us first consider the unperturbed map \(f_0\). By the argument before the lemma, it follows from [17] that the Collet–Eckmann condition implies the second Collet–Eckmann condition, for all critical points. Looking at any iterated preimage \(z=f^{-n}(c)\) to a critical point c, the second Collet–Eckmann condition implies

$$\begin{aligned} |Df^n(z)| \ge C_2 \lambda _2^n, \end{aligned}$$

for some \(\lambda _2 > 1\) and a constant \(C_2 > 0\). Let \(0< \kappa < 1\) and \(N > 0\) (we give conditions on these constants below). We follow partially the idea of [35] (p. 40–41). Let \(U'\) to be a union of disks \(U_j'\) around the critical points with radius \(\delta '\), so that for any iterated preimage \(f^{-k}(U_j')\) of a component of \(U'\), we have

$$\begin{aligned} {\text {diam}}\left( f^{-k}(U_j')\right) \le \kappa \cdot {\text {dist}}\left( f^{-k}(U_j'),Crit_0\right) , \quad \text { for all }k \le N. \end{aligned}$$
(3.2)

This implies that we have distortion inside \(f^{-k}(U_j')\), that is, for any choice of zw in the same component of \(f^{-k}(U_j')\) we have

$$\begin{aligned} \frac{|Df(z)|}{|Df(w)|} \le C_3, \end{aligned}$$
(3.3)

where \(C_3= C_3(\kappa ) \rightarrow 1\), as \(\kappa \rightarrow 0\).

If (3.2) is not valid, then we can use another estimate as follows. For any disk D of radius at most \(\delta ' > 0\) there is a constant \(C_4\) such that

$$\begin{aligned} |Df(z)| {\text {diam}}(D') \le C_4 {\text {diam}}(D), \text { for all }z \in D', \end{aligned}$$

where \(D'\) is a component of \(f^{-1}(D)\). Here \(C_4\) only depends on \(\delta '\).

Suppose now that \(N > 0\) is the largest time where (3.2) is valid. If we put \(W_k' = f^{-k}(U_j')\) and \(z_k \in W_k'\) the corresponding preimage of \(c_j \in U_j'\), and if N is large enough, then

$$\begin{aligned} {\text {diam}}(U_j') \ge C_3^{-(N-1)}|Df^{N}(z_{N-1})| {\text {diam}}\left( W_{N-1}'\right) \nonumber \\ \ge C_4^{-1} C_3^{-(N-1)} |Df^{N}(z_{N})| {\text {diam}}(W_{N}')&\ge \lambda _1^{N} {\text {diam}}(W_{N}'), \end{aligned}$$
(3.4)

for some \(\lambda _1 > 0\). Now let N be so large so that \(\lambda _1^N \ge 10/\kappa \). So from now on \(U'\) and N are fixed.

Now suppose that \(U_1 \subset U \subset \hat{U} \subset U'\), and let \(U_1 = \cup _j B(c_j,\delta _1)\), \(U = \cup _j B(c_j,\delta )\), \(\hat{U} = \cup _j B(c_j,\hat{\delta })\) i.e., \(\delta _1 \le \delta \le \hat{\delta } \le \delta '\). Suppose that \(z \notin U_1\), \(f^k(z) \notin U\) for all \(k=1, \ldots , n-1\) and \(f^n(z) \in U\). Let now \(n_0 > 0\) be the first time for which (3.2) is not valid with \(U_j'\) replaced by \(U_j\), the components of U. Let \(W_k = f^{-k}(U_j)\) be the corresponding preimages of \(c_j \in U_j\). By the definition of \(n_0\),

$$\begin{aligned} {\text {dist}}(W_{n_0},Crit) \le (1/\kappa ) {\text {diam}}(W_{n_0}) \le (1/\kappa ) \lambda _1^{-n_0} {\text {diam}}(U_j). \end{aligned}$$
(3.5)

Let us now consider the condition

$$\begin{aligned} (1/\kappa ) \lambda _1^{-n_0} \le \frac{q}{10} \le \frac{{\text {diam}}\left( U_{1,j}\right) }{10 {\text {diam}}\left( U_j\right) }, \end{aligned}$$
(3.6)

where \(U_{1,j} \subset U_j\) is the corresponding component of \(U_1\) inside \(U_j\) and the second inequality is valid by assumption. We discuss this condition soon. It implies that

$$\begin{aligned} {\text {dist}}(W_{n_0},Crit)&\le \frac{{\text {diam}}(U_{1,j})}{10 {\text {diam}}(U_j)} {\text {diam}}(U_j) = \frac{1}{10}{\text {diam}}(U_{1,j}), \quad \text { and } \end{aligned}$$
(3.7)
$$\begin{aligned} {\text {diam}}(W_{n_0})&\le \lambda _1^{-n_0} {\text {diam}}(U_j) \le \frac{\kappa }{10}{\text {diam}}(U_{1,j}). \end{aligned}$$
(3.8)

Clearly, this implies that \(W_{n_0} \subset U_{1,j} \subset U_1\). If \(n_0 \le n\), this was not allowed, since \(z_k \notin U_1\), \(1 \le k \le n\). Hence (3.2) is valid all the time up until n. Therefore, if \(w \in f_0^{-n}(U)\), we have, by the distortion estimate (3.3),

$$\begin{aligned} |Df_0^n(w)| \ge |Df_a^n(z)| C_3^{-n} \ge C_2 \lambda _1^n, \end{aligned}$$
(3.9)

where z is the preimage of the corresponding critical point and \(C_2\) is the constant from the second Collet–Eckmann condition, and hence does not depend on U.

Let us now discuss the condition (3.6). Then the condition implies that

$$\begin{aligned} n_0 \log \lambda _1 \ge \log (1/q) - \log \kappa + \log 10 \ge \Delta _1 - \Delta - \log \kappa + \log 10. \end{aligned}$$
(3.10)

Hence this basically forces \(n_0-1\), the time when (3.2) is valid, to be bounded below by the difference \(\Delta _1 - \Delta \). Let now \(\hat{N}\) be the largest integer such that (3.2) is valid with \(U_j'\) replaced by \(\hat{U}_j\), the components of \(\hat{U}\). Then we can say that \(\hat{U}\) depends on q in the following sense. For a fixed q we choose \(\hat{U}\) so that the corresponding \(\hat{N}\) satisfies (3.10), with \(n_0\) replaced by \(\hat{N}\). Clearly, if \(q=1\) we can put \(\hat{U} = U'\).

Now we turn to the situation when \(f_0^k(z) \notin U\) for \(k=0,\ldots ,n\). We can use the same estimates as before. Choose \(r_1 \le \delta \) and put \(V_0 = B(f_0^n(z),r_1)\) and \(V_k = f^{-k}(V_0)\), the corresponding component containing \(f_0^{n-k}(z)\). Now let \(m_0\) be maximal such that (3.2) is valid for \(k \le m_0 -1\), with \(U_j'\) replaced by \(V_0\). Suppose for the moment that \(m_0 \le n\). By the Main Theorem in [35] the ExpShrink condition is satsified. Hence, there is some \(\lambda _0 > 1\) so that

$$\begin{aligned} {\text {diam}}(V_k) \le \lambda _0^{-k} \end{aligned}$$

if we choose \(r_1\) small enough (smaller than some fixed r from the theorem). We also may choose \(r_1\) so small such that \(m_0\) satisfies \((1/\kappa ) \lambda _0^{-m_0} < \delta /10\). Since (3.2) is not valid for \(k=m_0\) we have

$$\begin{aligned} {\text {dist}}(V_{m_0},Crit)&\le \frac{1}{\kappa } {\text {diam}}(V_{m_0}) \le \frac{1}{\kappa } \lambda _0^{-m_0}< \frac{\delta }{10}, \\ {\text {diam}}(V_{m_0})&\le \lambda _0^{-m_0} < \frac{\delta }{10}. \end{aligned}$$

This clearly implies that \(V_{m_0} \subset U\) which is impossible. Hence (3.2) is valid all the time up to n. By possibly diminishing \(\kappa \), we get bounded distortion inside \(V_{n}\), and hence there is some \(\lambda _3 > 1\) (which depends on \(\kappa \)), such that

$$\begin{aligned} |Df_0^n(z)| \ge C \lambda _3^n, \end{aligned}$$
(3.11)

for some constant \(C >0\) that depends on U. We may assume that \(\lambda _3 > \lambda _1\), otherwise diminish \(\lambda _1\) so that this holds. This proves the first statement of the lemma.

Choose \(N_1 >0\) so that outside \(U_1\) the orbits \(f_a^k(z)\) and \(f^k(z)\) follow each other up to \(N_1\), i.e. for \(k \le N_1\), and so that \(|Df_a^{N_1}(z)| \ge C \tilde{\lambda }_3^{N_1} \ge \tilde{\lambda }_1^{N_1}\), for all \(a \in (-\varepsilon ,\varepsilon )\). Here \(\tilde{\lambda }_3 > 1\) comes from a perturbed version of (3.11). Since also (3.9) is valid for small perturbations if we bound the number of iterations by \(N_1\), we let \(\tilde{\lambda }_1 > 1\) be the corresponding perturbed version of \(\lambda _1 > 1\). Let us write n as \(n = qN_1 + r\), where \(r < N_1\). Then, if we assume that \(z \notin U_1\), \(f_a^k(z) \notin U\) for \(k=1,\ldots , n-1\) and \(f_a^n(z) \in U\) we get

$$\begin{aligned} |Df_a^n(z)| = |Df_a^r\left( f^{qN_1}(z)\right) | |Df_a^{N_1}\left( f^{(q-1)N_1}(z)\right) | \ldots |Df_a^{N_1}(z)| \ge C_2 \tilde{\lambda }_1^n, \end{aligned}$$
(3.12)

where we used (3.9) for \(|Df_0^r(f^{q N_1}(z))| \ge C_2 \lambda _1^r\), so that \(|Df_a^r(f^{q N_1}(z))| \ge C_2 \tilde{\lambda }_1^r\). The second statement of the lemma follows with \(\lambda = \tilde{\lambda }_1\).\(\square \)

The classical outside expansion lemma is obtained by setting \(U_1 = U\) in the above lemma, i.e. \(q=1\). From [17], it can be seen that the Lyapunov exponent from the second Collet–Eckmann condition is inherited from the exponent from the ordinary Collet–Eckmann condition. Hence the uniform “outside exponent” \(\log \tilde{\lambda }_1\), is close to the Lyapunov exponent for the starting map \(f_0\) (but likely lower than it), depending on the neighbourhood \(U'\). Let us set \(\gamma _H = \log \tilde{\lambda }_1\).

4 Parameter-phase distortion

One fundamental result we need is the comparison between space and parameter-derivatives. This has been proved in [4, 5] and many other papers. But for our purposes we need a stronger form of this result due to Levin. We use a normalised space, described in [25], of maps in \({\mathcal R}_d\) as follows. We consider the set \(\Lambda _{d,\overline{p'}} \subset {\mathcal R}_d\) of all rational maps of degree d with exactly \(p'\) distinct critical points \(c_j\) with corresponding multiplicities \(m_j\), \(1 \le j \le p'\), where \(\overline{p'} = \{m_1, \ldots , m_{p'} \}\), normalised to that every map \(f \in \Lambda _{d,\overline{p'}}\) has the form

$$\begin{aligned} f(z) = \sigma z + b + \frac{P(z)}{Q(z)}, \end{aligned}$$

where \(\sigma \ne 0\), and \(\deg (P) \le d-2\), \(\deg (Q) \le d-1\) and where P and Q have no common zeros. By Proposition 8 in [25], every \(f \in {\mathcal R}_d\) is conjugate by a Möbius transformation to some \(\tilde{f} \in \Lambda _{d,\overline{p'}}\). So we can view \({\mathcal R}_d\) as a union of sets of the type \(\Lambda _{d,\overline{p'}}\) up to equivalence by Möbius transformations. Note that in every such set, critical points do not split.

We assume that the real analytic family \(f_a \in \Lambda _{d,\overline{p'}}\), \(a \in (-\varepsilon ,\varepsilon )\) around \(f_0\) has a tangent vector \(\overline{u} \ne \overline{0}\). Hence \(f_a(z) = f_0(z) + a u(z) + {\mathcal O}(a^2)\) for some \(u \ne 0\). Let us write \(\xi _{n,l}(\overline{a}) = f_{\overline{a}}^n(c_l(\overline{a}))\), where \(\overline{a} = (a_1,a_2,\ldots ,a_{p'})\) is a parameterisation of the parameter space \(\Lambda _{d,\overline{p'}}\) around \(f=f_0\), where \(f_0\) corresponds to \((a_1,a_2,\ldots ,a_{p'}) = (0,0,\ldots ,0)\) and where \(c_j=c_j(a_1,a_2,\ldots ,a_{p'})\). In [25] and [26], it is proven that the matrix L formed by the numbers

$$\begin{aligned} L(c_l,a_k) = \lim \limits _{n\rightarrow \infty } \frac{\dfrac{\partial \xi _{n,l}}{\partial a_k}(\overline{0})}{Df_0^{n-1}(fc_l)} \end{aligned}$$

is non-degenerate. Let \(\overline{u} = (u_1, u_2, \ldots , u_{p'})\) be a tangent vector of unit length, i.e., a vector in \({\mathbb P}( \Lambda _{d,\overline{p'}} )\), and suppose that this is tangent to the family \(f_a\) at \(a=0\). Then for almost all directions, i.e. tangent vectors, we have that all entries of \(L \cdot \overline{u}\) are non-zero, since the set of directions when this is not true is a finite union of sets of co-dimension 1 in \({\mathbb P}(\Lambda _{d,\overline{p'}})\). This means precisely that, for almost all directions, the limits

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \frac{\xi _{n,l}'(0)}{Df_0^{n-1}(f_0(c_l))} = \sum _k a_k L(c_l,a_k), \end{aligned}$$

is non-zero for every l, where we mean \(\xi _{n,l}'(0) = \frac{\,\textrm{d}}{\,\textrm{d}a} \xi _{n,l}(a \overline{u}) \bigr |_{a=0}\). We thus say that the real analytic family \(f_a\) around \(f_0\) is non-degenerate, if its tangent vector satisfies this condition, (in addition to the fact that the critical points move analytically).

We summarise this result as a proposition below, which is a direct consequence of Theorem 1 combined with Corollary 2.1, part (8), in [26]. It is a generalisation of a corresponding result in [25] Theorem 1.1.

Proposition 4.1

(Levin) Suppose that f is a rational map with summable critical points without parabolic cycles such that \(J(f) = \hat{{\mathbb C}}\), and suppose that \(f_a\) is a non-degenerate real analytic family around \(f_0\), \(a \in (-\varepsilon ,\varepsilon )\). Then for each critical point \(c_l(a)\), the limit

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \frac{\xi _{n,l}'(0)}{Df^{n-1}(v_l)} = L_l \end{aligned}$$
(4.1)

exists and is different from 0 and \(\infty \).

Indeed, a CE-map has all its critical values summable so the above proposition can be used. We also note that by [18] any Collet–Eckmann map different from a flexible Lattés map carries no invariant line field on its Julia set. We now use this result, to make small perturbations.

Lemma 4.2

Assume that \(f_0\) satisfies the CE-condition with exponent \(\gamma \). For any \(0< \gamma _1 < \gamma \) and \(0< q < 1\), there exist \(N > 0\) and \(\varepsilon > 0\) such that if \(f_a\), \(a \in (-\varepsilon ,\varepsilon )\) satisfies the CE-condition up to time \(m \ge N\) with exponent \(\gamma _1\), we have

$$\begin{aligned} \biggl | \frac{\xi _{m,l}'(a)}{Df_a^{m-1}(v_l(a))} - L_l \biggr | \le q |L_l|, \end{aligned}$$

for every l.

Proof

According to Theorem 1 in [26], we have for \(a=0\),

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } \frac{\xi _{n,l}'(0)}{Df^{n-1}(v_l(0))} = \sum _{n=0}^{\infty } \frac{\partial _a f_0(\xi _{n,l}(0))}{Df_0^n(v_l(0))} = L_l. \end{aligned}$$

Let us put \(\xi _{m.l}(a) = \xi _m(a)\) and \(L_l=L\). The reader may verify that for small perturbations a close to 0,

$$\begin{aligned} \frac{\xi _m'(a)}{Df_a^{m-1}(v_l(a)} = \sum _{n=0}^{m-1} \frac{\partial _a f_a(\xi _n(a))}{Df_a^n(v_l(a))}. \end{aligned}$$

We have that \(|\partial _a f_a| = |\partial _a f(z,a)|\) is bounded by some constant \(B > 0\). We choose \(N>0\) so that the series

$$\begin{aligned} \sum _{n=N+1}^{\infty } \frac{B}{Ce^{\gamma _1 n}} \le \min _{l} (q |L_l|/4). \end{aligned}$$
(4.2)

By continuity, there exists some \(\varepsilon > 0\) such that if \(a \in (-\varepsilon ,\varepsilon )\) then

$$\begin{aligned} \biggl | \sum _{n=0}^{N} \frac{\partial _a f_a(\xi _{n,l}(a))}{Df_a^n(v_l(a))} - L_l \biggr | \le q |L_l|/2. \end{aligned}$$

Since \(f_a\) is assumed to satisfy the CE-condition with exponent \(\gamma _1\), by (4.2) we get that the tail satisfies

$$\begin{aligned} \biggl | \sum _{n=N+1}^{m} \frac{\partial _a f_a(\xi _{n,l}(a))}{Df_a^n(v_l(a))} \biggr | \le q |L_l|/2, \end{aligned}$$

for all \(a \in (-\varepsilon ,\varepsilon )\) and all \(m \ge N\). This finishes the lemma.

When we use this lemma we want to choose N and \(\varepsilon \) so that the above lemma is valid for \(\gamma _L = (1/(6K)) \min (\gamma _0,\gamma _H)(1-\tau )\), where \(0< q < 1\) is small, and where \(\tau \) is a constant, \(0< \tau < 1\).

5 Weak parameter dependence and weak distortion

We will later see that the expansion of the space derivative induces a great deal of parameter independence. This follows a posteriori from the Main Distortion Lemma 7.3 and the Starting Lemma 7.4, but to start we now prove a weaker statement.

Lemma 5.1

Let N be as in Lemma 4.2 and let \(\gamma _1 \ge (1/(4K)) \min (\gamma _H, \gamma _0) (1-\tau ) > 0\). Suppose that \(a,b \in (-\varepsilon ,\varepsilon )\), where \((-\varepsilon ,\varepsilon )\) is a parameter interval around \(f_0\) for which \(f_a\) is the real analytic family we are considering. If \(\varepsilon > 0\) is sufficiently small we have the following. Suppose that we have:

  1. (i)

    For all \(n \le N\), that \(|Df_a^n(v_l(a))| \ge C_1 e^{\gamma _1 n}\), and for \(k \le k_1\) for some \(k_1 \ge 0\), that \(|Df_a^k(\xi _{N,l}(a))| \ge C_2 e^{\gamma _1 k}\),

  2. (ii)

    For all \(n \le N + k_1\), if \(\xi _{n,l}(a), \xi _{n,l}(b) \notin U\), then \(|\xi _{n,l}(a)-\xi _{n,l}(b)| \le S\) and, if \(\xi _{n,l}(a) \in U\) or \(\xi _{n,l}(b) \in U\) (or both), then

    $$\begin{aligned} |\xi _{n,l}(a)-\xi _{n,l}(b)| \le {\text {dist}}(\xi _{n,l}(c),Crit_c)/ ( \log ({\text {dist}}(\xi _{n,l}(c),Crit_c)))^2, \end{aligned}$$

    where \(c \in \{ a,b \}\) is such that \({\text {dist}}(\xi _{n,l}(c),Crit_c)\) is minimal,

Then there exist \(Q > 1\) (arbitrarily close to 1, if N is large enough and S small enough), and \(\gamma _2 > 0\) (arbitrarily close to but slightly smaller than \(\gamma _1\)), such that

$$\begin{aligned} |\xi _{N+k,l}(a) -\xi _{N+k,l}(b) |&\sim _{Q^k} |Df_a^k(\xi _{N,l}(a))| |\xi _{N,l}(a) - \xi _{N,l}(b)| \text { and } \nonumber \\ |\xi _{N+k,l}(a) -\xi _{N+k,l}(b) |&\ge |\xi _{N,l}(a) - \xi _{N,l}(b)| C_2 e^{\gamma _2 k}, \end{aligned}$$
(5.1)

for any \(k \le k_1\).

Proof

Since we assume that the critical points \(c_l(a)\) move analytically in a we have

$$\begin{aligned} c_l(a) = K_l a^{k_l} + {\mathcal O}\left( a^{k_l+1}\right) . \end{aligned}$$

Let us fix l and consider \(\xi _{n,l}(a) =\xi _n(a)\). If we consider a sufficiently small parameter interval \((-\varepsilon ,\varepsilon )\) centred at \(a=0\) corresponding to \(f_0\), then, by bounded distortion, we can make \(\varepsilon \) so small so that we have, for any two points \(a,b \in (-\varepsilon , \varepsilon )\),

$$\begin{aligned} |\xi _{N}(a) - \xi _{N}(b) \sim _2 |\xi _{N}'(c)||a-b|, \end{aligned}$$

for any \(c \in (-\varepsilon ,\varepsilon )\). From the assumption \(|Df_a^N(v_l(a))| \ge C_1 e^{\gamma _1 N}\) we see that we may choose \(\varepsilon > 0\) small enough to get \(|(f_b^N)'(v_l(b))| \ge C_1 e^{\gamma _2 N}\) for \(b \in \omega = (-\varepsilon ,\varepsilon )\) for some \(\gamma _2 > 0\) slightly smaller than \(\gamma _1\). From Lemma 4.2 we now get, with \(q \le 1/2\) and \(L=|L_l|\), for any \(c \in [a,b]\),

$$\begin{aligned}{} & {} |\xi _{N}(a) - \xi _{N}(b) | \sim |\xi _{N}'(c)| |a-b| \ge q L |a-b| |Df_{c}^{N-1}(v_l(c))| \nonumber \\{} & {} \quad \ge C_1' |a-b| |Df_{c}^{N-1}(v_l(c))| \ge C_1' C_1 e^{\gamma _2 (N-1)} |a-b|, \end{aligned}$$
(5.2)

where \(C_1'=qL/2\).

During the first N iterates, (5.2) implies that for a and b close to 0 we have

$$\begin{aligned} |c_l(a) - c_l(b)| \le 2K_l k_l a^{k_l-1} |a-b| \le C a^{k_l-1} |\xi _{N,l}(a) - \xi _{N,l}(b)| e^{-\gamma _2 N}, \end{aligned}$$

for some constant C. It follows that

$$\begin{aligned} |\xi _{N,l}(a) - \xi _{N,l}(b)| \gg |c_l(a) - c_l(b)| \end{aligned}$$
(5.3)

for all critical points.

Suppose that, for all \(0 \le j \le k \le k_1 - 1\), we have

$$\begin{aligned} |\xi _{N+j}(a) - \xi _{N+j}(b) |&\sim _{Q^j} |Df_a^j(\xi _N(a))| |\xi _N(a) - \xi _N(b)| \nonumber \\&\ge C_2 e^{\gamma _2 j} |\xi _N(a) - \xi _N(b)|, \end{aligned}$$
(5.4)

for some \(Q > 1\) close to 1. We may assume that \(\gamma _2< \gamma _1 < \gamma _0\). Combining (5.4) and (5.2) we conclude that (5.3) holds for N replaced by \(N+j\).

For the proof, put \(\xi _n(a) = \xi _{n,l}(a)\). Since we assume that the orbit of \(w=\xi _n(a)\) stays close to \(z=\xi _n(b)\), and also using (5.3), we have a distortion estimate

$$\begin{aligned} \frac{1}{C} \le \frac{|Df_c(w)|}{|Df_{c'}(z)|} \le C, \end{aligned}$$

for some constant \(C \ge 1\) for \(c,c' \in [a,b]\). This constant can be arbitrarily close to 1 if \(|z-w| \le S\) and S is small enough (for \(z,w \notin U\)) and \(|z-w| \le e^{-r}/r^2\) (if \({\text {dist}}(z,Crit_b) \sim e^{-r}\)).

With \(B = \sup |\partial _a f_a|\), using (5.2) and (5.4), there is some \(Q_0 > 1\), such that

$$\begin{aligned}{} & {} |\xi _{N+k+1}(a) - \xi _{N+k+1}(b) | \nonumber \\{} & {} \quad \ge \bigl | |f_a(\xi _{N+k}(a)) - f_a(\xi _{N+k}(b))| - |f_a(\xi _{N+k}(b)) - f_b(\xi _{N+k}(b))| \bigr | \nonumber \\{} & {} \quad \sim _{Q_0} |Df_a(\xi _{N+k}(a))| |\xi _{N+k}(a) - \xi _{N+k}(b)| - |\partial _a f_a (\xi _{N+k}(a))| |a-b| \nonumber \\{} & {} \quad \ge |Df_a(\xi _{N+k}(a))| |\xi _{N+k}(a) - \xi _{N+k}(b)| - \frac{B Q^k}{C_1' |Df_a^{N+k}(v_l(a))|} |\xi _{N+k}(a) - \xi _{N+k}(b)| \nonumber \\{} & {} \quad = \biggl ( |Df_a(\xi _{N+k}(a))| - \frac{B Q^k}{C_1' |Df_a^{N+k}(v_l(a))|} \biggr ) |\xi _{N+k}(a) - \xi _{N+k}(b)|. \end{aligned}$$
(5.5)

It is easy to check that a reverse inequality also holds. Note that \(Q_0\) can be chosen arbitrarily close to 1 if N is large enough and \(S = \delta \varepsilon _1\) is small enough (i.e. \(\varepsilon _1\) small enough). Repeating this k more times we get

$$\begin{aligned}{} & {} |\xi _{N+k+1}(a) - \xi _{N+k+1}(b) | \nonumber \\{} & {} \sim _{Q_0^{k+1}} |Df_a^{k+1}(\xi _N(a))| \prod _{j=0}^{k} \biggl ( 1 - \frac{B Q^j}{C_1' |Df_a^{N+j+1}(v_l(a))|} \biggr ) |\xi _N(a) - \xi _N(b)|. \end{aligned}$$
(5.6)

Now we use assumption i) again, and conclude that

$$\begin{aligned} \sum _{j=0}^k \frac{B Q^j}{C_1' |Df_a^{N+j+1}(v_l(a))|} \le \sum _{k=0}^{\infty } \frac{B}{C_1'}e^{-\gamma _2(N+j+1)} < \infty , \end{aligned}$$

where \(\gamma _2 \le \gamma _1\) is slightly smaller than \(\gamma _1\). In fact the sum can be made as small as we like. Hence, the product in (5.6) can be arbitrarily close to 1, say at least \(1/Q_1\) for some small \(Q_1 > 1\). Therefore,

$$\begin{aligned} |\xi _{N+k+1}(a) - \xi _{N+k+1}(b) | \sim _{Q_0^{k+1} Q_1} |Df_a^{k+1}(\xi _N(a))| |\xi _N(a) - \xi _N(b)| \end{aligned}$$
(5.7)

Since \(|Df_a^{k+1}(\xi _N(a))| \ge C_2 e^{\gamma _1 (k+1)}\) we have \(Q_0^{k+1}Q_1 |Df_a^{k+1}(\xi _N(a))| \ge C_2 e^{\gamma _2 (k+1)}\), for some \(\gamma _2 > 0\) slightly smaller than \(\gamma _1\), given that \(Q_0\) and \(Q_1\) are sufficiently close to 1. Hence we have (5.4) satisfied with k replaced by \(k+1\) and we can continue the same argument and obtain (5.4) up until \(k_1\). This settles both claims. \(\square \)

We can easily get a little more general statement. If \(f_a\) satisfies the CE-condition up until time \(N + k_1\), we can use the same arguments as above to obtain

$$\begin{aligned} |\xi _n(a) - \xi _n(b)| \sim _{Q^j} |Df_a^j(\xi _{n-j}(a))||\xi _{n-j}(a) - \xi _{n-j}(b)|, \end{aligned}$$
(5.8)

if \(n - j \ge N\) and \(n \le N + k_1\). The details are left to the reader.

Remark 5.2

We have seen that the parameter dependence is inessential as long as the derivative of \(|f_a^n(v_l(a))|\) grows with a certain Lyapunov exponent \(\gamma _1\). We call this the weak parameter dependence property.

We will also require that the Lyapunov exponent never goes below a certain “critical” level, which is related to the “intermediate level” \(\gamma _I = (1/4) \min (\gamma _H, \gamma _0)(1-\tau )\). Since \(\gamma _L = (1/(6K)) \min (\gamma _H, \gamma _0) (1-\tau ) < (1/(4K)) \min (\gamma _H, \gamma _0) (1-\tau )\), then \(\gamma _C = (1/(4K)) \min (\gamma _H, \gamma _0) (1-\tau )\) (the critical exponent) as a lower bound for \(\gamma _1\) will do. We also let \(\gamma _B = (3/4) \min (\gamma _H, \gamma _0) (1-\tau )\). This \(\gamma _B\) is the Lyapunov exponent that we want to keep at the end.

6 Distortion and expansion during the bound period

We use the following notations. Below \(\omega \) is assumed to be an interval and a partition element according to Definition 2.3.

Definition 6.1

We say that \(a \in {\mathcal E}_{n,l}(\gamma )\) if

$$\begin{aligned} |Df_a^k(v_l(a))|&\ge C_0 e^{\gamma k}, \text { for all } k \le n-1,\text { and} \end{aligned}$$
(6.1)
$$\begin{aligned} |Df_a^k(v_j(a))|&\ge C_0 e^{\gamma k}, \text { for all } k \le (6 K \alpha /\gamma _I) n,\text { and all }j \ne l. \end{aligned}$$
(6.2)

We say that \(a \in {\mathcal B}_{n,l}\) if

$$\begin{aligned} {\text {dist}}(\xi _{k,l}(a),Crit_a)&\ge K_b e^{-2\alpha k}, \text { for all }k \le n\text { and} \end{aligned}$$
(6.3)
$$\begin{aligned} {\text {dist}}(\xi _{k,j}(a),Crit_a)&\ge K_b e^{-2\alpha k}, \text { for all }k \le (6 K \alpha /\gamma _I) n\text { and all }j \ne l. \end{aligned}$$
(6.4)

We say that \(\omega \subset {\mathcal E}_{n,l,\star }(\gamma )\) if (6.1) holds and (6.2) holds with \(6K \alpha /\gamma _I\) replaced by \(12 K \alpha /\gamma _I\). We say that \(\omega \subset {\mathcal B}_{n,l,\star }\) if (6.3) holds and (6.4) holds with \(6K \alpha /\gamma _I\) replaced by \(12 K \alpha /\gamma _I\).

Note that \(\omega _0 \subset {\mathcal E}_{N,l}(\gamma ) \cap {\mathcal B}_{N,l}\) for all l for some \(\gamma \) close to \(\gamma _0\). The definitions above are tailored so that if an interval belongs to \({\mathcal E}_{n,l}(\gamma )\) or \( {\mathcal B}_{n,l}\) then we can use the binding information for the other critical points up until some fraction \(6 K\alpha /\gamma _I\) of the time n (there is some extra space in the estimate to be used in the proofs). The star is added to be able to use the binding information longer and continue the parameter-exclusion construction up until 2n.

To prove bounded distortion, we will frequently make use of the following lemma, which is standard.

Lemma 6.2

Given complex numbers \(z_1, \ldots , z_n\) we have

$$\begin{aligned} \biggl | \prod _{j=1}^n z_j - 1 \biggr | \le -1 + \exp {\sum _{j=1}^n |z_j - 1|}. \end{aligned}$$

Expanding f in Taylor series near a critical point c gives

$$\begin{aligned}{} & {} f_a(z) = f_a(c) + A(z-c)^k + {\mathcal O}\left( (z-c)^{k+1}\right) , \\ {}{} & {} Df_a(z) = Ak(z-c)^{k-1} + {\mathcal O}\left( (z-c)^k\right) , \end{aligned}$$

where A is analytic in the parameter a. If z and w are close to c and \(|z-c| \sim |w-c|\), we get,

$$\begin{aligned}{} & {} Df_a(z) - Df_a(w) = Ak(z-w)\left( (z-c)^{k-2} + (z-c)^{k-3}(w-c) + \cdots \right. \nonumber \\{} & {} \quad \left. + (w-c)^{k-2} + {\mathcal O}((z-c)^{k-1})\right) . \end{aligned}$$
(6.5)

Hence,

$$\begin{aligned} \sum _{j=1}^{n} \frac{|Df_a(\xi _j(a)) - Df_a(\xi _j(b))|}{|Df_a(\xi _j(b))|} \sim _{2k} \sum _{j=1}^{n} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_a)}. \end{aligned}$$

if z and w are sufficiently close to \(Crit_a\). In Sect. 7 we will allow the parameter to vary as well, and have to go a bit further.

Lemma 6.3

(Distortion during the bound period) Let \(\varepsilon ' > 0\). Then if \(\delta '=e^{-\Delta '}\) is sufficiently small and N sufficiently large, the following holds. Let \(z=\xi _{\nu ,l}(a)\) be a free return into \(U_i'\), \(\nu \ge N\), where \(a \in {\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l}\) for some \(\gamma \ge \gamma _I\). Then we have, for all w on the line segment between \(f_a(z)\) and \(\xi _{1,i}(a) = v_i(a)\),

$$\begin{aligned} \biggl | \frac{Df_a^j(w)}{Df_a^j(v_i(a))} - 1 \biggr | \le \varepsilon ', \end{aligned}$$

for \(j \le p\), where p is the length of the bound period for z.

Proof

We first prove the lemma for \(w=f_a(z)\). Let \({\text {dist}}(\xi _{\nu ,l}(a), Crit_{a}) \sim _{\sqrt{e}} e^{-r}\) where \(\xi _{\nu ,l}(a) = z\) and put \(z_j = f_a^j(z)\) and \(\xi _{j,i}(a) = \xi _j(a)\). Following the discussion preceding the lemma, we estimate, for \(\nu \ge N\), the sum

$$\begin{aligned} \sum _{j=1}^{p} \frac{|Df_a(z_j) - Df_a(\xi _{j}(a))|}{|Df_a(\xi _{j}(a))|} \le C \sum _{j=1}^{p} \frac{|z_j - \xi _{j}(a)|}{{\text {dist}}(\xi _{j}(a),Crit_a) }. \end{aligned}$$

The last sum can be divided into two subsums \([1,J] \cup [J+1,p]\) where \(J = \rceil dr/(10(2 \alpha + \Gamma )) \lceil \), where d the degree of \(f_0\) at \(c_k\), and \(\Gamma = \sup _{a \in (-\varepsilon ,\varepsilon ), z \in \hat{{\mathbb C}}} \log |f_a'(z)|\). Assuming that the basic approach rate assumption holds, the first sum can be estimated as

$$\begin{aligned} \sum _{j=1}^J \frac{|z_1 - \xi _1(a)|e^{\Gamma (j-1)}}{K_b e^{-2 \alpha j}} \le \sum _{j=1}^J CK_b^{-1}e^{-dr} e^{(\Gamma + 2 \alpha )j } \le \sum _{j=1}^J Ce^{-(9/10)dr} \le Ce^{-9\Delta '/10}. \end{aligned}$$

The second sum can be estimated using the definition of the bound period (remember \(\beta =\alpha \)),

$$\begin{aligned} \sum _{j=J+1}^p \frac{|z_j - \xi _j(a)|}{{\text {dist}}(\xi _j(a),Crit_a) } \le C \sum _{j=J+1}^p e^{-\alpha j} \le C e^{-\alpha \frac{dr}{10(2 \alpha + \Gamma )}}. \end{aligned}$$

We see that both sums can be made arbitrarily small if \(\Delta '\) is large enough. This finishes the case \(w=f_a(z)\).

It is easy to see that the same must hold on the line segment between \(f_a(z)\) and \(v_i(a)\). Let \(p' \le p\) be the least bound period for all such w on this line. That means that up until \(j=p'\) the distortion estimate holds for all w on the line. But since \(\varepsilon '\) may be chosen very small this means that the image of the line under \(f_a^{p'}\) is an almost straight line too. It follows that the corresponding w for \(p'\) has to be z in fact, so \(p=p'\). \(\square \)

Note that the condition on \(U'\) in the above lemma is only depending on the starting family of functions \(f_a\), \(a \in (-\varepsilon ,\varepsilon )\). There is also a condition on \(U'\) in Lemma 3.1.

Lemma 6.4

Suppose that \(\xi _{\nu ,l}(a)\) is a return into \(U_i'\) and that \(a \in {\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l}\) for some \(\gamma \ge \gamma _I\). Then if N is large enough and p is the length of the following bound period we have,

$$\begin{aligned} |Df_a^{p}(\xi _{\nu ,l}(a))| \ge e^{\frac{\gamma }{2d_i} p}, \end{aligned}$$

where \(d_i\) is the degree of f at \(c_i\).

Moreover, if \({\text {dist}}(\xi _{\nu ,l}(a),Crit_{a}) \sim _{\sqrt{e}} e^{-r}\), then

$$\begin{aligned} \frac{d_i r }{2 \Gamma } \le p \le \frac{2 d_i r}{\gamma }. \end{aligned}$$

In particular, \(p \le 4 \alpha d_i \nu /\gamma \), where \(\alpha \) is the exponent in the basic assumption and \(\Gamma = \sup \limits _{a \in (-\varepsilon ,\varepsilon ), z \in \hat{{\mathbb C}}} \log |Df_a(z)|\).

Proof

Put \(D_j = |Df_a^j(\xi _{\nu ,l}(a))|\) and \(E_{j} = |Df_a^{j}(\xi _{\nu +1,l}(a))|\) for some \(a \in \omega \). We have \(D_1 \ge CK_b e^{-2\alpha K \nu }\), since \(a \in {\mathcal B}_{\nu ,l}\) for some constant C. Moreover, for \(1 \le j \le p-1\), we can use Lemma 6.3 to prove that \(E_j \ge (C_0/2) e^{\gamma j}\) since \(a \in {\mathcal E}_{\nu ,l}(\gamma )\). Hence the derivative

$$\begin{aligned} |Df_a^{\nu +j}(v_l(a))| \ge (C_0/2) C K_b C_0 e^{(\gamma -2\alpha K) (\nu +j)} \ge C_0e^{\gamma ' (\nu +j)}, \text { for }\quad j \le p, \end{aligned}$$

where \(\gamma ' \ge \gamma - 4 \alpha K \ge \gamma _C\), provided N is large enough (recall \(\nu \ge N\)). We can also use Lemma 6.3 to get the following distortion estimate, for some \(C > 1\) (close to 1),

$$\begin{aligned} |\xi _{\nu +j,l}(a) - \xi _{j,i}(a)| \sim _{C} |Df_a^j(\xi _{\nu ,l}(a))||\xi _{\nu ,l}(a) - \xi _{0,i}(a)|, \end{aligned}$$

for \(j \le p+1\). Suppose that \(|\xi _{\nu ,l}(a) - \xi _{0,i}(a)| \sim _2 e^{-r}\). We know from the definition of the bound period and the basic assumption, that

$$\begin{aligned} D_{p+1} e^{-r} \ge \frac{1}{4C} {\text {dist}}(\xi _{p+1,i}(a),Crit_a) e^{-\alpha (p+1)} \ge \frac{1}{4C} K_be^{-2\alpha (p+1) - \alpha (p+1)}. \end{aligned}$$
(6.6)

Also we have, for some \(\kappa _1 \ge 1\),

$$\begin{aligned} D_{p+1} e^{-r} \sim _{\kappa _1} E_{p} e^{-rd_i}, \end{aligned}$$

and so

$$\begin{aligned} e^{-r(d_i-1)}&\sim _{\kappa _1} \biggl ( D_{p+1} e^{-r} \biggr )^{\frac{d_i-1}{d_i}} E_p^{-\frac{d_i -1}{d_i}} \nonumber \\&\ge \biggl ( \frac{K_b}{4C} \biggr )^{\frac{d_i-1}{d_i}} e^{-(2 \alpha + \alpha ) (p+1) \frac{d_i -1}{d_i} } E_{p}^{-\frac{d_i-1}{d_i}} . \end{aligned}$$
(6.7)

Now we can use that \(2\alpha + \alpha = 3 \alpha \) is very small compared to \(\gamma \ge \gamma _I\). We get,

$$\begin{aligned} D_{p+1}&\sim _{\kappa _1} e^{-r(d_i-1)} E_{p} \nonumber \\&\ge \biggr ( \frac{K_b}{4C} \biggl )^{\frac{d_i-1}{d_i}} E_{p}^{\frac{1}{d_i}} e^{-3 \alpha (p+1) \frac{d_i-1}{d_i} } \nonumber \\&\ge \biggl ( \frac{C_0}{2}\biggr )^{\frac{1}{d_i}} \biggl (\frac{K_b}{4C} \biggl )^{\frac{d_i-1}{d_i}} e^{\frac{\gamma }{d_i} p - 3 \alpha (p+1)} \ge e^{\frac{p}{2d_i}\gamma }, \end{aligned}$$
(6.8)

if \(\nu \) is sufficiently large. Since \(D_p = D_{p+1}/|Df_a(\xi _{\nu +p}(a))|\), with minor modifications it is easy to see that the same estimate holds for \(D_p\).

To prove the second claim, we note that from (6.6), the slow recurrence condition and the fact that \(|Df^{\nu }(v_l(a))| \le e^{\nu \Gamma }\) we get that, for some very small \(\alpha > 0\) in comparison to \(\gamma \),

$$\begin{aligned} e^{\Gamma (p+1)}e^{-d_i r} \ge E_p e^{-d_i r} \ge \frac{K_b}{4C} \kappa _1^{-1} e^{-3 \alpha (p+1)}, \end{aligned}$$

which gives the left inequality if \(\nu \ge N\) is large enough. To prove the right inequality, we note that the spherical distance \({\text {dist}}(\xi _{\nu ,l}(a),Crit_a)\) is bounded from above. By the definition of the bound period (now we are considering the time p iterates from the return into U), and the fact that we also have \(E_{p-1} e^{-d_i r} \sim _{\kappa _1} D_p e^{-r}\),

$$\begin{aligned} (C_0/2)e^{\gamma (p-1)} e^{-d_i r} \le E_{p-1} e^{-d_i r} \le 4C \kappa _1 e^{-\alpha p} {\text {dist}}(\xi _{p,i}(a),Crit_a). \end{aligned}$$

and the right inequality follows.\(\square \)

The above lemma gives a quite substantial amount of increase of the derivative during the bound period, even if there is a loss in the first iterate. We can also see that under all circumstances,

$$\begin{aligned} |\xi _{\nu +p}(a) - \xi _{\nu +p}(b)| \ge |\xi _{\nu }(a) - \xi _{\nu }(b)|. \end{aligned}$$

7 Strong distortion

Our aim now is to use weak distortion and prove that we actually have something stronger, namely, for some small \(\varepsilon ' > 0\),

$$\begin{aligned} \biggl | \frac{Df_a^{n}(v_l(a))}{Df_b^{n}(v_l(b))} - 1 \biggr | \le \varepsilon ', \end{aligned}$$
(7.1)

for all \(a,b \in \omega \) where \(\omega \) is a partition element according to Definition 2.3. We will also make use of the preliminary discussion in Sect. 6. Let us first see a geometrical consequence of (7.1). By Lemma 4.2 we have, for some small \(0< q < 1\),

$$\begin{aligned} \biggl | \frac{\xi _{n,l}'(a)}{Df_a^{n-1}(v_l(a))} - L_l \biggr | \le q|L_l| \end{aligned}$$
(7.2)

for \(n \ge N\) as long as \(f_a\) satisfies the CE-condition with some exponent at least \(\gamma _L\) and where \(N > 0\) is as in Lemma 4.2. So combining (7.1) and (7.2) we get

$$\begin{aligned} \biggl | \frac{\xi _{n,l}'(a)}{\xi _{n,l}'(b)} - 1 \biggl | \le \tilde{\varepsilon }, \quad \text { for all }a,b \in \omega , \end{aligned}$$
(7.3)

where \(\tilde{\varepsilon } > 0\) is arbitrarily small given that \(\varepsilon '\) and q are small enough. This means that the curve \(\xi _{n,l}(\omega )\) is almost straight, which will be important when we make partitions at returns.

From now on, let us fix l and write \(\xi _{n,l}(a) = \xi _n(a)\). In the beginning we are going to follow orbits close to the original orbit \(\xi _n(0)\), and then it is rather easy to see that nearby orbits also satisfy the CE-condition, but when considering nearby parameters a close to 0, after a long time we have to keep track of the derivative \(Df_a^n(v_l(a))\), since the orbit of \(\xi _n(a)\) and \(\xi _n(0)\) become more or less independent.

Choose some small \(\varepsilon > 0\) and suppose that \(\omega \subset (-\varepsilon ,\varepsilon )\). For \(a,b \in \omega \), consider (7.1). The distortion during the first N iterates can be made arbitrarily small if the perturbation \(\varepsilon \) is small enough. So we only need to consider iterates after N, and hence focus on proving:

$$\begin{aligned} \biggl | \frac{Df_a^{n-N}(\xi _N(a))}{Df_b^{n-N}(\xi _N(b))} - 1 \biggr | \le \varepsilon '. \end{aligned}$$
(7.4)

The main task is to prove this stronger form of the space distortion. By Lemma 6.2, the distortion estimate (7.4) follows if we prove that

$$\begin{aligned} \sum _{j=0}^{n-N-1} \biggl | \frac{Df_a(\xi _{N+j}(a)) - Df_b(\xi _{N+j}(b))}{Df_b(\xi _{N+j}(b))} \biggr | \le \varepsilon '', \end{aligned}$$
(7.5)

where \(\varepsilon ' \rightarrow 0\) as \(\varepsilon '' \rightarrow 0\).

With \(z=\xi _j(a)\) and \(w=\xi _j(b)\), we have

$$\begin{aligned} |Df_a(\xi _j(a)) - Df_b(\xi _j(b))| \le |Df_a(z) - Df_a(w)| + |Df_a(w) - Df_b(w)|. \end{aligned}$$

We also see that, for some \(a^* \in [a,b]\),

$$\begin{aligned} |Df_a(w) - Df_b(w)| \le |a-b| |\partial _a Df_{a^*}(w)| \le C |\xi _j(a) - \xi _j(b)| e^{-\gamma _2 j} , \end{aligned}$$
(7.6)

for some constant \(C > 0\) since \(\partial _a Df(z)\) is bounded.

If c is a critical point, using that \(|z-c| \sim |w-c|\), for \(z=\xi _j(a)\), \(w=\xi _j(b)\), we get, using the Taylor expansion of f near c, see (6.5), that

$$\begin{aligned} \sum _{j=1}^{n} \frac{|Df_a(\xi _j(a)) - Df_b(\xi _j(b))|}{|Df_b(\xi _j(b))|} \sim _{2k} \sum _{j=1}^{n} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)}. \end{aligned}$$

if z and w are sufficiently close to \(Crit_a\). We will therefore estimate the sum

$$\begin{aligned} \hat{S} = \sum _{j=N}^{n} \frac{|\xi _{j}(a) - \xi _{j}(b)|}{\text {dist} (\xi _{j}(b), Crit_b) }. \end{aligned}$$
(7.7)

Lemma 7.1

If N is large enough we have the following. Suppose that \(\nu _k \ge N\) is a return time and that \(\xi _{\nu _k,l}(a)\) is a free return into \(U'\) (essential or inessential or a pseudo return), \(a \in (-\varepsilon ,\varepsilon )\). Moreover, we suppose that \(a \in {\mathcal E}_{\nu _k,l}(\gamma ) \cap {\mathcal B}_{\nu _k,l}\), where \(\gamma \ge \gamma _I\). Then until the next free return, we have,

$$\begin{aligned} |Df_a^{\nu _{k+1}}(v_l(a))| \ge e^{\gamma _1 \nu _{k+1}}, \end{aligned}$$

where \(\gamma _1 \ge (9/10) \min (\gamma , \gamma _H)\).

Proof

During the bound period \(p_k\) starting directly after the return \(\nu _k\), we see from Lemma 6.4 that

$$\begin{aligned} |Df^{\nu _k + p_k}(v_l(a))| \ge C_0e^{\gamma \nu _{k}} e^{\frac{\gamma }{2K} p_k}, \end{aligned}$$

for each \(a \in \omega \). Moreover, note that \(p_k \le (2 K \alpha /\gamma ) \nu _k\) from Lemma 6.4. After that the free period starts, and by the outside expansion Lemma 3.1 we get

$$\begin{aligned} |Df^{\nu _{k+1}}(v_l(a))| \ge C_0 C' e^{\gamma \nu _{k}} e^{\frac{\gamma }{2K} p_k} e^{\gamma _H (\nu _{k+1} - (\nu _k + p_k))} \ge e^{\gamma _1 \nu _{k+1}}, \end{aligned}$$

for some \(\gamma _1 \ge (9/10) \min (\gamma ,\gamma _H) \) if N is large enough.\(\square \)

If we consider a return of \(\xi _n(\omega )\), where \(\omega \) is a partition element, we have seen by Lemma 5.1, that we may disregard from the parameter dependence inside \(\omega \) as long as the space derivative grows exponentially. The above lemma ensures that \(\gamma _1 \ge \gamma _C\). So, by the weak parameter dependence property, we have

$$\begin{aligned} |\xi _{\nu '}(a) -\xi _{\nu '}(b)| \sim _{Q^{\nu '-\nu }} |Df_a^{\nu '-\nu }(\xi _{\nu }(a))||\xi _{\nu }(a) -\xi _{\nu }(b)| \end{aligned}$$

for all \(a,b \in \omega \). Moreover, there is some \(\gamma _2 \ge \min (\gamma , \gamma _H) /(3K)\), such that

$$\begin{aligned} |Df_a^{\nu '-\nu }(\xi _{\nu }(a))| Q^{-(\nu '-\nu )} \ge e^{\gamma _2(\nu ' - \nu )}, \end{aligned}$$

if \(\log Q < \alpha \ll \gamma _1\). It follows that two orbits \(\xi _{n,l}(a)\) and \(\xi _{n,l}(b)\) repel each other up to some large scale or until the next return takes place. We get the following lemma.

Lemma 7.2

If N is large enough we have the following. Suppose that \(\xi _{\nu ,l}(\omega )\) is a return with \(\nu \ge N\) (inessential or essential or a pseudo return) and that \(a,b \in \omega \), \(\omega \subset {\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l}\) is a partition element, and \(\gamma \ge \gamma _I\). Then if \(\nu '\) is the next free return time.

$$\begin{aligned} |\xi _{\nu '}(a) -\xi _{\nu '}(b)| \ge e^{\gamma _2 (\nu ' - \nu )} |\xi _{\nu }(a) -\xi _{\nu }(b)| \ge 2 |\xi _{\nu }(a) -\xi _{\nu }(b)|, \end{aligned}$$

where \(\gamma _2 \ge \min (\gamma , \gamma _H)/(3K)\).

Next, we prove the Main Distortion Lemma, which is our main object in this section.

Lemma 7.3

(Main Distortion Lemma) Let \(\varepsilon ' > 0\). Then if N is sufficiently large we have the following. Let \(\omega \subset {\mathcal E}_{\nu }(\gamma ) \cap {\mathcal B}_{\nu ,l}\) be a partition element for some \(\gamma \ge \gamma _I\) and suppose that \(\nu \ge N\) is a return time or does not belong to a bound period. Then we have, until the next free return \(\xi _{\nu ',l}(\omega )\), a bound on the distortion if \(\omega \) is still a partition element at time n, namely,

$$\begin{aligned} \biggl | \frac{Df_a^{n}(v_l(a))}{Df_b^{n}(v_l(b))} - 1 \biggl | \le \varepsilon ', \quad \text {for all }a,b \in \omega \end{aligned}$$

if \(\nu \le n \le \nu '\).

Proof

By Lemma 7.1 the CE-condition is fulfilled with exponent \(\gamma _1 \ge (9/10) \min (\gamma , \gamma _H)\) up until the next free return. Now, \(\gamma _1 \ge \gamma _C\) so we can repeatedly use the weak parameter dependence property. Let us assume that \(\nu \) is a return time. If not, replace \(\nu \) with the latest return time before \(\nu \).

Put \(\xi _{n,l}(a) = \xi _n(a)\). We want to estimate the sum

$$\begin{aligned} \sum _{j=1}^{n} \frac{|\xi _j(a)) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)}. \end{aligned}$$
(7.8)

First we look at the contribution from the bound periods. We want to estimate the sum

$$\begin{aligned} \sum _{j=0}^{p} \frac{|\xi _{\nu + j}(a) - \xi _{\nu +j}(b)|}{{\text {dist}}(\xi _{\nu +j}(b), Crit_b)}. \end{aligned}$$

Since \(|\xi _{\nu }(a) - \xi _{\nu }(b)| \sim _2 e^{-r}/r^2\) and \({\text {dist}}(\xi _{\nu }(b), Crit_b) \sim _{\sqrt{e}} e^{-r}\), the first term (\(j=0\)) contributes \(\sim 1/r^2\).

To estimate the other terms (\(j > 0\)), we use the weak parameter dependence property to get

$$\begin{aligned} |\xi _{\nu +j}(a) - \xi _{\nu +j}(b)| \sim _{Q^j} |Df^j(\xi _{\nu }(a))| |\xi _{\nu }(a) - \xi _{\nu }(b)| \sim _2 |Df^j(\xi _{\nu }(a))| e^{-r}/r^2. \end{aligned}$$

By the definition of the bound period we have, for \(j > 0\), using Lemma 6.3, if \(\xi _{\nu }(a) \in U_i'\),

$$\begin{aligned} |Df^j(\xi _{\nu }(a))| e^{-r} \sim |\xi _{j,i}(a) - \xi _{\nu +j}(a)| \le e^{-\alpha j} {\text {dist}}(\xi _{j,i}(a),Crit_a). \end{aligned}$$

So we get

$$\begin{aligned} |\xi _{\nu +j}(a) - \xi _{\nu +j}(b)| \le CQ^j \frac{e^{-\alpha j} {\text {dist}}(\xi _{j,i}(a),Crit_a)}{r^2}, \end{aligned}$$

and therefore, since \({\text {dist}}(\xi _{j,i}(a),Crit_a)\) is virtually the same for all \(a\in \omega \),

$$\begin{aligned} \sum _{j=0}^{p} \frac{|\xi _{\nu + j}(a) - \xi _{\nu +j}(b)|}{{\text {dist}}(\xi _{\nu +j}(b),Crit_b)} \le \frac{C}{r^2} + C \sum _{j=1}^p \frac{Q^j e^{-\alpha j}}{r^2} \le \frac{2C}{r^2}, \end{aligned}$$

where the term \(C/r^2\) corresponds to \(j=0\), and \(\log Q < \alpha \).

Between each adjacent pair of free returns there is a growth of the interval \(\xi _{n,j}(\omega )\) as follows. Lemma 7.2 implies

$$\begin{aligned} 2 {\text {diam}}(\xi _{\nu _k}(\omega )) \le {\text {diam}}(\xi _{\nu _{k+1}}(\omega )), \quad \text {for all }a,b \in \omega . \end{aligned}$$
(7.9)

Let (r) be those indices k for which \({\text {dist}}(\xi _{\nu _k}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), and let \(\hat{k}(r)\) be the largest integer in (r). Hence going backwards in time, inside each (r), the contribution from the bound periods is a constant times the last contribution, i.e.

$$\begin{aligned} \sum _{k \in (r)} \frac{|\xi _{\nu _k}(a) - \xi _{\nu _k}(b)|}{{\text {dist}}(\xi _{\nu _k}(b),Crit_b)} \le C \frac{|\xi _{\nu _{\hat{k}(r)}}(a) - \xi _{\nu _{\hat{k}(r)}(b)}|}{{\text {dist}}\left( \xi _{\nu _{\hat{k}(r)}}(b),Crit_b\right) } \le \frac{C}{r^2}. \end{aligned}$$

Summing over all such possible returns we get

$$\begin{aligned} \sum _{r =\Delta }^{\infty } \frac{C}{r^2} \le \frac{2C}{\Delta }. \end{aligned}$$

Let us now look for the contribution from the free periods. Let us first assume that \(\nu _k\) are the returns up until \(\nu _s\) and that \(\nu ' = \nu _s\) (hence \(\nu _{s-1}=\nu \)) and \(p_k\) their bound periods. By Lemma 3.1 we get that, for every \(a,b \in \omega \), now assuming that \(\xi _j(\omega ) \cap U = \emptyset \) for all \(\nu _k+p_k +1 \le j \le \nu _{k+1}-1\), and using the weak parameter dependence property,

$$\begin{aligned} |\xi _{\nu _{k+1}-1}(a) - \xi _{\nu _{k+1}-1}(b)| \ge C' Q^{-(\nu _{k+1}-1-j)} \lambda ^{\nu _{k+1}-1-j} | \xi _{j}(a) - \xi _{j}(b)|. \end{aligned}$$

We can choose \(Q > 1\) such that \(\log Q < (\log \lambda )/10\). Hence, possibly diminishing \(\lambda > 1\),

$$\begin{aligned} \sum _{j=\nu _{k-1}+p_{k-1}+1}^{\nu _{k}-1} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)}&\le C \sum _{j} \lambda ^{-(\nu _{k}-1-j)} \frac{|\xi _{\nu _{k}-1}(a) - \xi _{\nu _{k}-1}(b)|}{\delta } \nonumber \\&\le C \frac{|\xi _{\nu _{k}-1}(a) - \xi _{\nu _{k}-1}(b)|}{\delta }. \end{aligned}$$
(7.10)

We have, for some \(\kappa _2 \ge 1\),

$$\begin{aligned} |\xi _{\nu _{k}-1}(a) - \xi _{\nu _{k}-1}(b)| \sim _{\kappa _2} |\xi _{\nu _{k}}(a) - \xi _{\nu _{k}}(b)| \sim _2 e^{-r_{k}}/r_{k}^2, \end{aligned}$$

if \(k < s\), where we have put \({\text {dist}}(\xi _{\nu _k}(\omega ),Crit_{\omega }) \sim _{\sqrt{e}} e^{-r_k}\). So for those returns the contribution to the sum (7.8) is going to be very small. Recalling that \(|\xi _j(a)) - \xi _j(b)| \le S\), where \(S =\varepsilon _1 \delta \) is the large scale, \(\delta =e^{-\Delta }\), we get, for the last return,

$$\begin{aligned} \sum _{j=\nu _{s-1}+p_{s-1}+1}^{\nu _{s}-1} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \le C \frac{S}{\delta } \le C\varepsilon _1 , \end{aligned}$$
(7.11)

where C depends only on \(C'\) and \(\lambda \) (hence not on \(\delta \)). So \(C \varepsilon _1\) can be made arbitrarily small if \(\varepsilon _1\) is small enough. We let (r) be those indices k such that \({\text {dist}}(\xi _{\nu _k}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), and \(\hat{k}(r)\) the maximum index k for which this happens. Then using Lemma 7.2, we have (7.9), and therefore we conclude that

$$\begin{aligned} \sum _{k \in (r)} |\xi _{\nu _{k}}(a) - \xi _{\nu _{k}}(b)| \le C|\xi _{\nu _{\hat{k}(r)}}(a) - \xi _{\nu _{\hat{k}(r)}}(b)|. \end{aligned}$$

Summing up, we get, excluding the last return,

$$\begin{aligned} \sum _{k=1}^{s-1} \sum _{j=\nu _{k-1}+p_{k-1} +1}^{\nu _{k}-1}\frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)}&= \sum _{r \ge \Delta } \sum _{k \in (r)} \sum _{j=\nu _{k-1}+p_{k-1}+1}^{\nu _{k}-1} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \nonumber \\&\le C \sum _{r \ge \Delta } \sum _{k \in (r)} \frac{|\xi _{\nu _{k}-1}(a) - \xi _{\nu _{k}-1}(b)|}{\delta } \nonumber \\&\le C \sum _{r \ge \Delta } \frac{|\xi _{\nu _{\hat{k}(r)}-1}(a) - \xi _{\nu _{\hat{k}(r)}-1}(b)|}{\delta } \nonumber \\&\le C \sum _{r \ge \Delta } \frac{e^{\Delta -r}}{r^2} \le \frac{C}{\Delta }. \end{aligned}$$
(7.12)

Including the last we return we get

$$\begin{aligned} \sum _{k=1}^{s} \sum _{j=\nu _{k-1}+p_{k-1} + 1}^{\nu _{k}-1}\frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \le \frac{C}{\Delta } + C\varepsilon _1. \end{aligned}$$
(7.13)

If we now pick some n such that \(\nu +p \le n < \nu '\), then letting \(q_1< \ldots < q_t \) be consecutive, so called pseudo-returns into some fixed \(U' {\setminus }U\) so that \(\nu +p \le q_1\), \(q_t \le n\), we proceed as follows. The only difference to returns into U is that we can only say that \({\text {diam}}(\xi _{q_j}(\omega )) \le S\) for pseudo-returns. We do not count bound returns as pseudo-returns but consider only the free pseudo-returns.

The contribution to the sum (7.8) between each pair of pseudo returns is again a constant times the last term for each pseudo-return. Let (r) be the indices l for which \(\xi _{q_l}(\omega )\) is a pseudo return for which \({\text {dist}}(\xi _{q_l}(\omega ),Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), and let \(\hat{l}(r)\) be the largest index l for which \({\text {dist}}(\xi _{q_l}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\).

Then

$$\begin{aligned} \sum _{j=\nu +p+1}^{q_t} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)}&= \sum _{j=\nu +p+1}^{q_1} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \nonumber \\&+ \sum _{r=\Delta '}^{\Delta } \sum _{l \in (r), l > 1} \sum _{j=q_{l-1}+1}^{q_l} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \nonumber \\&\le C \sum _{r=\Delta '}^{\Delta } \sum _{l \in (r)} \frac{|\xi _{q_l}(a)) - \xi _{q_l}(b)|}{{\text {dist}}(\xi _{q_l}(b),Crit_b)} \nonumber \\&\le C \sum _{r=\Delta '}^{\Delta } \frac{|\xi _{q_{\hat{l}(r)}}(a) - \xi _{q_{\hat{l}(r)}}(b)|}{{\text {dist}}\left( \xi _{q_{\hat{l}(r)}}(b),Crit_b\right) }. \end{aligned}$$
(7.14)

Moreover, we have the assumption that \({\text {diam}}(\xi _k(\omega )) \le S = \varepsilon _1 \delta \), for all \(k \le n\). If \(\xi _{q_l}(\omega )\) is a pseudo return with \({\text {dist}}(\xi _{q_l}(\omega ),Crit_{\omega }) \sim _{\sqrt{e}} e^{-r_l}\), for \(\Delta ' \le r_l \le \Delta \), the contribution will be simply bounded by \(\varepsilon _1 e^{-\Delta }/e^{-r_l}\). We get

$$\begin{aligned} C \sum _{r=\Delta '}^{\Delta } \frac{|\xi _{q_{\hat{l}(r)}}(a) - \xi _{q_{\hat{l}(r)}}(b)|}{{\text {dist}}\left( \xi _{q_{\hat{l}(r)}}(b),Crit_b\right) } \le C \sum _{r=\Delta '}^{\Delta } \varepsilon _1 e^{r-\Delta } \le C \varepsilon _1. \end{aligned}$$
(7.15)

The contribution from the very last iterates from \(q_t < j \le n\) is a constant (depending on the large scale) by the uniform expansion along the early orbit (the bound period) and then outside \(U'\). Summing up,

$$\begin{aligned} \sum _{j=1}^{n} \frac{|\xi _j(a) - \xi _j(b)|}{{\text {dist}}(\xi _j(b),Crit_b)} \le \frac{2C}{\Delta } + \frac{C}{\Delta } + 2C\varepsilon _1, \end{aligned}$$

which can be made small if \(\varepsilon _1\) and \(\delta \) are small enough. This finishes the lemma.\(\square \)

We now get a posteriori that \(\xi _n\) is almost affine on each partition element \(\omega \). Hence also \(|\xi _n(a) - \xi _n(b)|\) expands according to the space derivative for any parameter \(c \in [a,b]\) i.e.

$$\begin{aligned} |\xi _n(a) - \xi _n(b)| \sim _C |Df_c^{n-j}(\xi _j(c))| |\xi _j(a) - \xi _j(b)|, \end{aligned}$$

for \(C > 1\) close to 1. This is called strong distortion.

Moreover, we see that as long as \(\gamma \ge \gamma _I\) for returns (in general \(\gamma \ge \gamma _I - 4 \alpha K\)), we have good geometry control, i.e. for a partition element \(\omega \), we have (7.3), for all \(a,b \in \omega \).

7.1 Initial distortion

As a direct consequence of the Main distortion lemma, we here state that for any sufficiently small \(\varepsilon \) we can find an interval \(\omega \subset (-\varepsilon ,\varepsilon )\) such that \(\xi _{n,l}(\omega )\) grows to some “large scale” (denoted by S) or returns into U as an essential first return.

Lemma 7.4

(Start-lemma) Let \(f=f_0\) be as in Theorem A and let \(\varepsilon ' > 0\) and \(N >0 \) from Lemma 4.2. There is a neighbourhood U of \(Crit_0\) and a number \(S > 0\) (called the “large scale”), which depends on U such that the following holds. For every sufficiently small \(\varepsilon > 0\) and each critical point \(c_l\) there is some \(N_l \ge N > 0\) such that for every \(a \in \omega =(-\varepsilon ,\varepsilon )\) we have:

  1. (i)

    For some \(\gamma _l \ge \gamma _0 (1-\varepsilon ')\), it holds that

    $$\begin{aligned} |Df_a^k(f_a(c_l(a)))| \ge Ce^{\gamma _l k}, \text { for all }\quad k \le N_l, \end{aligned}$$
  2. (ii)

    for all \(k \le N_l - 1\), it holds that

    $$\begin{aligned} {\text {diam}}(\xi _{k,l}(\omega )) \le \left\{ \begin{array}{cc} \frac{{\text {dist}}(\xi _{k,l}(\omega ),Crit_{\omega })}{(\log ({\text {dist}}(\xi _{k,l}(\omega ),Crit_{\omega })))^2}, &{} \text {if } \xi _{k,l}(\omega ) \cap U \ne \emptyset , \\ S, &{} \text {if } \xi _{k,l}(\omega ) \cap U = \emptyset , \end{array} \right. \end{aligned}$$
  3. (iii)

    for \(k = N_l\), it holds that

    $$\begin{aligned} {\text {diam}}(\xi _{N_l,l}(\omega )) \ge \left\{ \begin{array}{cc} \frac{{\text {dist}}(\xi _{N_l,l}(\omega ),Crit_{\omega })}{(\log ({\text {dist}}(\xi _{N_l,l}(\omega ),Crit_{\omega })))^2}, &{} \text {if } \xi _{N_l,l}(\omega ) \cap U \ne \emptyset , \\ S, &{} \text {if } \xi _{N_l,l}(\omega ) \cap U = \emptyset , \end{array} \right. \end{aligned}$$
  4. (iv)

    and finally, for all \(a,b \in \omega \) it holds that

    $$\begin{aligned} \biggl | \frac{Df_a^{n-N}(\xi _{N}(a))}{Df_b^{n-N}(\xi _{N}(b))} - 1 \biggr | \le \varepsilon ', \text { for all }n \le N_l. \end{aligned}$$

Remark 7.5

The Whitney type of condition on the diameter of \(\xi _n(\omega )\) and its distance to the critical points has the following meaning. With \({\text {dist}}(\xi _{n,l}(\omega ),Crit_{\omega }) \sim e^{-r}\) the diameter becomes \(\sim e^{-r}/r^2\) and this is sufficient for having control of the distortion of the derivative. The condition is also used in the main distortion lemma later.

Proof

Similar to the proof of Lemma 5.1, we can easily conclude that condition ii) implies that we have very small distortion for \(Df_c\) \(c \in (-\varepsilon ,\varepsilon )\) both in the space and parameter variable. We see that if \(\xi _n(a)\) and \(\xi _n(b)\) are close in this sense, then for all \(a,b \in (-\varepsilon ,\varepsilon )\), in particular for \(b=0\), the bounded distortion on \(Df_c\) implies

$$\begin{aligned} |Df_a^n(v_l(a))| \ge C^{-n} |Df_0^n(v_l(0))| \ge C_0 e^{\gamma _1 n}, \end{aligned}$$

for some \(\gamma _1\) slightly smaller than \(\gamma _0\) (we may assume that \(\gamma _1 \ge (1-\varepsilon ') \gamma _0\)), if C is close enough to 1 (by choosing S small enough). So \(f_a\) also satisfies the CE-condition with exponent slightly smaller than \(\gamma _0\). We now let \(N_l\) be the maximal integer such that ii) holds. We have shown that also i) holds for this \(N_l\). We can hence use the Main distortion lemma for all following returns until time \(N_l\). Hence iv) holds.\(\square \)

7.2 The partition

If \(J(f) = \hat{{\mathbb C}}\) then we have \(2d-2\) critical points, counting multiplicity. Inside the subspace \(\Lambda _{d,\overline{p}'}\) each critical point moves analytically. So Lemma 7.4 gives at most \(2d-2\) numbers \(N_l\), given an interval \(\omega _0=(-\varepsilon ,\varepsilon )\), such that \(\xi _{N_l,l}(\omega _0)\) has grown to some large scale S (same for all l), or has reached size \(e^{-r}/r^2\) inside U, where \(e^{-r}\) is, more or less, the distance to the critical points, i.e. \({\text {dist}}(\xi _{N_l,l}(\omega _0),Crit_{\omega _0}) \sim e^{-r}\). We now assume that, without loss of generality, \(N_1 = \min (N_l)\). Thus we have the CE-condition satisfied for all critical points up until time \(N_1\), on \(\omega _0\).

If \(N_1\) is not a return time, we have \({\text {diam}}(\xi _{N_1,1}(\omega _0)) \ge S\) by Lemma 7.4. As soon as this happens, we partition the interval \(\omega _0\) into the least number of smaller sub-intervals \(\omega _0^i \subset \omega \) of equal length such that \({\text {diam}}(\xi _{N_1,1}(\omega _0^i)) \le S\). We call the sets \(\omega _0^i\) of this type partition elements. We do this partitioning for every critical point at all times outside U until some parameter returns into U. In this way we always have \({\text {diam}}(\xi _{n,l}(\omega )) \le S\) for any partition element \(\omega \) and study the evolution of each such \(\omega \) separately. We will use \(\omega \subset \omega _0 = (-\varepsilon ,\varepsilon )\) as a standard notion for partition elements in the future.

Let us go back to the critical point \(c_1\) (\(l=1\)) and assume that \(\omega \subset \omega _0\) is such partition element and that \(m_1\) is the smallest integer \(m_1 \ge N_1\) such that \(\xi _{m_{1},1}(\omega ) \cap U \ne \emptyset \), i.e. \(\xi _{m_1,1}(\omega )\) is a return into U. If

$$\begin{aligned} \frac{1}{2} \frac{{\text {dist}}(\xi _{m_1,1}(\omega ),Crit_{\omega })}{(\log ({\text {dist}}(\xi _{m_1,l}(\omega ),Crit_{\omega })))^2} \le {\text {diam}}(\xi _{m_1,1}(\omega )), \end{aligned}$$

we speak of an essential return. Otherwise the return is inessential. For essential returns we then partition the interval \(\omega \) into smaller intervals \(\omega _{m_1}^i \subset \omega \) such that

$$\begin{aligned} \frac{1}{2} \frac{{\text {dist}}\left( \xi _{m_1,1}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}\right) }{\left( \log ({\text {dist}}(\xi _{m_1,l}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}))\right) ^2}\le & {} {\text {diam}}\left( \xi _{m_1,1}(\omega _{m_1}^i)\right) \nonumber \\\le & {} \frac{{\text {dist}}\left( \xi _{m_1,1}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}\right) }{\left( \log ({\text {dist}}(\xi _{m_1,l}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}))\right) ^2}.\qquad \qquad \end{aligned}$$
(7.16)

These smaller intervals \(\omega _{m_1}^i\) are also called partition elements (at time \(m_1\)). The condition (7.16) implies that we have control of the distortion:

$$\begin{aligned} \frac{|Df_a(\xi _{m_1,1}(a))|}{|Df_b(\xi _{m_1,1}(b))|} \le C(\tilde{r}), \quad \text { for all }a,b \in \omega _{m_1}^i. \end{aligned}$$

where \(C(\tilde{r}) > 1\) and tends to 1 as \(\tilde{r} = -\log ({\text {dist}}(\xi _{m_1,1}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}))\) tends to infinity. We let \(r = \lceil \tilde{r} - 1/2 \rceil \). For \(\omega _{m_1}^i\) above, we associate \(r = r(\tilde{r})\) to \(\tilde{r}\), and it follows that

$$\begin{aligned} {\text {dist}}\left( \xi _{m_1,1}(\omega _{m_1}^i),Crit_{\omega _{m_1}^i}\right) \sim _{\sqrt{e}} e^{-r}, \end{aligned}$$

(cf. with the annular neighbourhoods in [1]). Moreover, we see that

$$\begin{aligned} {\text {diam}}\left( \xi _{m_1,1}(\omega _{m_1}^i)\right) \sim _2 e^{-r}/r^2, \end{aligned}$$

if \(r \ge \Delta \), and \(\Delta \) sufficiently large. When we write \({\text {dist}}(A,B) \sim _{\sqrt{e}} e^{-r}\), typically we use it when \(A=\xi _{n,l}(\omega )\) and \(B=Crit_{\omega }\), then we mean the unique r such that \(r = \lceil -\log {\text {dist}}(A,B) -1/2 \rceil \), i.e. \({\text {dist}}(A,B) \in [e^{-r-1/2},e^{-r+1/2})\).

For each return, and in particular this first return, we partition parameter intervals according to the above rule. Moreover, we delete parameters not satisfying the basic assumption and show later that the Lebesgue measure of the set deleted is a small portion of the total interval returning into U. It is quite easy to see that this is the case for the first return. Because of the slow recurrence condition, we see that

$$\begin{aligned} e^{-r} \ge e^{-\alpha m_1} \gg e^{-2\alpha m_1}. \end{aligned}$$

Hence, the basic assumption possibly forces us to delete a small fraction of parameters at time \(m_1\).

After this return the first bound period starts, and the whole idea is that binding the old orbit to the early orbit of possibly another critical point, will, via distortion control, transfer the derivative gain form the early orbit to the old orbit. To do this we need to be able to use the binding time for all critical points in the induction. We continue like this as long as we can use the binding information for all critical points, up until time \(N_1\). This procedure creates a Cantor set (denoted by \(\Omega _l(m)\)) of “good” parameters, for each critical point \(c_l\), that do satisfy the basic assumption up until some time m, which turns out to be much larger than \(N_1\), because the bound periods for a return \(\xi _{m,l}(\omega )\) into U are much smaller than m itself (Lemma 6.4).

At this point, we have to delete more parameters such that the binding period can be used longer. A potential problem here is that different critical points \(c_l\) may produce different Cantor sets up until time m, and if we take intersections of these sets, we may destroy the partition elements. But the idea is that the partition elements at time from, say \(N_1\) until \(2N_1\), are much larger that those partition elements formed around time \(m \gg N_1\). We develop this idea, which is due to Benedicks, later.

In the construction, the growth of the derivative along critical orbits is never allowed to go below a certain level, in order to have the whole machine working. Recall that \(\gamma _B = (3/4) \min (\gamma _0,\gamma _H) (1-\tau )\), where \(0< \tau < 1\). This exponent \(\gamma _B\) should be thought of the desired Lyapunov exponent, which we will get at the end. It will also be used as an induction assumption. The number \(\tau \) can be chosen freely but \(\delta \) depends on it (see Sect. 8, Lemma 8.7). The intermediate Lyapunov exponent \(\gamma _I = (1/4) \min (\gamma _0,\gamma _H) (1-\tau ) < \gamma _B/2\) will be an assumption in most lemmas.

8 Large deviations

We will make an induction over time intervals of the type [n, 2n] and assume from now on that we make partitions as described above. Given a good situation at time n with growth of the derivative, we first delete the parameters not satisfying the basic assumption up until time 2n. But according to Lemma 7.1, this means that we may lose some part of the Lyapunov exponent. Therefore we make use of the famous large deviation argument, developed by Benedicks and Carleson, to restore the Lyapunov exponent up until time 2n.

This section is very similar to older papers [1, 5] et al.

Lemma 8.1

Suppose that \(\xi _{\nu ,l}(\omega )\) is an essential return into \(U_i\), and that the Lyapunov exponent \(\gamma \ge \gamma _I\) for all critical points, \(\omega \subset {\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l}\). Then if \(\nu '\) is the next return time, we have that the set \(\hat{\omega }\) of parameters in \(\omega \) that satisfies the basic assumption, has Lebesgue measure

$$\begin{aligned} m(\hat{\omega }) \ge (1-e^{-\alpha \nu }) m(\omega ). \end{aligned}$$

Proof

This follows quite easily, since the interval \(\xi _{\nu +p}(\omega )\) grows rapidly during the bound period p. By Lemma 6.4, Lemma 6.3 and Lemma 7.3, we get, for any \(a\in \omega \),

$$\begin{aligned} {\text {diam}}(\xi _{\nu +p+1,l}(\omega ))&\sim \frac{e^{-rd_i}}{r^{2}} |Df^p(\xi _{\nu +1,l}(a))| \nonumber \\&\sim \frac{|\xi _{\nu +p+1,l}(a) - \xi _{p+1,i}(a) |}{r^{2}} \nonumber \\&\ge C e^{-\alpha (p+1) - 2 \log r} {\text {dist}}(\xi _{p+1,i}(a),Crit_{a}) \nonumber \\&\ge C K_b e^{-2\alpha (p + 1) - \alpha (p+1) - 2 \log r} \ge e^{-(7/2) \alpha p - 2 \log r}, \end{aligned}$$
(8.1)

if p is large. So, by Lemma 3.1,

$$\begin{aligned}{} & {} {\text {diam}}(\xi _{\nu ',l}(\omega )) \ge {\text {diam}}(\xi _{\nu +p+1,l}(\omega )) C' e^{\gamma _H(\nu '-(\nu +p+1))} \ge C' e^{-(7/2) \alpha p - 2 \log r} \nonumber \\{} & {} \quad \ge e^{-7 \alpha d r/\gamma - 2 \log r } \ge e^{ - \frac{8 \alpha d}{\gamma } r}. \end{aligned}$$
(8.2)

Recalling the distortion control from the Main Distortion Lemma 7.3 together with Lemma 4.2, we see that the measure of parameters deleted at \(\nu '\) is

$$\begin{aligned} \frac{ |\omega | - |\hat{\omega }|}{|\omega |} \le 2 \frac{e^{-2 \alpha \nu '}}{{\text {diam}}(\xi _{\nu '}(\omega ))} \le 2e^{-\alpha ( 2 - \frac{8 \alpha d}{\gamma } ) \nu } \le e^{-\alpha \nu }, \end{aligned}$$

since \(\alpha K/\gamma \le 1/100\), \(K = \max (d)\) (maximal degree of the critical points).\(\square \)

We now define escape time and escape situation. Let \(U^2\) be a neighbourhood of Crit such that \(U^2 = \cup _j B(c_j, \delta ^2) \subset U\). We say that a deep return is characterised by \(\xi _{n,l}(\omega ) \cap U^2 \ne \emptyset \) and a shallow return means that \(\xi _{n,l}(\omega ) \cap U^2 = \emptyset \) but \(\xi _{n,l}(\omega ) \cap U \ne \emptyset \). We then speak of deep returns into \(U^2\) and shallow returns into \(U {\setminus }U^2\) even if the actual curve \(\xi _{n,l}(\omega )\) does not entirely lay inside \(U^2\) or \(U {\setminus }U^2\) respectively. We also let \(\omega _n(a)\) be the corresponding partition element following the parameter a, i.e. the unique \(\omega \) such that \(\xi _n(\omega )\) has diameter bounded by S if \( \xi _n(\omega ) \cap U = \emptyset \) and bounded by \({\text {dist}}(\xi _n(\omega ),Crit_{\omega }) / (\log {\text {dist}}(\xi _n(\omega ),Crit_{\omega }) )^2\) if \( \xi _n(\omega ) \cap U \ne \emptyset \).

Definition 8.2

We say that \(\xi _{n}(\omega )\), or \(\omega \) itself, has escaped, or is in escape position, if \({\text {diam}}(\xi _{n}(\omega )) \ge S\) just before partitioning, and the bound period has passed. In other words, \({\text {diam}}(\xi _n(\omega _{n-1}(a))) \ge S\), where \(\omega = \omega _{n-1}(a)\), some \(a \in \omega \).

The escape time for a parameter \(a \in \omega \) for a deep return \(\xi _{\nu ,l}(\omega )\) into \(U^2\) is defined as the least number \(n - \nu \ge 0\) such that \(\xi _{n,l}(\omega _{n-1}(a))\) has reached escape position. We write \(E_l(a,\nu ) = n - \nu \) for this escape time. We also define the escape time for shallow returns, i.e. if \(\xi _{\nu ,l}(\omega ) \subset U {\setminus }U^2\), to be equal to zero.

If some parameter \(a \in \omega \) has that \(\xi _{\nu ',l}(\omega _{\nu '}(a))\) does not satisfy the basic approach rate condition, i.e. returns too deep for some \(\nu ' > \nu \) before it escapes, then those parameters get deleted and we put \(E_l(a,\nu ) = -\infty \).

Lemma 8.3

Suppose that \(\xi _{\nu ,l}(\omega )\) is an essential return into \(U_i\), \(\omega \in {\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l}\), \(\gamma \ge \gamma _I\) and that \({\text {dist}}(\xi _{\nu ,l}(\omega ),Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\). Put \(h = 8K^2/\gamma _I\). Then if \(q = n - \nu \) where n is the next essential return or the time when \(\xi _{n,l}(\omega )\) is in escape position, which ever comes first, we have the estimate

$$\begin{aligned} q \le h r. \end{aligned}$$

Proof

Let us put \(D_j = |Df^j(\xi _{\nu ,l}(a))|\), for \(a \in \omega \). By the definition of the bound period, the basic assumption, and Lemma 6.4, for all \(a \in \omega \),

$$\begin{aligned}{} & {} D_{p+1} \ge C e^{-\alpha (p+1)} {\text {dist}}\left( \xi _{p+1,i}(a), Crit_{a}\right) e^r \nonumber \\{} & {} \quad \ge e^{-2 \alpha (p+1) - \alpha (p+1) + r} \ge e^{r\left( 1-\frac{7 \alpha K}{\gamma }\right) } \ge e^{r\left( 1-\frac{7 \alpha K}{\gamma _I}\right) }. \end{aligned}$$
(8.3)

Let \(m_j\) be the inessential returns after \(\nu \), i.e. \(\nu< m_1< m_2< \ldots< m_s < n\). Let \(p_j\) and \(q_j\) be the bound and free periods respectively following \(m_j\). Let \(p_0\) and \(q_0\) be the bound and free periods following the return \(\nu \). It can happen that escape takes place before a return takes place, and then \(q_s\) is not a complete free period. It can also happen that n is a time during the bound period for \(m_s\). But then we have \(m_s+p_s - \nu \) as an upper bound for q and we can assume that \(q > m_s+p_s\).

Suppose that \({\text {dist}}(\xi _{m_j}(\omega ),Crit_{\omega }) \sim _{\sqrt{e}} e^{-r_j}\), and let \(r=r_0\). Suppose that \(n=\nu '\) is a return. Then, as long as the bound period is bounded by \((6 K \alpha /\gamma _I)\nu \), we can use the same estimate as (8.3), and Lemma 3.1, to obtain

$$\begin{aligned}&{\text {diam}}(\xi _{n}(\omega )) \sim |Df_a^{n - \nu }(\xi _{\nu }(a))| {\text {diam}}(\xi _{\nu }(\omega )) \nonumber \\&\quad = \prod _{j=0}^{s} |Df_a^{p_j}(\xi _{m_j}(a))| C' e^{\gamma _H q_j} {\text {diam}}(\xi _{\nu }(\omega )) \nonumber \\&\quad \ge e^{r\left( 1-\frac{7 \alpha K}{\gamma _I}\right) } C' e^{\gamma _H q_0} {\text {diam}}(\xi _{\nu }(\omega )) \prod _{j=1}^s e^{r_j\left( 1-\frac{7 \alpha K}{\gamma _I}\right) } \prod _{j=1}^s C' e^{q_j \gamma _H} \nonumber \\&\quad \ge e^{-r \frac{8 \alpha K}{\gamma _I} + q_0 \gamma _H} \prod _{j=1}^s e^{r_j\left( 1-\frac{8 \alpha K}{\gamma _I}\right) + q_j \gamma _H}. \end{aligned}$$
(8.4)

If n was not a return, then let \(q_1< \ldots < q_t\) be the pseudo-returns after \(m_s+p_s\). Between each pair of pseudo-returns we have uniform expansion of the derivative according to Lemma 7.2. Between \(m_s+p_s\) and \(q_1\) we also have uniform expansion according to Lemma 3.1. So we only need to consider the last time period, from \(q_t\) to n. Since \(\xi _{q_t}(\omega )\) may belong to \(U' {\setminus }U\) we have \(|Df_a(\xi _{q_t}(a))| \ge e^{-K \Delta }\) for all \(a \in \omega \). After time \(q_t\) we can use the binding information, Lemma 6.3 and the first statement of Lemma 3.1 with \(U = U'\), depending on whether n belongs to the bound period or not. In any case we get uniform expansion; \(|Df_a^{n-q_t-1}(\xi _{q_t+1}(a))| \ge C e^{\min (\gamma ,\gamma _H) (n-q_t-1)}\). In other words, with \(z = \xi _{m_s+p_s}(a)\), for \(a \in \omega \),

$$\begin{aligned} |Df_a^{n-(m_s+p_s)}(z)|&= |Df_a^{q_1-(m_s+p_s)}(z)| |Df_a^{q_2-q_1}\left( f_a^{q_1-(m_s+p_s)}(z)\right) | \nonumber \\&\cdot \ldots \cdot |Df_a^{q_t-q_{t-1}}\left( f_a^{q_{t-1}-(m_s+p_s)}(z)\right) | |Df_a \left( f_a^{q_t-(m_s+p_s)}(z)\right) | \nonumber \\&\cdot |Df_a^{n-q_t-1}\left( f_a^{q_t-(m_s+p_s) + 1}(z)\right) | \nonumber \\&\ge C' e^{\gamma _H (q_1 -(m_s+p_s))} e^{\gamma _2 (q_t-q_1)} e^{-K \Delta } C e^{\min (\gamma ,\gamma _H) (n-q_t-1)} \nonumber \\&\ge e^{\gamma _C (n-(m_s+p_s))} e^{-K\Delta }, \end{aligned}$$
(8.5)

since \(\gamma _2 \ge \min (\gamma ,\gamma _H)/(3K) \ge \gamma _C\). So we may have to replace \(q_s \gamma _H\) with \(\gamma _C q_s - K \Delta \) in (8.4), where \(q_s = (n-(m_s+p_s))\) in this case.

Since \(\gamma _C < \gamma _H\), and \({\text {diam}}(\xi _{n}(\omega ))\) is assumed to be at most \(S = \varepsilon _1 \delta \le 1\), we therefore get

$$\begin{aligned} \sum _{j=1}^s r_j \left( 1-\frac{8 \alpha K}{\gamma _I}\right) + \sum _{j=0}^s q_j \gamma _C \le r \frac{8 \alpha K}{\gamma _I} + K\Delta . \end{aligned}$$

Hence, if \(q = \sum _{j=0}^s p_j + \sum _{j=0}^s q_j\), we get

$$\begin{aligned} q - p_0&= \sum _{j=1}^s p_j + \sum _{j=0}^s q_j \le \sum _{j=1}^s \frac{2K}{\gamma _I} r_j + \sum _{j=0}^s q_j \nonumber \\&\le \sum _{j=1}^s \frac{4 K}{\gamma _I} \left( 1 - \frac{8 \alpha K}{\gamma _I}\right) r_j + \frac{1}{\gamma _C} \sum _{j=0}^s q_j \gamma _C \nonumber \\&\le \max \biggl ( \frac{4 K}{\gamma _I}, \frac{1}{\gamma _C} \biggr ) \biggl ( \sum _{j=1}^s r_j \left( 1-\frac{8 \alpha K}{\gamma _I}\right) + \sum _{j=0}^s q_j\gamma _C \biggr ) \nonumber \\&\le \max \biggl ( \frac{4 K}{\gamma _I}, \frac{1}{\gamma _C} \biggr ) \biggl ( \frac{8 \alpha K}{\gamma _I} r +K\Delta \biggr ) \le \frac{6K^2}{\gamma _I}r, \end{aligned}$$
(8.6)

since \(4K/\gamma _I > 1/\gamma _C\), and \(\alpha \le 4 \gamma _I /(400K^2 \Gamma )\). Now, \(p_0 \le 2Kr/\gamma _I\), so

$$\begin{aligned} q \le \frac{8K^2}{\gamma _I}r. \end{aligned}$$
(8.7)

Since the total time is bounded from above by \(\nu + 8K^2 r/\gamma _I \le (3/2) \nu \), we can use the binding information the whole time.\(\square \)

We will now estimate the measure of the set of parameters having a specific history for the returns in a time window of the form [n, 2n]. For simplicity, suppose that \(\xi _{\nu }(\omega _0)\) is an essential return with \({\text {dist}}(\xi _{\nu }(\omega _0),Crit_{\omega _0}) \sim _{\sqrt{e}} e^{-r_0}\) and \(\nu \ge n\) (\(\nu \) should be though of as the smallest return time after n). Let us study the evolution of \(\xi _m(\omega _m(a))\) as m goes through a sequence of essential returns \(\nu _1, \nu _2, \ldots , \nu _s \le 2n\). Let us also assume that \(\omega _{\nu _j}(a) \subset {\mathcal E}_{\nu _j,l}(\gamma _I) \cap {\mathcal B}_{\nu _j,l}\), for these returns so that we can use the binding information of all other critical points up to time 2n. This is not a strong assumption, as we now explain. Suppose \(a \in {\mathcal E}_{n,l}(\gamma _B) \cap {\mathcal B}_{2n,l}\), i.e. we assume that the basic approach rate condition is fulfilled up until time 2n. The Lyapunov exponent will not drop too much at each return in the interval [n, 2n], because we can use Lemma 6.4 at each return and get a trivial lower bound for the expansion, namely 1 during the bound period. But this means that the actual Lyapunov exponent is bounded from below, and we get a trivial bound,

$$\begin{aligned} |Df^{2n}(v_l(a))| \ge e^{\gamma _B n} \ge e^{2n \gamma _I}, \end{aligned}$$

since \(\gamma _I \le \gamma _B/2\). In other words, \(a \in {\mathcal E}_{2n,l}(\gamma _I) \cap {\mathcal B}_{2n,l}\).

By the Main Distortion Lemma 7.3, which then gives good geometry control, the diameter of \(\xi _{\nu _j+p_j}(\omega _{\nu _j+p_j}(a))\) is more or less equal to the length of the curve (which is then more or less straight), i.e. \(\sim e^{-(7 K \alpha /\gamma ) r_j}\), see inequality (8.3). After the free period it may expand further, and to get rid of the constant \(C'\) in Lemma 3.1, we may say that the curve \(\xi _{\nu _{j+1}}(\omega _{\nu _{j+1}}(a))\) has a diameter at least \(e^{-(8 K \alpha / \gamma ) r_j}\). We therefore get, with \(\gamma \ge \gamma _I\), that the measure of those parameters \(b \in \omega _{\nu _j}(a)\) entering into U with \({\text {dist}}(\xi _{\nu _{j+1}}(b), Crit_{b} ) \sim _{\sqrt{e}} e^{-r_{j+1}}\) is

$$\begin{aligned} m(\omega _{\nu _{j+1}}(a)) = m\left( \left\{ b \in \omega _{\nu _{j}}(a) : \xi _{\nu _{j+1}}(b) \sim _{\sqrt{e}} e^{-r_{j+1}} \right\} \right) \le C \frac{e^{-r_{j+1}}}{e^{-(8 K \alpha /\gamma ) r_j}} m(\omega _{\nu _j}(a)) , \end{aligned}$$
(8.8)

(recall that we do not partition \(\omega _{\nu _j}(a)\) until the next return, so \(\omega _{\nu _j}(a)= \omega _{\nu _{j+1}-1}(a)\)). So suppose now that we have a sequence of s essential returns \(\nu _1, \nu _2, \ldots , \nu _s \le 2n\). Let us also assume that we always have a lower bound, \(\gamma _I\), for the Lyapunov exponent, i.e. \(a \in {\mathcal E}_{2n,l}(\gamma _I) \cup {\mathcal B}_{2n,l}\) for the parameters we are considering. Then the portion from the starting interval, call it \(\omega _0 = \omega _{\nu }(a)\) for some \(a \in \omega _0\), that has this specific history is, with \(\omega _j = \omega _{\nu _j}(a)\),

$$\begin{aligned} \frac{m(\omega _s)}{m(\omega _0)} = \prod _{j=0}^{s-1} \frac{m(\omega _{j+1})}{m(\omega _j)} \le C^s \prod _{j=0}^{s-1} \frac{e^{-r_{j+1}}}{e^{-(8 K \alpha /\gamma ) r_j}}. \end{aligned}$$
(8.9)

We continue to follow [1, 5] more or less verbatim. Let \(R=r_1+r_2 + \ldots + r_s\). We now compute the number of combinations of choosing such \(r_j\) given that \(r_j \ge \Delta \ge 0\). Let us not yet take into account that we are partitioning the intervals into smaller intervals such that

$$\begin{aligned} {\text {diam}}(\xi _{\nu _{j}}(\omega )) \sim _{\sqrt{e}} e^{-r_{j}} /r_{j}^2 , \text { for each }j = 1, \ldots , s, \end{aligned}$$
(8.10)

where \(\omega = \omega _{\nu _j}(a)\). Hence for each such set we have another \(r_{j}^2\) possibilities.

By the pigeonhole principle, an upper bound for this number of combinations is, disregarding from these extra \(r_j^2s\) possibilities,

$$\begin{aligned} \left( {\begin{array}{c}R+s-1\\ s-1\end{array}}\right) . \end{aligned}$$

By Stirling’s formula this can be estimated as follows, using that \(R \ge s \Delta \),

$$\begin{aligned} \left( {\begin{array}{c}R+s-1\\ s-1\end{array}}\right)&\le C \frac{1}{\sqrt{2\pi }} \frac{(R+s-1)^{R+s-1} e^{-R-s+1}}{R^R e^{-R} (s-1)^{s-1} e^{-s}} \sqrt{\frac{R+s-1}{R(s-1)}} \nonumber \\&\le \frac{R^{R+\frac{R}{\Delta }} \left( 1+\frac{1}{\Delta }\right) ^{\left( 1+\frac{1}{\Delta }\right) R}}{R^R \left( \frac{R}{\Delta }\right) ^{R/\Delta }} \nonumber \\&\le \biggl ( \Delta ^{1/\Delta } \bigg (1 + \frac{1}{\Delta }\bigg )^{1+\frac{1}{\Delta }} \biggr )^R \le 2(1 + \eta (\Delta ))^R \end{aligned}$$
(8.11)

if \(\Delta \) is large enough, where \(\eta (\Delta ) = {\mathcal O}(1/\Delta )\).

Taking into account now (8.10), we get that the number of combinations is

$$\begin{aligned} 2(1+\eta (\Delta ))^R \prod _{j=1}^s r_j^2 \le e^{R/32} (1 + \eta (\Delta ))^R. \end{aligned}$$

We can rewrite (8.9) to get, (recall the condition on \(\alpha \)),

$$\begin{aligned} \frac{m(\omega _s)}{m(\omega _0)} \le C^s e^{r_0 (8 \alpha K/\gamma ) - \sum _{j=1}^{s-1} r_j (1- 8 \alpha K/\gamma ) - r_s } \le C^se^{r_0 (8 \alpha K/\gamma ) - (15/16) R }. \end{aligned}$$

Given an essential return \(\xi _{\nu ,l}(\omega )\), let \(A_{s,R} \subset \omega \) be the set of those parameters having exactly s essential returns as above before escaping at the \(s+1\):st return, for a fixed R. Each pair of sequences \(\{\nu _j \}_{j=1}^s, \{ r_j\}_{j=1}^s\) defines a unique history for a parameter \(a \in A_{s,R}\). Letting s and R vary, then \(\omega \) gets partitioned into a (likely huge) number of smaller intervals having this specific history. But let us fix s and let \(\hat{\omega }_s\) be the largest of these partition intervals for this fixed s. Then

$$\begin{aligned} |A_{s,R}| \le |\hat{\omega }_s| e^{R/32}(1 + \eta (\Delta ))^R. \end{aligned}$$

Now we show that the set of those parameters for which \(\xi _n(a)\) returns too frequently and too deep into U has very small Lebesgue measure. This is handled via so famous large deviation argument, originally developed in [5], which is an idea from a probabilistic point of view, although the system we are considering is deterministic.

For an essential return \(\xi _{\nu ,l}(\omega )\) into \(U^2\) where \({\text {dist}}(\xi _{\nu ,l}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), suppose that \(a \in \omega \) has s essential returns before it has escaped. Then according to Lemma 8.3, we have,

$$\begin{aligned} E_l(a,\nu ) \le \sum _{j=0}^s h r_j \le h r + h R, \end{aligned}$$

where \(R = r_1 + \ldots + r_s\). So the escape time \(t \le hr + hR\), i.e. it is bounded in terms of how deep the returns are. Let us estimate the measure of those parameters that escape at a certain (long) time t.

Put \(r=r_0\). We get, given that \(\Delta \) is large enough,

$$\begin{aligned} m(\{ a \in \omega : E_l(a,\nu ) = t \})&\le \sum _{R \ge t/h - r_0, s \le R/\Delta } |A_{s,R}| \nonumber \\&\le \sum _{R \ge t/h - r_0, s \le R/\Delta } |\hat{\omega }_s| e^{R/32}(1 + \eta (\Delta ))^R \nonumber \\&\le |\omega | \sum _{R=t/h - r_0}^{\infty } \sum _{s=1}^{R/\Delta } e^{R/32}(1 + \eta (\Delta ))^R C^s e^{r_0 (8 \alpha K /\gamma ) - (15/16) R } \nonumber \\&\le C' |\omega | \sum _{R=t/h - r_0}^{\infty } C^{R/\Delta } e^{-R\left( \frac{29}{32} - \eta (\Delta )\right) + (8 K \alpha /\gamma )r_0} \nonumber \\&\le C' |\omega | e^{-\left( \frac{t}{h} - r_0\right) \frac{28}{32} + (8 K \alpha /\gamma )r_0} \nonumber \\&\le C' |\omega | e^{-\frac{t}{h}\frac{28}{32} + \left( \frac{28}{32} + \frac{8 K\alpha }{\gamma }\right) r_0}. \end{aligned}$$
(8.12)

for some constant \(C' > 0\).

By the condition on \(\alpha \), if \(\gamma \ge \gamma _I\), we get an estimate of the measure of parameters for large escape times. Let us suppose that \(t > 2hr_0\). Then

$$\begin{aligned} m(\{ a \in \omega : E_l(a,\nu ) = t \}) \le Ce^{-\frac{t}{3h}} |\omega |. \end{aligned}$$
(8.13)

We now follow a parameter in \(a \in \omega \) in a time window [n, 2n], and estimate its total time spent on escaping from essential returns. Recall that given an essential return \(\xi _{\nu }(\omega _{\nu }(a))\), the parameter a has to escape first before we can start counting the next escape time. Let

$$\begin{aligned} T_n(a) = T_{n,l}(a) = \sum _{j=1}^{s(a)} E_l(a,\nu _j(a)), \end{aligned}$$

where \(\nu _j(a)\) are essential returns after escape situations, and \(s=s(a)\) the total number of such returns in [n, 2n]. We include shallow returns above also but then, by definition, the escape time is zero, so one needs only consider deep returns in the sum.

Remark 8.4

A note on the last return \(\nu _s\) in the expression of \(T_{n,l}(a)\). The escape period of the last return \(\nu _s=\nu _s(a)\), by definition, has to transcend into the next time window [2n, 4n]. If it is too long it may deteriorate the Lyapunov exponent for that parameter too much. Here we make the following convention, namely that if \(E(a,\nu _s) \ge 6 h \alpha n\) (where \(6 h \alpha n \ll n\)), then we delete those parameters. They constitute an exponentially small portion of the parameters in \(\omega \) [put \(t=6h \alpha n\) in equation (8.13)], i.e. has measure \(\le |\omega | Ce^{- q n}\), where \(q = 2 \alpha \). We simply disregard from those parameters in the above expression for \(T_n(a)\). They can easily be taken care of in the final proof in the next section.

In order to reach the main conclusion that the set of parameters having too many too deep returns in the time window [n, 2n] has small measure, we want to estimate, for suitable \(\theta > 0\), the integral

$$\begin{aligned} \frac{1}{|\omega |} \int _{\omega } e^{\theta T_n(a)} \,\textrm{d}a. \end{aligned}$$

There is some freedom of how to choose \(\theta \), but let us set \(\theta = 1/(6\,h)\).

Lemma 8.5

Let \(\xi _{\nu ,l}(\omega )\) be a deep essential return with \({\text {dist}}(\xi _{\nu ,l}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), \(n \le \nu \le 2n\), and \(\omega \subset {\mathcal E}_{\nu ,l,\star } (\gamma ) \cap {\mathcal B}_{\nu ,l,\star }\) for some \(\gamma \ge \gamma _I\). Suppose also that all parameters \(a \in \omega \cap {\mathcal B}_{2n,l}\) has that \(a \in {\mathcal E}_{2n,l}(\gamma _I)\). Then

$$\begin{aligned} \int _{ \{ a \in \omega : 2hr \le E_l(a,\nu ) \le \nu -n \} } e^{\theta E_l(a,\nu )} \,\textrm{d}a&\le C e^{-r/3} |\omega |, \end{aligned}$$
(8.14)
$$\begin{aligned} \int _{ \{ a \in \omega : E_l(a,\nu ) \le 2hr \} } e^{\theta E_l(a,\nu )} \,\textrm{d}a&\le C e^{r/3} |\omega |. \end{aligned}$$
(8.15)

Proof

By (8.13) we have,

$$\begin{aligned} \int _{ \{ a \in \omega : E_l(a,\nu ) \ge 2hr \} } e^{\theta E_l(a,\nu )} \,\textrm{d}a&\le C \sum _{t \ge 2hr} e^{-\frac{t}{3h}} e^{\theta t} |\omega | \nonumber \\&\le C e^{-\frac{t}{6h}} |\omega | \le C e^{-r/3}|\omega |. \end{aligned}$$
(8.16)

The second inequality follows directly.\(\square \)

Lemma 8.6

Let \(\xi _{\nu ,l}(\omega )\) be an essential return with \({\text {dist}}(\xi _{\nu ,l}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), \(n \le \nu \le 2n\), and \(\omega \subset {\mathcal E}_{\nu ,l,\star } (\gamma ) \cap {\mathcal B}_{\nu ,l,\star }\) for some \(\gamma \ge \gamma _I\). Suppose also that all parameters \(a \in \omega \cap {\mathcal B}_{2n,l}\) has that \(a \in {\mathcal E}_{2n,l}(\gamma _I)\). Then for any \(\varepsilon _2 > 0\) there is a \(\Delta _2\) such that if \(\Delta \ge \Delta _2\) (recall \(\delta =e^{-\Delta }\)), we have

$$\begin{aligned} \int _{\omega } e^{\theta T_{n,l}(a)} \,\textrm{d}a \le e^{\varepsilon _2 n}|\omega |. \end{aligned}$$

Proof

Let \(\hat{\omega } \subset \omega \) be a subset of \(\omega \) such that every parameter \(a \in \hat{\omega }\) has s number of free returns into U after escape situations. So \(T_{n,l}(a)\) consists of s terms of the form \(E_l(a,\nu _j(a))\), \(j=1,\ldots , s\), where \(\nu _1=\nu \). Recall that \(E_l(a,\nu _j(a)) = 0\) if the return is shallow. Set \(\xi _{n,l}(\omega ) = \xi _n(\omega )\). Every parameter \(a \in \hat{\omega }\) has a nested sequence of corresponding intervals so that \(a \in \omega ^s \subset \omega ^{s-1} \subset \ldots \subset \omega ^1 \subset \hat{\omega }\), such that \(\xi _{\nu _{j+1}(a)}(\omega ^j)\) is in escape position and \(\xi _{\nu _j(a)}(\tilde{\omega }^{j})\) is an essential return, \(\omega ^j \subset \tilde{\omega }^{j}\). We have \(\tilde{\omega }^{1}=\omega \), by assumption. We also see that \(E_l(a,\nu _j(a))\) is constant on \(\omega ^j=\omega ^j(a)\) but not on \(\omega ^{j-1}\). We think of \(\omega ^1 = \omega ^1(a) \subset \hat{\omega }\) as an interval around a which has escaped at time \(\nu _2 = \nu _2(a)\) (possibly earlier). Then \(\omega ^2\) is another smaller interval around a which has escaped at time \(\nu _3\) (possibly earlier) and so on. In the construction one should think of \(\omega \) as contained in some larger interval \(\omega ^0\), \(\omega \subset \omega ^0\) where \(\xi _{\nu _1}(\omega ^0)\) is in escape position, and where \(\xi _{\nu _1}(\omega )\) is an essential return.

Since \(T_{n,l}(a) = \sum _{j=1}^{s} E_l(a,\nu _j)\), and \(E_l(a,\nu _j(a))\) is constant on \(\omega ^{j}\) but not on \(\omega ^{j-1}\), we get,

$$\begin{aligned} \int _{\omega ^{s-1}} e^{\theta T_{n,l}(a)} \,\textrm{d}a = \prod _{j=1}^{s-1} e^{\theta E_l(a,\nu _j)} \int _{\omega ^{s-1}} e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a. \end{aligned}$$

Now, \(\xi _{\nu _{s}}(\omega ^{s-1})\) is in escape position and therefore each interval \(\omega ^{s} \subset \omega ^{s-1,r}\) where \({\text {diam}}(\xi _{\nu _{s}}(\omega ^{s-1,r})) \sim e^{-r}\). Also \(\omega ^{s-1}\) is a union of disjoint intervals \(\omega ^{s-1,r}\), i.e.

$$\begin{aligned} \omega ^{s-1} = \bigcup _{r=\Delta }^{\infty } \omega ^{s-1,r}. \end{aligned}$$

Recall that the escape time \(E_l(a,\nu _{s}(a)) = 0\) for \(a \in \omega ^{s-1,r}\) if \(r \le 2\Delta \). By Lemma 8.5 we have

$$\begin{aligned} \int _{\omega ^{s-1}} e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a&\le |\omega ^{s-1}| + \sum _{r \ge 2\Delta } \int _{\omega ^{s-1,r}} e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a \nonumber \\&\le |\omega ^{s-1}| + \sum _{r=2\Delta }^{\infty } \biggl ( \int _{ \{a \in \omega ^{s-1,r} : E_l(a,\nu _{s}(a)) \ge 2hr \} } e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a \nonumber \\&+ \int _{ \{ a \in \omega ^{s-1,r} : E_l(a,\nu _{s}(a)) \le 2hr\} } e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a \biggl ) \nonumber \\&\le |\omega ^{s-1}| + C \sum _{r=2\Delta }^{\infty } \bigg (e^{r/3} + e^{-r/3}\bigg )|\omega ^{s-1,r}|. \end{aligned}$$
(8.17)

Since \(\xi _{\nu _{s}}(\omega ^{s-1})\) is in escape position, by the Main Distortion Lemma the parameters a that enter into the set where \({\text {diam}}(\xi _{\nu _{s}}(\omega ^{s-1,r})) \sim e^{-r}\) has measure \(\sim \frac{e^{-r}}{\delta } |\omega ^{s-1}|\). Therefore,

$$\begin{aligned} \int _{\omega ^{s-1}} e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a&\le |\omega ^{s-1}| + C \sum _{r=2\Delta }^{\infty } \left( e^{r/3} + e^{-r/3}\right) \frac{e^{-r}}{\delta } |\omega ^{s-1}| \nonumber \\&= |\omega ^{s-1}| \left( 1 + Ce^{-\Delta /3}\right) = |\omega ^{s-1}|(1+ \eta (\Delta )), \end{aligned}$$
(8.18)

where \(\eta (\Delta ) \rightarrow 0\) as \(\Delta \rightarrow \infty \).

Next, we want to compute the integral over \(\omega ^{s-2}\): Again \(\xi _{\nu _{s-1}}(\omega ^{s-2})\) is in escape position and therefore \(\omega ^{s-2}\) is subdivisioned into disjoint intervals of the type \(\omega ^{s-2,r} \subset \omega ^2\) as \(\omega ^{s-1}\):

$$\begin{aligned} \omega ^{s-2} = \bigcup _{r=\Delta }^{\infty } \omega ^{s-2,r}. \end{aligned}$$

Since \(E_l(a,\nu _j(a))\) is constant on \(\omega ^j\), we now compute

$$\begin{aligned} \int _{\omega ^{s-2,r}} e^{\theta (E_l(a,\nu _s) + E_l(a,\nu _{s-1}))} \,\textrm{d}a&= \sum _{\omega ^{s-1} \subset \omega ^{s-2,r}} e^{\theta E_l(a,\nu _{s-1})} \int _{\omega ^{s-2,r} \cap \omega ^{s-1}} e^{\theta E_l(a,\nu _{s})} \,\textrm{d}a \nonumber \\&\le \sum _{\omega ^{s-1} \subset \omega ^{s-2,r}} e^{\theta E_l(a,\nu _{s-1})} (1 + \eta (\Delta ))|\omega ^{s-1}| \nonumber \\&= (1+ \eta (\Delta )) \int _{\omega ^{s-2,r}} e^{\theta E_l(a,\nu _{s-1})} \,\textrm{d}a. \end{aligned}$$
(8.19)

Thus,

$$\begin{aligned} \int _{\omega ^{s-2}} e^{\theta (E_l(a,\nu _s) + E_l(a,\nu _{s-1}))}&= \sum _{r \ge 2 \Delta } \int _{\omega ^{s-2,r}} e^{\theta (E_l(a,\nu _{s-1}) + E_l(a,\nu _s))} \,\textrm{d}a \nonumber \\&\le (1 + \eta (\Delta )) \sum _{r \ge 2 \Delta } \int _{\omega ^{s-2,r}} e^{\theta E_l(a,\nu _{s-1})} \,\textrm{d}a \nonumber \\&\le (1 + \eta (\Delta )) \int _{\omega ^{s-2}} e^{\theta E_l(a,\nu _{s-1})} \,\textrm{d}a \le (1 + \eta (\Delta ))^2 |\omega ^{s-2}|. \end{aligned}$$
(8.20)

Repeating this s times and noting that \(s \le n\) trivially and that \(\eta (\Delta ) \rightarrow 0\) as \(\Delta \rightarrow \infty \), we get

$$\begin{aligned} \int _{\omega ^0} e^{\theta T_{n,l}(a)} \,\textrm{d}a \le (1 + \eta (\Delta ))^s |\omega ^0| \le e^{\varepsilon _2 n} |\omega ^0|. \end{aligned}$$

Since this holds for every set of the type \(\hat{\omega }\) (and letting s vary) the lemma follows.\(\square \)

Finally we can prove the main goal in this section.

Lemma 8.7

Let \(\tau > 0\) be such that \(\tau \theta > \varepsilon _2\) and suppose that \(\xi _{\nu ,l}(\omega )\) is a deep essential return with \({\text {dist}}(\xi _{\nu ,l}(\omega ), Crit_{\omega }) \sim _{\sqrt{e}} e^{-r}\), \(n \le \nu \le 2n\), and \(\omega \subset {\mathcal E}_{\nu ,l,\star } (\gamma ) \cap {\mathcal B}_{\nu ,l,\star }\) for some \(\gamma \ge \gamma _I\). Suppose also that all parameters \(a \in \omega \cap {\mathcal B}_{2n,l}\) has that \(a \in {\mathcal E}_{2n,l}(\gamma _I)\). Then

$$\begin{aligned} m(\{a \in \omega : T_n(a) \ge \tau n \} ) \le e^{n(\varepsilon _2 - \theta \tau )} |\omega |. \end{aligned}$$

Proof

We have by Lemma 8.6,

$$\begin{aligned} e^{\theta \tau n} m(\{a \in \omega : T_n(a) \ge \tau n \} ) \le \int _{ \{ T_n(a) > \tau n \} } e^{\theta T_n(a)} \,\textrm{d}a \le \int _{\omega } e^{\theta T_n(a)} \,\textrm{d}a \le e^{\varepsilon _2 n} |\omega |, \end{aligned}$$

from which we conclude that

$$\begin{aligned} m(\{a \in \omega : T_n(a) \ge \tau n \} ) \le e^{n(\varepsilon _2 - \theta \tau )} |\omega |. \end{aligned}$$

\(\square \)

9 Conclusion and proof of the main theorem

We make induction over time intervals of the type [n, 2n]. By Lemma 7.4, for a sufficiently small starting interval \(\omega _0 = (-\varepsilon ,\varepsilon )\) around the starting map \(f_0\), there are numbers \(N_l\) such that \(\xi _{N_l,l}(\omega _0)\) has grown to the large scale or returned into U with \({\text {dist}}(\xi _{N_l,l}(\omega _0),Crit_{\omega _0}) / (\log ( {\text {dist}}(\xi _{N_l,l}(\omega _0),Crit_{\omega _0})))^2 \le {\text {diam}}(\xi _{N_l,l}(\omega _0)) \), (i.e. in the case of a return, it has to be essential). Suppose, without loss of generality, that the first critical point (\(l=1\)) has that \(N_1 = \min (N_l)\). Let \(\nu _0 \ge N_1\) be the first return into U. It follows that \(\xi _{\nu _0,1}(\omega _0)\) is an essential return.

If \(\nu _0 > 2N_1\) then it means we have no more returns in \([N_1,2N_1]\) for \(l=1\) so we go on to the next critical point. To start, put \(n=N_1\). For each critical point, we consider the returns \(\nu _j \in [n,2n]\) and delete parameters according to the basic approach rate condition. If \(\hat{\omega }_0 \subset \omega _0\) is the set that is left from \(\omega _0\) when we have deleted parameters not satisfying this condition up until time 2n, then by Lemma 8.1,

$$\begin{aligned} |\hat{\omega _0}| \ge (1-e^{-\alpha n})|\omega _0|. \end{aligned}$$

We make this construction for each critical point, and thereby get a set \(\Omega _l(2N_1)\), which corresponds to \(\hat{\omega }_0\) for each l, and which contains parameters in \(\omega _0\) that satisfy the basic assumption for the critical point \(c_l\). Up until time \(n=N_1\) we see that \(\omega _0 \subset {\mathcal E}_{N_1,l}(\gamma _B)\), by making \(\varepsilon \) sufficiently small. Actually we have a stronger statement at this early stage according to the Starting Lemma (the Lyapunov exponents are close to \(\gamma _0\)), but we do not need that. Moreover, by definition we have \(\omega _0 \subset {\mathcal B}_{N_1-1,l}\) for all l. Obviously, \(\Omega _l(2N_1) \subset {\mathcal B}_{2N_1,l}\).

If we do not do anything more than keeping the parameters satisfying the basic approach rate condition, the Lyapunov exponent may drop in the time window [n, 2n], and over time we may lose too much. Every return in this time window has a bound period \(p_j \le \nu _j (4 K\alpha /\gamma _I) = \hat{\alpha } \nu _j \le 2n\), for the returns \(\nu _j \in [n,2n]\), where we have set \(\hat{\alpha } = 4 K \alpha /\gamma _I\). Hence we can use the expansion of the early orbits up until time 2n for all such bound periods. We also note that by Lemma 6.4, the bound periods are bounded from below by \(Kr_j/(2\Gamma ) \ge K\Delta /(2\Gamma )\). Let \(L_j\) be the corresponding free periods. For every parameter which satisfies the basic approach rate condition, by Lemmas 3.1 and 6.4, using that \(a \in {\mathcal E}_{n,l}(\gamma _B)\), we have, if \(\delta =e^{-\Delta }\) is sufficiently small,

$$\begin{aligned}{} & {} |Df^{2n}(v_l(a))| \ge C_0 e^{\gamma _B n} \prod _j \bigl ( e^{p_j (\gamma /(2K))} C' e^{L_j \gamma _H} \bigl ) \nonumber \\{} & {} \quad \ge e^{\gamma _B n} e^{\sum _j p_j (\gamma /(4K)) + \gamma _H L_j} \ge C_0 e^{\gamma _B (1/2) 2n}. \end{aligned}$$
(9.1)

Hence up until time \(2n=2N_1\), we may have lost some part of the starting Lyapunov exponent (\(\gamma _B\)), but at each return it does not go below \(\gamma _B/2 > \gamma _I\), where \(\gamma _I\) is a lower bound for most lemmas in the induction process. However, precisely after a return the exponent may drop, but not more than \(4 K \alpha \) because of the basic approach rate assumption (the \(2 \alpha \) is replaced by \(4 \alpha \) to eat up constants), and in general each parameter a we are considering belongs to \({\mathcal E}_{n,l}(\gamma )\) for some \(\gamma \ge \gamma _B/2 - 4 K \alpha \ge \gamma _I\).

Therefore we may have to delete more parameters, that return too often and too deep, in order to restore the Lyapunov exponent for the remaining parameters. This is handled in the section about large deviations. The large deviation argument estimates the set of those parameters that spend a too large portion of the time in [n, 2n] reaching escape positions. Since the escape period is set to zero for shallow returns, i.e. for returns into \(U {\setminus }U^2\), the orbits \(\xi _{n,l}(a)\) outside \(U^2\) can be considered as free periods. Using Lemma 3.1 for this neighbourhood \(U^2\) also gives uniform expansion until the next return (let us use the same exponent \(\gamma _H > 0\) for those free periods). Since each bound period for a deep return into \(U^2\) is contained in an escape period, we now consider those bound periods \(\tilde{p}_j\) in [n, 2n] and the corresponding free periods \(\tilde{L}_j\) outside \(U^2\). If the parameter a is such that \(T_{n}(a) \le \tau n\) where \(0< \tau < 1\) then

$$\begin{aligned} |Df^{2n}(v_l(a))| \ge C_0 e^{\gamma _B n} e^{\sum _j \tilde{p}_j (\gamma /(4K))} e^{\sum _j \gamma _H \tilde{L}_j} \ge C_0 e^{\gamma _B n} e^{(1-\tau ) n \gamma _H }. \end{aligned}$$
(9.2)

According to the definition, \(\gamma _B = (3/4) (1-\tau ) \min (\gamma _H,\gamma _0)\), and hence the Lyapunov exponent is restored:

$$\begin{aligned} |Df^{2n}(v_l(a))| C_0 \ge e^{\gamma _B 2n}. \end{aligned}$$

Let us now turn to the general case where we use induction. Assume that we have constructed \(\Omega _l(n)\) for every l and that the sets \(\Omega _l(n)\) are “good” in the following sense. We assume that each partition element \(\omega \subset \Omega _l(n)\) belongs to \({\mathcal E}_{n,l}(\gamma _B) \cap {\mathcal B}_{n,l}\), i.e. \(\Omega _l(n) \subset {\mathcal E}_{n,l}(\gamma _B) \cap {\mathcal B}_{n,l}\). The sets \(\Omega _l(n)\) have their own structure and should not be mixed until at the very end, because the partition elements in each such set may differ, and their intersection therefore can destroy these elements.

For simplicity assume that \(\nu =n\) is a return time for l. By definition of \({\mathcal E}_{\nu ,l}(\gamma )\) and \({\mathcal B}_{\nu ,l}\), we can use the binding information for all critical point up until time \((4K \alpha /\gamma _I) \nu = \hat{\alpha } \nu \). First let us from \(\Omega _l(n)\) delete parameters so that we can use the binding information of all other critical points \(j \ne l\) for a longer time, in the next time window [2n, 4n], i.e. we want to consider \({\mathcal E}_{\nu ,l,\star }(\gamma ) \cap {\mathcal B}_{\nu ,l,\star }\). The point is now that the partition elements we are deleting, i.e. parameters belonging to \(({\mathcal E}_{\nu ,l,\star }(\gamma ) \cap {\mathcal B}_{\nu ,l,\star }) {\setminus }({\mathcal E}_{\nu ,l}(\gamma ) \cap {\mathcal B}_{\nu ,l})\), by this procedure are much larger than the partition elements in \(\Omega _l(n)\) (this was originally observed by M. Benedicks). Indeed, if we look at the length of \(\xi _{m,j}(\omega _1)\) where \(\omega _1\) is a partition element that got deleted at some time (return) \(m \le 2 \hat{\alpha } n\) then, by Lemma 4.2,

$$\begin{aligned} {\text {diam}}(\xi _{m,j}(\omega _1))\sim |\omega _1||Df^m(v_j(a))| \le |\omega _1|e^{\Gamma m}. \end{aligned}$$

By the basic assumption, and since \(\omega _1\) got deleted at time m, we have

$$\begin{aligned} {\text {diam}}(\xi _{m,j}(\omega _1)) \sim {\text {dist}}(\xi _{m,j}(\omega _1),Crit_{\omega _1}) / (\log ({\text {dist}}(\xi _{m,j}(\omega _1),Crit_{\omega _1})))^2 \ge e^{-3 \alpha m}, \end{aligned}$$

so

$$\begin{aligned} e^{-3\alpha m} \le C {\text {diam}}(\xi _{m,j}(\omega _1)) \le C |\omega _1|e^{\Gamma m}. \end{aligned}$$

On the other hand, the partition elements at time n or higher, are much smaller. This can be seen as follows. Let \(\omega _2\) be a partition element at time n. Since \({\text {diam}}(\xi _{n,j}(\omega _2)) \le S\), we have

$$\begin{aligned} S \ge {\text {diam}}(\xi _{n,j}(\omega _2)) \sim |\omega _2| |Df^n(v_j(a))| \ge |\omega _2| C_0 e^{\gamma n}. \end{aligned}$$

Therefore, since \(m \le 2 \hat{\alpha } n\),

$$\begin{aligned} \frac{|\omega _1|}{|\omega _2|} \ge C \frac{e^{-(3\alpha + \Gamma )m}}{e^{-\gamma n}} \ge C e^{ (\gamma - 2 \hat{\alpha } (3 \alpha + \Gamma ) )n} \gg 1. \end{aligned}$$

Hence \(\omega _2\) is much smaller than \(\omega _1\). This means that when deleting partition elements in \(\Omega _l(n)\) that do not satisfy the basic approach rate condition until time \(2\hat{\alpha }n\) for other critical points \(j \ne l\), in the time window \([\hat{\alpha }n, 2\hat{\alpha }n]\), we do not destroy the partition elements; we only delete whole partition elements of the type \(\omega _2 \in \Omega _l(n)\) that intersect partition elements of the type \(\omega _1\) that was deleted at the time scale \(\sim \hat{\alpha }n\).

Starting from the partition elements in \(\Omega _l(n) \subset {\mathcal E}_{n,l}(\gamma ) \cap {\mathcal B}_{n,l} \) and passing to \( \Omega _l(n,\star ) \subset {\mathcal E}_{\nu ,l,\star }(\gamma ) \cap {\mathcal B}_{\nu ,l,\star }\) is therefore harmless and the measure deleted is

$$\begin{aligned} |\Omega _l(n,\star )| \ge \bigg (1-C e^{-\alpha \hat{\alpha }n}\bigg ) |\Omega _l(n)|. \end{aligned}$$
(9.3)

We have now constructed \(\Omega _l(n,\star )\) and want to pass to \(\Omega _l(2n) \subset {\mathcal E}_{2n,l}(\gamma ) \cap {\mathcal B}_{2n,l}\). Passing from \(\Omega _l(n,\star )\) to \(\Omega _l(2n)\), we have to delete parameters that do not satisfy the basic approach rate condition for critical point \(c_l\) and also delete those parameters that have too many too deep returns in [n, 2n]. We have seen by Eq. (9.1), that the Lyapunov exponent can decrease to \((1/2) \gamma _B > \gamma _I\) during the period form n to 2n. We also have to take into account the blind escapes, see Remark 8.4, which constitute a small portion, \(\le Ce^{-qn}\), of the original set of parameters. For those parameters whose escape periods transcend into [2n, 4n] (these are the escape periods for the last return \(\nu _s\) discussed in Remark 8.4), the Lyapunov exponent may drop slightly below \(\gamma _B/2\), but never below \(\gamma _I\) (if \(\alpha \) is sufficiently small, see the condition on p. 5). By Lemmas 8.1 and 8.6 we get the estimate

$$\begin{aligned} |\Omega _l(2n)| \ge (1-e^{-\alpha n}) \left( 1- Ce^{-(\theta \tau - \varepsilon _2)n} \right) \left( 1-Ce^{-qn}\right) |\Omega _l(n,\star )|. \end{aligned}$$

Together with (9.3), we get, for some \(\beta > 0\),

$$\begin{aligned} |\Omega _l(2n)|&\ge (1-e^{-\alpha n}) \left( 1- Ce^{-(\theta \tau - \varepsilon _2)n}\right) (1-Ce^{-qn}) \left( 1-C e^{-\alpha \hat{\alpha }n}\right) |\Omega _l(n)| \\&\ge (1-e^{-\beta n}) |\Omega _l(n)|. \end{aligned}$$

It follows that \(\Omega _l(2n) \subset {\mathcal E}_{2n,l}(\gamma ) \cap {\mathcal B}_{2n,l}\), where \(\gamma \ge \gamma _B\) by the choice of \(\gamma _B\) (possibly, if n is just after a return time, \(\gamma \ge \gamma _B - 4 K \alpha \)). We are then back to the same situation at time 2n as we were for time n and the induction argument goes on forever.

Let \(M \ge 2\). Choosing the constants correctly, in this way we construct, for each critical point, a set \(\Omega _l(n) \subset {\mathcal E}_{n,l}(\gamma _B- 4 K \alpha ) \cap {\mathcal B}_{n,l}\) with measure at least \((1-1/(2Md)) |\omega _0|\), that holds for \(n > 0\), where d is the degree of f. Passing to the limit, as \(n \rightarrow \infty \), we get that the measure of parameters that satisfies the CE-condition for all \(n > 0\) is estimated by

$$\begin{aligned} \lim \limits _{n \rightarrow \infty } m \left( \bigcap _l \Omega _l(n) \right) \ge \left( 1-\frac{1}{M}\right) |\omega _0|. \end{aligned}$$

Since M can be chosen arbitrarily large, it follows that \(f_0\) is a Lebesgue density point of CE-maps.