1 Introduction

Intermittency refers to the behaviour of a dynamical system that alternates between long periods of exhibiting one out of several types of dynamical characteristics. In their seminal paper [36] Manneville and Pomeau investigated intermittency in the context of transitions to turbulence in convective fluids, see also [12, 29], and distinguished several different types of intermittency. An illustrative example of a one-dimensional map with intermittent behaviour is the Manneville–Pomeau map

$$\begin{aligned} T: [0,1]\rightarrow [0,1], \, x \mapsto x+x^{1+\alpha } \pmod 1 \end{aligned}$$

for some \(\alpha >0\). The source of intermittency for this map is the presence of a neutral fixed point at the origin, which causes orbits to spend long periods of time close to zero, while behaving chaotically once they escape.

The dynamics of the Manneville–Pomeau map and similar maps with a single neutral fixed point have been extensively studied over the past decades. It is known for example that such maps admit an absolutely continuous invariant measure (acim) and that their statistical properties are determined by the characteristics of the fixed point. See [10, 11, 18, 20, 21, 28, 34, 39] for results on Manneville–Pomeau type maps, and [16, 23, 35, 37, 40] for other related results on one-dimensional systems with neutral fixed points.

Intermittency caused by neutral fixed points was also studied in random dynamical systems, see e.g. [6,7,8,9, 24, 25]. These results show that a random dynamical system, built as a mixture of ‘good’ maps with finite acim’s and ‘bad’ maps with slower mixing rates or without finite acim’s, inherit ergodic properties typical for the ‘good’ maps: e.g., the random systems still admit a unique finite acim. On the other hand, it is clear that in the random mixture of good and bad maps, the presence of bad maps should be visible in the properties of the acim. In [25] it was shown that in the random system built using the ‘good’ Gauss and ‘bad’ Rényi continued fractions maps, the density of the acim is provably less smooth than the invariant density of the Gauss map. This loss of smoothness is an interesting new phenomenon, which deserves further study.

The topic of the present paper is another type of intermittency observed in random dynamical systems: the so-called critical intermittency introduced recently in [2, 22]. To illustrate the concept, consider the Markov process generated by random applications of one of the two logistic maps \(T_2(x) = 2x(1-x)\) and \(T_4 (x) = 4x(1-x)\): for each n, independently

$$\begin{aligned} x_{n+1}={\left\{ \begin{array}{ll} T_2(x_n),&{}\ \text { with prob. }p_2,\\ T_4(x_n),&{}\ \text { with prob. }p_4=1-p_2. \end{array}\right. }\end{aligned}$$

The dynamics of these two maps individually is quite different: \(T_4\) exhibits chaotic behaviour and admits an ergodic absolutely continuous invariant probability measure, while \(T_2\) has \(\frac{1}{2}\) as a superattracting fixed point with (0, 1) as its basin of attraction. Under random compositions of \(T_2\) and \(T_4\) the typical behaviour is the following: orbits are quickly attracted to \(\frac{1}{2}\) by applications of \(T_2\) and are then repelled first close to 1 and then close to 0 by one application of \(T_4\) followed by an application of either \(T_2\) or \(T_4\). Since 0 is a repelling fixed point for both maps, orbits then leave a neighbourhood of 0 after a number of time steps, see Fig. 1. This pattern occurs infinitely often in typical random orbits and is the result of the interplay between the exponential divergence from 0 under \(T_2\) and \(T_4\) and the superexponential convergence to \(\frac{1}{2}\) under \(T_2\). Figure 2c shows an orbit under random compositions of \(T_2\) and \(T_4\) as well as an orbit of a point under a Manneville–Pomeau map in (a) and a random orbit under compositions of the Gauss and Rényi maps in (b).

Fig. 1
figure 1

Critical intermittency in the random system of logistic maps \(T_2\), \(T_4\). The dashed line indicates part of a random orbit of x

The dynamical behaviour of random compositions of the two logistic maps \(T_2\) and \(T_4\) was studied in [2, 4, 5, 15, 22] among others. In [2, 22] the authors investigated the existence and finiteness of absolutely continuous invariant measures for this random system and for iterated function systems consisting of rational maps on the Riemann sphere. One particular result from [2] states that the random dynamical system generated by i.i.d. compositions of \(T_2\) and \(T_4\) chosen with probabilities \(p_2\) and \(p_4 = 1-p_2\) admits an absolutely continuous invariant measure that is \(\sigma \)-finite on the interval [0, 1] and that is infinite in case \(p_2>\frac{1}{2}\). An interesting question that was left open in [2] is whether for \(p_2 \le \frac{1}{2}\) this measure is infinite or finite.

In this article we answer this question. We consider a large family of random interval maps with critical intermittency that includes the random combination of \(T_2\) and \(T_4\). The systems we consider consist of i.i.d. compositions of a finite number of maps of two types: bad maps which share a superattracting fixed point and good maps that map the superattacting fixed point onto a common repelling fixed point. To be precise, the families of maps we consider are defined as follows.

Throughout the text we fix a point \(c \in (0,1)\) that will represent the single critical point of our maps, both good and bad.

A map \(T_g:[0,1] \rightarrow [0,1]\) is in the class of good maps, denoted by \(\mathfrak {G}\), if

  1. (G1)

    \(T_g|_{(0,c)}\) and \(T_g|_{(c,1)}\) are \(C^3\) diffeomorphisms onto (0, 1) and \(T_g(\{0,c, 1\}) \subseteq \{0,1\}\);

  2. (G2)

    \(T_g\) has non-positive Schwarzian derivative on [0, c) and (c, 1];

  3. (G3)

    to \(T_g\) we can associate three constants \(r_g \ge 1\), \(0< K_g<1\) and \(M_g > r_g\) such that

    $$\begin{aligned} K_g |x-c|^{r_g-1} \le |DT_g(x)| \le M_g |x-c|^{r_g-1}; \end{aligned}$$
    (1.1)
  4. (G4)

    we have \(|DT_g(0)|,|DT_g(1)| > 1\).

These conditions imply in particular that at least one of the maps \(T_g|_{[0,c]}\) or \(T_g|_{[c,1]}\) is continuous, and that both branches of \(T_g\) are strictly monotone. Note also that the conditions \(K_g <1\) and \(M_g > r_g\) are superfluous, since we can always choose a smaller constant K and larger constant M to satisfy (1.1), but we need these specific bounds in our estimates later. The critical point c is mapped to either 0 or 1 under each of the good maps and both 0 and 1 are (eventually) fixed points or periodic points (with period 2) by (G1) that are repelling by (G4). Examples include the doubling map and any surjective unimodal map, see Fig. 3a and b.

Fig. 2
figure 2

Intermittent behaviour of orbits of a a single Manneville–Pomeau map with \(\alpha =1.5\), b a random mixture of the Gauss and Rényi continued fractions maps where the Gauss map is chosen with probability \(p=0.1\) and c a random mixture of the logistic maps \(T_2\) and \(T_4\) where the map \(T_4\) is chosen with probability \(p=0.6\)

Fig. 3
figure 3

Four maps with critical point \(c = \frac{1}{2}\). a and b Show two good maps, while in c and d we see the graphs of two bad maps

The choice of conditions (G1)-(G4) is based on two factors: firstly, these conditions incorporate the most important properties of the ‘good’ logistic map \(T_4(x)=4x(1-x)\), which is the primary motivating example for this work, and secondly, the techniques used in this paper are motivated by the work of Nowicki and Van Strien [32] where the following result has been proven. Throughout the text we let \(\lambda \) denote the one-dimensional Lebesgue measure.

Theorem 1.1

Suppose that \(T:[0,1] \rightarrow [0,1]\) is unimodal, \(C^3\), has negative Schwarzian derivative and that the critical point of T is of order \(r \ge 1\). Moreover assume that the growth rate of \(\left| D T^{n}\left( c_{1}\right) \right| \), \(c_1=T(c)\), is so fast that

$$\begin{aligned} \sum _{n=0}^{\infty }\left| D f^{n}\left( c_{1}\right) \right| ^{-1 / r}<\infty . \end{aligned}$$
(1.2)

Then T has a unique absolutely continuous invariant probability measure \(\mu \) which is ergodic and of positive entropy. Furthermore, there exists a positive constant K such that

$$\begin{aligned} \mu (A) \le K\lambda (A)^{1 / r}, \end{aligned}$$
(1.3)

for any measurable set \(A \subset (0,1)\). Finally, the density \(\rho =\frac{d\mu }{d\lambda }\) of the measure \(\mu \) with respect to \(\lambda \) is an \(L^{\mathrm {\tau }-}\)-function where \(\tau =r /(r-1)\) and \(L^{\tau -}=\bigcap _{1 \leqq t<\tau } L^{t}\) and \(L^{t}=\big \{\rho \in L^{1}: \int _0^1|\rho |^{t} d \lambda <\infty \big \}\).

Formally this result is not immediately applicable to the good maps we introduced. The difference, however, is not principal and the conclusion remains exactly the same, the main reason being that the conditions (G1) and (G4) imply the growth rate (1.2), and hence any good map admits a unique probability acim.

A map \(T_b:[0,1]\rightarrow [0,1]\) is in the class of bad maps, denoted by \(\mathfrak {B}\), if

  1. (B1)

    \(T_b|_{(0,c)}\) and \(T_b|_{(c,1)}\) are \(C^3\) diffeomorphisms onto (0, c) or (c, 1), \(T_b(\{0,1\}) \subseteq \{0,1\}\) and \(T_b(c) = c\);

  2. (B2)

    \(T_b\) has non-positive Schwarzian derivative on [0, c) and (c, 1];

  3. (B3)

    to \(T_b\) we can associate three constants \(\ell _b > 1\), \(0< K_b< 1\) and \(M_b > \ell _b\) such that

    $$\begin{aligned} K_b |x-c|^{\ell _b-1} \le |DT_b(x)| \le M_b |x-c|^{\ell _b-1}; \end{aligned}$$
    (1.4)
  4. (B4)

    we have \(|DT_b(0)|,|DT_b(1)| > 1\).

In particular (B1) implies that \(T_b\) is continuous, and that \(T_b\) strictly monotone on the intervals [0, c] and [c, 1]. In contrast to (G3), note that in (B3) we have assumed that \(\ell _b\) is not equal to one. This means that \(DT_b(c) = 0\), so c is a superattracting fixed point for each bad map. An immediate consequence of the presence of a globally attracting fixed point at c is that the only finite invariant measures are linear combinations of Dirac measures at 0, c, and 1. For examples, see Fig. 3c and d.

The random systems we consider in this article are the following. Let \(T_1,\ldots ,T_N \in {\mathfrak {G}} \cup {\mathfrak {B}}\) be a finite collection of good and bad maps. Write \(\Sigma _G = \{1 \le j \le N\, :\, T_j \in {\mathfrak {G}}\}\) and \(\Sigma _B = \{1 \le j \le N\, :\, T_j \in {\mathfrak {B}}\}\) for the index sets of the good and bad maps respectively and assume that \(\Sigma _G,\Sigma _B \ne \emptyset \). Write \(\Sigma = \{ 1, \ldots , N \} = \Sigma _G \cup \Sigma _B\). The skew product transformation or random map F is defined by

$$\begin{aligned} F:\Sigma ^{{\mathbb {N}}} \times [0,1] \rightarrow \Sigma ^{{\mathbb {N}}} \times [0,1], \, (\omega ,x) \mapsto (\sigma \omega , T_{\omega _1}(x)), \end{aligned}$$
(1.5)

where \(\sigma \) denotes the left shift on sequences in \(\Sigma ^{{\mathbb {N}}}\). Let \({\mathbf {p}} = (p_j)_{j \in \Sigma }\) be a probability vector representing the probabilities with which we choose the maps \(T_j\), \(j \in \Sigma \). We will consider measures of the form \({\mathbb {P}} \times \mu _{{\mathbf {p}}}\), where \({\mathbb {P}}\) is the \({\mathbf {p}}\)-Bernoulli measure on \(\Sigma ^{\mathbb {N}}\) and \(\mu _{{\mathbf {p}}}\) is a Borel measure on [0, 1] absolutely continuous with respect to \(\lambda \) and satisfying

$$\begin{aligned} \sum _{j \in \Sigma } p_j \mu _{{\mathbf {p}}}(T_j^{-1}A) = \mu _{\mathbf{p}}(A), \qquad \hbox { for all Borel sets}\ A \subseteq [0,1]. \end{aligned}$$
(1.6)

In this case \({\mathbb {P}} \times \mu _{{\mathbf {p}}}\) is an invariant measure for F and we say that \(\mu _{{\mathbf {p}}}\) is a stationary measure for F. We also say that a stationary measure \(\mu _{{\mathbf {p}}}\) is ergodic for F if \({\mathbb {P}} \times \mu _{{\mathbf {p}}}\) is ergodic for F. Our main results are the following.

Theorem 1.2

Let \(\{T_j: j \in \Sigma \}\) be as above and \({\mathbf {p}} = (p_j)_{j \in \Sigma }\) a positive probability vector.

  1. (1)

    There exists a unique (up to scalar multiplication) stationary \(\sigma \)-finite measure \(\mu _{{\mathbf {p}}}\) for F that is absolutely continuous with respect to the one-dimensional Lebesgue measure \(\lambda \). Moreover, this measure is ergodic.

  2. (2)

    The density \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is bounded away from zero, is locally Lipschitz on (0, c) and (c, 1) and is not in \(L^q\) for any \(q > 1\).

We call the measure \(\mu _{{\mathbf {p}}}\) from Theorem 1.2 an acs measure.

Theorem 1.3

Let \(\{T_j: j \in \Sigma \}\) be as above and \({\mathbf {p}} = (p_j)_{j \in \Sigma }\) a positive probability vector. Let \(\mu _{{\mathbf {p}}}\) be the unique acs measure from Theorem 1.2. Set \(\theta = \sum _{b \in \Sigma _B} p_b \ell _b\). Then \(\mu _{{\mathbf {p}}}\) is finite if and only if \(\theta <1\). In this case, there exists a constant \(C > 0\) such that

$$\begin{aligned} \mu _{{\mathbf {p}}}(A) \le C \cdot \sum _{k=0}^{\infty } \theta ^k \lambda (A)^{\ell _{\max }^{-k} r_{\max }^{-1}} \end{aligned}$$
(1.7)

for any Borel set \(A \subseteq [0,1]\), where \(r_{\max } = \max \{r_g: g \in \Sigma _G\}\) and \(\ell _{\max } = \max \{\ell _b: b \in \Sigma _B\}\).

As we shall see in (4.12) the bound in (1.7) can be improved by not bounding mixtures \(\ell _{\mathbf {b}} r_g = \prod _{i=1}^k \ell _{b_i} r_g\) by their maximal value \(\ell _{\max }^k r_{\max }\), but this improvement does not change the qualitative behaviour of the bound.

It will become clear that the density \(\frac{d\mu _{\mathbf{p}}}{d\lambda }\) in Theorem 1.2 blows up to infinity at the points zero and one and also (at least on one side) at c. Theorem 1.3 says that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is integrable if and only if \(\theta \) is small enough, namely \(\theta < 1\). This intuitively makes sense since for a smaller value of \(\theta \) the attraction of orbits to c is weaker on average and consequently orbits typically spend less time near zero and one once a good map is applied.

The inequality (1.7) is the counterpart of the Nowicki-Van Strien inequality (1.3), and naturally gives a substantially worse bound due to the presence of bad maps. It is not immediately clear how much worse (1.7) is in comparison to (1.3). However, the following holds.

Corollary 1.1

Let \( \{T_j: j \in \Sigma \}\) be as above and \({\mathbf {p}} = (p_j)_{j \in \Sigma }\) a positive probability vector. Suppose \(\theta = \sum _{b \in \Sigma _B} p_b \ell _b < 1\). Then there exist \(K > 0\) and \(\varkappa > 0\) such that for any Borel set \(A \subseteq [0,1]\) with \(\lambda (A) \in (0,1)\) one has

$$\begin{aligned} \mu _{{\mathbf {p}}}(A)\le K \frac{1}{ \log ^\varkappa (1/\lambda (A))}. \end{aligned}$$

Moreover, the acs measure \(\mu _{{\mathbf {p}}}\) from Theorem 1.2 depends continuously on the probability vector \({\mathbf {p}} \in {\mathbb {R}}^N\).

Corollary 1.2

Let \( \{T_j: j \in \Sigma \}\) be as above. For each \(n \ge 0\), let \({\mathbf {p}}_n = (p_{n,j})_{j \in \Sigma }\) be a positive probability vector such that \(\sup _{n}\sum _{b \in \Sigma _B} p_{n, b} \ell _b<1\) and assume that \(\lim _{n\rightarrow \infty }{\mathbf {p}}_n=\mathbf{p}\) in \({\mathbb {R}}_+^N\). Then the sequence \(\mu _{{\mathbf {p}}_n}\) converges weakly to \(\mu _{{\mathbf {p}}}\).

The problem of finding acs probability measures for random interval maps that are expanding on average is well studied and results often rely on bounded variation techniques from Lasota and Yorke [27]. There are several articles that extend these techniques to include random interval maps that are not expanding on average. See in particular [33, Section 4] and [14, 30, 31]. These methods require the infimum of the absolute value of the derivative of the random map to be positive and thus they do not apply to our class of maps with a critical point. For this reason it is e.g. not clear if one can conjugate the random map in the current paper to a random map that is expanding on average and is composed of Lasota–Yorke type maps as is the case in [33, Section 4]. Thus, we resort to techniques similar to those used by Nowicki and Van Strien in their proof of Theorem 1.1.

To be more precise, for the existence result from Theorem 1.2 we use an inducing scheme. This approach is inspired by [2], but the choice of the inducing domain needed some care. With the help of Kac’s Lemma we then obtain that the acs measure is infinite in case \(\theta \ge 1\). To prove that this measure is finite for \(\theta <1\) we use an approach similar to the one employed in [32]. The main difficulty here is that it may take an arbitrarily long time before the superattracting fixed point is mapped onto the repelling orbit by one of the good maps, which decreases the regularity of the density of the acs measure.

In (B3) we have assumed that for any bad map \(T_b\) the corresponding value \(\ell _b\) is not equal to one. Note that a bad map \(T_b\) for which we allow \(\ell _b = 1\) satisfies \(|DT_b(c)| > 0\), so in this case c is an attracting fixed point for \(T_b\) but not superattracting. It should not come as a surprise that results similar to Theorems 1.2 and 1.3 also hold in case some or all of the bad maps \(T_b\) have \(\ell _b=1\). The proofs presented for these theorems, however, do not immediately carry over. In the last section we explain how the results are affected in case some or all maps \(T_b\) satisfy \(\ell _b=1\) and what the necessary changes in the proofs are.

The paper is organised as follows. In Sect. 2 we list some preliminaries and first consequences of the conditions (G1)–(G4) and (B1)–(B4). Section 3 is devoted to the proof of Theorem 1.2 and in Sect. 4 we prove Theorem 1.3. In Sect. 5 we prove Corollaries 1.1 and 1.2 and explain what the analogues of Theorem 1.2 and 1.3 are in case \(\ell _b=1\) for one or more \(b \in \Sigma \) and how the proofs of Theorem 1.2 and 1.3 need to be modified to get these results. We end with some final remarks.

2 Preliminaries

We start by introducing some notation and collecting some general preliminaries.

2.1 Words, Sequences and Invariant Measures

For any finite subset \(\Sigma \subseteq {\mathbb {N}}\) and any \(n \ge 1\) we use \({\mathbf {u}} \in \Sigma ^n\) to denote a word \({\mathbf {u}} = u_1 \cdots u_n\). \(\Sigma ^0\) contains only the empty word, which we denote by \(\epsilon \). On the space of infinite sequences \(\Omega = \Sigma ^{\mathbb {N}}\) we use

$$\begin{aligned} {[}{\mathbf {u}}] = [u_1 \cdots u_n] = \{\omega \in \Omega : \omega _1 = u_1, \ldots , \omega _n = u_n\} \end{aligned}$$

to denote the cylinder set corresponding to \({\mathbf {u}}\). The notation \(|{\mathbf {u}}|\) indicates the length of \({\mathbf {u}}\), so \(|{\mathbf {u}}|=n\) for \({\mathbf {u}} \in \Sigma ^n\). For two words \({\mathbf {u}} \in \Sigma ^n\) and \({\mathbf {v}} \in \Sigma ^m\) the concatenation of \({\mathbf {u}}\) and \({\mathbf {v}}\) is denoted by \(\mathbf {uv} \in \Sigma ^{n+m}\). For a probability vector \(p = (p_j)_{j \in \Sigma }\) and \({\mathbf {u}} \in \Sigma ^n\) we write \(p_{{\mathbf {u}}} = \prod _{i=1}^n p_{u_i} \) with \(p_{{\mathbf {u}}}=0\) if \(n=0\). We use \(\sigma \) to denote the left shift on \(\Omega \): for \(\omega \in \Omega \) and all \(n\in {\mathbb {N}}\), \((\sigma \omega )_n = \omega _{n+1}\).

Given a finite family of Borel measurable maps \(\{ T_j: [0,1] \rightarrow [0,1]\}_{j \in \Sigma }\), the skew product or the random map F is defined by

$$\begin{aligned} F: \Omega \times [0,1] \rightarrow \Omega \times [0,1], \, (\omega ,x) \mapsto \big ( \sigma \omega , T_{\omega _1}(x)\big ).\end{aligned}$$

We use the following notation for the iterates of the maps \(T_j\). For each \(\omega \in \Omega \) and each \(n \in \mathbb {N}_0\) define

$$\begin{aligned} T_{\omega _1 \cdots \omega _n }(x) = T_\omega ^n(x) = {\left\{ \begin{array}{ll} x, &{} \text {if}\quad n=0,\\ T_{\omega _n}\circ T_{\omega _{n-1}}\circ \cdots \circ T_{\omega _1}(x), &{} \text {for } n\ge 1. \end{array}\right. } \end{aligned}$$
(2.1)

With this notation, we can write the iterates of the random system F as

$$\begin{aligned} F^n(\omega ,x) = (\sigma ^n \omega , T_{\omega }^n(x)). \end{aligned}$$
(2.2)

The following lemma on invariant measures for F holds.

Lemma 2.1

([30], see also Lemma 3.2 of [19]) If all maps \(T_j\) are non-singular with respect to \(\lambda \) (that is, \(\lambda (A) =0\) if and only if \(\lambda (T_j^{-1}A) = 0\) for all \(A \subseteq [0,1]\) measurable) and \({\mathbb {P}}\) is the \(\mathbf{p}\)-Bernoulli measure on \(\Omega \) for some positive probability vector \({\mathbf {p}}\), then the \(\mathbb {P} \times \lambda \)-absolutely continuous F-invariant measures are precisely the measures of the form \(\mathbb {P} \times \mu \) where \(\mu \) is absolutely continuous w.r.t. \(\lambda \) and satisfies

$$\begin{aligned} \sum _{j \in \Sigma } p_j \mu (T_j^{-1}A) = \mu (A) \qquad \text {for all Borel sets }A. \end{aligned}$$
(2.3)

Now let \((X, {\mathcal {F}}, m)\) be a measure space and \(T: X \rightarrow X\) measurable and non-singular with respect to m. For a set \(Y \in {\mathcal {F}}\) such that \(0< m(Y) < \infty \) and \(m \big ( X \setminus \bigcup _{n \ge 1} T^{-n} Y \big )=0\), the first return time map \(\varphi _Y: Y \rightarrow {\mathbb {N}} \cup \{ \infty \}\) given by

$$\begin{aligned} \varphi _Y(y) = \inf \{ n \ge 1 \, : \, T^n(y) \in Y \} \end{aligned}$$
(2.4)

is finite m-a.e. on Y, and moreover m-a.e. \(y\in Y\) returns to Y infinitely often. If we remove from Y the m-null set of points that return to Y only finitely many times, and for convenience call this set Y again, then we can define the induced transformation \(T_Y:Y \rightarrow Y\) by

$$\begin{aligned} T_Y (y) = T^{\varphi _Y(y)}(y).\end{aligned}$$

The following result can be found in e.g. [1, Proposition 1.5.7]. Note that this statement asks for T to be conservative. This is not used in the proof however and the condition \(m \big ( X \setminus \bigcup _{n \ge 1} T^{-n} Y \big )=0\) is enough to guarantee that the induced transformation is well defined.

Lemma 2.2

(see e.g. Proposition 1.5.7. in [1]) Let T be a measurable and non-singular transformation on a measure space \((X, {\mathcal {F}}, m)\) and let \(Y \in {\mathcal {F}}\) be such that \(0< m(Y) < \infty \) and \(m \big (X \setminus \bigcup _{n \ge 1} T^{-n}Y\big )=0\). If \(\nu \ll m|_Y\) is a finite invariant measure for the induced transformation \(T_Y\), then the measure \(\mu \) on \((X, {\mathcal {F}}, m)\) defined by

$$\begin{aligned} \mu (B) = \sum _{k \ge 0} \nu \Big (Y \cap T^{-k}B \setminus \bigcup _{j=1}^k T^{-j}Y \Big )\end{aligned}$$

for \(B \in {\mathcal {F}}\) is T-invariant, absolutely continuous with respect to m and \(\mu |_Y = \nu \).

We will also use the following result on the first return time.

Lemma 2.3

(Kac’s Formula, see e.g. 1.5.5. in [1]) Let T be a conservative, ergodic, measure preserving transformation on a measure space \((X, {\mathcal {F}}, m)\). Let \(Y \in {\mathcal {F}}\) be such that \(0< m(Y) < \infty \) and let \(\varphi _Y\) be the first return map to Y. Then \(\int _Y \varphi _Y \, d m = m(X)\).

One can also obtain invariant measures via a functional analytic approach. Here we give a specific result for interval maps. Let I be an interval. If \(T: I \rightarrow I\) is piecewise strictly monotone and \(C^1\), then the Perron–Frobenius operator \({\mathcal {P}}_T\) is defined on the space of non-negative measurable functions h on I by

$$\begin{aligned} {\mathcal {P}}_T h (x) = \sum _{y \in T^{-1}\{x\}} \frac{h(y)}{|DT(y)|}. \end{aligned}$$
(2.5)

A non-negative measurable function \(\varphi \) on I is a fixed point of \({\mathcal {P}}_T\) if and only if it provides an invariant measure \(\mu \) for T that is absolutely continuous with respect to \(\lambda \) by setting \(\mu (A) = \int _A \varphi \, d\lambda \) for each Borel set A.

For a random map F using a finite family of transformations \(\{ T_j: I \rightarrow I\}_{j \in \Sigma }\), such that each map \(T_j\) is piecewise strictly monotone and \(C^1\), and a positive probability vector \({\mathbf {p}} = (p_j)_{j \in \Sigma }\), the Perron–Frobenius operator \(\mathcal {P}_F\) is given on the space of non-negative measurable functions h on I by

$$\begin{aligned} \mathcal {P}_Fh(x) = \sum _{j \in \Sigma } p_j {\mathcal {P}}_{T_j} h (x), \end{aligned}$$
(2.6)

where each \(\mathcal {P}_{T_j}\) is as given in (2.5). Let \({\mathbb {P}}\) denote the \({\mathbf {p}}\)-Bernoulli measure on \(\Omega \). Then a non-negative measurable function \(\varphi \) on I is a fixed point of \({\mathcal {P}}_F \) if and only if the measure \({\mathbb {P}} \times \mu \), where \(\mu \) is the absolutely continuous measure with \(\frac{d\mu }{d\lambda } = \varphi \), is F-invariant.

In Sect. 3.3 it will be shown that the density \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) from Theorem 1.2, which is a fixed point of the Perron–Frobenius operator for the random system F given by (1.5), is bounded away from zero. From this it is easy to see that (2.6) implies that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) blows up to infinity at the points zero and one and also at least on one side of c.

2.2 Estimates on Good and Bad maps

Now let \(T:I \rightarrow I\) be a \(C^3\) map of an interval I into itself. The Schwarzian derivative of T at \(x \in I\) with \(DT(x) \ne 0\) is defined by

$$\begin{aligned} {\mathbf {S}}{\text {T}}(x) = \frac{D^3T(x)}{DT(x)} - \frac{3}{2} \Big (\frac{D^2 T(x)}{DT(x)}\Big )^2. \end{aligned}$$
(2.7)

We say that T has non-positive Schwarzian derivative on I if \(DT(x) \ne 0\) and \({\mathbf {S}}{{T}}(x) \le 0\) for all \(x \in I\). A direct computation shows that the Schwarzian derivative of the composition of two transformations \(T_1,T_2: I \rightarrow I\) satisfies

$$\begin{aligned} {\mathbf {S}}(T_2 \circ T_1)(x) = {\mathbf {S}} T_2\big (T_1(x)\big ) \cdot |DT_1(x)|^2 + {\mathbf {S}}{\text {T}}_1(x). \end{aligned}$$
(2.8)

Hence, \({\mathbf {S}}(T_2 \circ T_1)\le 0\) provided \({\mathbf {S}} T_1\le 0\) and \({\mathbf {S}} T_2\le 0\).

From (2.8) it follows that for a finite collection \(\{ T_j: I \rightarrow I\}_{j \in \Sigma }\) of \(C^3\) interval maps with non-positive Schwarzian derivative, we can write the Schwarzian derivative of \(T_\omega ^n\), \(n \in \mathbb {N}\) and \(\omega \in \Omega \), as

$$\begin{aligned} {\mathbf {S}}{\text {T}}_{\omega }^n(x) = \sum _{i=0}^{n-1} \mathbf{ST}_{\omega _{i+1}}\big (T_{\omega }^i(x)\big ) \cdot \Big |\prod _{j=1}^i DT_{\omega _j}(T_{\omega }^{j-1}(x))\Big |^2. \end{aligned}$$
(2.9)

By (G2) and (B2) this implies that for a collection of good and bad maps \(\{ T_j\}_{j \in \Sigma }\), \(T_{\omega }^n\) has non-positive Schwarzian derivative on [0, 1] outside of the critical points of \(T_\omega ^n\) for all \(\omega \in \Omega \) and \(n \in \mathbb {N}\).

We will use the following two well-known properties of maps with non-positive Schwarzian derivative (see e.g. [17, Section 4.1]).

Minimum Principle: Let \(I = [a,b]\) be a closed interval and suppose that \(T: I \rightarrow I\) has non-positive Schwarzian derivative. Then

$$\begin{aligned} |DT(x)| \ge \min \{DT(a),DT(b)\}, \quad \forall x \in [a,b]. \end{aligned}$$
(2.10)

A consequence of the Minimum Principle is that for any \(T \in {\mathfrak {G}} \cup {\mathfrak {B}}\) the derivative |DT| has locally no strict minima in the intervals (0, c) and (c, 1). In particular, there cannot be any attracting fixed points for T in (0, c) and (c, 1). Therefore, if \(T \in {\mathfrak {B}}\), then \(T^n(x) \rightarrow c\) as \(n \rightarrow \infty \) for all \(x \in (0,1)\).

Koebe Principle: For each \(\rho > 0\) there exist \(K^{(\rho )} >1\) and \(M^{(\rho )} > 0\) with the following property. Let \(J \subseteq I\) be two intervals and suppose that \(T: I \rightarrow I\) has non-positive Schwarzian derivative. If both components of \(T(I)\backslash T(J)\) have length at least \(\rho \cdot \lambda (T(J))\), then

$$\begin{aligned} \frac{1}{K^{(\rho )}} \le \frac{DT(x)}{DT(y)} \le K^{(\rho )}, \qquad \forall x,y \in J \end{aligned}$$
(2.11)

and

$$\begin{aligned} \Big |\frac{DT(x)}{DT(y)}-1\Big | \le M^{(\rho )} \cdot \frac{|T(x)-T(y)|}{\lambda (T(J))}, \qquad \forall x,y \in J. \end{aligned}$$
(2.12)

Note that the constants \(K^{(\rho )},M^{(\rho )}\) only depend on \(\rho \) and not on the map T.

From (2.11) one can obtain a bound on the size of the images of intervals: Let \(J' \subseteq J\) be another interval. By the Mean Value Theorem there exists an \(x \in J'\) with \(|DT(x)| = \frac{\lambda (T(J'))}{\lambda (J')}\) and a \(y \in J\) with \(|DT(y)| = \frac{\lambda (T(J))}{\lambda (J)}\). Hence,

$$\begin{aligned} \frac{1}{K^{(\rho )}} \frac{\lambda (J')}{\lambda (J)} \le \frac{DT(x)}{DT(y)} \frac{\lambda (J')}{\lambda (J)} = \frac{\lambda (T(J'))}{\lambda (T(J))} \le K^{(\rho )} \frac{\lambda (J')}{\lambda (J)}. \end{aligned}$$
(2.13)

Recall the constants \(\ell _b\), \(K_b\) and \(M_b\) from condition (B3) and set \(\ell _{\min } = \min \{ \ell _b \, : \, b \in \Sigma _B\}\) and \(\ell _{\max } = \max \{ \ell _b \, : \, b \in \Sigma _B\}\). (B3) gives us control over the distance between \(T_\omega ^n (x)\) and c.

Lemma 2.4

For all \(n \in \mathbb {N}\), \(\omega \in \Sigma _B^{\mathbb {N}}\) and \(x \in [0,1]\),

$$\begin{aligned} \left( {\tilde{K}} |x-c| \right) ^{\ell _{\omega _1} \cdots \ell _{\omega _n}} \le |T_{\omega }^n(x)-c| \le \left( {\tilde{M}} |x-c|\right) ^{\ell _{\omega _1} \cdots \ell _{\omega _n}}, \end{aligned}$$
(2.14)

with constants \({\tilde{K}} =\big ( \frac{\min \{ K_b \, : \, b \in \Sigma _B \}}{\ell _{\max }}\big )^ \frac{1}{\ell _{\min }-1} \in (0,1)\) and \({\tilde{M}} = \big (\frac{\max \{ M_b \, : \, b \in \Sigma _B \}}{\ell _{\min }}\big )^{\frac{1}{\ell _{\min }-1}} > 1\).

Proof

It follows from (B3) that for any \(j \in \Sigma _B\) and \(x \in [0,1]\),

$$\begin{aligned} |T_j(x)-c| = |T_j(x)-T_j(c)| = \Big | \int _c^x DT_j(y) dy \Big | \ge \frac{\min \{ K_b \, : \, b \in \Sigma _B \}}{\ell _{\max }} |x-c|^{\ell _j}.\end{aligned}$$

By induction we get that for each \(n \in \mathbb {N}\) and \(\omega \in \Sigma _B^{\mathbb {N}}\),

$$\begin{aligned} |T_{\omega }^n(x)-c| \ge \left( \frac{\min \{ K_b \, : \, b \in \Sigma _B \}}{\ell _{\max }} \right) ^{1+\sum _{i=0}^{n-2} \ell _{\omega _n} \cdots \ell _{\omega _{n-i}}} \cdot |x-c|^{\ell _{\omega _1} \cdots \ell _{\omega _n}}. \end{aligned}$$
(2.15)

From (B3) we see that \(\frac{\min \{ K_b \, : \, b \in \Sigma _B \}}{\ell _{\max }}<1\). The lower bound now follows by observing that

$$\begin{aligned} \Big (1+\sum _{i=0}^{n-2} \ell _{\omega _n} \cdots \ell _{\omega _{n-i}}\Big )/(\ell _{\omega _1}\cdots \ell _{\omega _n}) \le \sum _{i=1}^n \frac{1}{\ell _{\min }^i} < \frac{1}{\ell _{\min }-1}. \end{aligned}$$

The result for the upper bound follows similarly, by noticing that in this case from (B3) it follows that \( \frac{\max \{ M_b \, : \, b \in \Sigma _B\}}{\ell _{\min }}>1\). \(\square \)

It follows that under iterations of bad maps the distance \(|T^n_\omega (x)-c|\) is eventually decreasing superexponentially fast in n.

Furthermore, note that there exists a \(\delta > 0\) such that \(|DT_b(x)| < 1\) for all \(x \in [c-\delta ,c+\delta ]\) and \(b \in \Sigma _B\). This implies

$$\begin{aligned} |T_b(x)-c| < |x-c| \end{aligned}$$
(2.16)

for all \(x \in [c-\delta ,c+\delta ]\) and \(b \in \Sigma _B\).

The upper bound on \(|T^n_\omega (x)-c|\) that we obtained in Lemma 2.4 will be used in Sect. 4 to prove that \(\mu _{{\mathbf {p}}}\) in Theorem 1.3 is infinite if \(\theta \ge 1\). The lower bound from Lemma 2.4 will be used for the proof that \(\mu _{{\mathbf {p}}}\) is finite if \(\theta < 1\).

3 Existence of a \(\sigma \)-finite acs Measure

From now on we fix an integer \(N \ge 2\) and consider a finite collection \(T_1,\ldots ,T_N \in {\mathfrak {G}} \cup {\mathfrak {B}}\) of good and bad maps in the classes \({\mathfrak {G}}\) and \(\mathfrak B\). As in the Introduction write \(\Sigma _G = \{1 \le j \le N: T_j \in {\mathfrak {G}}\}\) and \(\Sigma _B = \{1 \le j \le N: T_j \in {\mathfrak {B}}\}\) for the corresponding index sets and assume that \(\Sigma _G,\Sigma _B \ne \emptyset \). Write \(\Sigma = \{ 1, 2, \ldots , N \}\) and set \(\Omega = \Sigma ^{\mathbb {N}}\) for the set of infinite sequences of elements in \(\Sigma \). In this section we prove Theorem 1.2, i.e., we establish the existence of an ergodic acs measure and several of its properties using an inducing scheme for the random system F. We fix the index \(g \in \Sigma _G\) of one good map \(T_g\) and start by constructing an inducing domain that depends on this g.

3.1 The induced system and return time partition

The first lemma is needed to specify the set on which we induce. For each \(k \in \mathbb {N}\) let \(x_k\) and \(x_k'\) in (0, c) denote the critical points of \(T_g^k\) closest to 0 and c, respectively. Furthermore, let \(y_k\) and \(y_k'\) in (c, 1) denote the critical points of \(T_g^k\) closest to 1 and c, respectively.

Lemma 3.1

We have \(x_k \downarrow 0\), \(x_k' \uparrow c\), \(y_k' \downarrow c\), \(y_k \uparrow 1\) as \(k \rightarrow \infty \).

Proof

Let a and b denote the critical points of \(T^2_g\) in (0, c) and (c, 1), respectively. Then at least one of the branches \(T_g^2|_{(0,a)}\) and \(T_g^2|_{(b,1)}\) is increasing. Suppose that \(T_g^2|_{(0,a)}\) is increasing. It then follows from the Minimum Principle that \(T_g^2(x) \ge \min \{\frac{x}{a},DT_g^2(0)\cdot x\}\) for each \(x \in [0,a]\). To see this, suppose there is an \(x \in (0,a)\) with \(T_g^2(x) < \min \{\frac{x}{a},DT_g^2(0)\cdot x\}\). Then there must be a \(y \in (0, x)\) with \(DT_g^2(y) < \min \{ D T_g^2(0), \frac{1}{a} \}\) and a \(z \in [x,a)\) with \(DT_g^2(z) > \frac{1}{a}\). On the other hand, by the Minimum Principle, \(DT_g^2(y) \ge \min \{ DT_g^2(0), DT_g^2(z) \}\), a contradiction. Combining this with \(DT_g^2(0) > 1\) and defining \(L: (0,1) \rightarrow (0,a)\) by \(L = (T_g^2|_{(0,a)})^{-1}\), we see that \(L^k(a) \downarrow 0\) as \(k \rightarrow \infty \). Furthermore, define \(R: (0,1) \rightarrow (b,1)\) by \(R = (T_g^2|_{(b,1)})^{-1}\). If \(T_g^2|_{(b,1)}\) is increasing, we see that similarly \(R^k(b) \uparrow 1\) as \(k \rightarrow \infty \). On the other hand, if \(T_g^2|_{(b,1)}\) is decreasing, we have \(RL^k(a) \uparrow 1\) as \(k \rightarrow \infty \). Finally, if \(T_g^2|_{(0,a)}\) is decreasing, then \(T_g^2|_{(b,1)}\) must be increasing, which yields \(LR^k(b) \downarrow 0\) as \(k \rightarrow \infty \). We conclude that \(x_k \downarrow 0\) and \(y_k \uparrow 1\) as \(k \rightarrow \infty \). It follows from (G1) that c is a limit point of both of the sets \(\bigcup _{k \in \mathbb {N}} (T_g|_{(0,c)})^{-1}(\{x_k,y_k\})\) and \(\bigcup _{k \in \mathbb {N}} (T_g|_{(c,1)})^{-1}(\{x_k,y_k\})\). So \(x_k' \uparrow c\), \(y_k' \downarrow c\) as \(k \rightarrow \infty \). \(\square \)

By the previous lemma and (G1), for \(k \in \mathbb {N}\) large enough it holds that

$$\begin{aligned}&T_g(x_k') \le \ x_k' \text { or } T_g(x_k') \ge y_k', \text { and}\nonumber \\&T_g(y_k') \le \ x_k' \text { or } T_g(y_k') \ge y_k', \end{aligned}$$
(3.1)

and, using also (G4), (B1) and (B4), for every \(j \in \Sigma \),

$$\begin{aligned}&T_j \big ([0,x_k] \cup [y_k,1]\big ) \subseteq [0,x_k') \cup (y_k',1] \text { and }\nonumber \\&|D T_j(x)|> d > 1 \, \, \text { for all } x \in [0,x_k) \cup (y_k,1] \, \text { and some constant }d. \end{aligned}$$
(3.2)

Fix a \(\kappa \in {\mathbb {N}}\) for which (3.1) and (3.2) hold. We introduce some notation. Let \(t \in \Sigma \) be such that \(t \ne g\), and define

$$\begin{aligned}&C = [\underbrace{g\cdots g}_{\kappa \ \text {times}} t] = [g^\kappa t],\end{aligned}$$
(3.3)
$$\begin{aligned}&J_0 = (x_\kappa ,x_\kappa '), \quad J_1 = (y_\kappa ',y_\kappa ), \quad J = J_0 \cup J_1,\end{aligned}$$
(3.4)
$$\begin{aligned}&Y = C \times J. \end{aligned}$$
(3.5)

The next lemma shows that \({\mathbb {P}} \times \lambda \)-almost all \((\omega ,x)\) eventually enter Y under iterations of F, and hence that \({\mathbb {P}} \times \lambda \)-almost all \((\omega ,x) \in Y\) will return to Y infinitely many times.

Lemma 3.2

$$\begin{aligned} {\mathbb {P}} \times \lambda \Big ( \Omega \times [0,1] \setminus \bigcup _{n = 1}^{\infty } F^{-n} Y \Big )=0. \end{aligned}$$
(3.6)

Proof

For \(\mathbb {P}\)-almost all \(\omega \in \Omega \) we have \(\sigma ^n \omega \in [g]\) for infinitely many \(n \in \mathbb {N}\). For any such n and each \(x \in (0,c) \cup (c,1)\) either \(T_\omega ^n (x) \in J\) or \(T_\omega ^n(x) \not \in J\). If \(T_\omega ^n(x) \in (0, x_\kappa ] \cup [y_\kappa ,1)\), then it follows from (3.2) that there is an \(m \ge 1\) such that \(T_\omega ^{n+m}(x) \in J\). If \(T_\omega ^n (x) \in [x_\kappa ',c) \cup (c, y_\kappa ']\) it follows from (3.1) that \(T^{n+1}_\omega (x)=T_g \circ T_\omega ^n (x) \in (0,x_\kappa '] \cup [y_\kappa ',1)\), which means that we are in the first case if \(T_{\omega }^{n+1}(x) \notin J\). Hence, there exists a measurable set \(A \subseteq \Omega \times [0,1]\) with \(\mathbb {P} \times \lambda (A) = 1\) such that for each \((\omega ,x) \in A\) we have \(T_{\omega }^n(x) \in J\) for infinitely many \(n \in \mathbb {N}\).

We define

$$\begin{aligned} \mathcal {E} = A \setminus \bigcup _{n = 1}^{\infty } F^{-n} Y \end{aligned}$$
(3.7)

and for each \(x \in [0,1]\) we define

$$\begin{aligned} \mathcal {E}_x = \Big \{ \omega \in \Omega : (\omega ,x) \in A \setminus \bigcup _{n = 1}^{\infty } F^{-n} Y \Big \}, \end{aligned}$$
(3.8)

which is the x-section of \(\mathcal {E}\). It follows from Fubini’s Theorem that \(\mathcal {E}_x\) is measurable for \(\lambda \)-almost all \(x \in [0,1]\) and that

$$\begin{aligned} \mathbb {P} \times \lambda (\mathcal {E}) = \int _{[0,1]} \mathbb {P}(\mathcal {E}_x) d\lambda (x). \end{aligned}$$
(3.9)

Combining this with \(\mathbb {P} \times \lambda (A) = 1\), it remains to show that \(\mathbb {P}(\mathcal {E}_x) = 0\) holds for \(\lambda \)-almost all \(x \in [0,1]\) for which \(\mathcal {E}_x\) is measurable.

Let \(x \in [0,1]\) for which \(\mathcal {E}_x\) is measurable. According to the Lebesgue Differentiation Theorem (see e.g. [38]) we have that \(\mathbb {P}\)-almost all \(\omega \in \Omega \) is a Lebesgue point of the function \(1_{\mathcal {E}_x}\). Consider such an \(\omega \) and suppose that \(\omega \in \mathcal {E}_x\). Then \((\omega ,x) \in A\), so there exists an increasing sequence \((n_j)_{j \in \mathbb {N}}\) in \(\mathbb {N}\) that satisfies \(T_{\omega }^{n_j}(x) \in J\) for each \(j \in \mathbb {N}\). Recall that \(\sigma \) denotes the left shift on sequence. If \(\omega ' \in \sigma ^{-n_j} C \cap [\omega _1 \cdots \omega _{n_j}]\), then \(T_{\omega '}^{n_j}(x) = T_{\omega }^{n_j}(x) \in J\) and so \(F^{n_j}(\omega ',x) \in Y\), which gives \(\omega ' \notin \mathcal {E}_x\). So \(\mathcal {E}_x\) and \(\sigma ^{-n_j} C \cap [\omega _1 \cdots \omega _{n_j}]\) are disjoint for each \(j \in \mathbb {N}\), which together with \(\omega \) being a Lebesgue point of \(1_{\mathcal {E}_x}\) yields that

$$\begin{aligned} 1 \ge \frac{\mathbb {P}\big ((\mathcal {E}_x \cup \sigma ^{-n_j}C) \cap [\omega _1 \cdots \omega _{n_j}]\big )}{\mathbb {P}\big ([\omega _1 \cdots \omega _{n_j}]\big )}&= \frac{\mathbb {P}\big (\mathcal {E}_x \cap [\omega _1 \cdots \omega _{n_j}]\big )}{\mathbb {P}\big ([\omega _1 \cdots \omega _{n_j}]\big )} + \frac{\mathbb {P}\big (\sigma ^{-n_j}C \cap [\omega _1 \cdots \omega _{n_j}]\big )}{\mathbb {P}\big ([\omega _1 \cdots \omega _{n_j}]\big )}\\&\rightarrow 1_{\mathcal {E}_x}(\omega ) + \mathbb {P}(C), \qquad \hbox { as}\ j \rightarrow \infty . \end{aligned}$$

Since \(\mathbb {P}(C) > 0\), we find that \(\omega \in \mathcal {E}_x\) gives a contradiction. We conclude that \(\mathbb {P}(\mathcal {E}_x) = 0\).

\(\square \)

By Lemma 3.2 the first return time map \(\varphi _Y\), see (2.4), and the induced transformation \(F_Y\) are well defined on the full measure subset of points in Y that return to Y infinitely often under iterations of F, which we call Y again. The set of points in Y that return to Y after n iterations of F can be described as

$$\begin{aligned} Y \cap F^{-n}(Y) = \bigcup _{\omega \in C \cap \sigma ^{-n}C} [\omega _1 \cdots \omega _n] \times (T_{\omega }^n|_J)^{-1}(J) \quad \bmod \mathbb {P} \times \lambda , \end{aligned}$$
(3.10)

which is empty for \(n \le \kappa \). Note that in (3.10) in fact \([\omega _1 \cdots \omega _n] = [g^\kappa t \omega _{k+2}\cdots \omega _n g^\kappa t]\) and that by construction each map \(T_{\omega }^n|_J\) in (3.10) consists of branches that all have range (0, c) or (c, 1) or (0, 1), since any branch of \(T_{\omega }^\kappa |_J\) maps onto (0, 1). Therefore, \(Y \cap F^{-n}(Y)\) can be written as a finite union of products \(A = [{\mathbf {u}} g^{\kappa } t] \times I\) of cylinders \([{\mathbf {u}} g^{\kappa } t] \subseteq C\) with \(|{\mathbf {u}}|=n\) and open intervals \(I \subseteq J\), each of which is mapped under \(F^n\) onto \(C \times J_0\) or \(C \times J_1\). Call the collection of these sets \(P_n\) and let \(\alpha = \bigcup _{n > \kappa } P_n\). Let \(\mathbb {P}_C\) and \(\lambda _J\) denote the normalized restrictions of \(\mathbb {P}\) to C and \(\lambda \) to J respectively.

Lemma 3.3

  1. (1)

    The collection \(\alpha \) forms a countable return time partition of Y, i.e., the measure \({\mathbb {P}}_C \times \lambda _J (\bigcup _{A \in \alpha } A) = 1\), any two different sets \(A, A' \in \alpha \) are disjoint and on any \(A \in \alpha \) the first return time map \(\varphi _Y\) is constant.

  2. (2)

    Let \(\pi \) denote the canonical projection onto the second coordinate. Any \(x \in J\) is contained in a set \(\pi (A)\) for some set \(A \in \alpha \).

Proof

The fact that \({\mathbb {P}}_C \times \lambda _J (\bigcup _{A \in \alpha } A) = 1\) follows from Lemma 3.2 and it is clear from the construction that the first return time map \(\varphi _Y\) is constant on any element \(A \in \alpha \). To show that any two elements are disjoint, note that for \(A,A' \in P_n\) this is clear. Suppose there are \(1 \le m <n\), \(A = [{\mathbf {u}} g^{\kappa } t] \times I \in P_n\) and \(A' = [{\mathbf {v}} g^{\kappa } t] \times I' \in P_m\) such that \(A \cap A' \ne \emptyset \). Since \(t \ne g\) we get \(n \ge m+\kappa +1\) and \([{\mathbf {u}} g^\kappa t] = [g^\kappa t v_{\kappa +2}\cdots v_m g^\kappa t u_{m+\kappa +2} \cdots u_n g^\kappa t]\). Moreover, \(I \cap \partial I' \ne \emptyset \) or \(I=I'\). In both cases, note that \(F^{m+\kappa +1} ([{\mathbf {v}} g^{\kappa } t] \times \partial I') \subseteq \Omega \times \{0,1\}\), so by (G1) and (B1) also \(F^n([{\mathbf {v}} g^{\kappa } t] \times \partial I') \subseteq \Omega \times \{0,1\}\), contradicting that \(F^n(A) \subseteq Y\). This proves (1).

For (2) note that, since \(\alpha \) is a partition of Y, for each \(x \in J\) it holds that there is an \(A = [{\mathbf {u}} g^{\kappa } t] \times I \in \alpha \) with \(x \in I\) or \(x \in \partial I\). In the first case there is nothing to prove, so assume that \(x \in \partial I\). Then \(T_{{\mathbf {u}}}(x) \in \partial J_i\) for some \(i \in \{0,1\}\). From the first part of the proof of Lemma 3.2 it then follows that there is an \(n > |{\mathbf {u}}|\) and an \(\omega \in C\) such that \(T^n_\omega (x) \in J\). If we write \(I'\) for the interval in \(T^{-n}_\omega (J)\) containing x, then this means that there exists a set \(A'=[{\mathbf {v}}{\mathbf {g}}^{\kappa } t] \times I' \in \alpha \) with \(x \in \pi (A')\). \(\square \)

The second part of Lemma 3.3 shows that even though the partition elements of \(\alpha \) are disjoint, their projections on the second coordinate are not. The same is true for the first coordinate as the same string \({\mathbf {u}}\) can lead points in J to \(J_0\) and \(J_1\).

3.2 Properties of the induced transformation

It follows from (3.10) and Lemma 3.3 that for each \(A \in \alpha \) we have either \(F_Y(A) = C \times J_0\) or \(F_Y(A) = C \times J_1\). For any \([{\mathbf {u}}{\mathbf {g}}^{\kappa } t] \times I \in \alpha \), the transformation \(T_{{\mathbf {u}}}|_I\) is invertible from I to one of the sets \(J_0\) or \(J_1\). Define the operator \(\mathcal {P}_{{\mathbf {u}},I}: L^1(J,\lambda _J) \rightarrow L^1(J,\lambda _J)\) by

$$\begin{aligned} \mathcal {P}_{{\mathbf {u}},I}h(x) = {\left\{ \begin{array}{ll} \displaystyle \frac{h(T_{\mathbf {u}}|_I^{-1}(x))}{\big |DT_{\mathbf {u}}|_I(T_{\mathbf {u}}|_I^{-1}(x))\big |}, &{} \text {if } T_{\mathbf {u}}|_I^{-1}\{x\} \ne \emptyset ,\\ 0, &{} \text {otherwise}. \end{array}\right. } \end{aligned}$$
(3.11)

The random Perron–Frobenius-type operator \(\mathcal {P}_Y: L^1(J,\lambda _J) \rightarrow L^1(J,\lambda _J)\) on Y is given by

$$\begin{aligned} \mathcal {P}_Y = \sum _{[{\mathbf {u}} g^{\kappa } t] \times I \in \alpha } \mathbb {P}_C([{\mathbf {u}}]) \mathcal {P}_{{\mathbf {u}},I}. \end{aligned}$$
(3.12)

Note that \({\mathcal {P}}_Y\) is not exactly of the same form as the usual Perron–Frobenius operator in (2.6). Nonetheless, we have the following result.

Lemma 3.4

If \(\varphi \in L^1(J,\lambda _J)\) is a fixed point of \(\mathcal {P}_Y\), then the measure \(\mathbb {P}_C \times \nu \) with \(\nu = \varphi d\lambda _J\) is invariant for \(F_Y\).

Proof

For each cylinder \(K \subseteq C\) and each Borel set \(E \subseteq J\) we have

$$\begin{aligned} \mathbb {P}_C \times \nu \big (F_Y^{-1}(K \times E)\big )&= \sum _{[{\mathbf {u}} g^{\kappa } t] \times I \in \alpha } \mathbb {P}_C([{\mathbf {u}} g^{\kappa } t] \cap \sigma ^{-|{\mathbf {u}}|} K) \nu (I \cap T_{\mathbf {u}}^{-1} E)\\&= \mathbb {P}_C(K) \sum _{[{\mathbf {u}} g^{\kappa } t] \times I \in \alpha } \mathbb {P}_C([{\mathbf {u}}]) \int _E \mathcal {P}_{{\mathbf {u}},I} \varphi d\lambda _J\\&= \mathbb {P}_C(K) \int _E \mathcal {P}_Y \varphi d\lambda _J\\&= \mathbb {P}_C \times \nu (K \times E). \end{aligned}$$

\(\square \)

In Lemma 3.5 below we show that a fixed point of \({\mathcal {P}}_Y\) exists. For \(m \in \mathbb {N}\), set \(\alpha _m = \bigvee _{j=0}^{m-1} F_Y^{-j} \alpha \). Atoms of this partition are the m-cylinders of \(F_Y\). Introducing for each \(Z = \bigcap _{j=0}^{m-1} F_Y^{-j} ([{\mathbf {u}}_j g^{\kappa } t] \times I_j)\) in \(\alpha _m\) the notation

$$\begin{aligned} C_Z = \bigcap _{j=0}^{m-1} \sigma ^{-\sum _{i=0}^{j-1}|\mathbf{u}_i|}[{\mathbf {u}}_j g^{\kappa } t] \quad \text { and } \quad J_Z = \bigcap _{j =0}^{m-1} T_{{\mathbf {u}}_0 {\mathbf {u}}_1\cdots \mathbf{u}_{j-1}}^{-1} (I_j), \end{aligned}$$
(3.13)

we obtain \(Z = C_Z \times J_Z\). Writing \(\sigma _Z = \sigma ^{\sum _{i=0}^{m-1}|{\mathbf {u}}_i|}|_{C_Z}\) and \(T_Z = T_{{\mathbf {u}}_0 {\mathbf {u}}_1\cdots {\mathbf {u}}_{m-1}}|_{J_Z}\) we have \(F_Y^m|_Z = \sigma _Z \times T_Z\). Each \(T_Z \) has non-positive Schwarzian derivative, so we can apply the Koebe Principle. The image \(T_Z(J_Z)\) either equals \(J_0\) or \(J_1\). Choose a \({\bar{\rho }} >0\) such that \(I_0:=[x_\kappa - {\bar{\rho }}, x_\kappa ' + {\bar{\rho }}] \subseteq (0,c)\) and \(I_1:=[y_\kappa '-{\bar{\rho }}, y_\kappa + \bar{\rho }] \subseteq (c,1)\). There is a canonical way to extend the domain of each \(T_Z\) to an interval I containing \(J_Z\), such that \(T_Z(I)\) equals either \(I_0\) or \(I_1\) and \({\mathbf {S}}(T_Z) \le 0\) on I. Then by the Koebe Principle there exist constants \(K^{(\bar{\rho })}>1\) and \(M^{({\bar{\rho }})} > 0\) such that for all \(m \in \mathbb {N}\), \(Z \in \alpha _m\) and \(x,y \in J_Z\),

$$\begin{aligned}&\frac{1}{K^{({\bar{\rho }})}} \le \frac{DT_Z(x)}{DT_Z(y)} \le K^{({\bar{\rho }})}, \end{aligned}$$
(3.14)
$$\begin{aligned}&\Big |\frac{DT_Z(x)}{DT_Z(y)}-1\Big | \le \frac{M^{(\bar{\rho })}}{\min \{ \lambda (I_0), \lambda (I_1)\}} \cdot |T_Z(x)-T_Z(y)|. \end{aligned}$$
(3.15)

Note that for the random Perron–Frobenius-type operator from (3.12) we have for each \(m \ge 1\) that

$$\begin{aligned} \mathcal {P}_Y^m = \frac{1}{\mathbb {P}(C)} \sum _{Z \in \alpha _m} \mathbb {P}_C(C_Z) \mathcal {P}_{T_Z}, \end{aligned}$$
(3.16)

where \(\mathcal {P}_{T_Z}\) is as in (2.5).

Lemma 3.5

(cf. Lemmata V.2.1 and V.2.2 of [17]) \(\mathcal {P}_Y\) admits a fixed point \(\varphi \in L^1(J,\lambda _J)\) that is bounded, Lipschitz and bounded away from zero.

Proof

For each \(m \in \mathbb {N}\) and \(x \in J\),

$$\begin{aligned} \mathcal {P}_Y^m 1(x) = \frac{1}{\mathbb {P}(C)} \sum _{{\mathop { x \in T_Z(J_Z)}\limits ^{Z \in \alpha _m:}}} \frac{\mathbb {P}_C(C_Z)}{|DT_Z(T_Z^{-1}x)|}. \end{aligned}$$
(3.17)

Using the Mean Value Theorem, for all \(m \in \mathbb {N}\) and \(Z \in \alpha _m\) there exists a \(\xi \in J_Z\) such that

$$\begin{aligned} \frac{\lambda \big (T_Z(J_Z)\big )}{\lambda (J_Z)} = |DT_Z(\xi )|. \end{aligned}$$
(3.18)

Set \(K_1 = \frac{\max \{K^{({\bar{\rho }})}, M^{(\bar{\rho })}\}}{\mathbb {P}(C)\cdot \min \{\lambda (J_0),\lambda (J_1)\}}\), where \({\bar{\rho }}\) is as in (3.14) and (3.15). Since \(DT_Z(\xi )\) and \(DT_Z(y)\) have the same sign for any \(y \in J_Z\), (3.18) together with (3.14) implies

$$\begin{aligned} \mathcal {P}_Y^m 1(x) \le \sum _{Z \in \alpha _m} \frac{\mathbb {P}_C(C_Z)}{\mathbb {P}(C)} \cdot K^{({\bar{\rho }})} \frac{\lambda (J_Z)}{\lambda (T_Z(J_Z))} \le K_1 \sum _{Z \in \alpha _m} {\mathbb {P}}_C \times \lambda _J (C_Z \times J_Z) = K_1.\nonumber \\ \end{aligned}$$
(3.19)

Moreover, if for \(A = [{\mathbf {u}} g^{\kappa } t] \times I \in \alpha \) we take \(x,y \in I\), then for any \(Z \in \alpha _m\) it holds that \(x \in T_Z(J_Z)\) if and only if \(y \in T_Z(J_Z)\). For such Z, let \(x_Z, y_Z \in J_Z\) be such that \(T_Z(x_Z)=x\) and \(T_Z (y_Z)=y\). Then by (3.15)

$$\begin{aligned} |{\mathcal {P}}^m_Y 1(x) - {\mathcal {P}}^m_Y 1(y)|\le & {} \sum _{{\mathop { x \in T_Z(J_Z)}\limits ^{Z \in \alpha _m:}}} \frac{{\mathbb {P}}_C(C_Z)}{\mathbb {P}(C)} \left| \frac{1}{|D T_Z (x_Z)|} - \frac{1}{|D T_Z (y_Z)|} \right| \nonumber \\\le & {} \sum _{{\mathop { x \in T_Z(J_Z)}\limits ^{Z \in \alpha _m:}}} {\mathbb {P}}_C(C_Z) \frac{1}{|D T_Z (x_Z)|} K_1 |T_Z (x_Z)- T_Z (y_Z)|\nonumber \\= & {} K_1 {\mathcal {P}}_Y^m 1 (x) |x-y|. \end{aligned}$$
(3.20)

Together (3.19) and (3.20) imply that the sequence \(\big ( \frac{1}{m} \sum _{j=0}^{m-1} {\mathcal {P}}^j_Y 1 \big )_m\) is uniformly bounded and equicontinuous on I for each \(A = [{\mathbf {u}} g^{\kappa } t] \times I\). By Lemma 3.3(2) it follows that the same holds on J. Hence, by the Arzela-Ascoli Theorem there exists a subsequence

$$\begin{aligned} \left( \frac{1}{m_k} \sum _{j=0}^{m_k-1} {\mathcal {P}}^j_Y 1 \right) _{m_k}\end{aligned}$$

converging uniformly to a function \(\varphi :J \rightarrow [0,\infty )\) satisfying \(\varphi \le K_1\) and for each \(A = [{\mathbf {u}} g^{\kappa } t] \times I \in \alpha \) and \(x,y \in I\),

$$\begin{aligned} |\varphi (x)-\varphi (y)| \le K_1 \varphi (x) |x-y|. \end{aligned}$$
(3.21)

Hence, \(\varphi \) is bounded and by Lemma 3.3(2) it is clear that \(\varphi \) is Lipschitz (with Lipschitz constant bounded by \(K_1^2\)). It is readily checked that \(\varphi \) is a fixed point of \({\mathcal {P}}_Y\), so that \({\mathbb {P}}_C \times \nu \) with \(\nu = \varphi \, d\lambda \) is an invariant probability measure for \(F_Y\).

What is left is to verify that for each \(A = [{\mathbf {u}}{\mathbf {g}}^{\kappa }t] \times I \in \alpha \) the function \(\varphi \) is bounded from below on the interior of I. Suppose that there is such an \(A = [{\mathbf {u}}{\mathbf {g}}^{\kappa }t] \times I\) for which \(\inf _{x \in I} \varphi (x)=0\). Then from (3.21) it follows that \(\varphi (y)=0\) for all \(y \in I\), hence \(\nu (I)=0\). Either \(I \subseteq J_0\) or \(I \subseteq J_1\). If \(I \subseteq J_0\), then for any set \(A' = [{\mathbf {v}} g^{\kappa } t] \times I'\in \alpha \) with \(T_{\mathbf {v}} (I') =J_0\) it holds that

$$\begin{aligned} \mathbb {P}_C \times \lambda _J(A' \cap F_Y^{-1} A) > 0\end{aligned}$$

and, by the \(F_Y\)-invariance of \(\mathbb {P}_C \times \nu \),

$$\begin{aligned} \mathbb {P}_C \times \nu (A' \cap F_Y^{-1} A) \le \mathbb {P}_C \times \nu (F_Y^{-1} A) = \mathbb {P}_C \times \nu (A) = 0,\end{aligned}$$

which together give \(\inf _{x \in I'} \varphi (x) = 0\) and therefore, like before, \(\nu (I') = 0\). There are sets \(A' = [{\mathbf {v}} g^{\kappa } t] \times I'\) with \(I' \subseteq J_1\) and \(T_{\mathbf{v}}(I') = J_0\), so we can repeat the argument to show that also for any set \(A''= [{\mathbf {v}} g^{\kappa } t] \times I'' \in \alpha \) with \(T_{{\mathbf {v}}}(I'')=J_1\) we have \(\nu (I'')=0\). So \({\mathbb {P}}_C \times \nu (A)=0\) for all \(A \in \alpha \). If \(I \subseteq J_1\) we come to the same conclusion. This gives a contradiction, so \(\varphi \) is bounded from below on each interval I. \(\square \)

It follows from Lemma 3.4 that \(\mathbb {P}_C \times \nu \) with \(\nu = \varphi d\lambda _J\) is a finite \(F_Y\)-invariant measure. To show that \({\mathbb {P}}_C \times \lambda _J\) is \(F_Y\)-ergodic we need the following result, which states that the sets \(\pi (A)\) for \(A \in \alpha _m\) shrink uniformly to \(\lambda \)-null sets as \(m \rightarrow \infty \).

Lemma 3.6

\(\displaystyle \lim _{m \rightarrow \infty } \sup \{ \lambda _J(J_Z) \, : \, Z \in \alpha _m \} =0\).

Proof

Set \(\delta = \sup \{ \lambda _J(J_Z) \, : \, Z \in \alpha \} < 1\). Fix an m and let \(Z = \bigcap _{j=0}^{m-1} F_Y^{-j} ([{\mathbf {u}}_j g^{\kappa }t] \times I_j) = C_Z \times J_Z \in \alpha _m\) as in (3.13). Set

$$\begin{aligned} {\tilde{J}}_Z = \bigcap _{j =0}^{m-2} T_{{\mathbf {u}}_0 {\mathbf {u}}_1\cdots {\mathbf {u}}_{j-1}}^{-1} (I_j),\end{aligned}$$

so that \(J_Z = {\tilde{J}}_Z \cap T_{{\mathbf {u}}_0 \cdots \mathbf{u}_{m-2}}^{-1} (I_{m-1})\). Let \(J_i\), \(i \in \{0,1\}\), be such that \(T_{{\mathbf {u}}_0 \cdots {\mathbf {u}}_{m-2}} ({\tilde{J}}_Z) = J_i\). It holds that \(T_{{\mathbf {u}}_0 \cdots {\mathbf {u}}_{m-2}}(J_Z) = I_{m-1}\), so \(\lambda (T_{{\mathbf {u}}_0 \cdots {\mathbf {u}}_{m-2}}(J_Z)) \le \delta \) and thus

$$\begin{aligned} \lambda ( T_{{\mathbf {u}}_0 {\mathbf {u}}_1 \cdots {\mathbf {u}}_{m-2}} ( {\tilde{J}}_Z \setminus J_Z) \ge \lambda (J_i)-\delta .\end{aligned}$$

Since \({\tilde{J}}_Z \setminus J_Z\) consists of at most two intervals, with (3.14) and (2.13) this gives

$$\begin{aligned} 1- \frac{\lambda _J(J_Z)}{\lambda _J({\tilde{J}}_Z)} = \frac{\lambda _J ({\tilde{J}}_Z\setminus J_Z)}{\lambda _J({\tilde{J}}_Z)} \ge \frac{1}{K^{({\bar{\rho }})}} \frac{\lambda _J(T_{{\mathbf {u}}_0 \cdots {\mathbf {u}}_{m-2}}({\tilde{J}}_Z \setminus J_Z)}{\lambda _J(T_{{\mathbf {u}}_0 \cdots {\mathbf {u}}_{m-2}}({\tilde{J}}_Z))} \ge \frac{1}{K^{({\bar{\rho }})}} \frac{\lambda _J(J_i)-\delta }{\lambda _J(J_i)}.\end{aligned}$$

Set \(K_1 := \max \big \{ 1 - \frac{1}{K^{({\bar{\rho }})}} \frac{\lambda _J(J_i)-\delta }{\lambda _J(J_i)} \, : \, i=0,1 \big \} \in (0,1)\). Then by repeating the same steps, we obtain

$$\begin{aligned} \lambda _J(J_Z) \le K_1 \lambda _J({\tilde{J}}_Z) \le \cdots \le K_1^m \lambda _J(I_0) < K_1^m,\end{aligned}$$

which proves the lemma. \(\square \)

Lemma 3.7

The measure \(\mathbb {P}_C \times \lambda _J\) is \(F_Y\)-ergodic.

Proof

Suppose \(E \subseteq Y\) with \(\mathbb {P}_C \times \lambda _J(E) > 0\) satisfies \(F_Y^{-1} E = E\) mod \(\mathbb {P}_C \times \lambda _J\). We show that \(\mathbb {P}_C \times \lambda _J(E) =1\). The Borel measure \(\rho \) on Y given by

$$\begin{aligned} \rho (V) = \int _V 1_E(\omega ,x) \varphi (x) d\mathbb {P}_C(\omega )d\lambda _J(x)\end{aligned}$$

for Borel sets V is \(F_Y\)-invariant. According to Lemmas 2.2 and 2.1 this yields a stationary measure \(\tilde{\mu }\) on [0, 1] that is absolutely continuous w.r.t. \(\lambda \) and satisfies \((\mathbb {P} \times \tilde{\mu })|_Y = \rho \). Let \(L := \text {supp}(\tilde{\mu }|_J)\) denote the support of the measure \(\tilde{\mu }|_J\). Since \(\rho \) is a product measure, this gives \(\text {supp}(\rho ) = C \times L\) and so by the definition of \(\rho \) we get \(C \times L \subseteq E\) and \(\rho (E \backslash (C \times L)) = 0\). Since \(\varphi \) is bounded away from zero, this yields

$$\begin{aligned} E = C \times L \quad \bmod \mathbb {P}_C \times \lambda _J. \end{aligned}$$
(3.22)

To obtain the result, it remains to show that \(\lambda _J(J \backslash L) = 0\).

We have \(C \times L = \bigcup _{Z \in \alpha _m} C_Z \times (J_Z \cap L)\) and \(F_Y^{-m}(C \times L) = \bigcup _{Z \in \alpha _m} C_Z \times T_Z^{-1} L\). From the non-singularity of \(F_Y\) w.r.t. \(\mathbb {P}_C \times \lambda _J\) it follows that for each \(m \in \mathbb {N}\),

$$\begin{aligned} C \times L = E = F_Y^{-m}E = F_Y^{-m}(C \times L) \quad \bmod \mathbb {P}_C \times \lambda _J, \end{aligned}$$
(3.23)

which yields

$$\begin{aligned} J_Z \cap L = T_Z^{-1} L \quad \bmod \lambda _J, \quad \hbox { for each}\ Z \in \alpha _m. \end{aligned}$$
(3.24)

Let \(\varepsilon >0\). Since \(\lambda _J(L) > 0\), it follows from Lemma 3.6 and the Lebesgue Density Theorem that there are \(i\in \{0,1\}\), \(m_i \in \mathbb {N}\) and \(Z_i \in \alpha _{m_i}\) such that

$$\begin{aligned} T_{Z_i}(J_{Z_i}) = J_i \quad \text { and } \quad \lambda _J(J_{Z_i} \cap L) \ge (1-\varepsilon ) \lambda _J(J_{Z_i}).\end{aligned}$$

By (3.24), \(T_{Z_i}^{-1}(J_i\setminus L) = J_{Z_i} \setminus L \bmod \lambda _J\). The Mean Value Theorem gives the existence of a \(\xi \in J_{Z_i}\) such that

$$\begin{aligned} \frac{\lambda _J(T_{Z_i}(J_{Z_i}))}{\lambda _J(J_{Z_i})} = |D T_{Z_i}(\xi )|,\end{aligned}$$

and from (3.14) it follows that

$$\begin{aligned} \lambda _J (T_{Z_i} (J_{Z_i} \setminus L)) = \int _{J_{Z_i} \setminus L} |DT_{Z_i}| d\lambda \le K^{({\bar{\rho }})} |DT_{Z_i}(\xi )| \lambda _J (J_{Z_i} \setminus L).\end{aligned}$$

Hence,

$$\begin{aligned} \frac{\lambda _J(J_i \backslash L)}{\lambda _J(J_i)} = \frac{\lambda _J(T_{Z_i}(J_{Z_i} \setminus L))}{\lambda _J(T_{Z_i}(J_{Z_i}))} \le K^{({\bar{\rho }})} \frac{\lambda _J(J_{Z_i} \backslash L)}{\lambda _J(J_{Z_i})} \le K^{({\bar{\rho }})} \varepsilon . \end{aligned}$$
(3.25)

So, for each \(\varepsilon >0\) we can find an \(i=i(\varepsilon )\) for which (3.25) holds. If for each \(\varepsilon _0 > 0\) and each \(i_0 \in \{0,1\}\) there exists an \(\varepsilon \in (0,\varepsilon _0)\) such that \(i(\varepsilon ) = i_0\), we obtain from (3.25) that \(\lambda _J(J \backslash L) = 0\). Otherwise, there exists \(\varepsilon _0 > 0\) and \(i_0 \in \{0,1\}\) such that \(i(\varepsilon ) = i_0\) for all \(\varepsilon \in (0,\varepsilon _0)\). Without loss of generality, suppose that \(i_0 = 0\). Then (3.25) gives \(\lambda _J(J_0 \backslash L) = 0\). By the equivalence of \(\nu \) and \(\lambda _J\) and the fact that every good map has full branches it follows that

$$\begin{aligned} \mathbb {P}_C \times \nu \big ((C \times J_0) \cap F_Y^{-1}(C \times J_1)\big ) > 0. \end{aligned}$$
(3.26)

Together with the Poincaré Recurrence Theorem this gives that

$$\begin{aligned} A = \{(\omega ,x) \in C \times J_0: F_Y^m(\omega ,x) \in C \times J_1 \hbox { for infinitely many}\ m \in \mathbb {N}\} \end{aligned}$$
(3.27)

satisfies \(\mathbb {P}_C \times \nu (A) > 0\), and therefore \(\mathbb {P}_C \times \lambda _J(A) > 0\). Together with \(\lambda _J(J_0 \backslash L ) = 0\) it follows from the Lebesgue Density Theorem that there exists a Lebesgue point \(x \in \pi (A) \cap L\) of \(1_{\pi (A) \cap L}\). Since \(x \in \pi (A)\), for infinitely many \(m \in \mathbb {N}\) there exists \(Z_m \in \alpha _m\) such that \(x \in J_{Z_m}\) and \(T_{Z_m}(J_{Z_m}) = J_1\). This again together with Lemma 3.6 yields that for each \(\varepsilon > 0\) there exist \(m \in \mathbb {N}\) and \(Z \in \alpha _m\) such that

$$\begin{aligned} T_Z(J_Z) = J_1 \quad \text { and } \quad \lambda _J(J_Z \cap L) \ge (1-\varepsilon ) \lambda _J(J_Z).\end{aligned}$$

Similar as before, this gives \(\lambda _J(J_1 \backslash L) = 0\), so \(\lambda _J(J \backslash L) = 0\). \(\square \)

3.3 The proof of Theorem 1.2

In the previous paragraphs we collected all the ingredients necessary to prove Theorem 1.2.

    Proof of Theorem 1.2

  1. (1)

    We have constructed a finite \(F_Y\)-invariant measure \(\mathbb P_C \times \nu \) which is absolutely continuous with respect to \({\mathbb {P}}_C\times \lambda _J\). Since F is non-singular with respect to \(\mathbb {P} \times \lambda \), we can therefore by Lemma 2.2 extend \({\mathbb {P}}_C \times \nu \) to an F-invariant measure \({\mathbb {P}} \times \mu \) which is absolutely continuous with respect to \({\mathbb {P}}\times \lambda \). Lemma 3.2 immediately implies that \(\mu \) is \(\sigma \)-finite. What is left to show is that \({\mathbb {P}} \times \mu \) is the unique such measure (up to multiplication by constants) and that it is ergodic.

A well known result [1, Theorem 1.5.6] states that a conservative ergodic non-singular transformation T on a probability space \((X,{\mathcal {B}},m)\) admits at most one (up to scalar multiplication) m-absolutely continuous \(\sigma \)-finite invariant measure. Therefore, it suffices to show that F is conservative and ergodic with respect to \({\mathbb {P}}\times \lambda \). We are going to deduce these properties of F from the corresponding properties of the induced transformation \(F_Y\).

In the proof of part (2) below we will see that the density of \(\frac{d\mu }{d\lambda }\) is bounded away from zero. Hence, \(\lambda \ll \mu \). Combining Lemma 3.2 with Maharam’s Recurrence Theorem gives that F is conservative with respect to \({\mathbb {P}} \times \mu \) and thus also with respect to \({\mathbb {P}} \times \lambda \). Furthermore, from the ergodicity of \(F_Y\) with respect to \({\mathbb {P}}_C \times \lambda _J\) it follows by Lemma 3.2 combined with [1, Proposition 1.5.2(2)] that F is ergodic with respect to \({\mathbb {P}} \times \lambda \).

  1. (2)

    For the density \(\psi := \frac{d\mu }{d\lambda }\) it holds that \(\psi |_J = \varphi \). Since we can take \(\kappa \) in the definition of J as large as we want, \(\psi \) is locally Lipschitz on (0, c) and (c, 1). Moreover, it is a fixed point of the Perron–Frobenius operator from (2.6) and thus for all \(x \in [0,1]\),

$$\begin{aligned} \psi (x) = \mathcal {P}_F^\kappa \psi (x) \ge p_g^\kappa \frac{\varphi (T_g^{-\kappa }x)}{|DT_g^\kappa (T_g^{-\kappa }x)|}. \end{aligned}$$
(3.28)

From Lemma 3.5 we conclude that \(\psi \) is bounded from below by some constant \(C>0\). It remains to show that \(\psi \) is not in \(L^q\) for any \(q > 1\). To see this, fix a \(b \in \Sigma _B\). Since \(\psi \) is bounded from below by \(C > 0\), we have for all \(k \in \mathbb {Z}_{\ge 0}\) and \(x \in [0,1]\) that

$$\begin{aligned} \psi (x) = \mathcal {P}_F^{k+1} \psi (x) \ge C \cdot p_g p_b^k \sum _{y \in (T_g T_b^k)^{-1}\{x\}} \frac{1}{|D(T_g T_b^k)(y)|}. \end{aligned}$$
(3.29)

Let \(\ell _b,M_b,r_g,M_g,K_g\) be as in (B3) and (G3). From (B3), (G3) and Lemma 2.4 we get

$$\begin{aligned} |D(T_gT_b^k)(y)|= & {} |DT_g (T_b^k(y))| \prod _{i=1}^{k} |DT_b (T_b^{k-i}(y))|\nonumber \\\le & {} M_g |T_b^k(y)-c|^{r_g-1} \prod _{i=0}^{k-1} (M_b|T_b^i(y)-c|^{\ell _b-1})\nonumber \\\le & {} M_g M_b^k ({\tilde{M}}|y-c|)^{\ell _b^k(r_g-1)} \prod _{i=0}^{k-1} ({\tilde{M}} |y-c|)^{\ell _b^i(\ell _b-1)} \nonumber \\= & {} K_1 |y-c|^{\ell _b^k r_g-1}, \end{aligned}$$
(3.30)

for the positive constant \(K_1 = M_g M_b^k {\tilde{M}}^{\ell _b^k r_g-1}\). On the other hand, from (G3) we obtain for any \(y \in (T_g T_b^k)^{-1}\{x\}\) as in the proof of Lemma 2.4 that

$$\begin{aligned} |x-T_g(c)| = |T_gT_b^k(y)-T_g(c)| \ge \frac{K_g}{r_g} |T_b^k(y)-c|^{r_g}\end{aligned}$$

and then Lemma 2.4 yields

$$\begin{aligned} |x-T_g(c)| \ge K_2 |y-c|^{\ell _b^k r_g} \end{aligned}$$
(3.31)

for the positive constant \(K_2 = \frac{K_g}{r_g} {\tilde{K}}^{\ell _b r_g}\). Now for any \(q > 1\) we can choose \(k \in \mathbb {Z}_{\ge 0}\) large enough so that \(\tau := (1-\ell _b^{-k} r_g^{-1}) q \ge 1\). Combining (3.28), (3.30) and (3.31) we obtain

$$\begin{aligned} \begin{aligned} \psi ^q(x) \ge \&\Big ( \frac{C p_g p_b^k}{K_1} \Big )^q \Big ( \sum _{y \in (T_g T_b^k)^{-1}\{x\}} |y-c|^{1-\ell _b^k r_g} \Big )^q\\ \ge \&K_3 |x-T_g(c)|^{-\tau } \end{aligned}\end{aligned}$$

for a positive constant \(K_3\). This gives the result. \(\square \)

Remark 3.1

The result from Theorem 1.2 still holds if we allow the critical order \(\ell _b\) from (B3) to be equal to 1 for some b, as long as \(\ell _{\max } > 1\). To see this, note that in the proof of Theorem 1.2 condition (B3) only plays a role in proving that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \not \in L^q\) for any \(q > 1\). Here we refer to Lemma 2.4 and the constants \(\tilde{K}\) and \(\tilde{M}\), which are not well defined if \(\ell _{\min } = 1\). In (3.30) however, we use the estimates from Lemma 2.4 only for one arbitrary fixed \(b \in \Sigma _B\). By the same reasoning as in the proof of Lemma 2.4 it follows that

$$\begin{aligned} \left( \Big (\frac{K_b}{\ell _b}\Big )^{\frac{1}{\ell _b-1}} |x-c| \right) ^{\ell _b^n} \le |T_b^n(x)-c| \le \left( \Big (\frac{M_b}{\ell _b}\Big )^{\frac{1}{\ell _b-1}} |x-c|\right) ^{\ell _b^n}. \end{aligned}$$
(3.32)

for any \(b \in \Sigma _B\) with \(\ell _b>1\). Hence, if there exists at least one \(b \in \Sigma _B\) with \(\ell _b>1\), then we can replace the bounds obtained from Lemma 2.4 in (3.30) and (3.31) by constants \(K_1 = M_g M_b^k (\frac{K_b}{\ell _b})^{(\ell _b^k r_g - 1)/(\ell _b-1)}\) and \(K_2 = \frac{K_g}{r_g}(\frac{M_b}{\ell _b})^{\ell _b r_g /(\ell _b-1)}\) and obtain the same result. In case \(\ell _{\max } = 1\), then most parts from Theorem 1.2 still remain valid with the exception that then we can only say that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \not \in L^q\) if \(q \ge \frac{r_{\max }}{r_{\max }-1}\). This follows from the above reasoning by taking \(k=0\) in the definition of \(\tau \) in the proof of Theorem 1.2 and by noting that \(\tau = (1-r_{\max }^{-1})q \ge 1\) if \(q \ge \frac{r_{\max }}{r_{\max }-1}\).

4 Estimates on the acs Measure

In this section we prove Theorem 1.3. Recall the definition of \(\theta \) from Theorem 1.3:

$$\begin{aligned} \theta = \sum _{b \in \Sigma _B} p_b \ell _b.\end{aligned}$$

4.1 The case \(\theta \ge 1\)

To prove one direction of Theorem 1.3, namely that the unique acs measure \(\mu \) from Theorem 1.2 is infinite if \(\theta \ge 1\), we introduce another induced transformation.

Proposition 4.1

Suppose \(\theta \ge 1\). Then the unique acs measure \(\mu \) from Theorem 1.2 is infinite.

Proof

Fix a \(b \in \Sigma _B\). Recall the definitions of \(\tilde{M}\) from Lemma 2.4 and \(\delta \) from in and below the proof of Lemma 2.4, and set \(\gamma = \min \{\delta ,\frac{1}{2}\tilde{M}^{-1}\}\). Let \(a \in [c-\gamma ,c)\). Then there exists a \(\xi \in (a,c)\) such that \(T_b(a) > \xi \) and \(T_b^2(a) > \xi \). Take \([bb] \times (a, \xi )\) as the inducing domain and let

$$\begin{aligned} \kappa (\omega ,x) = \inf \{k \in \mathbb {N}: F^k(\omega ,x) \in [bb] \times (a,\xi )\} \end{aligned}$$
(4.1)

be the first return time to \([bb] \times (a,\xi )\) under F. If \({\mathbb {P}} \times \mu ([bb] \times (a,\xi ))=\infty \), there is nothing left to prove. If not, then we compute \(\int _{[bb] \times (a, \xi )} \kappa \, d\mathbb {P} \times \mu \) and use Kac’s Formula from Lemma 2.3 to prove the result.

So, assume that \({\mathbb {P}} \times \mu ([bb] \times (a,\xi )) < \infty \). The conditions that \(T_b(a) > \xi \) and \(T_b^2(a) > \xi \) together with the fact that any bad map has c as a fixed point and is strictly monotone on the intervals [0, c] and [c, 1], guarantee that for each \(n \in \mathbb {N}\) and \(\omega \in \Sigma _B^{\mathbb {N}} \cap [bb]\) we get

$$\begin{aligned} T_{\omega }^n((a, \xi )) \cap (a, \xi ) = \emptyset . \end{aligned}$$
(4.2)

For any \(\omega \in [bb]\) and \(x \in (a,\xi )\) it follows by (4.2) and (2.16) that \(T_\omega ^n(x)\) can only return to \((a,\xi )\) after at least one application of a good map. Assume that \(\omega \in [bb]\) is of the form

$$\begin{aligned} \omega = (b, b, \omega _3, \omega _4, \ldots , \omega _n, g, \omega _{n+2}, \ldots ),\end{aligned}$$

with \(n \ge 2\), \(\omega _i \in \Sigma _B\) for \(3 \le i \le n\), \(g \in \Sigma _g\), and \(x \in (a, \xi )\). Then \(\kappa (\omega ,x) \ge n+1\). Lemma 2.4 yields that

$$\begin{aligned} |T_{\omega }^n(x)-c| \le ({\tilde{M}} \gamma )^{\ell _{\omega _1} \cdots \ell _{\omega _n}} < 2^{-\ell _{\omega _1} \cdots \ell _{\omega _n}}. \end{aligned}$$
(4.3)

From (G3) and (4.3) we obtain that

$$\begin{aligned} |T_g T_{\omega }^n(x)-T_g(c)| = \left| \int _c^{T_{\omega }^n(x)} DT_g(y) \, dy \right| \le \frac{M_g}{r_g} |T_\omega ^n(x)-c|^{r_g} < \frac{M_g}{r_g} \cdot 2^{-\ell _{\omega _1} \cdots \ell _{\omega _n} r_g}. \end{aligned}$$
(4.4)

Set

$$\begin{aligned} \zeta = \sup \{|DT_j(x)|\, :\, j \in \Sigma ,\, x \in [0,1]\}. \end{aligned}$$
(4.5)

Then \(\zeta >1\) by (G4), (B4). Assume \(\kappa (\omega ,x) = m+n\) for some \(m \ge 1\). Then \(T_\omega ^{m+n}(x) \in (a, \xi )\) so that by (G1),

$$\begin{aligned} |T_{\omega }^{m+n}(x)-T_g(c)| \ge \min \{a,1-\xi \}. \end{aligned}$$
(4.6)

Because of (4.4) this implies

$$\begin{aligned} \zeta ^{m-1} \frac{M_g}{r_g} \cdot 2^{-\ell _{\omega _1} \cdots \ell _{\omega _n} r_g} \ge \min \{a,1-\xi \}. \end{aligned}$$
(4.7)

Solving for m yields

$$\begin{aligned} m \ge K_1 + K_2 \ell _{\omega _1} \cdots \ell _{\omega _n} \end{aligned}$$
(4.8)

for constants \(K_1 = \big (1+\log \big ( \frac{\min \{ a, 1-\xi \} r_g}{M_g} \big )\big ) / \log \zeta \in {\mathbb {R}}\) and \(K _2 = \log (2^{r_g})/\log \zeta >0\). Note that \(K_1,K_2\) are independent of \(\omega , x, m\) and n.

We obtain that for any \(g \in \Sigma _G\),

$$\begin{aligned} \begin{aligned} \int _{[bb] \times (a, \xi )} \kappa \, d\mathbb {P} \times \mu \ge \,&\sum _{n \in \mathbb {N}_{\ge 2}} \sum _{\omega _3,\ldots ,\omega _n \in \Sigma _B} \mathbb {P}([bb \omega _3 \cdots \omega _n g]) \mu ((a, \xi )) \Big (n+K_1 + K_2 \ell _b^2 \prod _{i=3}^n \ell _{\omega _i}\Big ). \end{aligned}\end{aligned}$$

Since

$$\begin{aligned} \sum _{n \in \mathbb {N}_{\ge 2}} \sum _{\omega _3,\ldots ,\omega _n \in \Sigma _B} \mathbb {P}([\omega _3 \cdots \omega _n]) \prod _{i=3}^n \ell _{\omega _i} = 1+\sum _{n \in {\mathbb {N}}} \theta ^n = \infty ,\end{aligned}$$

we get \(\int _{[bb] \times (a, \xi )} \kappa \, d\mathbb {P} \times \mu = \infty \) and from Lemma 2.3 we now conclude that \(\mu \) is infinite. \(\square \)

4.2 The case \(\theta < 1\)

For the other direction of Theorem 1.3, assume \(\theta < 1\). We first obtain a stationary probability measure \(\tilde{\mu }\) for F as in (1.5) using a standard Krylov-Bogolyubov type argument. For this, let \(\mathcal {M}\) denote the set of all finite Borel measures on [0, 1], and define the operator \(\mathcal {P}: \mathcal {M} \rightarrow \mathcal {M}\) by

$$\begin{aligned} \mathcal {P} \nu = \sum _{j \in \Sigma } p_j \nu \circ T_j^{-1}, \qquad \nu \in \mathcal {M}, \end{aligned}$$
(4.9)

where \(\nu \circ T_j^{-1}\) denotes the pushforward measure of \(\nu \) under \(T_j\). Then \(\mathcal {P}\) is a Markov-Feller operator (see e.g. [26]) with dual operator U on the space B([0, 1]) of all bounded Borel measurable functions given byFootnote 1\(U f = \sum _{j \in \Sigma } p_j f \circ T_j\) for \(f \in B([0,1])\). As before, let \(\lambda \) denote the Lebesgue measure on [0, 1], and set \(\lambda _n = \mathcal {P}^n \lambda \) for each \(n \ge 0\). Furthermore, for each \(n \in \mathbb {N}\) define the Cesáro mean \(\mu _n = \frac{1}{n} \sum _{k=0}^{n-1} \lambda _k\). Since the space of probability measures on [0, 1] equipped with the weak topology is sequentially compact, there exists a subsequence \((\mu _{n_k})_{k \in \mathbb {N}}\) of \((\mu _n)_{n \in \mathbb {N}}\) that converges weakly to a probability measure \(\tilde{\mu }\) on [0, 1]. Using that a Markov-Feller operator is weakly continuous, it then follows from a standard argument that \(\mathcal {P} \tilde{\mu } = \tilde{\mu }\), that is, \(\tilde{\mu }\) is a stationary probability measure for F. The next theorem will lead to the estimate (1.7) from Theorem 1.3. For any \({\mathbf {b}} = b_1 \cdots b_k \in \Sigma _B^k\), \(k \ge 0\), recall that we abbreviate \(p_{{\mathbf {b}}} = \prod _{i=1}^kp_{b_i} \) and also let \(\ell _{{\mathbf {b}}} = \prod _{i=1}^k \ell _{b_i}\) where we use \(p_{{\mathbf {b}}}=1=\ell _{{\mathbf {b}}}\) in case \(k=0\).

Theorem 4.1

There exists a constant \(C > 0\) such that for all \(n \in \mathbb {N}\) and all Borel sets \(A \subseteq [0,1]\) we have

$$\begin{aligned} \lambda _n(A) \le C \cdot \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{\infty } \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot \lambda (A)^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}. \end{aligned}$$
(4.10)

Before we prove this theorem, we first show how it gives Theorem 1.3.

Proof of Theorem 1.3

The first part of the statement follows from Proposition 4.1. For the second part, assume that \(\theta < 1\) and that Theorem 4.1 holds. Let \(A \subseteq [0,1]\). Using the regularity of \(\lambda \), for any \( \delta >0\) there exists an open set \(G \subseteq [0,1]\) such that \(A \subseteq G\) and \(\lambda (G) \le \lambda (A) + \delta \). Using that \((\mu _{n_k})_{k \in \mathbb {N}}\) converges weakly to \(\tilde{\mu }\), we obtain from the Portmanteau Theorem together with Theorem 4.1 that

$$\begin{aligned} \tilde{\mu }(A)&\le \tilde{\mu }(G) \le \liminf _k \mu _{n_k}(G) \nonumber \\&\le C \cdot \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{\infty } \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot (\lambda (A)+\delta )^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}. \end{aligned}$$
(4.11)

Since \(\theta < 1\), the sum is bounded and with the Dominated Convergence Theorem we can take the limit as \(\delta \rightarrow 0\) to obtain

$$\begin{aligned} \tilde{\mu }(A) \le C \cdot \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{\infty } \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot \lambda (A)^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}. \end{aligned}$$
(4.12)

This proves that \(\tilde{\mu }\) is absolutely continuous with respect to the Lebesgue measure on [0, 1]. It follows that the probability measure \(\tilde{\mu }\) is equal to the unique acs measure \(\mu _{{\mathbf {p}}}\) from Theorem 1.2. The estimate (1.7) follows directly from (4.12). \(\square \)

It remains to give the proof of Theorem 4.1. We shall do this in a number of steps.

Proposition 4.2

There exists a constant \(K_1 > 0\) such that for all \(n \in \mathbb {N}\), all \({\mathbf {u}} \in \Sigma ^n\) and all Borel sets \(A \subseteq [0,1]\) with \(0< 3\lambda (A) < \frac{1}{2} \min \{c,1-c\} \) we have

$$\begin{aligned} \lambda (T_{{\mathbf {u}}}^{-1}A) \le K_1 \big ( \lambda (T_{{\mathbf {u}}}^{-1}[0,3\eta )) + \lambda (T_{{\mathbf {u}}}^{-1}(c-3\eta ,c+3\eta )) + \lambda (T_{{\mathbf {u}}}^{-1}(1-3\eta ,1])\big ),\end{aligned}$$

where \(\eta = \lambda (A)\).

Proof

Let \(n \in \mathbb {N}\), \({\mathbf {u}} \in \Sigma ^n\) and a Borel set \(A \subseteq [0,1]\) with \(0< 3\lambda (A)< \frac{1}{2} \min \{c,1-c\} < 1\) be given and write \(\eta = \lambda (A)\). The map \(T_{{\mathbf {u}}}\) has non-positive Schwarzian derivative on any of its intervals of monotonicity (see (2.9)) and the image of any such interval is [0, c], [c, 1] or [0, 1]. Set \(A_1 = (\eta ,c-\eta )\) and \(A_2= (2\eta ,c-2\eta )\). Let I be a connected component of \(T_{{\mathbf {u}}}^{-1} A_1\), and set \(f = T_{\mathbf{u}}|_{I}\) and \(I^* = f^{-1} A_2\). The Minimum Principle yields

$$\begin{aligned} |Df(x)| \ge \min _{z \in \partial I^*} |Df(z)|, \qquad \text { for all }x \in I^*. \end{aligned}$$
(4.13)

Suppose the minimal value is attained at \(f^{-1}(2\eta )\) and set \(A_3 = (2\eta ,3\eta )\) and \(J = f^{-1} A_3\). By the condition on the size of A it follows from the Koebe Principle that

$$\begin{aligned} K^{(\eta )} |Df(f^{-1}(2\eta ))| \ge |Df(x)|, \qquad \text { for all } x \in J. \end{aligned}$$
(4.14)

Combining (4.13) and (4.14) gives

$$\begin{aligned} \begin{aligned} \lambda (f^{-1}(A \cap A_2)) =\&\int _{A \cap A_2} \frac{1}{|Df(f^{-1} y)|} \, d\lambda (y) \le \lambda (A) \cdot \frac{1}{|Df(f^{-1}(2\eta ))|} \\ \le \&K^{(\eta )} \int _{A_3} \frac{1}{|Df(f^{-1}y)|} \, d\lambda (y) = K^{(\eta )} \lambda (f^{-1}(A_3)). \end{aligned}\end{aligned}$$

We conclude that

$$\begin{aligned} \lambda \big (T_{{\mathbf {u}}}^{ -1}\big (A \cap (2\eta ,c-2\eta )\big )\big ) \le K^{(\eta )} \lambda \big ( T_{\mathbf{u}}^{-1}(2\eta ,3\eta )\big ). \end{aligned}$$
(4.15)

In case \(\min _{z \in \partial I^*} |Df(z)| = f^{-1}(c-2\eta )\), a similar reasoning yields

$$\begin{aligned} \lambda \big ( T_{{\mathbf {u}}}^{ -1}\big (A \cap (2\eta ,c-2\eta )\big )\big ) \le K^{(\eta )} \lambda \big ( T_{\mathbf{u}}^{-1}(c-3\eta ,c-2\eta )\big ).\qquad \end{aligned}$$
(4.16)

Furthermore, a similar reasoning can be done for the interval [c, 1] to conclude that

$$\begin{aligned} \lambda \big ( T_{{\mathbf {u}}}^{ -1}\big (A \cap (c+2\eta ,1-2\eta )\big )\big ) \le K^{(\eta )} \Big ( \lambda \big ( T_{{\mathbf {u}}}^{-1}(c+2\eta ,c+3\eta ) \big ) + \lambda \big ( T_{{\mathbf {u}}}^{-1}(1-3\eta ,1-2\eta ) \big ) \Big ). \end{aligned}$$

Hence, setting \(K_1 = \max \{K^{(\eta )},1\}\) gives the desired result. \(\square \)

Proposition 4.2 shows that to get the desired estimate from Theorem 4.1 it suffices to consider small intervals on the left and right of [0, 1] and around c, i.e., sets of the form

$$\begin{aligned} I_c(\varepsilon ):=(c-\varepsilon ,c+\varepsilon ) \quad \text {and} \quad I_0(\varepsilon ):=[0, \varepsilon ) \cup (1-\varepsilon ,1]\end{aligned}$$

for \(\varepsilon >0\). We first focus on estimating the measure of the intervals \(I_c(\varepsilon )\).

Lemma 4.1

There exists a constant \(K_2 \ge 1\) such that for all \(n \in \mathbb {N}\), \(\mathbf {u} \in \Sigma ^{n-1} \times \Sigma _G\) and all \(\varepsilon > 0\) we have

$$\begin{aligned} \lambda (T_{{\mathbf {u}}}^{-1} I_c(\varepsilon ))\le K_2 \varepsilon . \end{aligned}$$
(4.17)

Proof

Let \(n \in \mathbb {N}\) and \({\mathbf {u}} \in \Sigma ^{n-1} \times \Sigma _G\). Let \(\varepsilon > 0\). Suppose that \(\varepsilon \ge \frac{1}{4} \min \{c,1-c\}\). Then

$$\begin{aligned} \lambda (T_{{\mathbf {u}}}^{-1}I_c(\varepsilon )) \le 1 \le \frac{4\varepsilon }{\min \{c,1-c\}}. \end{aligned}$$
(4.18)

Now suppose \(\varepsilon < \frac{1}{4} \min \{c,1-c\}\). Again the map \(T_{{\mathbf {u}}}\) has non-positive Schwarzian derivative on the interior of any of its intervals of monotonicity and since \( u_n \in \Sigma _G\) the image of any such interval is [0, 1]. Use \(\mathcal I\) to denote the collection of connected components of \(T_{\mathbf{u}}^{-1}I_c(\varepsilon )\). Let \(A \in {\mathcal {I}}\) and write \(J = J_A\) and \(I = I_A\) for the intervals that satisfy \(A \subseteq J\), \(A \subseteq I\) and

$$\begin{aligned} \begin{aligned} T_{{\mathbf {u}}}(J) =\&\Big [c-\frac{1}{2} \min \{c,1-c\},c+\frac{1}{2} \min \{c,1-c\}\Big ],\\ T_{{\mathbf {u}}}(I) =\&\Big [c-\frac{3}{4} \min \{c,1-c\},c+\frac{3}{4} \min \{c,1-c\}\Big ]. \end{aligned}\end{aligned}$$

Also, write \(f = T_{{\mathbf {u}}}|_I\). Since f has non-positive Schwarzian derivative, it follows from (2.13) that

$$\begin{aligned} \frac{\lambda (A)}{\lambda (J)} \le K^{(\frac{1}{4})} \frac{\lambda (f(A))}{\lambda (f(J))} = K^{(\frac{1}{4})} \frac{2\varepsilon }{\min \{c,1-c\}}. \end{aligned}$$
(4.19)

We conclude that

$$\begin{aligned} \lambda (T_{{\mathbf {u}}}^{-1} I_c(\varepsilon )) = \sum _{A \in \mathcal I} \lambda (A) \le K^{(\frac{1}{4})} \frac{2\varepsilon }{\min \{c,1-c\}} \sum _{A \in {\mathcal {I}}}\lambda (J_A) \le K^{(\frac{1}{4})} \frac{2\varepsilon }{\min \{c,1-c\}}. \end{aligned}$$
(4.20)

Defining \(K_2 = \frac{2\max \{2,K^{(\frac{1}{4})}\}}{\min \{c,1-c\}} \), the desired result now follows from (4.18) and (4.20). \(\square \)

To find \(\lambda _n\big (I_c(\varepsilon )\big )\), first note that from Lemma 2.4 it follows that for all \(\varepsilon >0\), \(n \in {\mathbb {N}}\), \({\mathbf {u}} \in \Sigma _B^n\),

$$\begin{aligned} T_{{\mathbf {u}}}^{-1} \big ( I_c(\varepsilon ) \big ) \subseteq I_c \big ( {\tilde{K}}^{-1}\varepsilon ^{\ell _{{\mathbf {u}}}^{-1}} \big ). \end{aligned}$$
(4.21)

By splitting \(\Sigma ^n\) according to the final block of bad indices, we can then write using (4.21) and Lemma 4.1 that

$$\begin{aligned} \begin{aligned} \lambda _n\big (I_c(\varepsilon )\big ) =\&\sum _{k=0}^{n-1} \sum _{\mathbf {v} \in \Sigma ^{n-k-1}} \sum _{g \in \Sigma _G} \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {v} g \mathbf {b}} \lambda \big (T^{-1}_{\mathbf {v}g\mathbf {b}} I_c(\varepsilon )\big ) + \sum _{\mathbf {b} \in \Sigma _B^n} p_{\mathbf {b}} \lambda \big (T_{\mathbf {b}}^{-1} I_c(\varepsilon )\big )\\ \le \&\sum _{k=0}^{n-1} \sum _{\mathbf {v} \in \Sigma ^{n-k-1}} \sum _{g \in \Sigma _G} \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {v} g \mathbf {b}} \lambda \big ( T_{{\mathbf {v}} g}^{-1} I_c ({\tilde{K}}^{-1} \varepsilon ^{\ell _{{\mathbf {b}}}^{-1}}) \big ) + \sum _{\mathbf {b} \in \Sigma _B^n} p_{\mathbf {b}} \lambda \big ( I_c( {\tilde{K}}^{-1} \varepsilon ^{\ell _{{\mathbf {b}}}^{-1}})\big )\\ \le \&\sum _{k=0}^{n-1} \sum _{g \in \Sigma _G} \sum _{\mathbf {b} \in \Sigma _B^k} p_g p_{\mathbf {b}} K_2 {\tilde{K}}^{-1} \varepsilon ^{\ell _{{\mathbf {b}}}^{-1}} + \sum _{\mathbf {b} \in \Sigma _B^n} p_{{\mathbf {b}}} 2{\tilde{K}}^{-1} \varepsilon ^{\ell _{\mathbf {b}}^{-1}}. \end{aligned}\end{aligned}$$

Taking \(K_3 = \max \big \{ K_2, 2 \big ( \sum _{g \in \Sigma _G} p_g \big )^{-1} \big \} \cdot {\tilde{K}}^{-1} \ge 1\) then gives

$$\begin{aligned} \lambda _n \big (I_c(\varepsilon )\big ) \le K_3 \sum _{g \in \Sigma _G} \sum _{k=0}^n \sum _{{\mathbf {b}} \in \Sigma _B^k} p_g p_{{\mathbf {b}}} \varepsilon ^{\ell _{{\mathbf {b}}}^{-1}}. \end{aligned}$$
(4.22)

We now focus on \(I_0(\varepsilon ) = [0, \varepsilon ) \cup (1-\varepsilon ,1]\). Fix an \( 0< \varepsilon _0 < \frac{1}{2} \min \{ c, 1-c\}\) and a \(t > 1\) that satisfy

$$\begin{aligned} |DT_j(x)| > t, \qquad \text { for all }x \in I_0(\varepsilon _0)\hbox { and each }j \in \Sigma . \end{aligned}$$
(4.23)

Such \(\varepsilon _0\) and t exist because of (G4) and (B4). From (G3) it follows that for each \(0< \varepsilon < \varepsilon _0\) and \(g \in \Sigma _G\),

$$\begin{aligned} |T_g(x)-T_g(c)| = \left| \int _c^x DT_g (y) dy \right| \ge \frac{K_g}{r_g} \cdot |x-c|^{r_g}.\end{aligned}$$

Set \(K_4 = \max \{(K_g^{-1} r_g)^{r_g^{-1}}: g \in \Sigma _G\} \ge 1\). Then (G1) implies that

$$\begin{aligned} T_g^{-1} I_0(\varepsilon ) \subseteq I_0(\varepsilon t^{-1}) \cup I_c (K_4 \varepsilon ^{r_g^{-1}} ). \end{aligned}$$
(4.24)

Furthermore, from (B1) it follows that for each \(\varepsilon \in (0,\varepsilon _0)\) and \(b \in \Sigma _B\),

$$\begin{aligned} T_b^{-1} I_0(\varepsilon ) \subseteq I_0(\varepsilon t^{-1}). \end{aligned}$$
(4.25)

Write each \(\mathbf {u} \in \Sigma ^n\) as

$$\begin{aligned} \mathbf {u} = \mathbf {b}_1 \mathbf {g}_1 \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}} \end{aligned}$$
(4.26)

for some \(\tilde{s} \in \{1,\ldots ,n\}\), where for each i we have \(\mathbf {b}_i = b_{i,1} \cdots b_{i,k_i}\in \Sigma _B^{k_i}\) and \(\mathbf {g}_i = g_{i,1} \cdots g_{i,m_i} \in \Sigma _G^{m_i}\) for some \(k_1,m_{\tilde{s}} \in \mathbb {Z}_{\ge 0}\) and \(k_2,\ldots ,k_{\tilde{s}}, m_1,\ldots ,m_{\tilde{s}-1} \in \mathbb {N}\). Define

$$\begin{aligned} s= {\left\{ \begin{array}{ll} {\tilde{s}}, &{} \text { if } m_{\tilde{s}} \ge 1,\\ \tilde{s} -1, &{} \text { if } m_{\tilde{s}} = 0 . \end{array}\right. }\end{aligned}$$

Moreover, we introduce notation to indicate the length of the tails of the block \({\mathbf {u}}\):

$$\begin{aligned} \begin{array}{ll} d_i = |\mathbf {b}_i \mathbf {g}_i \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}}|, &{} i \in \{1,\ldots ,\tilde{s}\}, \\ q_{i,j} = |g_{i,j+1} \cdots g_{i,m_i} \mathbf {b}_{i+1} \mathbf {g}_{i+1} \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}}|, &{} i \in \{1,\ldots ,\tilde{s}\}, \, j \in \{0,\ldots ,m_i\}. \end{array}\end{aligned}$$

If necessary to avoid confusion, we write \(s(\mathbf {u})\), \(k_i(\mathbf {u})\), etcetera to emphasize the dependence on \(\mathbf {u}\).

Lemma 4.2

There exists a constant \(K_5 > 0\) such that for each \( 0< \varepsilon < \varepsilon _0\), \(n \in \mathbb {N}\) and \(\mathbf {u} = \mathbf {b}_1 \mathbf {g}_1 \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}} \in \Sigma ^n\),

$$\begin{aligned} T_{\mathbf {u}}^{-1} I_0(\varepsilon ) \subseteq&I_0(\varepsilon t^{-d_1}) \cup \bigcup _{i=1}^{s} T^{-1}_{\mathbf {b}_1 \mathbf {g}_1 \cdots \mathbf {b}_{i-1} \mathbf {g}_{i-1}} I_c\Big (K_5(\varepsilon t^{-q_{i,1}})^{\ell _{\mathbf {b}_i}^{-1} r^{-1}_{g_{i,1}}}\Big ) \nonumber \\&\cup \bigcup _{i=1}^s \bigcup _{j=2}^{m_i} T_{\mathbf {b}_1 \mathbf {g}_1 \cdots \mathbf {b}_{i-1} \mathbf {g}_{i-1}\mathbf {b}_i g_{i,1} \cdots g_{i,j-1}}^{-1}I_c(K_5 (\varepsilon t^{-q_{i,j}})^{r_{g_{i,j}}^{-1}}). \end{aligned}$$

Proof

We prove the statement by an induction argument for \(\tilde{s}\). Let \(\mathbf {u}\) be a word with symbols in \(\Sigma \), and write \(\mathbf {u} = \mathbf {b}_1 \mathbf {g}_1 \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}}\) for its decomposition as in (4.26). First suppose that \(\tilde{s} = 1\). If \(m_1 = 0\), then the statement immediately follows from repeated application of (4.25). If \(m_1 \ge 1\), then repeated application of (4.24) gives

$$\begin{aligned} T_{\mathbf {g}_1}^{-1} I_0(\varepsilon ) \subseteq I_0 (\varepsilon t^{-q_{1,0}}) \cup I_c\Big (K_4(\varepsilon t^{-q_{1,1}})^{r^{-1}_{g_{1,1}}}\Big ) \cup \bigcup _{j=2}^{m_1} T_{g_{1,1} \cdots g_{1,j-1}}^{-1} I_c\Big (K_4(\varepsilon t^{-q_{1,j}})^{r^{-1}_{g_{1,j}}}\Big ). \end{aligned}$$
(4.27)

By setting \(K_5 = {\tilde{K}}^{-1} K_4\), applying (4.21) and (4.25) then yields

$$\begin{aligned} T_{\mathbf {b}_1 \mathbf {g}_1}^{-1} I_0 (\varepsilon ) \subseteq I_0 (\varepsilon t^{-d_1}) \cup I_c \Big (K_5(\varepsilon t^{-q_{1,1}})^{\ell _{\mathbf {b}_1}^{-1} r^{-1}_{g_{1,1}}}\Big ) \cup \bigcup _{j=2}^{m_1} T_{\mathbf {b}_1 g_{1,1} \cdots g_{1,j-1}}^{-1} I_c\Big (K_5(\varepsilon t^{-q_{1,j}})^{r^{-1}_{g_{1,j}}}\Big ). \end{aligned}$$

Note that this is true for the case that \(k_1 = 0\) as well. This proves the statement if \(\tilde{s} = 1\). Now suppose \(\tilde{s}(\mathbf {u}) > 1\) and suppose that the statement holds for all words \(\mathbf {v}\) with \(\tilde{s}(\mathbf {v}) = \tilde{s}(\mathbf {u})-1\). In particular, the statement then holds for the word \(\mathbf {b}_2 \mathbf {g}_2 \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}}\). Note that \(m_1 \ge 1\). Again, by repeated application of (4.24) it follows that

$$\begin{aligned} T_{\mathbf {g}_1}^{-1} I_0(\varepsilon t^{-d_2}) \subseteq I_0 (\varepsilon t^{-q_{1,0}}) \cup I_c\Big (K_4(\varepsilon t^{-q_{1,1}})^{r^{-1}_{g_{1,1}}}\Big ) \cup \bigcup _{j=2}^{m_1} T_{g_{1,1} \cdots g_{1,j-1}}^{-1} I_c\Big (K_4(\varepsilon t^{-q_{1,j}})^{r^{-1}_{g_{1,j}}}\Big ).\qquad \end{aligned}$$
(4.28)

Furthermore, applying (4.21) and (4.25) then yields

$$\begin{aligned} T_{\mathbf {b}_1 \mathbf {g}_1}^{-1} I_0 (\varepsilon t^{-d_2}) \subseteq I_0 (\varepsilon t^{-d_1}) \cup I_c\Big (K_5(\varepsilon t^{-q_{1,1}})^{\ell _{\mathbf {b}_1}^{-1} r^{-1}_{g_{1,1}}} \Big ) \cup \bigcup _{j=2}^{m_1} T_{\mathbf {b}_1 g_{1,1} \cdots g_{1,j-1}}^{-1} I_c\Big (K_5(\varepsilon t^{-q_{1,j}})^{r^{-1}_{g_{1,j}}}\Big ).\end{aligned}$$

This together with the statement being true for the word \(\mathbf {b}_2 \mathbf {g}_2 \cdots \mathbf {b}_{\tilde{s}} \mathbf {g}_{\tilde{s}}\) yields the statement for \(\mathbf {u}\). \(\square \)

Combining Lemmas 4.1 and 4.2 gives

$$\begin{aligned}\lambda (T_{{\mathbf {u}}}^{-1}I_0(\varepsilon )) \le 2\varepsilon t^{-d_1} + \sum _{i=1}^s K_2 K_5 (\varepsilon t^{-q_{i,1}})^{\ell _{\mathbf {b}_i}^{-1} r^{-1}_{g_{i,1}}} + \sum _{i=1}^s \sum _{j=2}^{m_i} K_2 K_5 (\varepsilon t^{-q_{i,j}})^{r_{g_{i,j}}^{-1}}.\end{aligned}$$

Let \(r_{\max } = \max \{ r_g \, : \, g \in \Sigma _G\}\) and set \(\alpha := t^{1/r_{\max }}>1\). Then

$$\begin{aligned} \sum _{i=1}^s \sum _{j=2}^{m_i} \alpha ^{-q_{i,j}} \le \sum _{\ell =0}^{\infty } \alpha ^{-\ell } = \frac{1}{1-1/\alpha }, \end{aligned}$$

so that

$$\begin{aligned} \lambda (T_{{\mathbf {u}}}^{-1}I_0(\varepsilon ))&\le \&2 \varepsilon ^{1/r_{\max }} + K_2 K_5 \sum _{i=1}^s \sum _{j=2}^{m_i} \varepsilon ^{1/r_{\max }} \alpha ^{-q_{i,j}} + \sum _{i=1}^s K_2 K_5 (\varepsilon t^{-q_{i,1}})^{\ell _{\mathbf {b}_i}^{-1} r^{-1}_{g_{i,1}}}\nonumber \\&\le \&\bigg ( 2 + \frac{K_2K_5}{1-1/\alpha } \bigg ) \varepsilon ^{1/r_{\max }} + K_2 K_5\sum _{i=1}^s (\varepsilon t^{-q_{i,1}})^{\ell _{\mathbf {b}_i}^{-1} r^{-1}_{g_{i,1}}}. \end{aligned}$$
(4.29)

Proposition 4.3

There exists a constant \(K_6 > 0\) such that for each \(\varepsilon \in (0,\varepsilon _0)\) and \(n \in \mathbb {N}\),

$$\begin{aligned} \lambda _n (I_0(\varepsilon )) \le K_6 \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{n-1} \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot \varepsilon ^{\ell _{\mathbf {b}}^{-1}r_g^{-1}}. \end{aligned}$$

Proof

Let \(n \in \mathbb {N}\). Then with (4.29) we obtain

$$\begin{aligned} \lambda _n (I_0(\varepsilon ))&= \sum _{\mathbf {u} \in \Sigma ^n} p_{\mathbf {u}} \lambda \big (T_{\mathbf {u}}^{-1} (I_0(\varepsilon ))\big ) \nonumber \\&\le \bigg ( 2 + \frac{K_2K_5}{1-1/\alpha } \bigg ) \varepsilon ^{1/r_{\max }} + K_2 K_5 \sum _{\mathbf {u} \in \Sigma ^n} p_{\mathbf {u}} \sum _{i=1}^{s(\mathbf {u})} \big (\varepsilon t^{-q_{i,1}(\mathbf {u})}\big )^{\ell ^{-1}_{\mathbf {b}_i(\mathbf {u})}r^{-1}_{g_{i,1}(\mathbf {u})}} \nonumber \\&= \bigg ( 2 + \frac{K_2K_5}{1-1/\alpha } \bigg ) \varepsilon ^{1/r_{\max }} + K_2 K_5 \sum _{i=1}^{\tau } \sum _{\mathbf {u} \in \Sigma ^n} 1_{\{1,\ldots ,s(\mathbf {u})\}}(i) p_{\mathbf {u}} \big (\varepsilon t^{-q_{i,1}(\mathbf {u})}\big )^{\ell ^{-1}_{\mathbf {b}_i(\mathbf {u})}r^{-1}_{g_{i,1}(\mathbf {u})}}, \end{aligned}$$
(4.30)

where we defined \(\tau = \lfloor \frac{n+1}{2}\rfloor \) which is the largest value \(s({\mathbf {u}})\) can take. Let us consider the second term in (4.30). First of all, note that a word \(\mathbf {u} \in \Sigma ^n\) satisfies \(s(\mathbf {u}) \ge 1\) if and only if \(m_1(\mathbf {u}) \ge 1\). Therefore,

$$\begin{aligned} \{\mathbf {u} \in \Sigma ^n: s(\mathbf {u}) \ge 1\}&= \bigcup _{k=0}^{n-1} \Sigma _B^k \times \Sigma _G \times \Sigma ^{n-k-1}. \end{aligned}$$

Hence, defining the function \(\chi \) on \(\{0,\ldots ,n-1\}^2\) by

$$\begin{aligned} \chi (k,q) = \sum _{\mathbf {b} \in \Sigma _B^k} \sum _{g \in \Sigma _G} p_{\mathbf {b}} p_g \big (\varepsilon t^{-q}\big )^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}, \qquad (k,q) \in \{0,\ldots ,n-1\}^2. \end{aligned}$$
(4.31)

we can rewrite and bound the term with \(i=1\) in (4.30) as follows:

$$\begin{aligned} \sum _{\mathbf {u} \in \Sigma ^n} 1_{\{1,\ldots ,s(\mathbf {u})\}}(1) p_{\mathbf {u}} \big (\varepsilon t^{-q_{1,1}(\mathbf {u})}\big )^{\ell ^{-1}_{\mathbf {b}_i(\mathbf {u})}r^{-1}_{g_{1,1}(\mathbf {u})}}&= \sum _{k=0}^{n-1} \sum _{\mathbf {v} \in \Sigma ^{n-k-1}} p_{\mathbf {v}} \chi (k,n-k-1) \nonumber \\&\le \varepsilon ^{1/r_{\max }} + \sum _{k=1}^{n-1} \chi (k,n-k-1). \end{aligned}$$
(4.32)

Secondly, note that for each \(i \in \{2,\ldots ,\tau \}\) a word \({\mathbf {u}} \in \Sigma ^n\) satisfies \(s(\mathbf {u}) \ge i\) if and only if \(m_{i-1}(\mathbf {u}),k_i(\mathbf {u}),m_i(\mathbf {u}) \ge 1\). For each \(k \in \{1,\ldots ,n-1\}\) and \(q \in \{0,\ldots ,n-k-2\}\) and \(i \in \{2,\ldots ,\tau \}\) we define

$$\begin{aligned} A_{i,k,q} = \{ \mathbf {v} \in \Sigma ^{n-k-q-1}: \tilde{s}(\mathbf {v}) = i-1, v_{n-k-q-1} \in \Sigma _G\}. \end{aligned}$$
(4.33)

The set \(A_{i,k,q}\) contains all words of length \(n-k-q-1\) that can precede the word \({\mathbf {b}}_i \mathbf {g}_i \cdots \mathbf{b}_{\tilde{s}} {\mathbf {g}}_{\tilde{s}}\) with \(|\mathbf {b}_i| = k\) and \(|g_{i,2} \cdots g_{i,m_i} \mathbf {b}_{i+1} {\mathbf {g}}_{i+1} \cdots {\mathbf {b}}_{\tilde{s}} \mathbf {g}_{\tilde{s}}| = q\). So

$$\begin{aligned} \{\mathbf {u} \in \Sigma ^n: s(\mathbf {u}) \ge i\}&= \bigcup _{k=1}^{n-1} \bigcup _{q=0}^{n-k-2} A_{i,k,q} \times \Sigma _B^k \times \Sigma _G \times \Sigma ^q, \qquad i \in \{2, \ldots , \tau \}. \end{aligned}$$

Hence, using (4.31) we can rewrite and bound the sum in (4.30) that runs from \(i=2\) to \(\tau \) as follows:

$$\begin{aligned}&\sum _{i=2}^{\tau } \sum _{\mathbf {u} \in \Sigma ^n} 1_{\{1,\ldots ,s(\mathbf {u})\}} (i) p_{\mathbf {u}} \big (\varepsilon t^{-q_{i,1}(\mathbf {u})} \big )^{\ell ^{-1}_{\mathbf {b}_i(\mathbf {u})}r^{-1}_{g_{i,1}(\mathbf {u})}} \nonumber \\&\quad =\ \sum _{i=2}^{\tau } \sum _{k=1}^{n-1} \sum _{q=0}^{n-k-2} \sum _{\mathbf {v}_1 \in A_{i,k,q}} \sum _{\mathbf {v}_2 \in \Sigma ^{q-1}} p_{\mathbf {v}_1} p_{\mathbf {v}_2} \chi (k,q) \nonumber \\&\quad =\ \sum _{k=1}^{n-1}\sum _{q=0}^{n-k-2} \chi (k,q) \sum _{i=2}^{\tau } \sum _{\mathbf {v}_1 \in A_{i,k,q}} \sum _{\mathbf {v}_2 \in \Sigma ^{q-1}} p_{\mathbf {v}_1} p_{\mathbf {v}_2} \nonumber \\&\quad \le \sum _{k=1}^{n-1} \sum _{q=0}^{n-k-2} \chi (k,q). \end{aligned}$$
(4.34)

Here the last step follows from the fact that

$$\begin{aligned} \sum _{i=2}^{\tau } \sum _{\mathbf {v}_1 \in A_{i,k,q}} p_{\mathbf {v}_1} \le \sum _{\mathbf {v} \in \Sigma ^{n-k-q-2}} \sum _{g \in \Sigma _G} p_{{\mathbf {v}}} p_g \le 1. \end{aligned}$$
(4.35)

Combining (4.32) and (4.34) gives

$$\begin{aligned} \sum _{i=1}^{\tau } \sum _{\mathbf {u} \in \Sigma ^n} 1_{\{1,\ldots ,s(\mathbf {u})\}}(i) p_{\mathbf {u}} \big (\varepsilon t^{-q_{i,1}(\mathbf {u})}\big )^{\ell ^{-1}_{\mathbf {b}_i(\mathbf {u})}r^{-1}_{g_{i,1}(\mathbf {u})}} \le \varepsilon ^{1/r_{\max }} + \sum _{k=1}^{n-1} \sum _{q=0}^{n-k-1} \chi (k,q). \end{aligned}$$
(4.36)

Furthermore, for each \(\mathbf {b} \in \Sigma _B^k\) and \(g \in \Sigma _G\) we have again by setting \(r_{\max } = \max \{r_j: j \in \Sigma _G\}\) and \(\alpha = t^{1/r_{\max }}\) that

$$\begin{aligned} \sum _{q=0}^{n-k-1} (t^{-q})^{\ell _{\mathbf {b}}^{-1} r_g^{-1}} \le \sum _{q=0}^{n-k-1} \big (\alpha ^{-\ell _{\mathbf {b}}^{-1}}\big )^q \le \frac{1}{1-\alpha ^{-\ell _{\mathbf {b}}^{-1}}} \le \frac{\alpha \ell _{\mathbf {b}}^{-1}}{\alpha ^{\ell _{\mathbf {b}}^{-1}}-1} \ell _{\mathbf {b}} \le \frac{\alpha }{\log (\alpha )} \ell _{\mathbf {b}}, \end{aligned}$$
(4.37)

where the last step follows from the fact that \(f(x) = \frac{x}{\alpha ^x-1}\) is a decreasing function and \(\lim _{x \downarrow 0} f(x) = \frac{1}{\log \alpha }\). Hence, combining (4.30), (4.36) and (4.37) gives

$$\begin{aligned} \begin{aligned} \lambda _n (I_0(\varepsilon )) \le \&\bigg ( 2 + \frac{K_2K_5}{1-1/\alpha } + K_2 K_5 \bigg ) \varepsilon ^{1/r_{\max }} + K_2 K_5 \sum _{k=1}^{n-1} \sum _{q=0}^{n-k-1} \sum _{\mathbf {b} \in \Sigma _B^k} \sum _{g \in \Sigma _G} p_{\mathbf {b}} p_g \big (\varepsilon t^{-q}\big )^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}\\ \le \&\bigg ( 2 + K_2K_5 \frac{2\alpha -1}{\alpha -1} \bigg ) \varepsilon ^{1/r_{\max }} + K_2 K_5 \sum _{k=1}^{n-1} \sum _{\mathbf {b} \in \Sigma _B^k} \sum _{g \in \Sigma _G} p_{\mathbf {b}} p_g \varepsilon ^{\ell _{\mathbf {b}}^{-1} r_g^{-1}} \frac{\alpha \ell _{{\mathbf {b}}}}{\log (\alpha )}\\ \le \&K_6 \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{n-1} \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf{b}}\varepsilon ^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}, \end{aligned}\end{aligned}$$

where \(K_6 = \frac{1}{\min \{ p_g \, : \, g \in \Sigma _G\}} \big ( 2 + K_2K_5 \frac{2\alpha -1}{\alpha -1} \big ) + \frac{K_2K_5 \alpha }{\log \alpha }\). \(\square \)

We are now ready to prove Theorem 4.1.

Proof of Theorem 4.1

Let \(A \subseteq [0,1]\) be a Borel set. First suppose that \(\lambda (A) \ge \frac{\varepsilon _0}{3}\). Then there exists a constant \(C = C(\frac{\varepsilon _0}{3}) > 0\) such that

$$\begin{aligned} \lambda _n(A) \le 1 \le C \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{\infty } \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot \lambda (A)^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}. \end{aligned}$$
(4.38)

Now suppose that \(\lambda (A) < \frac{\varepsilon _0}{3}\) and set \(\varepsilon = 3\lambda (A)\). It follows from Proposition 4.2 that for all \(n \in \mathbb {N}\) and all \({\mathbf {u}} \in \Sigma ^n\) we have

$$\begin{aligned} \lambda \big (T_{{\mathbf {u}}}^{-1} A\big ) \le K_1 \big (\lambda (T_{{\mathbf {u}}}^{-1}I_0(\varepsilon )) + \lambda (T_{{\mathbf {u}}}^{-1}I_c(\varepsilon ))\big ). \end{aligned}$$

Together with (4.22) and Proposition 4.3 this yields for all \(n \in \mathbb {N}\) that

$$\begin{aligned} \lambda _n(A) \le K_1\cdot (K_3 + K_6) \sum _{g \in \Sigma _G} p_g \sum _{k=0}^{\infty } \sum _{\mathbf {b} \in \Sigma _B^k} p_{\mathbf {b}} \ell _{\mathbf {b}} \cdot \varepsilon ^{\ell _{\mathbf {b}}^{-1} r_g^{-1}}.\end{aligned}$$

This gives the result. \(\square \)

5 Further Results and Final Remarks

5.1 Proof of Corollaries 1.1 and 1.2

In this section we prove Corollaries 1.1 and 1.2.

Proof of Corollary 1.1

We use the bound (1.7) obtained in Theorem 1.3. For convenience, we set \(\ell = \ell _{\max }\) and \(x=\lambda (A)^{1/r_{\max }}\). The asymptotics is determined by the interplay between \(\theta ^k\searrow 0\) and \(x^{1/\ell ^k}\nearrow 1\). First suppose \(\theta < x^{1/\ell }\). Then \(\lambda (A) > \theta ^{\ell r_{\max }}\), so there exists a constant \(C = C( \theta ^{\ell r_{\max }}) > 0\) such that

$$\begin{aligned} \mu _{{\mathbf {p}}}(A)\le C \cdot \frac{1}{ \log ^{\varkappa } (1/\lambda (A))}.\end{aligned}$$

Now suppose \(\theta \ge x^{1/\ell }\). Note that \(\theta ^N\ge x^{1/\ell ^N}\) if and only if

$$\begin{aligned} \log N + N \log \ell \le \log \bigg ( \frac{\log x}{\log \theta } \bigg ).\end{aligned}$$

Since \(\log N \le N\), this last inequality is satisfied if we take for example

$$\begin{aligned} N=\left\lfloor \frac{1}{1 + \log \ell }\log \bigg ( \frac{\log x}{\log \theta } \bigg ) \right\rfloor = \left\lfloor \frac{1}{1+\log \ell }\log \bigg ( \frac{\log (1/x)}{\log (1/\theta )} \bigg )\right\rfloor , \end{aligned}$$
(5.1)

where \(\lfloor y \rfloor \) denotes the largest integer not exceeding y. Taking N as in (5.1), note that it follows from \(\theta \ge x^{1/\ell }\) that \(N \ge 0\). Then \(\theta ^k\ge x^{1/\ell ^k}\) for all \(k\le N\) as well, and hence

$$\begin{aligned} \begin{aligned} \sum _{k=0}^{\infty } \theta ^k x^{1/\ell ^k}&= \sum _{k=0}^{N} \theta ^k x^{1/\ell ^k} + \sum _{k=N+1}^{\infty } \theta ^k x^{1/\ell ^k} \le \sum _{k=0}^{N} \theta ^k \cdot x^{1/\ell ^N} +\sum _{k=N+1}^{\infty } \theta ^k \cdot 1\\&\le \frac{1}{1-\theta } x^{1/\ell ^N} +\frac{\theta ^{N+1} }{1-\theta }\le \frac{1}{1-\theta } (1+\theta ) \theta ^N. \end{aligned} \end{aligned}$$

From (5.1) we see that \(N \ge \frac{1}{1+\log \ell }\log \big ( \frac{\log x}{\log \theta } \big ) -1\), thus

$$\begin{aligned} \begin{aligned} \theta ^N =\&\exp (N\log \theta ) \le \exp \bigg (\bigg (\frac{1}{1+\log \ell }\log \bigg (\frac{\log (1/x)}{\log (1/\theta )}\bigg ) -1 \bigg )\log \theta \bigg ) \\ =\&\exp \bigg ( \frac{\log \theta }{1+\log \ell }\log {\log (1/x)} + C(\ell ,\theta )\bigg )\\ =\&\overline{C}(\ell ,\theta ) \big ( \log ( 1/x)\big )^{\frac{\log \theta }{1+\log \ell }} = \overline{C}(\ell ,\theta ) \bigg ( \frac{r_{\max }}{\log (1/\lambda (A))} \bigg )^\varkappa , \end{aligned} \end{aligned}$$

where we set \(\varkappa =\frac{\log (1/\theta )}{1+\log \ell }>0\), and where \(C(\ell ,\theta ) \in \mathbb {R}\) and \(\overline{C}(\ell ,\theta ) > 0\) are constants that only depend on \(\ell \) and \(\theta \). We conclude from the bound (1.7) that

$$\begin{aligned} \mu _{{\mathbf {p}}}(A)\le K \cdot \frac{1}{ \log ^{\varkappa } (1/\lambda (A))}\end{aligned}$$

for some positive constant K. \(\square \)

The proof of Corollary 1.2 consists of two steps. Firstly we show that any weak limit point of \(\mu _{{\mathbf {p}}_n}\) is a stationary measure, i.e., satisfies (1.62.3), and secondly that any weak limit point of \(\mu _{{\mathbf {p}}_n}\) is absolutely continuous with respect to the Lebesgue measure. The corollary then follows from the uniqueness of absolutely continuous stationary measures given by Theorem 1.2.

Proof of Corollary 1.2

For each \(n \ge 0\), let \({\mathbf {p}}_n = (p_{n,j})_{j \in \Sigma }\) be a positive probability vector such that \(\sup _{n}\sum _{b \in \Sigma _B} p_{n,b} \ell _b<1\) and assume that \(\lim _{n\rightarrow \infty }{\mathbf {p}}_n={\mathbf {p}}\) in \({\mathbb {R}}_+^N\) for some \({\mathbf {p}}= (p_j)_{j \in \Sigma }\). Let \({\tilde{\mu }}\) be a weak limit point of \(\mu _{{\mathbf {p}}_n}\). Again, note that such a \(\tilde{\mu }\) exists because the space of probability measures on [0, 1] equipped with the weak topology is sequentially compact. After passing to a subsequence we have for any continuous function \(\varphi :[0,1]\rightarrow {\mathbb {R}}\) that

$$\begin{aligned} \lim _{n \rightarrow \infty }\int _{[0,1]} \varphi \, d\mu _{{\mathbf {p}}_n} = \int _{[0,1]} \varphi \, d{\tilde{\mu }}.\end{aligned}$$

Moreover, by the stationarity of the measures \(\mu _{{\mathbf {p}}_n}\) it follows that for each \(n \ge 1\),

$$\begin{aligned} \int _{[0,1]} \varphi \, d\mu _{{\mathbf {p}}_n} =\sum _{j \in \Sigma } p_{n,j} \int _{[0,1]} \varphi \circ T_j \, d\mu _{{\mathbf {p}}_n}.\end{aligned}$$

To prove that \({\tilde{\mu }}\) is stationary for \({\mathbf {p}}\), it is sufficient to show that for each \(j \in \Sigma \),

$$\begin{aligned} \lim _{n \rightarrow \infty } p_{n,j} \int _{[0,1]}\varphi \circ T_j \, d\mu _{{\mathbf {p}}_n} = p_j \int _{[0,1]} \varphi \circ T_j \, d{\tilde{\mu }}. \end{aligned}$$
(5.2)

If \(j\in \Sigma _B\) this is obvious, since then \(\varphi \circ T_j\) is continuous. For \(j \in \Sigma _G\) the map \(\varphi \circ T_j\) might have a discontinuity at c. In this case, we let \(\varphi _\delta \) be the continuous function given by \(\varphi _\delta (x) = \varphi \circ T_j(x)\) for \(x\in I\setminus (c-\delta , c+\delta )\) and \(\varphi _\delta \) is linear otherwise. Then we have

$$\begin{aligned} \lim _{n \rightarrow \infty } \left| p_{n,j}\int _{[0,1]} \varphi _\delta \, d\mu _{{\mathbf {p}}_n}-p_{j}\int _{[0,1]}\varphi _\delta \, d\tilde{\mu }\right| =0, \end{aligned}$$

by the weak convergence and since \(p_{n,j}\rightarrow p_{j}\) as \(n \rightarrow \infty \). Also, we have

$$\begin{aligned} \left| p_{n,j} \int _{[0,1]}\varphi \circ T_j \, d\mu _{{\mathbf {p}}_n}- p_{n,j} \int _{[0,1]}\varphi _\delta \,d\mu _{{\mathbf {p}}_n}\right| \le C\mu _{\mathbf{p}_n}([c-\delta ,c+\delta ]) \rightarrow 0 \text { as } \delta \rightarrow 0, \end{aligned}$$

where the convergence is uniform in n because of (1.7). Similarly,

$$\begin{aligned} \left| p_j \int _{[0,1]}\varphi \circ T_j d\tilde{\mu }- p_j \int _{[0,1]}\varphi _\delta d\tilde{\mu }\right| \le C\tilde{\mu }([c-\delta ,c+\delta ]) \rightarrow 0 \text { as } \delta \rightarrow 0, \end{aligned}$$

The last three relations imply (5.2).

To show that \({\tilde{\mu }}\) is absolutely continuous with respect to the Lebesgue measure \(\lambda \) we proceed as in the proof of Theorem 1.3. We set \(\tilde{\theta } = \sup _{n}\sum _{b \in \Sigma _B} p_{n, b} \ell _b<1\). Let \(A \subseteq [0,1]\) be a Borel set. By Theorem 1.2 every \(\mu _{{\mathbf {p}}_n}\) satisfies (1.7), so that

$$\begin{aligned}\mu _{{\mathbf {p}}_n} (A) \le C_n \sum _{k=0}^{\infty } {\tilde{\theta }}^k \lambda (A)^{\ell _{\max }^{-k}r_{\max }^{-1}},\end{aligned}$$

where the constant \(C_n\) depends on \((\sum _{g \in \Sigma _G} p_{n,g}\big )^{-1}\) and \((\min \{ p_{n,g} \, : \, g \in \Sigma _G\})^{-1}\) (and properties of the good and bad maps themselves that are not linked to the probabilities). Since each \({\mathbf {p}}_n\), \(n \ge 0\), is a positive probability vector and \(\lim _{n \rightarrow \infty } {\mathbf {p}}_n = {\mathbf {p}}\), both these quantities can be bounded from above and \({\tilde{C}}:= \sup _n C_n < \infty \). From the weak convergence of \(\mu _{{\mathbf {p}}_n}\) to \({\tilde{\mu }}\) we obtain as in (4.12) using the Portmanteau Theorem that

$$\begin{aligned} {\tilde{\mu }}(A) \le {\tilde{C}} \sum _{k=0}^{\infty } {\tilde{\theta }}^k \lambda (A)^{\ell _{\max }^{-k}r_{\max }^{-1}}.\end{aligned}$$

Hence, \({\tilde{\mu }} \ll \lambda \). By Theorem 1.2 we know that \(\mu _{{\mathbf {p}}}\) is the unique acs probability measure for F and \({\mathbf {p}}\). So, \({\tilde{\mu }}=\mu _{{\mathbf {p}}}\). \(\square \)

5.2 The non-superattracting case

With some modifications the results from Theorem 1.2 and Theorem 1.3 can be extended to the class \(\mathfrak {B}^1 \supseteq {\mathfrak {B}}\) of bad maps of which critical order \(\ell _b\) in (B3) is allowed to be equal to 1. We will list the modified statements and the necessary modifications to the proofs here. Note that for each \(T \in \mathfrak {B}^1 \setminus \mathfrak B\), we have \(DT(c) \ne 0\), and due to the minimal principle, \(|DT(c)| < 1\). So we consider \(T_1, \ldots , T_N \in {\mathfrak {G}} \cup {\mathfrak {B}}^1\) with \(\Sigma _B^1 = \{1 \le j \le N: T_j \in {\mathfrak {B}}^1\}\) and \(\Sigma _G\), \(\Sigma _B\) as before and such that \(\Sigma _G, \Sigma _B^1 \backslash \Sigma _B \ne \emptyset \). Furthermore, we write again \(\Sigma = \{1,\ldots ,N\} = \Sigma _G \cup \Sigma _B^1\).

Theorem 5.1

Let \(\{T_j: j \in \Sigma \}\) be as above and \({\mathbf {p}} = (p_j)_{j \in \Sigma }\) a positive probability vector.

  1. (1)

    There exists a unique (up to scalar multiplication) stationary \(\sigma \)-finite measure \(\mu _{{\mathbf {p}}}\) for F that is absolutely continuous with respect to the one-dimensional Lebesgue measure \(\lambda \). This measure is ergodic and the density \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is bounded away from zero and is locally Lipschitz on (0, c) and (c, 1).

  2. (2)

    Suppose \(\ell _{\max } > 1\).

    1. (i)

      The measure \(\mu _{{\mathbf {p}}}\) is finite if and only if \(\theta = \sum _{b \in \Sigma _B^1} p_b \ell _b <1\). In this case, for each \(\hat{\theta } \in (\theta ,1)\) there exists a constant \(C(\hat{\theta }) > 0\) such that

      $$\begin{aligned} \mu _{{\mathbf {p}}}(A) \le C(\hat{\theta }) \cdot \sum _{k=0}^{\infty } \hat{\theta }^k \lambda (A)^{\ell _{\max }^{-k} r_{\max }^{-1}} \end{aligned}$$
      (5.3)

      for any Borel set \(A \subseteq [0,1]\), where \(r_{\max } = \max \{r_g: g \in \Sigma _G\}\) and \(\ell _{\max } = \max \{\ell _b: b \in \Sigma _B\}\).

    2. (ii)

      The density \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is not in \(L^q\) for any \(q > 1\).

  3. (3)

    Suppose \(\ell _{\max } = 1\).

    1. (i)

      The measure \(\mu _{{\mathbf {p}}}\) is finite, and for each \(\varvec{\eta }= (\eta _b)_{b \in \Sigma _B^1}\) such that \(\eta _b > 1\) for each \(b \in \Sigma _B^1\) and \(\hat{\theta }(\varvec{\eta }) = \sum _{b \in \Sigma _B^1 } p_b \eta _b < 1\) there exists a constant \(C(\varvec{\eta }) > 0\) such that

      $$\begin{aligned} \mu _{{\mathbf {p}}}(A) \le C(\varvec{\eta }) \cdot \sum _{k=0}^{\infty } {\hat{\theta }}(\varvec{\eta })^k \lambda (A)^{\eta _{\max }^{-k} r_{\max }^{-1}} \end{aligned}$$
      (5.4)

      for any Borel set \(A \subseteq [0,1]\), where \(\eta _{\max } = \max \{\eta _b: b \in \Sigma _B^1\}\). If \(\sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} < 1\), so if the bad maps are expanding on average at the point c, then we can get the estimate

      $$\begin{aligned} \mu _{{\mathbf {p}}}(A) \le C \cdot \lambda (A)^{r_{\max }^{-1}} \end{aligned}$$
      (5.5)

      for some constant \(C > 0\) and any Borel set \(A \subseteq [0,1]\).

    2. (ii)

      If \(r_{\max } > 1\), then \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \not \in L^q\) for any \(q \ge \frac{r_{\max }}{r_{\max }-1}\). If, moreover, \(\sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} < 1\), then \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \in L^q\) for all \(1 \le q < \frac{r_{\max }}{r_{\max }-1}\).

    3. (iii)

      If \(r_{\max } =1\) and \(\sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} < 1\), then \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \in L^\infty \).

The main issue we need to deal with in order to get Theorem 5.1 is adapting Lemma 2.4, i.e., finding suitable bounds for \(|T_{\omega }^n(x)-c|\), since the constants \({\tilde{K}}\) and \({\tilde{M}}\) from Lemma 2.4 are not well defined in case \(\ell _{\min } = 1\). This is done in the next two lemmata. For the upper bound of \(|T_{\omega }^n(x)-c|\) we assume \(\ell _{\max } > 1\) since we only need it for the proof of part (2)(i).

Lemma 5.1

Let \(\{T_j: j \in \Sigma \}\) be as above. Suppose \(\ell _{\max } > 1\). There are constants \(\hat{M} > 1\) and \(\delta > 0\) such that for all \(n \in \mathbb {N}\), \(\omega \in (\Sigma _B^1)^{\mathbb {N}}\) and \(x \in [c-\delta ,c+\delta ]\) we have

$$\begin{aligned} |T_{\omega }^n(x)-c| \le \Big (\hat{M} |x-c|\Big )^{\ell _{\omega _1} \cdots \ell _{\omega _n}}. \end{aligned}$$
(5.6)

Proof

Similar as in the proof of Lemma 2.4 it follows that there exists an \(M >1\) such that for any \(b \in \Sigma _B\) and \(x \in [0,1]\) we have

$$\begin{aligned} |T_b(x)-c| \le M |x-c|^{\ell _b}. \end{aligned}$$
(5.7)

Furthermore, there exists a \(\delta > 0\) such that \(|DT_b(x)| < 1\) for all \(x \in [c-\delta ,c+\delta ]\) and \(b \in \Sigma _B^1\). This implies

$$\begin{aligned} |T_b(x)-c| < |x-c| \end{aligned}$$
(5.8)

for all \(x \in [c-\delta ,c+\delta ]\) and \(b \in \Sigma _B^1\). Note that \(\Sigma _B \ne \emptyset \) because \(\ell _{\max } > 1\). We set \(\upsilon = \min \{\ell _b: b \in \Sigma _B\} > 1\) and \(\hat{M} = M^{\frac{1}{\upsilon -1}}\). For each \(n \in \mathbb {N}\) and \(\omega \in (\Sigma _B^1)^{\mathbb {N}}\), write

$$\begin{aligned} m(n,\omega ) = \#\{1 \le \omega _i\le n \, :\, \ell _{\omega _i} > 1\}. \end{aligned}$$
(5.9)

The statement follows by showing that for all \(n \in \mathbb {N}\), \(\omega \in (\Sigma _B^1)^{\mathbb {N}}\) and \(x \in [c-\delta ,c+\delta ]\) we have

$$\begin{aligned} |T_{\omega }^n(x)-c| \le \Big (M^{(1-\upsilon ^{-m(n,\omega )})/(\upsilon -1)}|x-c|\Big )^{\ell _{\omega _1} \cdots \ell _{\omega _n}}. \end{aligned}$$
(5.10)

We prove (5.10) by induction. From (5.7) and (5.8) it follows that (5.10) holds for \(n=1\). Now suppose (5.10) holds for some \(n \in \mathbb {N}\). Let \(\omega \in (\Sigma _B^1)^{\mathbb {N}}\) and \(y \in [c-\delta ,c+\delta ]\). If \(\ell _{\omega _{n+1}} = 1\), then the desired result follows by applying (5.8) with \(j = \omega _{n+1}\) and \(x = T_{\omega }^n(y)\). Suppose \(\ell _{\omega _{n+1}} > 1\). Then, using (5.7),

$$\begin{aligned} |T_{\omega }^{n+1}(y) -c|&\le M |T_{\omega }^n(y)-c|^{\ell _{\omega _{n+1}}} \\&\le \Big (M^{(1-\upsilon ^{-m(n,\omega )})/(\upsilon -1)+\upsilon ^{-m(n+1,\omega )}}|y-c|\Big )^{\ell _{\omega _1} \cdots \ell _{\omega _{n+1}}}. \end{aligned}$$

Using that

$$\begin{aligned} \upsilon ^{-m(n+1,\omega )} = \frac{\upsilon ^{-m(n,\omega )}-\upsilon ^{-m(n+1,\omega )}}{\upsilon -1}, \end{aligned}$$
(5.11)

the desired result follows. \(\square \)

Lemma 5.2

Let \(\{T_j: j \in \Sigma \}\) be as above. Let \(\varvec{\eta } = (\eta _b)_{b \in \Sigma _B^1}\) be a vector such that \(\eta _b > 1\) for each \(b \in \Sigma _B^1\). Set \(\hat{\eta }_b = \max \{\eta _b,\ell _b\}\) for each \(b \in \Sigma _B^1\). Then there exists a constant \(\hat{K}(\varvec{\eta }) \in (0,1)\) such that for all \(n \in \mathbb {N}\), \(\omega \in (\Sigma _B^1)^{\mathbb {N}}\) and \(x \in [0,1]\) we have

$$\begin{aligned} \Big (\hat{K}(\varvec{\eta })|x-c|\Big )^{\hat{\eta }_{\omega _1} \cdots \hat{\eta }_{\omega _n}} \le |T_{\omega }^n(x)-c|. \end{aligned}$$
(5.12)

Proof

Note from (B3) that for each \(b \in \Sigma _B^1\) we have

$$\begin{aligned} K_b |x-c|^{\hat{\eta }_b-1} \le K_b |x-c|^{\ell _b-1} \le |DT_b(x)|. \end{aligned}$$

The result now follows in the same way as in the proof of Lemma 2.4 by setting \(\hat{\eta }_{\min } = \min \{\hat{\eta }_b : b \in \Sigma _B^1\}\), \(\hat{\eta }_{\max } = \max \{\hat{\eta }_b : b \in \Sigma _B^1\}\) and \(\hat{K}(\varvec{\eta }) = \big ( \frac{\min \{ K_b \, : \, b \in \Sigma _B^1 \}}{\hat{\eta }_{\max }}\big )^ \frac{1}{\hat{\eta }_{\min }-1}\). \(\square \)

Proof of Theorem 5.1

Firstly, note that (1), (2)(ii) and the first part of (3)(ii) immediately follow from Remark 3.1. Moreover, as in [17, Section 5.4] it can be shown that (5.5) implies that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is in \(L^q\) if \(r_{\max } > 1\) and \(1 \le q < \frac{r_{\max }}{r_{\max }-1}\), giving the remainder of 3(ii). It is immediate that (5.5) implies that \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) is in \(L^{\infty }\) if \(r_{\max } = 1\), so (3)(iii) holds. Hence, it remains to prove (2i) and (3i).

Suppose \(\theta = \sum _{b \in \Sigma _B^1} p_b \ell _b \ge 1\), which means that \(\ell _{\max } > 1\). The proof that in this case \(\mu _{\mathbf {p}}\) is infinite follows by the same reasoning as in Sect. 4.1 by now taking \(\gamma = \min \{\delta ,\frac{1}{2}\hat{M}^{-1}\}\) with \(\delta \) and \(\hat{M}\) as in the proof of Lemma 5.1. Now suppose \(\theta < 1\). Let \(\varvec{\eta } = (\eta _b)_{b \in \Sigma _B^1}\) be a vector such that \(\eta _b > 1\) for each \(b \in \Sigma _B^1\) and \(\hat{\theta }(\varvec{\eta }) = \sum _{b \in \Sigma _B^1} p_b \hat{\eta }_b < 1\) with again \(\hat{\eta }_b = \max \{\eta _b,\ell _b\}\). Applying Lemma 5.2 yields that for all \(\varepsilon >0\), \(n \in {\mathbb {N}}\), \({\mathbf {b}} \in (\Sigma _B^1)^n\),

$$\begin{aligned} T_{{\mathbf {b}}}^{-1} \big ( I_c(\varepsilon ) \big ) \subseteq I_c \big ( \hat{K}(\varvec{\eta })^{-1}\varepsilon ^{\hat{\eta }_{\mathbf{b}}^{-1}} \big ), \end{aligned}$$
(5.13)

where we used the notation \(\hat{\eta }_{{\mathbf {b}}} = \hat{\eta }_{b_1} \cdots \hat{\eta }_{b_n}\) for a word \({\mathbf {b}} = b_1 \cdots b_n\). Following the line of reasoning in Sect. 4.2 with (5.13) instead of (4.21), we obtain that there exists a constant \(C(\varvec{\eta }) > 0\) such that

$$\begin{aligned} \mu _{{\mathbf {p}}}(A) \le C(\varvec{\eta }) \cdot \sum _{k=0}^{\infty } {\hat{\theta }}(\varvec{\eta })^k \lambda (A)^{{\hat{\eta }}_{\max }^{-k} r_{\max }^{-1}} \end{aligned}$$
(5.14)

for any Borel set \(A \subseteq [0,1]\). In case \(\ell _{\max } > 1\) we can choose \(\varvec{\eta }\) to satisfy \(\hat{\eta }_{\max } = \ell _{\max }\) and such that \(\hat{\theta }(\varvec{\eta }) -\theta \) is arbitrarily small, which yields (2)(i). In case \(\ell _{\max } = 1\), then \(\hat{\eta }_{\max } = \eta _{\max }\), so this together with (5.14) yields the first part of (3)(i).

Finally, for the second part of (3)(i), suppose \(\ell _{\max } = 1\) and \(\Lambda = \sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} < 1\). Setting \(K_{\mathbf {b}} = |DT_{\mathbf {b}}(c)|\) for each \(\mathbf {b} \in (\Sigma _B^1)^n\) and \(n \in \mathbb {N}\), note that for all \(\varepsilon > 0\), \(n \in \mathbb {N}\), \(\mathbf {b} \in (\Sigma _B^1)^n\),

$$\begin{aligned} T_{{\mathbf {b}}}^{-1} \big ( I_c(\varepsilon ) \big ) \subseteq I_c \big ( K_{\mathbf {b}}^{-1} \varepsilon \big ). \end{aligned}$$
(5.15)

By using (5.15) instead of (4.21), letting \(\tilde{p}_{{\mathbf {b}}} = K_{{\mathbf {b}}}^{-1} p_{{\mathbf {b}}}\) play the role of \(p_{{\mathbf {b}}}\) in the reasoning of Sect. 4.2 and noting that \(\Lambda ^k = \sum _{{\mathbf {b}} \in (\Sigma _B^1)^k} \tilde{p}_{{\mathbf {b}}}\), we arrive similarly as for Theorem 4.1 to the conclusion that there exists a constant \(\tilde{C} > 0\) such that for all \(n \in \mathbb {N}\) and all Borel sets \(A \subseteq [0,1]\),

$$\begin{aligned} \lambda _n(A) \le \tilde{C} \cdot \sum _{g \in \Sigma _G} p_g \Big (\sum _{k=0}^{\infty } \Lambda ^k\Big ) \lambda (A)^{r_g^{-1}}. \end{aligned}$$
(5.16)

This proves the remaining part of (3)(i). \(\square \)

5.3 Final remarks

The results from Theorem 5.1 contain one possible extension of our main results to another set of conditions (G1)–(G4), (B1)–(B4). In this section we discuss some of the questions that our main results brought up in this respect, i.e., about whether or not some of the conditions (G1)–(G4), (B1)–(B4) can be relaxed, and questions about other possible future extensions.

A condition that plays a fundamental role in the proofs of Theorem 1.2 and Theorem 1.3 is the fact that the critical point is mapped to a point that is a common repelling fixed point for all maps \(T_j\). We considered whether this condition can be relaxed, for instance by assuming that the branches of one of the good maps are not full. However, in this case the critical values of the random system are not just 0, c, 1 but contain all the values of all possible postcritical orbits of c. This has several consequences:

  • An invariant density (if it exists) clearly cannot be locally Lipschitz on (0, c) and (c, 1).

  • Proposition 4.2 and all subsequent arguments fail, since it is not sufficient to restrict to neighbourhoods around only 0, c and 1. One might try to solve this issue by requiring that the postcritical orbits ‘gain enough expansion’ as was done in for instance [32] for deterministic maps. An analogous condition for random systems, however, would become much stronger since it would have to hold for all possible random orbits of c.

  • The argument using Kac’s Lemma might fail, because in that case there exist words \({\mathbf {u}}\) with symbols in \(\Sigma \) and neighbourhoods U of c such that \(T_{\mathbf {u}}(x)\) is bounded away from zero and one uniformly in \(x \in U\).

The dynamical behaviour of the system is governed by the interplay between the superexponential convergence at c and the exponential divergence from 0 and 1. In this article we fixed the exponential divergence away from 0 and 1 and the two regimes \(\theta < 1\) and \(\theta \ge 1\) in Theorem 1.3 only refer to the convergence at c: For smaller \(\theta \) orbits are less attracted to c. It would be interesting to see under what other conditions on the rates of convergence to c and divergence from 0 and 1 the system admits an acs measure. Could one for example

- take exponential convergence to c and polynomial divergence from 0 and 1, or

- replace the conditions (G4) and (B4) stating that all good and bad maps are expanding at 0 and 1 by the condition that the random system is expanding on average at a sufficiently large neighbourhood of 0 and 1?

There are also some additional questions that our main results raise. It would be interesting for example to study further statistical properties of the random system such as mixing properties and if possible mixing rates in case the acs measure is finite. It is not clear a priori if the behaviour of the good maps dominates the statistical properties of the random system, since trajectories spend long periods of time near the points 0, c and 1. In this respect the dynamics resembles that of the Manneville–Pomeau maps, and mixing rates might be polynomial rather than exponential. A way to approach this problem is by estimating the measures \({\mathbb {P}}\times \lambda (\{\varphi _Y>n\})\), where \(\varphi _Y\) is the first return time to Y defined in Sect. 3 as they give information on the rates of decay of correlations. To obtain the desired decay rates it is sufficient to obtain estimates for \({\mathbb {P}}\times \lambda (\{\varphi _Y=k\})\) for all \(k> n\). Recall that every returning set \(\{\varphi _Y=k\}\) is of the form \(C_k\times J(C_k)\), where \(C_k\subset \Sigma ^{\mathbb N}\) is a cylinder set and \(J(C_k) \subset I\) is an interval with return time k, which depends only on \(C_k\). Obtaining effective estimates on individual intervals J by directly looking at pre-images of Y under the skew product system does not seem very feasible at the moment, since cylinders can contain a positive proportion of bad maps. An alternative approach could be a combinatorial construction as in [3] or [13], where a two step induction process is introduced. To perform a similar construction we have to find a suitable way to define the binding period or the slow recurrence to the critical set, which takes into account the existence of bad maps.

Finally, in Theorems 1.2 and 5.1 we have seen that the regularity of the density \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda }\) depends on whether or not there is a bad map for which c is superattracting: If \(\ell _{\max } >1\), then \(\frac{d\mu _{\mathbf{p}}}{d\lambda }\) is not in \(L^q\) for any \(q > 1\). On the other hand, if \(\ell _{\max } = 1\) and the bad maps are expanding on average at c, i.e. \(\sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} < 1\), then the density has the same regularity as in the setting of Theorem 1.1 by Nowicki and van Strien. Indeed, in this case, if \(r_{\max } > 1\), we have \(\frac{d\mu _{{\mathbf {p}}}}{d\lambda } \in L^q\) if and only if \(1 \le q < \frac{r_{\max }}{r_{\max } -1}\) and in the case that \(r_{\max } = 1\) we have \(\frac{d\mu _{\mathbf{p}}}{d\lambda } \in L^q\) for all \(q \in [1,\infty ]\). In view of this, one could wonder for which \(q > 1\) we have \(\frac{d\mu _{\mathbf{p}}}{d\lambda } \in L^q\) in the intermediate case that \(\ell _{\max } = 1\) and \(\sum _{b \in \Sigma _B^1} \frac{p_b}{|DT_b(c)|} \ge 1\), i.e. if c is not superattracting for any bad map and the bad maps are not expanding on average at c.