1 Introduction

1.1 Statement of results

In this series of papers, we are mainly concerned with the Anderson localization phenomenon for one-dimensional discrete Schrödinger operators \(H_\omega \) in \(\ell ^2({{\mathbb {Z}}})\) acting by

$$\begin{aligned}{}[H_\omega \psi ](n) = \psi (n+1) + \psi (n-1) + V_\omega (n) \psi (n). \end{aligned}$$
(1.1)

Here we assume \(\Omega \) to be any compact metric space, \(T:\Omega \rightarrow \Omega \) a homeomorphism, and \(f : \Omega \rightarrow {{\mathbb {R}}}\) be continuous. We consider potentials \(V_\omega : {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) defined by \(V_\omega (n) = f(T^n \omega )\) for \(\omega \in \Omega \) and \(n \in {{\mathbb {Z}}}\). For general background on Schrödinger operators in \(\ell ^2({{\mathbb {Z}}})\) with dynamically generated potentials of this form, we refer the reader to [19,20,21].

Spectral properties of the operators \(H_\omega \) can be investigated by studying the behavior of the solutions to the difference equation

$$\begin{aligned} u(n+1) + u(n-1) + V_\omega (n) u(n) = E u(n), \quad n \in {{\mathbb {Z}}}\end{aligned}$$
(1.2)

with E real (or complex, depending on the problem in question). These solutions in turn can be described with the help of the Schrödinger cocycle \((T,A^E)\) with the cocycle map \(A^{E}:\Omega \rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) (resp., \({\mathrm {SL}}(2,{{\mathbb {C}}})\)) being defined as

$$\begin{aligned} A^E(\omega ) = A^{(E-f)}(\omega ):= \begin{pmatrix} E-f(\omega ) &{} -1 \\ 1 &{} 0 \end{pmatrix}, \end{aligned}$$
(1.3)

where we often leave the dependence on \(f:\Omega \rightarrow {{\mathbb {R}}}\) implicit as it will be fixed most of the time.

Such cocycles describe the transfer matrices associated with Schrödinger operators. Specifically, \(u=u(n)\) solves (1.2) if and only if

$$\begin{aligned} \begin{pmatrix} u(n) \\ u(n-1) \end{pmatrix} = A^{E}_n(\omega ) \begin{pmatrix} u(0) \\ u(-1) \end{pmatrix}, \quad n \in {{\mathbb {Z}}}, \end{aligned}$$
(1.4)

where

$$\begin{aligned} A_n(\omega )= {\left\{ \begin{array}{ll} A(T^{n-1}\omega ) \ldots A(\omega ), &{} n\ge 1;\\ {[}A_{-n}(T^n\omega )]^{-1}, &{}n\le -1, \end{array}\right. } \end{aligned}$$
(1.5)

and we set \(A_0(\omega )\) to be the identity matrix.

The Lyapunov exponent (LE) of the Schrödinger cocycle plays a key role in the spectral analysis of the operators. Let \(\mu \) be a T-ergodic probability measure on \(\Omega \). The Lyapunov exponent is given by

$$\begin{aligned} L(A^E,\mu )= & {} \lim _{n \rightarrow \infty } \frac{1}{n} \int \log \Vert A^E_n(\omega )\Vert \, d\mu (\omega )\nonumber \\= & {} \inf _{n \ge 1} \frac{1}{n} \int \log \Vert A^E_n(\omega )\Vert \, d\mu (\omega ). \end{aligned}$$
(1.6)

For simplicity, we write \(L(E)=L(A^E,\mu )\). By Kingman’s subaddive ergodic theorem, we have

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{1}{n}\log \Vert A^E_n(\omega )\Vert =L(E) \end{aligned}$$

for \(\mu \)-almost every \(\omega \in \Omega \). In particular, certain uniform positivity and uniform large deviation estimates (LDT) for the LE are strong indications of Anderson localization, which in its spectral formulation states that for \(\mu \)-almost every \(\omega \in \Omega \), the operator \(H_\omega \) has pure point spectrum with exponentially decaying eigenfunctions.

On the other hand, positivity and LDT estimates for the LE are extensively studied topics in dynamical systems. In general, the more random the base dynamics \((\Omega ,T,\mu )\) is, the more likely it is that one has positivity and LDT for the LE. For instance, for the well-known Anderson model, where \(V_\omega \) is a realization of independent identically distributed random variables, one does have uniform positivity and uniform LDT on any compact set of energies E. These are classic results that go back to the seminal work of Furstenberg [27]. Combining this with a certain elimination of double resonance argument, these two properties indeed lead to a localization result for the Anderson model; see, for example, [16] for recent proofs of all these results mentioned above.

The Anderson model may be put into the context of the present paper as follows. We consider the Anderson model whose single site measure is an atomic measure supported on a finite number of points, which is the most difficult case. Let \({\mathcal {A}}=\{1, 2, \ldots , \ell \}\) with \(\ell \ge 2\) and let \({{\tilde{\mu }}}\) be a fully supported probability measure on \({{\mathcal {A}}}\). Let \(\Omega ={{\mathcal {A}}}^{{\mathbb {Z}}}\) be the full shift space and consider the left shift \(T : \Omega \rightarrow \Omega \) defined by \((T \omega )_n = \omega _{n+1}\) for \(\omega \in {\mathcal {A}}^{{\mathbb {Z}}}\) and \(n \in {{\mathbb {Z}}}\). Let \(\mu ={{\tilde{\mu }}}^{{\mathbb {Z}}}\), which is strongly mixing with respect to T. The Anderson model may be generated by setting \(V_\omega =f(T^n\omega )\) where \(f:\Omega \rightarrow {{\mathbb {R}}}\) depends only on \(\omega _0\). The potentials generated in this way are the most random among the potentials studied in this paper. It is natural to ask what can be said if the potentials, or rather the base dynamics \((\Omega ,T,\mu )\), are less random. In the language of mathematical physics, what if the \(V_\omega \)’s are weakly correlated? Or in the language of dynamical systems, what if \((\Omega ,T,\mu )\) is a mixing system such as the Arnold cat map or the doubling map? Or more generally, a subshift of finite type with measure of maximal entropy? It turns out that such systems are much more difficult to analyze.

To further explain what this paper accomplishes, we consider a general framework of the base dynamics that includes most of the systems mentioned above as special classes. Let \((\Omega ,T)\) be a subshift of finite type. Let \(\mu \) be a T-ergodic measure that is fully supported on \(\Omega \). Moreover, we further assume that \(\mu \) admits a local product structure (a detailed definition may be found in Sect. 2.1.2). Let \(f:\Omega \rightarrow {{\mathbb {R}}}\) be \(\alpha \)-Hölder continuous for some \(0<\alpha \le 1\) and non-constant. We define

$$\begin{aligned} {{\mathcal {Z}}}_{f}=\{E: L(E)=0\}. \end{aligned}$$
(1.7)

In the present paper, we address the following question.

Problem

Let \(\Omega \), T, \(\mu \), and f be described as above. How large is \({{\mathcal {Z}}}_{f}\)? In particular, when is it discrete, finite, or even empty?

Note that the discreteness of \({{\mathcal {Z}}}_f\) can be taken as a starting point to show full spectral localization for the corresponding operators; see, for example, the proof of localization in [16, Proof of Theorem 1.3]. We comment on this point in more detail in Remark 1.6 below. Earlier partial results along this line may be found for example in [6, 9, 12, 18, 22, 35, 36, 44], where either the base dynamics or the choice of f are quite restricted, or \({{\mathcal {Z}}}_f\) is still quite large. The main theorem of this paper is:

Theorem 1.1

Suppose \((\Omega , T)\) is a subshift of finite type and \(\mu \) is a fully supported T-ergodic probability measure that has a local product structure that is fully supported on \(\Omega \). Suppose T has a fixed point and f is Hölder continuous and non-constant. Then the set \({{\mathcal {Z}}}_f\) is discrete.

Remark 1.2

Let us mention that the concept of local product structure is recalled in detail in Sect. 2.1. If \(\mu \) has a local product structure, then its topological support \(\mathrm {supp} \, \mu \) is a subshift of finite type (see, e.g., [7, Lemma 1.2]) and hence the assumption in Theorem 1.1 that \(\mu \) is fully supported is not a restriction. If the support is not the whole space, then we can replace \((\Omega ,T)\) by \((\mathrm {supp} \, \mu , T_{\mathrm {supp} \, \mu })\). This remark applies whenever we assume in this paper that \(\mu \) is fully supported. Conversely, given any subshift of finite type, the unique equilibrium state associated with a Hölder continuous potential always has a local product structure, see [14, 34] or [8, Section 2.2]. In particular, measures with maximal entropy do have a local product structure.

It is clear that we can add a coupling constant \(\lambda \) to f in the statement of Theorem 1.1. This further indicates that such systems do behave like the Anderson model, as the Anderson model is always localized as long as \(\lambda >0\). If we restrict the choice of f so that it is locally constant or \(\Vert f\Vert _\infty \) is small, then we can improve the result as follows. Let \(C^\alpha (\Omega ,{{\mathbb {R}}})\), \(0<\alpha \le 1\) be the space of \(\alpha \)-Hölder continuous functions.

Theorem 1.3

Let \((\Omega ,T,\mu )\) be as in Theorem 1.1. Suppose \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) is globally bunched or locally constant. Assume further that f is non-constant and T has a fixed point. Then \({{\mathcal {Z}}}_f\) is finite.

A detailed definition of global bunching may be found at Sect. 5.2. In particular, f is globally bunched if \(\Vert f\Vert _\infty \) is small. A possible explicit choice of a smallness condition on \(\Vert f\Vert _\infty \) may be found in (7.7). We can again add a coupling constant \(\lambda \) to f in Theorem 1.3 if f is locally constant. If f is globally bunched, then as \(\lambda \) becomes large, \({{\mathcal {Z}}}_{\lambda f}\) might become a discrete set that is no longer guaranteed to be finite. This is because we will lose global bunching as \(\lambda \) becomes large and we have to apply Theorem 1.1 then. In Sect. 7, we shall show that Theorems 1.1 and  1.3 are sharp in the sense that \({{\mathcal {Z}}}_f\) may indeed be nonempty for a suitable locally constant f. Thus another natural question is: when can we remove the discrete or finite set \({{\mathcal {Z}}}_f\)? We have the following results.

Theorem 1.4

Suppose \((\Omega ,T)\) is a subshift of finite type and \(\mu \) is a fully supported T-ergodic measure that has a local product structure. Then there is a residual subset \({{\mathcal {G}}}\) of \(C^\alpha (\Omega ,{{\mathbb {R}}})\) such that for each \(f\in {{\mathcal {G}}}\), \({{\mathcal {Z}}}_f\) is empty.

Again, if we restrict the choice of f so that it is either locally constant or globally bunched, then we can obtain a uniform lower bound of the L(E) for a even wider class of choices:

Theorem 1.5

Suppose \((\Omega ,T)\) is a subshift of finite type and \(\mu \) is a fully supported T-ergodic measure that has a local product structure. Consider the subspaces of \(C^\alpha (\Omega ,{{\mathbb {R}}})\) consisting of globally bunched or locally constant functions. For each of them, there is an open and dense subset \({{\mathcal {G}}}\) such that for every \(f \in {{\mathcal {G}}}\), we have \(\inf \{ L(E) : E \in {{\mathbb {R}}}\} > 0\).

Applications of Theorems 1.11.5 to more concrete base dynamics such as the doubling map, Arnold’s cat map, and Markov shifts may be found in Sect. 7.

Remark 1.6

  1. (a)

    Let us emphasize that from the perspective of a spectral analysis of the operator family \(\{ H_\omega \}_{\omega \in \Omega }\), and in particular when seeking a proof of spectral localization for this family, the discreteness of \({{\mathcal {Z}}}_f\) is in general the appropriate first milestone towards the eventual goal. It then needs to be combined with control of the Lyapunov exponent away from \({{\mathcal {Z}}}_f\) (the connected components of \({{\mathcal {Z}}}_f^c\) need to be exhausted by intervals on which the Lyapunov exponent is uniformly bounded away from zero; this is often established by proving the continuity of L(E) in E whenever possible), suitable large deviation estimates, and an argument that rules out the presence of infinitely many double resonances for almost every \(\omega \). It then follows for \(\mu \)-almost every \(\omega \in \Omega \) that spectrally almost every energy in \({{\mathcal {Z}}}_f^c\) admits an exponentially decaying eigenfunction for \(H_\omega \). As the discrete set \({{\mathcal {Z}}}_f\) almost surely carries no weight with respect to the spectral measures of \(H_\omega \), this then shows that for \(\mu \)-almost every \(\omega \in \Omega \), the operator \(H_\omega \) admits a basis consisting of exponentially decaying eigenfunctions, and the desired spectral localization statement then follows.

  2. (b)

    One is nevertheless interested in obtaining stronger results on the size of the exceptional set \({{\mathcal {Z}}}_f\), such as finiteness or emptiness, whenever possible, as this leads to stronger versions of the dynamical version of an Anderson localization statement. Here, one is interested in showing that the solutions of the time-dependent Schrödinger equation \(i \partial _t \psi = H_\omega \psi \) are localized. In other words, one seeks to prove good off-diagonal estimates for the matrix elements of \(e^{-itH_\omega }\) relative to the standard basis of \(\ell ^2({{\mathbb {Z}}})\), uniformly in the time parameter t. Energies E in \({{\mathcal {Z}}}_f\) present an obstacle for proving this and one generally simply projects away from these exceptional energies and considers \(\chi _I(H_\omega ) e^{-itH_\omega }\) with a set \(I \subseteq {{\mathcal {Z}}}_f^c\) that has positive distance from \({{\mathcal {Z}}}_f\). In fact, it has been shown that dynamical localization can actually fail, even when spectral localization holds, if one does not project away from \({{\mathcal {Z}}}_f\); compare, for example, [23, 29]. Clearly, it is then desirable to show that \({{\mathcal {Z}}}_f\) is empty whenever this can be expected to be true. Of course, as pointed out earlier, this will not always be the case.

  3. (c)

    Let us emphasize that the road map to spectral localization described in part (a) of this remark is applicable in the general setting of ergodic Schrödinger operators, and it has been implemented for special cases ranging from the Anderson model to potentials generated by torus translations, the standard skew-shift, the doubling map, or the Arnold cat map. While the literature is vast, let us just mention a few representative papers, [10,11,12, 16], and refer the reader to [19,20,21] for more information. Regarding the base transformations considered in this paper, the absence of a suitable general and global result showing the discreteness of \({{\mathcal {Z}}}_f\) was the primary obstacle in attempting to implement this road map. Thus, the present paper fills precisely this gap and opens the door to a localization proof, which we intend to work out in detail in the second part of this series [1].

1.2 Strategy of proofs

One of the main tools we use to prove our results is the so-called invariance principle as coined in [3]. The first version of the invariance principle goes back to Ledrappier [33] and it was later generalized in [3]. The version we adopt in this paper is due to Bonatti et al. [7, 39]. A detailed statement of the invariance principle may be found in Proposition 4.5. It says that if the Lyapunov exponent \(L(A,\mu )\) of a cocycle (TA) is 0 and A depends only on the future or the past, then any (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) admits a disintegration \(\{m_\omega :\omega \in \Omega \}\) that depends only on the future or the past, respectively.

Another main tool we use is given by the so-called stable and unstable holonomies, which are defined along the stable or unstable sets of \(\omega \), respectively; see Sect. 2.1.5 for a detailed definition. If \(L(A,\mu )=0\), we can define a measurable family of stable and unstable holonomies for \(\mu \)-almost every \(\omega \). Then one can use the stable or unstable holonomies to conjugate the cocycle (TA) to one that depends only on the future or the past, respectively.

Combining the two steps above, one can show that the family of invariant measures \(\{m_\omega \}\) are invariant with respect to the stable and unstable holonomies as well. We call such a family an su-state.

It turns out that the existence of su-states is a very rare event in the sense that they can be easily perturbed away by modifying the data of the cocycle map A at certain periodic points. Roughly speaking, this is how [7, 8, 39] show the positivity of the Lyapunov exponent for certain typical \(C^\alpha \)-cocycles. More precisely, [7, 8] did it in case the cocycle is fiber bunched or is locally constant while [39] did it for the general case.

However, to prove Theorems 1.11.3, we need to consider Schrödinger cocycles with fixed sampling functions. They are basically fixed cocycle maps parametrized by the energy parameter \(E \in {{\mathbb {R}}}\). So we are not allowed to perturb the cocycle maps to get typicality. Hence, the above strategy is not sufficient to yield the discreteness or finiteness of \({{\mathcal {Z}}}_{f}\) as stated in Sect. 1.1. It turns out that in addition we need to deploy certain tools from spectral theory. In particular, we will consider the spectra associated with certain periodic orbits and invoke a result from inverse spectral theory for periodic operators. Moreover, to make use of the periodic data, among other things, we also need to show that periodic orbits with small Lyapunov exponent belong to the topological support of the sets where one can define continuous holonomies. Finally, to use the periodic data to prove the main results, we have to combine the conformal barycenter concept due to Douady and Earle [25], Bowen’s specification property [13], and Kalinin’s theorem regarding approximating L(E) by the Lyapunov exponent along periodic orbits [30]. In short, the proof is based on a fusion of ideas and results from both dynamical systems and spectral theory.

The structure of the remainder of the paper is as follows. In Sect. 2, we state some necessary preliminaries and lay out our context. In Sect. 3, we give a proof of an additive version of a large deviation estimate for Hölder continuous functions defined on \(\Omega \) and for slightly more restricted measures \(\mu \). These large deviation estimates may be of independent interest. Moreover, they will play a key role in the second paper of this series [1]. In Sect. 4, we introduce our main tools such as the invariance principle and the conformal barycenter, and we also give detailed proofs of certain lemmas. We prove Theorems 1.11.3 in Sect. 5 and Theorems 1.41.5 in Sect. 6. In Sect. 7, we apply our general Theorems 1.11.5 to several concrete models such as the doubling map, Arnold cat map, and Markov chains. In particular, the class of Markov chains includes general locally constant Schrödinger potentials defined on the full shift space as a special case, which yields a generalization of the classical Furstenberg theorem. Many of the results are the first of their kind. We also compute an explicit choice of \(\lambda _0 > 0\) so that \(\Vert f\Vert _\infty \le \lambda _0\) is sufficient for f to be globally bunched. Finally, we present an example where we show the finite set \({{\mathcal {Z}}}_f\) appearing in the statement of Theorem 1.3 may not be removed in general, so that our results are sharp in a suitable sense.

2 Preliminaries

2.1 The setting

In this section we describe the setting we will work in. We have chosen subshifts of finite type with appropriate ergodic measures as base transformations as a compromise between concreteness and generality. Other possible choices would have been concrete classes of smooth hyperbolic transformations and expanding maps. For background and discussion of the material presented below, we refer the reader to [7, 8, 39].

2.1.1 The base space and the base transformation

Let \({\mathcal {A}}=\{1, 2, \ldots , \ell \}\) with \(\ell \ge 2\) be equipped with the discrete topology. Consider the product space \({\mathcal {A}}^{{\mathbb {Z}}}\), whose topology is generated by the cylinder sets, which are the sets of the form

$$\begin{aligned} {[}n; j_0,\ldots ,j_k] = \{ \omega \in {\mathcal {A}}^{{\mathbb {Z}}}: \omega _{n+i} = j_i , \; 0 \le i \le k \} \end{aligned}$$

with \(n \in {{\mathbb {Z}}}\) and \(j_0, \ldots , j_k \in {\mathcal {A}}\). The topology is metrizable and for definiteness we fix the following metric d on \({\mathcal {A}}^{{\mathbb {Z}}}\). Set \(d(\omega ,\omega ) = 0\) for \(\omega \in {\mathcal {A}}^{{\mathbb {Z}}}\) and

$$\begin{aligned} d(\omega , {{\tilde{\omega }}}) =e^{-N(\omega ,{{\tilde{\omega }}})} \end{aligned}$$
(2.1)

for \(\omega , {{\tilde{\omega }}} \in {\mathcal {A}}^{{\mathbb {Z}}}\) with \(\omega \not = {{\tilde{\omega }}}\), where

$$\begin{aligned} N(\omega ,{{\tilde{\omega }}})=\max \{N\ge 0: \omega _n={{\tilde{\omega }}}_n \text{ for } \text{ all } |n|<N\}. \end{aligned}$$
(2.2)

We consider the left shift \(T : {\mathcal {A}}^{{\mathbb {Z}}}\rightarrow {\mathcal {A}}^{{\mathbb {Z}}}\) defined by \((T \omega )_n = \omega _{n+1}\) for \(\omega \in {\mathcal {A}}^{{\mathbb {Z}}}\) and \(n \in {{\mathbb {Z}}}\). Let \(\mathrm {Orb}(\omega )=\{T^n\omega :\ n\in {{\mathbb {Z}}}\}\) be the orbit of \(\omega \) under the dynamics T.

Definition 2.1

Let \(Q = (q_{ij})_{1\le i, j\le \ell }\) be an \(\ell \times \ell \) matrix with \(q_{ij}\in \{0,1\}\) and let \(\Omega \) be the subshift of finite type associated to the matrix Q,

$$\begin{aligned} \Omega =\{(\omega _n)_{n\in {\mathbb {Z}}}: q_{\omega _n\omega _{n+1}}=1 \text{ for } \text{ all } n\in {\mathbb {Z}}\}. \end{aligned}$$

Consider the topological dynamical system \((\Omega ,T)\).

We say that a finite word \(j_0 j_1 \ldots j_k\), where \(j_i \in \{1,\ldots , \ell \}\) for \(0 \le i \le k\), is admissible if it occurs in some \(\omega \in \Omega \), that is, there are \(\omega \in \Omega \) and \(n\in {{\mathbb {Z}}}\) such that \(\omega _{n+i}=j_i\) for all \(0\le i\le k\).

The local stable set of a point \(\omega \in \Omega \) is defined by

$$\begin{aligned} W^s_\mathrm {loc}(\omega ) = \{ {{\tilde{\omega }}} \in \Omega : \omega _n = {{\tilde{\omega }}}_n \text { for } n \ge 0 \} \end{aligned}$$
(2.3)

and the local unstable set of \(\omega \) is defined by

$$\begin{aligned} W^u_\mathrm {loc}(\omega ) = \{ {{\tilde{\omega }}} \in \Omega : \omega _n = {{\tilde{\omega }}}_n \text { for } n \le 0 \}. \end{aligned}$$
(2.4)

A set is called s-locally saturated (resp., u-locally saturated) if it is a union of local stable (resp., local unstable) sets of the form above.

For each \(j \in {\mathcal {A}}\) and each pair of points \(\omega , {{\tilde{\omega }}}\in [0;j]\), we denote the unique point in \(W^u_{\mathrm {loc}}(\omega )\cap W^s_{\mathrm {loc}}({{\tilde{\omega }}})\) by \(\omega \wedge {{\tilde{\omega }}}\). Throughout this paper, we fix for each \(1\le j\le \ell \) a choice of \(\omega ^{(j)}\in [0;j]\) so that the maps

$$\begin{aligned} \omega \mapsto \omega ^{(\omega _0)}\wedge \omega ,\ \omega \mapsto \omega \wedge \omega ^{(\omega _0)} \end{aligned}$$
(2.5)

are well-defined and continuous on \(\Omega \) and are constant on local stable and unstable sets, respectively.

2.1.2 Measures with a local product structure

Let the subshift \(\Omega \) be equipped with the Borel \(\sigma \)-algebra and let \(\mu \) be a probability measure on \(\Omega \) that is ergodic with respect to T. We define

$$\begin{aligned}&\Omega ^+=\{(\omega _n)_{n\ge 0}: \omega \in \Omega \}, \end{aligned}$$
(2.6)
$$\begin{aligned}&\Omega ^-=\{(\omega _n)_{n\le 0}: \omega \in \Omega \} \end{aligned}$$
(2.7)

to be the spaces of one-sided right and left infinite sequences, respectively, associated with \(\Omega \). Metrics for \(\Omega ^\pm \) can be defined in a way similar to the definition of the metric for \({{\mathcal {A}}}^{{\mathbb {Z}}}\) in Sect. 2.1.1. Abusing notation slightly, we still let d denote their metrics. Let \(\pi ^{+}\) be the projection from \(\Omega \) to \(\Omega ^+\) and \(\mu ^+=\pi ^+_*(\mu )\) be the pushforward measure of \(\mu \) on \(\Omega ^+\). Similarly, we let \(\pi ^-\) be the projection to \(\Omega ^-\) and \(\mu ^-\) be the pushforward measure on \(\Omega ^-\). Let \(T_+\) be the left shift operator on \(\Omega ^+\) and \(T_-\) be the right shift on \(\Omega ^-\). For \(n\ge 0\), we let \([n;j_0, \ldots , j_k]^+\) denote the cylinder sets in \(\Omega ^+\); for \(n \le -k\), we let \([n;j_0,\ldots , j_k]^-\) denote the cylinder sets in \(\Omega ^-\). Let \(\omega ^\pm \) denote points in \(\Omega ^\pm \), respectively.

For simplicity, for each \(1\le j\le \ell \), we set \(\mu _j=\mu |_{[0;j]}\). Similarly, we set \(\mu ^\pm _j=\mu ^\pm |_{[0;j]^\pm }\), respectively.

Note that we do not have \(\Omega =\Omega ^-\times \Omega ^+\). However, for each \(1\le j\le \ell \) we have a natural homeomorphism

$$\begin{aligned} P: [0;j]\rightarrow [0;j]^-\times [0;j]^+ \text{ where } P(\omega )=(\pi ^-\omega ,\pi ^+\omega ). \end{aligned}$$

Thus, abusing the notation a bit, we may just write \([0;j]=[0;j]^-\times [0;j]^+\). Moreover, we have for all \(\omega \in \Omega \),

$$\begin{aligned} (\pi ^+)^{-1}(\pi ^+\omega )=W^s_{\mathrm {loc}}(\omega ),\ (\pi ^-)^{-1}(\pi ^-\omega )=W^u_{\mathrm {loc}}(\omega ). \end{aligned}$$
(2.8)

Definition 2.2

We say \(\mu \) has a local product structure if there is a \(\psi :\Omega \rightarrow (0,\infty )\) such that for each \(1\le j \le \ell \), \(\psi \in L^1([0;j],\mu ^-_j\times \mu ^+_j)\) and

$$\begin{aligned} d\mu _j=\psi \cdot d(\mu ^-_j\times \mu ^+_j). \end{aligned}$$
(2.9)

The local product structure of \(\mu \) amounts to saying that \(\mu _j^-\times \mu _j^+\) is equivalent to \(\mu _j\). Indeed, (2.9) clearly implies that \(\mu _j\) is absolutely continuous with respect to \(\mu _j^-\times \mu _j^+\). On the other hand, if \(\mu _j(E)=0\), then we must have \((\mu _j^-\times \mu _j^+)(E)=0\) since \(\psi (\omega )>0\) for all \(\omega \in \Omega \). In particular, we may draw the following conclusion. If \(E\subset [0;j]\) is u-locally saturated with \(\mu (E) > 0\) and \(F\subset [0;j]\) is s-locally saturated with \(\mu (F) > 0\), we have

$$\begin{aligned} (\mu ^-_j\times \mu ^+_j)(E\cap F)&= (\mu ^-_j\times \mu ^+_j)(\pi ^-E\times \pi ^+ F)\\&=\mu ^-_j(\pi ^-E)\cdot \mu ^+_j(\pi ^+F)\\&=\mu (E)\cdot \mu (F)\\&> 0, \end{aligned}$$

which implies that

$$\begin{aligned} \mu (E\cap F)=\mu _j(E\cap F)>0. \end{aligned}$$
(2.10)

Conversely, if \(\mu ^-_j\times \mu ^+_j\) is equivalent to \(\mu _j\) for each \(1\le j\le \ell \), then \( d\mu _j=\psi \cdot d(\mu ^-_j\times \mu ^+_j)\), where \(\psi \in L^1([0;j],\mu ^-_j\times \mu ^+_j)\) is the Radon–Nikodym derivative of \(\mu _j\) with respect to \(\mu ^-_j\times \mu ^+_j\). Note \(1/\psi \in L^1([0;j], \mu _j)\) is the Radon–Nikodym derivative of \(\mu ^-_j\times \mu ^+_j\) with respect to \(\mu _j\). Hence we must have that \(\psi (\omega )>0\) for all j and for \(\mu _j\)-a.e. \(\omega \). We can of course modify \(\psi \) so that it is positive everywhere.

Definition 2.3

A Jacobian of the measure \(\mu ^+\) with respect to \(T_+\) on \(\Omega ^+\) is a measurable function \(J_+:\Omega ^+\rightarrow {{\mathbb {R}}}_+\) such for each \(j \in \{1,\ldots , \ell \}\), we have

$$\begin{aligned} d\mu ^+(T_+\omega ^+)=J_+(\omega ^+)\cdot d((T_+)_*(\mu ^+|_{[0;j]^+}))(T_+\omega ^+). \end{aligned}$$
(2.11)

A Jacobian of \(\mu ^-\) with respect to \(T_-\) can be defined similarly.

One consequence of the local product structure of \(\mu \) is that \(\mu ^\pm \) admit Jacobians with respect to \(T_\pm \) on \([0;j]^\pm \) for each \(1\le j\le \ell \), respectively. The following lemma is essentially contained in [8]. While in [8, Lemma 2.2], \(\psi \) is assumed to be continuous, we note that the same proof can be applied to obtain the following lemma.

Lemma 2.4

The measures \(\mu ^\pm \) admit positive Jacobians \(J_\pm \in L^1([0;j]^\pm ,d\mu ^\pm _j)\) with respect to \(T_\pm \) on \([0;j]^\pm \), respectively, for each \(1\le j\le \ell \).

For \(\underline{l}=(l_1,\ldots , l_n)\in \{1,\ldots , \ell \}^n\), we write the cylinder \([0;l_1,\ldots , l_n, j]\) as \([0;\underline{l}, j]\) and set \(|\underline{l}|:=n\). We use a similar notation for spaces of one-sided sequences. For a cylinder \([0;\underline{l}, j]^+\subset \Omega ^+\), we clearly have a Jacobian for \(T^{|\underline{l}|}_+:[0;\underline{l}, j]^+\rightarrow [0;j]^+\), which is denoted by \(J^{(\underline{l},j)}_+:[0;l_1,\ldots , l_n, j]^+\rightarrow (0,\infty )\) and is given by the formula

$$\begin{aligned} J^{(\underline{l}, j)}_+(\omega ^+)=\prod ^{n-1}_{k=0}J_+(T_+^k\omega ^+). \end{aligned}$$

By the definition of a Jacobian, we have for any integrable function \(f:\Omega ^+\rightarrow {{\mathbb {R}}}\) and any \([0;\underline{l},j]^+\subset \Omega ^+\) that

$$\begin{aligned} \int _{[0;j]^+}f(\eta ) \, d\mu ^+(\eta ) = \int _{[0;\underline{l}, j]^+}f(T^{|\underline{l}|}_+\omega ^+)J^{(\underline{l}, j)}_+(\omega ^+) \, d\mu ^+(\omega ^+). \end{aligned}$$
(2.12)

We first have the following immediate consequence of Lemma 2.4, which will be used in Sect. 5.

Corollary 2.5

Let \(D\subset \Omega ^+\) be such that \(\mu ^{+}(D\cap [0;j]^+)>0\). Then for all \([0;\underline{l}, j]^+\subset \Omega ^+\), we have

$$\begin{aligned} \mu ^+(T^{-|\underline{l}|}_+(D)\cap [0;\underline{l},j]^+)>0. \end{aligned}$$

Similarly, if \(\mu ^-(D\cap [0;j]^{-})>0\) for some \(D\subset \Omega ^-\), then for all \([-|\underline{l}|; j,\underline{l}]^-\subset \Omega ^-\), we have

$$\begin{aligned} \mu ^-(T^{-|\underline{l}|}_-(D)\cap [-|\underline{l}|; j,\underline{l}]^-)>0. \end{aligned}$$

Proof

We only consider the case for \((\Omega ^+, T_+, \mu ^+)\); the case with \((\Omega ^-,T_-, \mu ^-)\) can be handled similarly.

Without loss of generality, we may just consider a Borel set \(D\subset [0;j]^+\) with positive measure. By (2.12), we have

$$\begin{aligned} 0&< \mu ^+(D) \\&=\int _{[0;j]}\chi _D(\eta )d\mu ^+(\eta )\\&=\int _{[0;\underline{l}, j]}\chi _D (T^{|\underline{l}|}_+\omega ^+)J^{(\underline{l}, j)}_+(\omega ^+)d\mu ^+(\omega ^+)\\&=\int _{[0;\underline{l},j]\cap (T_+^{-|\underline{l}|}D)}J^{(\underline{l}, j)}_+(\omega ^+)d\mu ^+(\omega ^+), \end{aligned}$$

which implies that \(\mu ^+\big ([0;\underline{l},j]\cap (T_+^{-|\underline{l}|}D)\big )>0.\) \(\square \)

For some results we will need the measure \(\mu \) to obey a quantitative version of local product structure, which is defined as follows.

Definition 2.6

We say that \(\mu \) satisfies the bounded distortion property if there is \(C \ge 1\) such that for all cylinders \([n;j_0,\ldots ,j_{k}]\subset \Omega \) and \([l;i_{0},\ \ldots , j_{m}]\subset \Omega \), where \(l> n+k\) and \([n;j_0,\ldots ,j_{k}]\cap [l;,i_{0},\ldots , i_{m}]\ne \varnothing \), we have

$$\begin{aligned} C^{-1} \le \frac{\mu \left( [n;j_0,\ldots ,j_{k}]\cap [l;i_{0},\ldots , i_{m}] \right) }{\mu \left( [n;j_0,\ldots ,j_{k}] \right) \cdot \mu \left( [l;i_{0},\ldots ,i_{m}] \right) } \le C. \end{aligned}$$
(2.13)

Note that by T-invariance of \(\mu \) and by the definition of \(\mu ^\pm \), \(\mu \) has the bounded distortion property if and only if \(\mu ^+\) or \(\mu ^-\) has the bounded distortion property. For instance, the bounded distortion property of \(\mu ^+\) means that for all \(n\ge 0\), \(l>n+k\), and \([n;j_0,\ldots ,j_{k}]^+\cap [l; i_{0},\ldots , i_{m}]^+\ne \varnothing \), we have

$$\begin{aligned} C^{-1} \le \frac{\mu ^+ \left( [n;j_0,\ldots ,j_{k}]^+\cap {[}l;i_{1},\ldots , i_{m}]^+ \right) }{\mu ^+ \left( {[}n;j_0,\ldots ,j_{k}]^+ \right) \cdot \mu ^+ \left( {[}l;i_{1},\ldots ,i_{m}]^+ \right) } \le C. \end{aligned}$$
(2.14)

In fact, given any subshift of finite type, the unique equilibrium state associated with a Hölder continuous potential always has the bounded distortion property; see Lemma 3.4.

It is not difficult to see that every measure satisfying the bounded distortion property has a local product structure. Indeed, for every cylinder \([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{k}]\subset \Omega \), we have by (2.13)

$$\begin{aligned} (\mu ^-_{j_0} \times&\mu ^+_{j_0}) \big ([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big ) \\&= \mu ^-_{j_0}\big ([-k;j_{-k},\ldots ,j_{-1},j_{0}]^-\big )\cdot \mu ^+_{j_0} \big ([0;j_{0},\ldots , j_{m}]^+\big )\\&= \mu \big ([-k;j_{-k},\ldots ,j_{-1},j_{0}]\big )\cdot \mu \big ([0;j_{0},\ldots , j_{m}]\big )\\&\le \mu \big ([-k;j_{-k},\ldots ,j_{-1}]\big )\cdot \mu \big ([0;j_{0},\ldots , j_{m}]\big )\\&\le C \mu \big ([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big )\\&= C \mu _{j_0}\big ([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big ). \end{aligned}$$

Similarly, we can obtain such estimates for all other cylinders. Since every Borel set can be approximated by cylinder sets, these estimates clearly imply that \(\mu ^-_j\times \mu ^+_j\) is absolutely continuous with respect to \(\mu _j\). On the other hand,

$$\begin{aligned} \mu _{j_0}&\big ([-k; j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big )\\&= \mu _{j_0}\big ([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big )\\&\le C \mu ^-_{j_0}\big ([-k;j_{-k},\ldots ,j_{-1}]^-\big )\cdot \mu ^+_{j_0} \big ([0;j_{0},\ldots , j_{m}]^+\big )\\&\le \frac{C^2}{\mu ^-_{j_0}([0;j_0]^-)} \mu ^-_{j_0}\big ([-k;j_{-k}, \ldots ,j_{-1},j_{0}]^-\big )\cdot \mu ^+_{j_0}\big ([0;j_{0},\ldots , j_{m}]^+\big )\\&\le {\widetilde{C}}(\mu ^-_{j_0} \times \mu ^+_{j_0}) \big ([-k;j_{-k},\ldots ,j_{-1},j_{0},\ldots , j_{m}]\big ) , \end{aligned}$$

where \({\widetilde{C}}=\max \{ \frac{C^2}{\mu ([0;j_0])}: 1\le j\le \ell \}\) is independent of the choice of the cylinder sets. Note in the fourth line above, we use the bounded distortion property of \(\mu ^-\) as:

$$\begin{aligned} \mu ^-_{j_0}\big ([-k;j_{-k},\ldots ,j_{-1}]^-\big )\cdot \mu ^-_{j_0}([0;j_0]^-)\le C \mu ^-_{j_0}\big ([-k;j_{-k},\ldots ,j_{-1},j_{0}]^-\big ). \end{aligned}$$

It clearly implies that \(\mu _j\) is absolutely continuous with respect to \(\mu ^-_j\times \mu ^+_j\) for each \(1\le j\le \ell \).

2.1.3 \(\mathrm {SL}(2,{{\mathbb {R}}})\)-cocycles and their projectivization

A continuous map \(A : \Omega \rightarrow \mathrm {SL}(2,{{\mathbb {R}}})\) gives rise to the cocycle \((T,A) : \Omega \times {{\mathbb {R}}}^2 \rightarrow \Omega \times {{\mathbb {R}}}^2\), \((\omega , v) \mapsto (T \omega , A(\omega ) v)\). For \(n \in {{\mathbb {Z}}}\), we let \((T,A)^n = (T^n , A_n)\). In particular, we have

$$\begin{aligned} A_n(\omega )= {\left\{ \begin{array}{ll} A(T^{n-1}\omega ) \cdots A(\omega ), &{} n\ge 1;\\ I_2, &{}n=0;\\ {[}A_{-n}(T^n\omega )]^{-1}, &{}n\le -1, \end{array}\right. } \end{aligned}$$

where \(I_2\) is the identity matrix. Now let \(\mu \) be a T-ergodic probability measure with topological support equal to \(\Omega \). The Lyapunov exponent is given by

$$\begin{aligned} L(A,\mu )&= \lim _{n \rightarrow \infty } \frac{1}{n} \int \log \Vert A_n(\omega )\Vert \, d\mu (\omega ) \\&= \inf _{n \ge 1} \frac{1}{n} \int \log \Vert A_n(\omega )\Vert \, d\mu (\omega ). \end{aligned}$$

By Kingman’s subaddive ergodic theorem, we have

$$\begin{aligned} \lim _{n \rightarrow \infty } \frac{1}{n}\log \Vert A_n(\omega )\Vert = L(A,\mu ) \end{aligned}$$

for \(\mu \)-a.e. \(\omega \). By linearity and invertibility of each \(A(\omega )\), we can projectivize the second component and consider \((T,A) : \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1 \rightarrow \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\).

2.1.4 Reduction to a topologically mixing subshift

We need to reduce to the case where \(T:\Omega \rightarrow \Omega \) is topologically mixing and collect some standard facts. One may find a detailed discussion of the results stated in this section in [31, Section 1.9].

One says that \((\Omega , T)\) is topologically mixing if for any pair of nonempty open sets \(U, V\subset \Omega \), there is an \(N\ge 1\) such that \(T^n(U)\cap V\ne \varnothing \) for all \(n\ge N\).

Note that a general subshift of finite type \((\Omega , T)\) might not be topologically mixing. But any subshift of finite type that has a dense positive semiorbit has the following decomposition. By the spectral decomposition theorem for hyperbolic basic sets, we can decompose \(\Omega \) as \(\Omega = \bigsqcup ^{s}_{l=1}\Omega _l\) for some \(s\ge 1\) and for closed subsets \(\Omega _l\), so that the following holds true: \(T(\Omega _l)=\Omega _{l+1}\) for \(1 \le l < s\) and \(T(\Omega _s)=(\Omega _1)\), and \(T^s|\Omega _l\) is a topologically mixing subshift of finite type for each \(1\le l\le s\). In particular, if \((\Omega ,T)\) has a fully supported ergodic measure, then \((\Omega , T)\) has such a decomposition. Moreover, the normalized restriction \(\mu _l\) of \(\mu \) to \(\Omega _l\) is a \(T^s\)-invariant ergodic, fully supported probability measure with local product structure or bounded distortion property, provided the same property is true for \(\mu \) on \(\Omega \). One may also see [17, Section 3.2] for such facts.

Then for a cocycle map \(A : \Omega \rightarrow \mathrm {SL}(2,{{\mathbb {R}}})\), we consider \(A_s : \Omega _l \rightarrow \mathrm {SL}(2,{{\mathbb {R}}})\) as \(A_s(\omega )\), which may be considered a cocycle map defined over the base dynamics \(T^s : \Omega _l \rightarrow \Omega _l\). Clearly, \(L(A_s, \mu _l) > 0\) for some \(1 \le l \le s\) implies that \(L(A,\mu ) > 0\). Since the present paper is only concerned with the positivity of the Lyapunov exponent, we assume from now on that \((\Omega , T)\) is topologically mixing.

Note that \(\mathrm {supp}(\mu )=\Omega \) and ergodicity of \(\mu \) together already imply that \(\overline{\mathrm {Orb}(\omega )}=\Omega \) for \(\mu \)-almost every \(\omega \in \Omega \).

Topological mixing has additional consequences, which are needed in the present paper. First, it implies that the set of periodic orbits is dense in \(\Omega \). Moreover, we have the following more quantitative behavior of periodic points, which is called the specification property. It concerns shadowing finite pieces of segments of different orbits by a single orbit, in particular, by a periodic orbit. It was first introduced by Bowen [13]. The following version for subshifts of finite type is due to Sigmund [37]. For \(a < b \in {{\mathbb {Z}}}\), we let \([a,b] \subset {{\mathbb {Z}}}\) denote the indicated interval of integers. In other words, \([a,b] = \{ n \in {{\mathbb {Z}}}: a \le n \le b \}\).

Proposition 2.7

Let \((\Omega ,T)\) be a topologically mixing subshift of finite type. For each \(\epsilon > 0\), there is an integer \(r = r(\epsilon )>0\) such that for any choice of points \(p^{(i)} \in \Omega \) and intervals of integers \(I_i = [a_i,b_i]\), \(i=1,2\), with \(a_2 - b_1 > r\) and any \(n > b_2 - a_1 + r\), there exists a periodic point p with period n such that

$$\begin{aligned} d(T^j p, T^j p^{(i)}) < \epsilon \text{ for } j \in I_i,\ i=1,2. \end{aligned}$$

2.1.5 Stable and unstable holonomies

Given \((\Omega ,T,\mu )\) as above, consider \(A : \Omega \rightarrow \mathrm {SL}(2,{{\mathbb {R}}})\) and the projective cocycle \((T,A) : \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1 \rightarrow \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\). We will denote the fiber \(\{ \omega \} \times {{\mathbb {R}}}{\mathbb {P}}^1\) by \({\mathcal {E}}_\omega \).

Definition 2.8

A stable holonomy \(h^s\) for A is a family of homeomorphisms \(h^s_{\omega , \omega '} : {\mathcal {E}}_\omega \rightarrow {\mathcal {E}}_{\omega '}\), defined whenever \(\omega \) and \(\omega '\) belong to the same local stable set, satisfying the following properties:

  1. (i)

    \(h^s_{\omega ' , \omega ''} \circ h^s_{\omega , \omega '} = h^s_{\omega , \omega ''}\) and \(h^s_{\omega , \omega } = \mathrm {id}\),

  2. (ii)

    \(A(\omega ') \circ h^s_{\omega , \omega '} = h^s_{T \omega , T \omega '} \circ A(\omega )\),

  3. (iii)

    \((\omega , \omega ') \mapsto h^s_{\omega , \omega '}(\phi )\) is continuous when \(\omega , \omega '\) belong to the same local stable set, uniformly in \(\phi \).

An unstable holonomy \(h^u_{\omega , \omega '} : {\mathcal {E}}_\omega \rightarrow {\mathcal {E}}_{\omega '}\) is defined analogously for pairs of points in the same unstable set.

By property (i), we have \(h^\tau _{\omega ,\omega '}=(h^\tau _{\omega ',\omega })^{-1}\) for any \(\omega '\in W^\tau _{\mathrm {loc}}(\omega )\), where \(\tau \in \{s,u\}\).

These projective holonomies \(h^s_{\omega , \omega '}, h^u_{\omega , \omega '}\) typically arise via projectivization of \(H^s_{\omega , \omega '}, H^u_{\omega , \omega '} \in \mathrm {SL}(2,{{\mathbb {R}}})\) that are obtained as follows:

$$\begin{aligned} H^s_{\omega ,\omega '}= & {} \lim _{n \rightarrow \infty } A_n(\omega ')^{-1} A_n(\omega ), \nonumber \\ H^u_{\omega ,\omega '}= & {} \lim _{n \rightarrow \infty } A_{-n}(\omega ')^{-1} A_{-n}(\omega ) \end{aligned}$$
(2.15)

for \(\omega , \omega '\) in the same stable (resp., unstable) set. Conditions need to be placed on the cocycle to ensure convergence in (2.15); see, for example, the proof of Lemma 4.2. The analogues of the properties (i)–(iii) for \(H^s_{\omega , \omega '}, H^u_{\omega , \omega '}\) follow directly from the construction and this in turn implies (i)–(iii) for \(h^s_{\omega , \omega '}, h^u_{\omega , \omega '}\) by projectivization. Holonomies that arise from (2.15) are called canonical holonomies of A.

2.1.6 Invariant measures of projective cocycles

Consider a projective cocycle \((T,A) : \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1 \rightarrow \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that has stable and unstable holonomies.

Definition 2.9

Suppose we are given a (TA)-invariant probability measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component. A disintegration of m along the fibers is a measurable family \(\{m_\omega : \omega \in \Omega \}\) of conditional probabilities on \({{\mathbb {R}}}{\mathbb {P}}^1\) such that \(m = \int m_\omega \, d\mu (\omega )\), that is,

$$\begin{aligned} m(D)=\int _\Omega m_\omega (\{z\in {{\mathbb {R}}}{\mathbb {P}}^1:(\omega ,z)\in D\}) \, d\mu (\omega ) \end{aligned}$$
(2.16)

for each measurable set \(D\subset \Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\).

By Rokhlin’s disintegration theorem, such a disintegration exists. Moreover, \(\{{\tilde{m}}_\omega :\omega \in \Omega \}\) is another disintegration of m if and only if \(m_\omega ={\tilde{m}}_\omega \) for \(\mu \)-almost every \(\omega \in \Omega \). By a straightforward calculation one checks that \(\{ A(\omega )_* m_{\omega } : \omega \in \Omega \}\) is a disintegration of \((T,A)_*m\), where \(A(\omega )_*m_{\omega }\) is the measure on the fiber \(\{T\omega \}\times {\mathbb {R}}{\mathbb {P}}^1\). In particular, the (TA)-invariance of m implies \(A(\omega )_* m_\omega = m_{T\omega }\) for \(\mu \)-almost every \(\omega \in \Omega \). Conversely, if \(\{{\tilde{m}}_\omega : \omega \in \Omega \}\) is a family of probability measures where \({\tilde{m}}_\omega \) is defined on \(\{\omega \}\times {{\mathbb {R}}}{\mathbb {P}}^1\), then we may define a measure \({\tilde{m}}\) on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) via the right side of (2.16) by replacing \(m_\omega \) with \({\tilde{m}}_\omega \). Then \({\tilde{m}}\) is (TA)-invariant if \(A(\omega )_*{\tilde{m}}_\omega ={\tilde{m}}_{T\omega }\) for \(\mu \)-a.e. \(\omega \).

We say m is an s-state (resp., a u-state) if it is in addition invariant under the stable (resp., unstable) holonomies. That is, the disintegration \(\{m_\omega :\omega \in \Omega \}\) satisfies that \((h^s_{\omega ,\omega '})_* m_\omega = m_{\omega '}\) for \(\mu \)-almost every \(\omega \in \Omega \) and for every \(\omega '\in W^s_{\mathrm {loc}}(\omega )\) (resp., \((h^u_{\omega ,\omega '})_* m_\omega = m_{\omega '}\) for \(\mu \)-almost every \(\omega \in \Omega \) and for every \(\omega '\in W^u_{\mathrm {loc}}(\omega )\)). In this case, we say that \(\{m_\omega \}\) is s-invariant (resp. u-invariant). A measure that is both an s-state and a u-state is called an su-state.

2.1.7 Schrödinger operators and cocycles

In this subsection let us initially assume that \(\Omega \) is a compact metric space, \(T:\Omega \rightarrow \Omega \) is a homeomorphism, and \(f : \Omega \rightarrow {{\mathbb {R}}}\) is continuous. We consider potentials \(V_\omega : {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) defined by \(V_\omega (n) = f(T^n \omega )\) for \(\omega \in \Omega \) and \(n \in {{\mathbb {Z}}}\), and associated Schrödinger operators \(H_\omega \) in \(\ell ^2({{\mathbb {Z}}})\) acting by

$$\begin{aligned} {[}H_\omega \psi ](n) = \psi (n+1) + \psi (n-1) + V_\omega (n) \psi (n). \end{aligned}$$

The spectrum \(\sigma (H_\omega )\) is defined as

$$\begin{aligned} \sigma (H_\omega )=\{E\in {{\mathbb {C}}}: H_\omega -E \text{ does } \text{ not } \text{ have } \text{ a } \text{ bounded } \text{ inverse }\}. \end{aligned}$$

For a subset S of a metric space (Xd) and \(\delta > 0\), the open \(\delta \)-neighborhood of S is given by \(B_\delta (S)=\{x\in X: d(x,s)<\delta \text{ for } \text{ some } s\in S\}\). In particular, \(B_\delta (x)\) denotes the open ball centered at the point \(x\in X\). We need the following uniform estimate that relates the spectrum \(\sigma (H_\omega )\) with the orbit \(\mathrm {Orb}(\omega )=\{T^n(\omega ),\ n\in {{\mathbb {Z}}}\}\); see, for example, [43, Theorem 6].

Proposition 2.10

For each \(\varepsilon >0\), there exists a \(\delta >0\), depending on \(\varepsilon \) only, so that the following holds true. If the orbit \(\mathrm {Orb}(\omega _0)\) of some \(\omega _0\in \Omega \) satisfies

$$\begin{aligned} \mathrm {Orb}(\omega _0)\cap B_\delta (\omega )\ne \varnothing \end{aligned}$$

for some \(\omega \in \Omega \), then

$$\begin{aligned} \sigma (H_\omega )\subset B_\varepsilon [\sigma (H_{\omega _0})]. \end{aligned}$$

Proposition 2.10 implies that if \(\mathrm {Orb}(\omega _0)\) is dense in \(\Omega \), then \(\sigma (H_\omega ) \subseteq \sigma (H_{\omega _0})\) for all \(\omega \in \Omega \). In this case, we set

$$\begin{aligned} \Sigma =\sigma (H_{\omega _0}). \end{aligned}$$

Let us now return to the main scenario of this paper, where T is a topologically mixing shift operator on a subshift of finite type \(\Omega \) with an ergodic measure \(\mu \) satisfying \(\mathrm {supp}(\mu ) = \Omega \). Let \(\mathrm {Per}(T)\) be the set of periodic points of T. Recall that \(\overline{\mathrm {Per}(T)} = \Omega \). Recall we have that \(\overline{\mathrm {Orb}(\omega )} = \Omega \) for \(\mu \)-almost every \(\omega \). All these facts together with Proposition 2.10 imply for \(\mu \)-almost every \(\omega \) that

$$\begin{aligned} \Sigma = \sigma (H_\omega ) = \overline{\bigcup _{\omega _p \in \mathrm {Per}(T)}\sigma (H_{\omega _p})}. \end{aligned}$$
(2.17)

Spectral properties of the operators \(H_\omega \) can be investigated in terms of the behavior of the solutions to the difference equation

$$\begin{aligned} u(n+1) + u(n-1) + V_\omega (n) u(n) = E u(n), \quad n \in {{\mathbb {Z}}}, \end{aligned}$$
(2.18)

with E real or complex (depending on the problem in question). These solutions in turn can be described with the help of the Schrödinger cocycle \((T,A^E)\) with the cocycle map \(A^{E}:\Omega \rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) (resp., \({\mathrm {SL}}(2,{{\mathbb {C}}})\) when \(E \in {{\mathbb {C}}}{\setminus } {{\mathbb {R}}}\)) being defined as

$$\begin{aligned} A^E(\omega ) = A^{(E-f)}(\omega ):= \begin{pmatrix} E-f(\omega ) &{} -1\\ 1 &{} 0 \end{pmatrix}, \end{aligned}$$

where we often leave the dependence on \(f:\Omega \rightarrow {{\mathbb {R}}}\) implicit as it will be fixed most of the time. Such cocycles describe the transfer matrices associated with Schrödinger operators with dynamically defined potentials. Specifically, u solves (2.18) if and only if

$$\begin{aligned} \begin{pmatrix} u(n) \\ u(n-1) \end{pmatrix} = A^{E}_n(\omega ) \begin{pmatrix} u(0) \\ u(-1) \end{pmatrix}, \quad n \in {{\mathbb {Z}}}. \end{aligned}$$

For the Schrödinger cocycle \((T,A^E)\), we set \(L(E)=L(A^E,\mu )\). One of the main questions in the spectral analysis of the ergodic family of Schrödinger operators \(\{H_\omega \}_{\omega \in \Omega }\) (with respect to the ergodic measure \(\mu \)) is for how many \(E \in \Sigma \) we have \(L(E) > 0\).

2.2 Periodic potentials

A periodic point \(\omega \) of T gives rise to a periodic potential, that is, if \(T^p \omega = \omega \), then, \(V_{\omega }(n + p) = V_{\omega }(n)\) for every \(n \in {{\mathbb {Z}}}\). Since much of our work below will involve the study of periodic points and the associated potentials, let us recall some basic properties of Schrödinger operators with periodic potentials; see [24, 38] for proofs of the results stated in this subsection.

Consider a Schrödinger operator

$$\begin{aligned} {[}H \psi ](n) = \psi (n+1) + \psi (n-1) + V(n) \psi (n). \end{aligned}$$

in \(\ell ^2({{\mathbb {Z}}})\) with a p-periodic potential, \(V(n + p) = V(n)\) for every \(n \in {{\mathbb {Z}}}\). Define, for \(E \in {{\mathbb {C}}}\), the monodromy matrix

$$\begin{aligned} M(E) =\prod ^{0}_{j=p-1} \begin{pmatrix} E - V(j) &{} -1 \\ 1 &{} 0 \end{pmatrix} \end{aligned}$$

and the discriminant \(\Delta (E) = {\mathrm {Tr}}(M(E))\), where \({\mathrm {Tr}}(B)\) is the trace of B. The function \(\Delta (\cdot )\) is a monic polynomial of degree p.

Proposition 2.11

The set \(\Delta ^{-1}((-2,2))\) consists of p disjoint open intervals and on each of them, \(\Delta \) is strictly monotone. Moreover, \(\sigma (H) = \overline{\Delta ^{-1}((-2,2))} = \Delta ^{-1}([-2,2])\).

This shows that the spectrum of H consists of a finite union of closed intervals and, in fact, the number of connected components of the spectrum is bounded by the period of the potential. This suggests an interesting inverse problem. Suppose we are given a set that has such a form, that is, it has finitely many connected components, each being a closed interval. Suppose further that we know that the set is the spectrum of a periodic Schrödinger operator. Can we say anything about the period of the potential?Footnote 1

Proposition 2.12

Suppose \(V : {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) is periodic. Denote the spectrum of the associated Schrödinger operator by \(\sigma \).

  1. (a)

    For a probability measure m on \(\sigma \), consider its potential energy

    $$\begin{aligned} {\mathcal {E}}(m) = \iint \log \left( |E - E'|^{-1} \right) \, dm(E) \, dm(E') \in {{\mathbb {R}}}\cup \{ \infty \}. \end{aligned}$$
    (2.19)

    Then there is a unique measure, \(m_\sigma \), which minimizes the potential energy among all probability measures on \(\sigma \), and in fact \({\mathcal {E}} (m_\sigma ) = 0\).

  2. (b)

    The measure \(m_\sigma \) assigns rational weight to each connected component of \(\sigma \).

  3. (c)

    The potential V is p-periodic if and only if the weight of each connected component of \(\sigma \) with respect to \(m_\sigma \) is an integer multiple of \(\frac{1}{p}\).

This result shows that the shape of the spectrum of a periodic Schrödinger operator determines the period of the potential. An immediate consequence is the fact that the spectrum of a periodic Schrödinger operator is connected if and only if the period is one, that is, the potential is constant. Another characterization of constant potentials is the following:

Proposition 2.13

Suppose \(V : {{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) is periodic. Then the spectrum \(\sigma \) of the associated Schrödinger operator has Lebesgue measure at most 4. Moreover, the Lebesgue measure of \(\sigma \) is equal to 4 if and only if V is constant.

Finally, we note the following standard facts. For each \(E\in {{\mathbb {C}}}\) such that \(\Delta (E)\ne \pm 2\), there are exactly two eigendirections s(E) and u(E) in \({{\mathbb {C}}}{\mathbb {P}}^1\) of the monodromy matrix M(E), which are actually the so-called Weyl–Titchmarsh m-functions associated with the operator. Moreover, \(s(E)\ne u(E)\) are real if and only if \(E\in {{\mathbb {R}}}{\setminus } \sigma (H_V)\), and they are the stable and unstable directions of the real hyperbolic matrix M(E). Here we always set s(E) to be the stable direction and u(E) to be the unstable direction. If E is in the upper or lower-half plane or is such that \(E\in {{\mathbb {R}}}\) and \(|\Delta (E)|<2\), then s(E) and u(E) are not real. In the latter case, we have \(s(E)=\overline{u(E)}\). For \(\Delta (E) = \pm 2\), we let \(I \subseteq {{\mathbb {R}}}\) be the connected component of \(\sigma (H_V)\) containing E. If E belongs to the boundary of I, then M(E) has a unique real invariant direction. We may think of this case as \(s(E) = u(E)\). If E is a point at which a spectral gap is collapsed (or, in other words, at which two different components of \(\Delta ^{-1}(-2,2)\) touch), then \(M(E) = \pm I_2\), in which case all directions are invariant.

Based on the description above, we may consider two functions s and u which are holomorphic on the upper or lower half plane \({{\mathbb {H}}}\) and \({{\mathbb {C}}}{\setminus }{\overline{{{\mathbb {H}}}}}\), respectively. When restricted to the real line \({{\mathbb {R}}}\), they both are continuous functions. Moreover, they are analytic on each spectral gap or in the interior of each connected component of \(\sigma (H_V)\). If \(E_0\) is on the boundary of some connected component of \(I\subseteq \sigma (H_V)\), then s and u are locally like \(g\big (\sqrt{\pm (E-E_0)}\big )\) near \(E_0\) for some choice of g that is real-analytic near \(E_0\). Here the choice of g depends on s or u, and the sign of \((E-E_0)\) is determined by whether \(E_0\) is the right or left endpoint of I. Moreover, s(E) and u(E) are real only when \(\sqrt{\pm (E-E_0)}\) is real. Thus, we can find an open disk \(D \subseteq {{\mathbb {C}}}\) centered at \(E_0\) and a ramified (at \(E_0\)) double cover \(\pi : {\tilde{D}} \rightarrow D\) of D so that \(s({\tilde{E}})\) and \(u({\tilde{E}})\) are holomorphic in \({\tilde{E}} \in {\tilde{D}}\). Moreover, when \(\pi ({\tilde{E}}) \in D \cap {{\mathbb {R}}}\), \(s({\tilde{E}})\) and \(u({\tilde{E}})\) are real only when \(\sqrt{\pm (\pi ({\tilde{E}})-E_0)}\) is real.

3 Large deviations

The main goal of this section is to prove the following large deviation theorem. Let \(C^\alpha (\Omega ,{{\mathbb {R}}})\), \(0< \alpha \le 1\), be the space of \(\alpha \)-Hölder continuous functions. In other words, \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\) if there are \(C>0\) such that

$$\begin{aligned} |f(\omega )-f(\omega ')|<C\cdot d(\omega ,\omega ')^\alpha \text{ for } \text{ all } \omega ,\omega '\in \Omega . \end{aligned}$$

Note that here \(C^1(\Omega ,{{\mathbb {R}}})\) is the space of Lipschitz continuous functions, not the space of functions with continuous derivatives. Similarly, we can define the space \(C^\alpha (\Omega ^+,{{\mathbb {R}}})\). Throughout this section, \(\mu \), or equivalently \(\mu ^+\), will be assumed to have the bounded distortion property.

Theorem 3.1

Let \((\Omega ,T)\) be a topologically mixing subshift of finite type. Let \(\mu \) be a T-ergodic probability measure that has the bounded distortion property. Let \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) for some \(0 < \alpha \le 1\). Then, for each \(\varepsilon >0\), there exist \(C, c > 0\), depending on \(f,\alpha \), and \(\varepsilon \), such that

$$\begin{aligned} \mu \bigg \{ \omega \in \Omega : \bigg | \frac{1}{n} \sum ^{n-1}_{k=0} f(T^k\omega ) - \int _\Omega f \, d\mu \bigg | \ge \varepsilon \bigg \} < Ce^{-cn},\ \forall n\ge 1. \end{aligned}$$

Theorem 3.1 will be a consequence of the following version of large deviations. Recall we have the spaces \((\Omega ^\pm ,T_\pm ,\mu ^\pm )\) of one-sided infinite sequences with nonnegative/nonpositive indices.

Theorem 3.2

Let \((\Omega ^+,T_+,\mu ^+)\) be a topologically mixing one-sided subshift of finite type and suppose that \(\mu ^+\) is \(T^+\)-ergodic and has the bounded distortion property. Let \(f\in C^\alpha (\Omega ^+,{{\mathbb {R}}})\) for some \(0<\alpha \le 1\). Then for each \(\varepsilon >0\), there exist \(C,c>0\), depending on f, \(\alpha \), and \(\varepsilon \) such that

$$\begin{aligned} \mu ^+\bigg \{\omega ^+\in \Omega ^+:\bigg |\frac{1}{n}\sum ^{n-1}_{k=0}f(T_+^k\omega ^+) -\int _{\Omega ^+} f d\mu ^+\bigg |\ge \varepsilon \bigg \}<Ce^{-cn},\ \forall n\ge 1.\nonumber \\ \end{aligned}$$
(3.1)

We first derive Theorem 3.1 from Theorem 3.2. Let us write \(S_nf:=\sum ^{n-1}_{k=0}f\circ T^k\) for the Birkhoff sums. We let \(\varphi (\omega ) = \omega ^{(\omega _0)}\wedge \omega \) which we defined in (2.5). Note that \(\varphi (\omega )\) is continuous on \(\Omega \) and constant on local stable sets.

Proof of Theorem 3.1

Let \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\). Since f is Hölder continuous and \(\varphi (\omega )\in W^s_{\mathrm {loc}}(\omega )\), a straightforward computation shows that

$$\begin{aligned} h^s(\omega ) := \sum ^{\infty }_{n=0} \big [ f(T^n\omega ) - f(T^n\varphi (\omega )) \big ] \end{aligned}$$

converges uniformly, and hence is continuous. We define

$$\begin{aligned} f^+(\omega ):=f(\omega )+h^s(T\omega )-h^s(\omega ). \end{aligned}$$

In particular, we have

$$\begin{aligned} \int _\Omega f \, d\mu = \int _\Omega f^+ \, d\mu \text{ and } \big \Vert S_n f - S_n f^+ \big \Vert _\infty < 2\Vert h^s\Vert _\infty , \end{aligned}$$

where \(\Vert \cdot \Vert _\infty \) denotes the supremum norm. It is straightforward to see that

$$\begin{aligned} f^+(\omega ) = f(\varphi (\omega )) + \sum ^{\infty }_{n=0} \big [ f(T^nT\varphi (\omega )) - f(T^n\varphi (T\omega )) \big ], \end{aligned}$$

which implies that \(f^+\) is constant on \(W^s_{\mathrm {loc}}(\omega )\) for all \(\omega \in \Omega \). Moreover, we claim that \(f^+ \in C^{\frac{\alpha }{2}}(\Omega ,{{\mathbb {R}}})\). Indeed, take \(\omega \) and \(\omega ' \in \Omega \). Without loss of generality, we may assume \(N(\omega ,\omega ')\) is large and take \(k = \lfloor \frac{N}{2}\rfloor \). Then we have

$$\begin{aligned}&f^+(\omega ) - f^+(\omega ') \\&\quad = \sum ^{k}_{n=0} \big [f(T^n\varphi (\omega )) - f(T^n\varphi (\omega ') \big ]\\&\qquad +\sum ^{k-1}_{n=0} \big [ f(T^n\varphi (T\omega ')) - f(T^n\varphi (T\omega )) \big ] \\&\qquad + \sum ^{\infty }_{n=k} \big [ f(T^nT\varphi (\omega )) - f(T^n\varphi (T\omega )) \big ]\\&\qquad - \sum ^{\infty }_{n=k} \big [ f(T^nT\varphi (\omega ')) - f(T^n\varphi (T\omega ')) \big ], \end{aligned}$$

where the absolute values of the first two terms may be bounded by

$$\begin{aligned} C \sum ^{k}_{i=1} e^{-\alpha (N-i)} \le C e^{-\alpha \frac{N}{2}} = C d(\omega ,\omega ')^{\frac{\alpha }{2}}, \end{aligned}$$

and the absolute values of the last two terms may be bounded by

$$\begin{aligned} C e^{-\alpha k} \le C e^{-\alpha \frac{N}{2}} = C d(\omega ,\omega ')^{\frac{\alpha }{2}}. \end{aligned}$$

Thus \(f^+\in C^{\frac{\alpha }{2}}(\Omega ,{{\mathbb {R}}})\). Since \(f^+\) is constant on local stable sets, it descends to a function in \(C^{\frac{\alpha }{2}}(\Omega ^+,{{\mathbb {R}}})\). Abusing notation slightly, let \(f^+\) denote its descended function as well. Clearly, we have \(\int _\Omega f^+ \, d\mu = \int _{\Omega ^+} f^+ \, d\mu ^+\) and \(S_n f^+(\omega ) = S_n f^+(\pi ^+\omega )\). Fix any \(\varepsilon > 0\) and define

$$\begin{aligned} {{\mathcal {B}}}^+_n(\varepsilon ) := \bigg \{ \omega ^+ \in \Omega ^+ : \bigg | \frac{1}{n} S_n f^+(\omega ^+) - \int _{\Omega ^+} f^+ \, d\mu ^+ \bigg | > \varepsilon \bigg \}. \end{aligned}$$

By Theorem 3.2, there are \(C, c > 0\), depending on \(f^+\), \(\alpha \), and \(\varepsilon \), such that

$$\begin{aligned} \mu ^+({{\mathcal {B}}}^+_n) < C e^{-cn},\ \forall n\ge 0. \end{aligned}$$

Combining the relations of f and \(f^+\) above, if we choose \(N = N(\varepsilon )\) so that \(4\Vert h^s\Vert _\infty <N\varepsilon \), then we have

$$\begin{aligned} \left\{ \omega \in \Omega : \bigg | \frac{1}{n} S_n f(\omega ) - \int _\Omega f \, d\mu \bigg | > \varepsilon \right\} \subseteq (\pi ^+)^{-1} {{\mathcal {B}}}^+_n(\varepsilon /2),\ \forall n \ge N. \end{aligned}$$

Changing Cc if necessary, we then have for all \(n\ge 1\),

$$\begin{aligned} \mu \left\{ \omega \in \Omega : \bigg | \frac{1}{n} S_n f(\omega ) - \int _\Omega f \, d\mu \bigg | > \varepsilon \right\}&\le \mu [ (\pi ^+)^{-1} {{\mathcal {B}}}^+_n(\varepsilon /2)] \\&= \mu ^+({{\mathcal {B}}}^+_n(\varepsilon /2)) \\&< C e^{-cn}, \end{aligned}$$

as desired. \(\square \)

To prove Theorem 3.2, we first need the following lemma. For \(\underline{l}=(l_1,\ldots , l_n)\) where \(l_1\ldots l_n\) is admissible (in this case we also just say that \(\underline{l}\) is admissible), we set \(\Omega ^+_{\underline{l}}:=[0;l_1,l_2,\ldots ,l_n]^+\), \(|\underline{l}|:=n\), and

$$\begin{aligned} \mu ^+_{\underline{l}}=\frac{1}{\mu ^+\big (\Omega ^+_{\underline{l}}\big )}T^{|\underline{l}|}_*\mu ^+\big |_{\Omega ^+_{\underline{l}}}. \end{aligned}$$
(3.2)

In other words, \(\mu ^+_{\underline{l}}\) is the normalized push-forward of \(\mu ^+\) under the injective map \(T^{|\underline{l}|}:\Omega ^+_{\underline{l}}\rightarrow \Omega ^+\). Note that \(\mu ^+_{\underline{l}}\) is concentrated on \(T_+^{|\underline{l}|}(\Omega ^+_{\underline{l}})\). By definition, it clearly holds that

$$\begin{aligned} \int _{\Omega ^+}f\, d\mu _{\underline{l}}=\frac{1}{\mu (\Omega ^+_{\underline{l}})}\int _{\Omega ^+_{\underline{l}}}f\circ T_+^{|\underline{l}|}\, d\mu ^+. \end{aligned}$$

If we view \(T^{-|\underline{l}|}_+\) as a map from \(T_+^{|\underline{l}|}(\Omega ^+_{\underline{l}})\) to \(\Omega ^+_{\underline{l}}\), we obtain

$$\begin{aligned} \mu (\Omega ^+_{\underline{l}})\int _{T_+^{|\underline{l}|}(\Omega ^+_{\underline{l}})}f\circ T_+^{-|\underline{l}|}\, d\mu ^+_{\underline{l}}=\int _{\Omega ^+_{\underline{l}}}f\, d\mu ^+. \end{aligned}$$

Since \(\mu ^+_{\underline{l}}\) is concentrated on \(T_+^{|\underline{l}|}(\Omega ^+_{\underline{l}})\), we may simply write the equation above as

$$\begin{aligned} \mu (\Omega ^+_{\underline{l}})\int f\circ T_+^{-|\underline{l}|}\, d\mu ^+_{\underline{l}}=\int _{\Omega ^+_{\underline{l}}}f\, d\mu ^+. \end{aligned}$$
(3.3)

Recall we also write \([n;\underline{l}]^+=[n;l_1,\ldots , l_n]^+\).

Lemma 3.3

Consider a topologically mixing one-sided subshift of finite type \((\Omega ^+,T_+,\mu ^+)\), where \(\mu ^+\) has the bounded distortion property. There exists a \(C \ge 1\) so that, uniformly for all admissible \(\underline{l}\), we have

$$\begin{aligned} \frac{d\mu ^+_{\underline{l}}}{d\mu ^+}(\omega ^+)\le C \text{ for } \mu \text{-a.e. } \omega ^+, \end{aligned}$$
(3.4)

where \(\frac{d\mu ^+_{\underline{l}}}{d\mu ^+}\) is the Radon–Nikodym derivative of \(\mu ^+_{\underline{l}}\) with respect to \(\mu ^+\). In particular, we have for all positive measurable functions f and all admissible \(\underline{l}\),

$$\begin{aligned} \int fd\mu ^+_{\underline{l}}\le C\int fd\mu ^+. \end{aligned}$$
(3.5)

Proof

Fix an admissible \(\underline{l}=(l_1,\ldots , l_n)\). Clearly, (3.4) is equivalent to the existence of a \( C\ge 1\), independent of \(\underline{l}\), such that for every \([n;\underline{i}]^+=[n;i_1,\ldots , i_m]^+ \subseteq \Omega ^+\) (which implies \(n\ge 0\)), we have

$$\begin{aligned} \frac{\mu ^+_{\underline{l}}([n;\underline{i}])}{\mu ^+([n;\underline{i}])}\le C. \end{aligned}$$
(3.6)

By definition of \(\mu ^+_{\underline{l}}\), we have

$$\begin{aligned} \frac{\mu ^+_{\underline{l}}([n;\underline{i}])}{\mu ^+([n;\underline{i}])}=\frac{\mu ^+([0;\underline{l}]\cap [n+|\underline{l}|;\underline{i}])}{\mu ^+([0;\underline{l}])\mu ^+([n;\underline{i}])}. \end{aligned}$$

Hence, (3.6) is equivalent to

$$\begin{aligned} \frac{\mu ^+([0;\underline{l}]\cap [n+|\underline{l}|;\underline{i}])}{\mu ^+([0;\underline{l}])\mu ^+([n;\underline{i}])} \le C. \end{aligned}$$

By \(T^+\)-invariance of \(\mu ^+\), the above estimate is then guaranteed by (2.14). Indeed, if \([0;\underline{l}] \cap [n+|\underline{l}|+r_0;\underline{i}]\), then it is trivial. If \([0;\underline{l}] \cap [n+|\underline{l}|+r_0;\underline{i}] \ne \varnothing \), then it is a consequence of the second inequality of (2.14). \(\square \)

We are now ready to prove Theorem 3.2. We adopt the strategy of [2, Section 6.1].

Proof of Theorem 3.2

We split the proof into two parts. First, we show

$$\begin{aligned} \mu ^+\bigg \{\omega ^+\in \Omega ^+:\frac{1}{n}S_nf(\omega ^+)-\int f \, d\mu ^+\ge \varepsilon \bigg \}<Ce^{-cn},\ \forall n\ge 1. \end{aligned}$$
(3.7)

For simplicity, we write \(I_n(\omega ^+)=\frac{1}{n} S_nf(\omega ^+)\) and \(\gamma =\int f \, d\mu ^+\). Fix a \(\varepsilon >0\). By the Birkhoff Ergodic Theorem, \(I_n(\omega )\) converges to \(\int f \, d\mu ^+\) pointwise almost everywhere and in \(L^1\). Thus we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\mu ^+\{\omega ^+: |I_n(\omega ^+)-\gamma |> \varepsilon \}=0. \end{aligned}$$

By (3.5) of Lemma 3.3, we have

$$\begin{aligned} \lim _{n\rightarrow \infty }\sup _{\underline{l}}\mu ^+_{\underline{l}}\{\omega ^+: |I_n(\omega ^+)-\gamma |>\varepsilon \}=0, \end{aligned}$$
(3.8)

where the supremum is taking over the set of all admissible \(\underline{l}\). Fix a \(0<\varepsilon '<\varepsilon \). Let \({{\mathcal {B}}}_{n}=\{\omega ^+\in \Omega ^+: I_n(\omega ^+)>\gamma +\varepsilon '\}\) and \(\kappa =\frac{\varepsilon -\varepsilon '}{2}\). By (3.8) (replacing \(\varepsilon \) by \(\varepsilon '\)), we have for all admissible \(\underline{l}\) and all large n that

$$\begin{aligned} \sup _{\underline{l}}\int (I_n(\omega ^+)-\gamma -\varepsilon ) \, d\mu ^+_{\underline{l}}&=\sup _{\underline{l}}\left( \int _{{{\mathcal {B}}}_{n}}+\int _{{{\mathcal {B}}}_{n}^\complement }\right) (I_n(\omega ^+)-\gamma -\varepsilon ) \, d\mu ^+_{\underline{l}}\nonumber \\&\le C\sup _{\underline{l}}\mu ^+_{\underline{l}}({{\mathcal {B}}}_n)+(\varepsilon '-\varepsilon )\inf _{\underline{l}} \mu ^+_{\underline{l}}({{\mathcal {B}}}_n^\complement )\nonumber \\&<-\kappa . \end{aligned}$$
(3.9)

Fix a large N so that (3.9) holds true. For any \(\delta '>0\), by (3.5) and boundedness of the integrand below, there clearly exists a \(C'=C'(\delta ',N)>0\) such that for all \(|t|<\delta '\) and all \(1\le n\le N\), we have

$$\begin{aligned} \sup _{\underline{l}}\int e^{tn(I_n(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+_{\underline{l}}\le C'. \end{aligned}$$
(3.10)

Hence \(\phi _{N,\underline{l}}(t):=\int e^{tN(I_N(\omega ^+)-\gamma -\varepsilon )}\, d\mu ^+_{\underline{l}}\) are uniformly bounded holomorphic functions on \(\{t\in {{\mathbb {C}}}: |t|<\delta '\}\) for all \(\underline{l}\) admissible. In particular, \(\{\phi _{N,\underline{l}}(t)\}_{\underline{l}}\) is a normal family on the open disk \(\{t\in {{\mathbb {C}}}: |t|<\delta '\}\). Shrinking \(\delta '\) if necessary, we see that \(\{\phi '_{N,\underline{l}}(t)\}_{\underline{l}}\) is a normal family on the open disk \(\{t\in {{\mathbb {C}}}: |t|<\delta '\}\) as well. Note that

$$\begin{aligned} \phi _{N,\underline{l}}(0)=1 \text{ and } \phi '_{N,\underline{l}}(0)=\int N(I_N(\omega ^+)-\gamma -\varepsilon ) \, d\mu ^+_{\underline{l}}. \end{aligned}$$

By (3.9) and shrinking \(\delta '\) if necessary, we must have that \(\phi '_{N,\underline{l}}(t)<-N\kappa \) for all \(t\in (-\delta ', \delta ')\) and all admissible \(\underline{l}\). This implies that

$$\begin{aligned} (\log \phi _{N,\underline{l}})'(t)=\frac{\phi '_{N,\underline{l}}(t)}{\phi _{N,\underline{l}}(t)}<-(C')^{-1}N\kappa \end{aligned}$$

for all \(t\in (-\delta ', \delta ')\) and all admissible \(\underline{l}\). Since \((\log \phi _{N,\underline{l}})(0)=0\), we then have

$$\begin{aligned} \sup _{\underline{l}}\big \{\log \phi _{N,\underline{l}}(t)\big \}\le -(C')^{-1}N\kappa t \end{aligned}$$

for all \(0 \le \delta < \delta '\). Hence, it holds for all \(0 \le \delta < \delta '\) that

$$\begin{aligned} \sup _{\underline{l}} \int e^{\delta N(I_N(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+_{\underline{l}} < e^{-(C')^{-1} N \delta \kappa }. \end{aligned}$$
(3.11)

Now we want to extend the estimate above to all \(n\ge 1\) via the bounded distortion property of \(\mu ^+\).

Since \(f \in C^\alpha (\Omega ^+,{{\mathbb {R}}})\) and \(nI_n(\omega )=(S_nf)(\omega )\) is the Birkhoff sum, it is straightforward to see that

$$\begin{aligned} |nI_{n}(\omega ^+) - nI_{n}({{\tilde{\omega }}}^+)| \le C \sum ^{n-1}_{k=0} d(T_+^k\omega ^+,T_+^k{{\tilde{\omega }}}^+)^\alpha \le C_1, \end{aligned}$$
(3.12)

provided \(\omega ^+,{{\tilde{\omega }}}^+\in \Omega ^+_{\underline{l}}\), where \(|\underline{l}|=n\). Note here that \(C_1\) depends only on \(\alpha \) and f. We choose \(\omega ^+_{\underline{l},\max },\ \omega ^+_{\underline{l},\min }\in \Omega ^+_{\underline{l}}\) so that

$$\begin{aligned} I_{n}(\omega ^+_{\underline{l}, \max })=\max _{\omega ^+\in \Omega ^+_{\underline{l}}}\{I_{n}(\omega ^+)\} \text{ and } I_{n}(\omega ^+_{\underline{l}, \min })=\min _{\omega ^+\in \Omega ^+_{\underline{l}}}\{I_{n}(\omega ^+)\}. \end{aligned}$$

In particular we have

$$\begin{aligned} n\big (I_{n}(\omega ^+_{\underline{l}, \max })-I_{n}(\omega ^+_{\underline{l}, \min })\big ) < C_1. \end{aligned}$$
(3.13)

Since \((n+N)I_{n+N}(\omega ^+) = NI_N(T_+^n\omega ^+) + nI_n(\omega ^+)\), we have for all \(n\ge 1\) that

$$\begin{aligned} \int&e^{\delta (n+N)(I_{n+N}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+ \\&= \int e^{\delta n(I_{n}(\omega ^+)-\gamma -\varepsilon )} e^{\delta N(I_{N}(T_+^{n}\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+ \\&= \sum _{|\underline{l}|=n} \int _{\Omega ^+_{\underline{l}}} e^{\delta n(I_{n}(\omega ^+) -\gamma -\varepsilon )} e^{\delta N(I_{N}(T_+^{n}\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+ \\&\le \sum _{|\underline{l}|=n} e^{\delta n(I_{n}(\omega ^+_{\underline{l},\max })-\gamma -\varepsilon )} \int _{\Omega ^+_{\underline{l}}} e^{\delta N(I_{N}(T^{n}_+\omega ^+) -\gamma -\varepsilon )} \, d\mu ^+ \\&= \sum _{|\underline{l}|=n} \mu ^+(\Omega ^+_{\underline{l}}) e^{\delta n(I_{n}(\omega ^+_{\underline{l}, \max })-\gamma -\varepsilon )} \int e^{\delta N(I_{N}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+_{\underline{l}} \\&\le \bigg (\sup _{\underline{l}}\int _{\Omega ^+}e^{\delta N(I_{N}(\omega ^+)-\gamma )} \, d \mu ^+_{\underline{l}}\bigg ) \\&\qquad \cdot \sum _{|\underline{l}|=n} \bigg (e^{\delta n[I(\omega ^+_{\underline{l},\max }) -I(\omega ^+_{\underline{l},\min })]} \cdot \int _{\Omega ^+_{\underline{l}}}e^{\delta n(I_{n} (\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+\bigg ) \\&\le e^{C_1\delta } \left( \sup _{\underline{l}} \int _{\Omega ^+} e^{\delta N(I_{N}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+_{\underline{l}}\right) \cdot \int _{\Omega ^+} e^{\delta n(I_{n}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+, \end{aligned}$$

where the third identity follows from (3.3) and the last inequality follows from (3.13). We choose N large so that \((C')^{-1} N \kappa > 2 C_1\) and set \(c = \frac{1}{2} (C')^{-1} \delta \kappa \). Then by (3.11) we have for all \(n\ge 1\),

$$\begin{aligned} \int _{\Omega ^+} e^{\delta (n+N)(I_{n+N}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+ \le e^{-cN} \int _{\Omega ^+} e^{\delta n(I_{n}(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+. \nonumber \\ \end{aligned}$$
(3.14)

Now, given any \(n\ge 1\), we may apply the Euclidean division \(n = kN + r\). Using (3.14) several times, we obtain for all \(n\ge 1\) that

$$\begin{aligned} \mu ^+ \left\{ \omega ^+ : I_n(\omega ^+) -\gamma \ge \varepsilon \right\} \le \int _{\Omega ^+} e^{\delta n(I_n(\omega ^+)-\gamma -\varepsilon )} \, d\mu ^+ \le C e^{-cn}.\nonumber \\ \end{aligned}$$
(3.15)

Note that the estimate for small n is absorbed into the constant C due to (3.10). This gives one part of the estimates (3.7).

To prove the second part which is

$$\begin{aligned} \mu ^+ \left\{ \omega ^+ : \gamma -I_n(\omega )\ge \varepsilon \right\} \le \int _{\Omega ^+} e^{\delta n(\gamma -I_n(\omega ^+)-\varepsilon )} \, d\mu ^+ \le C e^{-cn},\nonumber \\ \end{aligned}$$
(3.16)

we just need to replace \(I_n(\omega )-\gamma -\varepsilon \) by \(\gamma -I_n(\omega )-\varepsilon \) and rum the same proof of (3.7) above. The only difference is that in this case we set \({{\mathcal {B}}}_n=\{\omega ^+: \gamma -I_n(\omega )>\varepsilon '\}\) for some \(0\le \varepsilon '<\varepsilon \). All other steps are exactly the same.

The two estimates (3.7) and (3.16) clearly imply the desired large deviation estimate as stated in Theorem 3.2. \(\square \)

The fact that an equilibrium state of a Hölder continuous potential has local product structure may be found in [8, 14]. Here we show that equilibrium states of Hölder continuous potentials have the bounded distortion property as defined in (2.13), which also implies that such a \(\mu \) has a local product structure. In particular this shows that Theorem 3.1 holds true for such measures. Equivalently, we may consider \((\Omega ^+,\mu ^+,T_+)\), where \(\mu ^+\) is an equilibrium state of a Hölder continuous potential and show that such a \(\mu ^+\) has the bounded distortion property. Indeed, equilibrium states of Hölder continuous potentials defined over \((\Omega ,T)\) are lifts of equilibrium states of Hölder continuous potentials defined over \((\Omega ^+, T_+)\); see, for example, [8]. According to [8, 14], such a \(\mu ^+\) has a Hölder continuous Jacobian with respect to \(T_+\). So it suffices to prove the following lemma:

Lemma 3.4

Let \((\Omega ^+,T_+,\mu ^+)\) be a one-sided subshift of finite type, where \(\mu ^+\) is a \(T_+\)-ergodic measure that has a Hölder continuous Jacobian. Then \(\mu ^+\) satisfies the bounded distortion property as defined in (2.14).

Proof

To get (2.14), we fix any \([0;\underline{l}]^+\subset \Omega ^+\) and set \(n=|\underline{l}|\). Choose any \([k;\underline{j}]^+\subset \Omega ^+\) such that \(k\ge n\) and \([0;\underline{l}]^+\cap [k;\underline{j}]^+\ne \varnothing \).

Let \(J_+\in C^\alpha (\Omega ,{{\mathbb {R}}}_+)\) be the Jacobian of \(\mu ^+\) with respect to \(T_+\). Since it is positive and continuous on \(\Omega ^+\), we have \(\inf _{\omega ^+\in \Omega ^+} |J_+(\omega ^+)|> c > 0\), which implies that \(\log J_+\in C^\alpha (\Omega ^+,{{\mathbb {R}}}_+)\). Consider the map \(T^n:[0;\underline{l}]\rightarrow \Omega ^+\) and let \(J_+^{\underline{l}}\) be its Jacobian. Then we have

$$\begin{aligned} J_+^{\underline{l}}(\omega ^+)=\prod ^{n-1}_{i=0}J_+(T^i\omega ^+). \end{aligned}$$

Suppose \(\omega ^+, {{\tilde{\omega }}}^+\in [0;\underline{l}]\) for some \(|\underline{l}|=n\). Then we have

$$\begin{aligned} \big |\log J_+^{\underline{l}}(\omega ^+)-\log J_+^{\underline{l}}({{\tilde{\omega }}}^+) \big |&=\sum ^{n-1}_{i=0}\big |\log J_+(T^i\omega ^+)-\log J_+(T^i{{\tilde{\omega }}}^+) \big |\\&<C\cdot d(T^i_+\omega ^+,T^i_+{{\tilde{\omega }}}^+)^\alpha \\&<C, \end{aligned}$$

where C is independent of \(\underline{l}\) , \(\omega ^+\), and \({{\tilde{\omega }}}^+\). Thus we have

$$\begin{aligned} C^{-1}<\bigg |\frac{J_+^{\underline{l}}(\omega ^+)}{J_+^{\underline{l}}({{\tilde{\omega }}}^+)}\bigg |<C \text{ for } \text{ all } \omega ^+, {{\tilde{\omega }}}^+\in [0;\underline{l}]^+. \end{aligned}$$
(3.17)

Now by the definition of the Jacobian, we have

$$\begin{aligned} \int _{\Omega ^+}\chi _{[k-n;\underline{j}]^+}(\eta ) \, d\mu ^+(\eta ) = \int _{[0;\underline{l}]^+}\chi _{[k-n;\underline{j}]^+}(T^{n}_+\omega ^+)J_+^{\underline{l}}(\omega ^+) \, d\mu ^+(\omega ^+), \end{aligned}$$

which implies that

$$\begin{aligned} J_+^{\underline{l}}(\omega ^+_{\underline{l}, \min })\le \frac{\mu \big ([k;\underline{j}]^+\big )}{\mu ^+([0;\underline{l}]^+\cap [k;\underline{j}]^+)} \le J_+^{\underline{l}}(\omega ^+_{\underline{l}, \max }). \end{aligned}$$

Here \(\omega ^+_{\underline{l}, \min }\) and \(\omega ^+_{\underline{l}, \max }\) are chosen as in the proof of Theorem 3.2. Using \(1=\int _{\Omega ^+}1d\mu ^+=\int _{[0;\underline{l}]}J_+^{\underline{l}}(\omega ^+)d\mu ^+\), we obtain

$$\begin{aligned} \frac{1}{J_+^{\underline{l}}(\omega ^+_{\underline{l}, \max })}\le \mu ^+([0;\underline{l}]^+)\le \frac{1}{J_+^{\underline{l}}(\omega ^+_{\underline{l}, \min })}. \end{aligned}$$

Combining the two estimates above with (3.17), we clearly get

$$\begin{aligned} C^{-1}\le \frac{\mu ^+\big ([0;\underline{l}]^+\big )\cdot \mu ^+\big ([k;\underline{j}]^+\big )}{\mu ^+\big ([0;\underline{l}]^+\cap [k;\underline{j}]^+\big )}\le C, \end{aligned}$$

which is (2.14). \(\square \)

Remark 3.5

There are many early works concerning large deviation estimates for functions defined on hyperbolic dynamical systems; see, for example, [41]. But we could not find a proof that applies in our framework. In Sect. 4, we shall show that if \(\mu \) has the bounded distortion property, then Theorem 3.1 yields relatively global versions of all the techniques needed. We hope they will be of independent interest. Most importantly, the results, ideas, and proofs given in this section will be used in the second paper [1] of this series.

4 Invariance principle and conformal barycenter

In this section, we develop some tools that are needed to prove our main results in the next two sections. Our main objective is to consider cocycles that have zero Lyapunov exponent. First, we show that a small Lyapunov exponent gives rise to a measurable family of holonomies, which will be integrable if \(\mu \) has the bounded distortion property. In the case of a zero Lyapunov exponent, we shall introduce techniques originally developed in [33], and then generalized in [3, 7, 39], which are referred to as the invariance principle. We will use the invariance principle to show the existence of a continuous su-state on suitable sets. Concretely, in case of a zero Lyapunov exponent and bounded distortion, we will construct an su-state that is continuous on the support of a certain full measure set. In case we have only local product structure, we will construct a local su-state that is continuous on the support of some positive measure set. Then we show that periodic points with small Lyapunov exponents belong to the support in question. Finally, we will construct an su-invariant family of \(\delta \)-measures by using the conformal barycenter.

4.1 Measurable holonomies resulting from small exponents

For the remainder of this subsection, we fix \(0 < \alpha \le 1\) and consider the space of \(\alpha \)-Hölder continuous cocycles \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\), that is, \(A \in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) if

$$\begin{aligned} \Vert A(\omega ) - A({{\tilde{\omega }}}) \Vert < C \cdot d(\omega ,{{\tilde{\omega }}})^\alpha \text{ for } \text{ all } \omega ,\omega '\in \Omega . \end{aligned}$$
(4.1)

Thus for every \(\omega ,{{\tilde{\omega }}} \in \Omega \) and \(n \ge 0\), we have

$$\begin{aligned} {\left\{ \begin{array}{ll} \Vert A(T^n \omega )-A(T^n {{\tilde{\omega }}})\Vert \le C e^{-\alpha n} &{} \text{ if } {{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(\omega ),\\ \Vert A(T^{-n} \omega ) - A(T^{-n} {{\tilde{\omega }}})\Vert \le C e^{-\alpha n} &{} \text{ if } {{\tilde{\omega }}}\in W^u_{\mathrm {loc}}(\omega ). \end{array}\right. } \end{aligned}$$
(4.2)

Our goal is to show that a small Lyapunov exponent produces stable and unstable holonomies. Unfortunately, we can not show that these holonomies satisfy all properties of Definition 2.8. Therefore we introduce the following weaker version of Definition 2.8.

Definition 4.1

A measurable stable holonomy \(h^s\) for A is a measurable family of homeomorphisms \(h^s_{\omega , \omega '} : {\mathcal {E}}_\omega \rightarrow {\mathcal {E}}_{\omega '}\), defined for \(\mu \)-almost every \(\omega \) and every \(\omega '\in W^s_{\mathrm {loc}}(\omega )\), satisfying the following properties:

  1. (i)

    \(h^s_{\omega ' , \omega ''} \circ h^s_{\omega , \omega '} = h^s_{\omega , \omega ''}\) and \(h^s_{\omega , \omega } = \mathrm {id}\),

  2. (ii)

    \(A(\omega ') \circ h^s_{\omega , \omega '} = h^s_{T \omega , T \omega '} \circ A(\omega )\).

A measurable unstable holonomy \(h^u_{\omega , \omega '} : {\mathcal {E}}_\omega \rightarrow {\mathcal {E}}_{\omega '}\) is defined analogously for \(\mu \)-almost every \(\omega \) and every \(\omega '\in W^u_{\mathrm {loc}}(\omega )\).

Using measurable stable and unstable holonomies, we can define the notions of s-state, u-state, and su-state in the same way as in Sect. 2.1.6 since the disintegration \(\{m_\omega :\omega \in \Omega \}\) of the invariant measure m is only measurable anyway.

In Lemma 4.2 below we will indeed show that small Lyapunov exponents ensure the existence of measurable stable and unstable holonomies. In fact, these maps will arise via projectivization from canonical holonomies defined on suitably defined subsets.

The following subsets will play a key role. We define for \(N \in {{\mathbb {Z}}}_+\) and \(\delta > 0\),

$$\begin{aligned} K_s(N,\delta )&= \{ \omega : \Vert A_n(\omega )\Vert ^2 \le e^{(\alpha - \delta )n} \text { for every } n \ge N \}, \end{aligned}$$
(4.3)
$$\begin{aligned} K_u(N,\delta )&=\{\omega : \Vert A_{-n}(\omega )\Vert ^2 \le e^{(\alpha - \delta )n} \text { for every } n \ge N\}. \end{aligned}$$
(4.4)

Lemma 4.2

Assume that \(L(A,\mu )\) and \(\delta > 0\) are such that \(2L(A,\mu ) < \alpha - \delta \). Then, for each \(N \in {{\mathbb {Z}}}_+\), the limit

$$\begin{aligned} H^{s}_{\omega ,{{\tilde{\omega }}}} = \lim _{n \rightarrow \infty } A_n({{\tilde{\omega }}})^{-1} A_n(\omega ) \end{aligned}$$

exists uniformly for each \(\omega \in K_s(N,\delta )\) and \({{\tilde{\omega }}} \in W^s_{\mathrm {loc}}(\omega )\). Similarly, for each \(N \in {{\mathbb {Z}}}_+\), the limit

$$\begin{aligned} H^{u}_{\omega ,{{\tilde{\omega }}}} = \lim _{n \rightarrow \infty } A_{-n}({{\tilde{\omega }}})^{-1} A_{-n}(\omega ) \end{aligned}$$

exists uniformly for each \(\omega \in K_u(N,\delta )\) and \({{\tilde{\omega }}} \in W^u_{\mathrm {loc}}(\omega )\). Moreover, if \(\mu \) has bounded distortion, then the following integrability conditions hold:

$$\begin{aligned} \int _{\Omega } \log \Vert H^s_{\omega ^{(\omega _0)} \wedge \omega , \omega }\Vert \, d\mu (\omega )&< \infty , \end{aligned}$$
(4.5)
$$\begin{aligned} \int _{\Omega } \log \Vert H^u_{\omega \wedge \omega ^{(\omega _0)}, \omega }\Vert \, d\mu (\omega )&< \infty . \end{aligned}$$
(4.6)

Since each of the sets \(\bigcup _N K_s (N,\delta )\) and \(\bigcup _N K_u (N,\delta )\) has full measure and is T-invariant, it follows in particular that there are measurable stable and unstable holonomies in the sense of Definition 4.1.

Proof

In this proof, C will denote a positive constant that depends only on \(\Vert A\Vert _\infty \) and \(\delta \). Conditions on it will be placed in several places, which leads to finitely many adjustments.

Let \(H^{s,n}_{\omega ,{{\tilde{\omega }}}} = A_n({{\tilde{\omega }}})^{-1} A_n(\omega )\) for \({{\tilde{\omega }}} \in W^s_{\mathrm {loc}}(\omega )\). Define

$$\begin{aligned} \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}} = \left( H^{s,n}_{\omega ,{{\tilde{\omega }}}} \right) ^{-1} \left( H^{s,n+1}_{\omega ,{{\tilde{\omega }}}} - H^{s,n}_{\omega ,{{\tilde{\omega }}}} \right) , \end{aligned}$$
(4.7)

so that

$$\begin{aligned} H^{s,n}_{\omega ,{{\tilde{\omega }}}} \left( \mathrm {Id} + \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}} \right) = H^{s,n+1}_{\omega ,{{\tilde{\omega }}}}. \end{aligned}$$

We first will estimate \(\Vert \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}}\Vert \) as follows. In the case where \(\Vert A_n(\omega )\Vert ^2 \le e^{(\alpha - \delta )n}\), we have

$$\begin{aligned} \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}} = A_n(\omega )^{-1} \left( A(T^n {{\tilde{\omega }}})^{-1} A(T^n \omega ) - \mathrm {Id} \right) A_n(\omega ) \end{aligned}$$

and therefore

$$\begin{aligned} \Vert \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}}\Vert \le C e^{(\alpha - \delta )n} e^{-\alpha n} = C e^{-\delta n}. \end{aligned}$$
(4.8)

The fact \(\lim \limits _{n\rightarrow \infty }\frac{1}{n}\log \Vert A_n(\omega )\Vert =L(A,\mu )\) for \(\mu \)-almost every \(\omega \in \Omega \) implies that \(\frac{1}{n}\log \Vert A_n(\omega )\Vert \) converges to \(L(A,\mu )\) in measure. Thus, the sets \(K_s(N,\delta )\) defined in (4.3) are compact and increasing in N, and their union over N has full measure.

For \(\omega \in K_s(N,\delta )\), we have the uniform summability statement

$$\begin{aligned} \sum _{n = N}^\infty \Vert \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}}\Vert \le C. \end{aligned}$$

Changing C if necessary, the estimate above in turn implies that

$$\begin{aligned} \left\| H^{s,n}_{\omega ,{{\tilde{\omega }}}}\right\|&=\left\| H^{s,N}_{\omega ,{{\tilde{\omega }}}}\prod ^{n}_{k=N}(H^{s,k}_{\omega , {{\tilde{\omega }}}})^{-1}\cdot H^{s,k+1}_{\omega ,{{\tilde{\omega }}}}\right\| \nonumber \\&\le \Vert H^{s,N}_{\omega ,{{\tilde{\omega }}}}\Vert \prod ^{n}_{k=N} \Vert (H^{s,k}_{\omega ,{{\tilde{\omega }}}})^{-1}\cdot H^{s,k+1}_{\omega ,{{\tilde{\omega }}}}\Vert \nonumber \\&\le e^{CN}\exp \left( \sum ^{n}_{k=N}\log (1+\Vert \delta ^{s,k}_{\omega ,{{\tilde{\omega }}}}\Vert ) \right) \nonumber \\&\le e^{CN}\exp \left( C\sum ^{n}_{k=N}\Vert \delta ^{s,k}_{\omega ,{{\tilde{\omega }}}}\Vert \right) \nonumber \\&\le e^{CN}, \end{aligned}$$
(4.9)

which implies for all \(n\ge N\):

$$\begin{aligned} \left\| H^{s,n}_{\omega ,{{\tilde{\omega }}}}-H^{s,n+1}_{\omega ,{{\tilde{\omega }}}} \right\| =\left\| H^{s,n}_{\omega ,{{\tilde{\omega }}}}\delta ^{s,n}_{\omega ,{{\tilde{\omega }}}}\right\| \le Ce^{CN}e^{-\delta n}. \end{aligned}$$

Hence \(\{H^{s,n}_{\omega ,{{\tilde{\omega }}}}\}_{n\ge 0}\) is Cauchy sequence in \({\mathrm {SL}}(2,{{\mathbb {R}}})\) and is thus convergent. Let us define

$$\begin{aligned} H^s_{\omega ,{{\tilde{\omega }}}} := \lim \limits _{n \rightarrow \infty } H^{s,n}_{\omega ,{{\tilde{\omega }}}} \end{aligned}$$

where the convergence is uniform on \(K_s(N,\delta )\). In particular, \(H^s_{\omega ,{{\tilde{\omega }}}}\) depends continuously on \(\omega \in K_s(N,\delta )\) and \({{\tilde{\omega }}} \in W^s_{\mathrm {loc}}(\omega )\). Changing C if necessary, (4.9) implies for the same \(\omega \) and \({{\tilde{\omega }}}\) that

$$\begin{aligned} \Vert A_n({{\tilde{\omega }}})^{-1}A_n(\omega )\Vert <e^{CN} \end{aligned}$$
(4.10)

for all \(n\ge 1\).

To get the integrability condition (4.5), we define \(\phi (\omega )=\log \Vert A(\omega )\Vert \) and assume without loss of generality that

$$\begin{aligned} 2\int _{\Omega }\phi (\omega ) \, d\mu < \alpha - \delta , \end{aligned}$$
(4.11)

because otherwise we may instead consider \(\phi (\omega ) = \frac{1}{k} \log \Vert A_k(\omega )\Vert \) for some large k, which must satisfy the condition above since \(\int _\Omega \frac{1}{k} \log \Vert A_k(\omega )\Vert \, d\mu \) converges to \(L(A,\mu )\). It is straightforward to see that \(\phi \in C^\alpha (\Omega ,{{\mathbb {R}}})\) since A is \(\alpha \)-Hölder continuous and \(\Vert A(\omega )\Vert \ge 1\) for all \(\omega \in \Omega \).

We want to estimate for some \(C>0\) the measure of the following set,

$$\begin{aligned} {{\mathcal {B}}}_N=\{ \omega : \log \Vert H^s_{\omega ,{{\tilde{\omega }}}}\Vert > CN \text { for some } {{\tilde{\omega }}} \in W^s_{\mathrm {loc}}(\omega ) \}. \end{aligned}$$

Recall \(S_n\phi (\omega )=\sum ^{n-1}_{j=0}\phi (T^j\omega )\). We define for \(\delta '=\frac{\delta }{2}\),

$$\begin{aligned} {{\mathcal {Z}}}_m=\bigg \{\omega : \frac{1}{n} S_n\phi (\omega )<\frac{\alpha -\delta '}{2} \text{ for } \text{ all } n\ge m\bigg \} \end{aligned}$$

If \(\omega \in {{\mathcal {Z}}}_1\), then \(2S_n\phi (\omega )<n(\alpha -\delta ')\) for all \(n\ge 0\), which clearly implies that \(\Vert A_n(\omega )\Vert ^2<e^{(\alpha -\delta ')n}\) for all \(n\ge 0\). Thus by the computation leading to (4.10), we have for all \(\omega \in {{\mathcal {Z}}}_1\),

$$\begin{aligned} \Vert A_n({{\tilde{\omega }}})^{-1}A_n(\omega )\Vert \le C \text{ for } \text{ all } {{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(\omega ) \text{ and } \text{ for } \text{ all } n\ge 1. \end{aligned}$$

If \(\omega \in {{\mathcal {Z}}}_m\) with \(m>1\), then we set \(0<k<m\) to be the largest integer for which \(\omega \notin {{\mathcal {Z}}}_{k}\). A direct computation then shows that \(T^k \omega \in {{\mathcal {Z}}}_1\), which implies that for all \({{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(T^k\omega )\),

$$\begin{aligned} \Vert A_n({{\tilde{\omega }}})^{-1}A_n(T^k\omega )\Vert \le C \text{ for } \text{ all } n\ge 1. \end{aligned}$$

Combining \(\Vert A_n(\omega )\Vert <e^{Cm}\) for all \(1\le n\le m\) and for all \(\omega \), we have for all \(\omega \in {{\mathcal {Z}}}_m\) and all \({{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(\omega )\) that

$$\begin{aligned} \Vert A_n({{\tilde{\omega }}})^{-1}A_n(\omega )\Vert \le e^{Cm} \text{ for } \text{ all } n\ge 1, \end{aligned}$$

which clearly implies that \(\log \Vert H^s_{\omega ,{{\tilde{\omega }}}}\Vert <Cm\). Thus by choosing C appropriately, we have

$$\begin{aligned} {{\mathcal {B}}}_N\subset \Omega {\setminus }{{\mathcal {Z}}}_N. \end{aligned}$$

However, by (4.11) it is clear that

$$\begin{aligned} \Omega {\setminus } {{\mathcal {Z}}}_N \subseteq \bigcup ^{\infty }_{n=N} \bigg \{ \omega : \bigg | \frac{1}{n} S_nf(\omega ) - \int _\Omega \phi \, d\mu \bigg | > \frac{\delta }{4} \bigg \}. \end{aligned}$$

Suppose that \(\mu \) has bounded distortion. Note that \(\phi \in C^\alpha (\Omega ,{{\mathbb {R}}})\). Hence, by Theorem 3.1, there exist \(C'>0\) and \(\eta >0\) such that

$$\begin{aligned} \mu \bigg \{ \omega : \bigg | \frac{1}{n} S_nf(\omega )- \int _\Omega \phi \, d\mu \bigg | > \frac{\delta }{4} \bigg \} < C' e^{-\eta n} \text{ for } \text{ all } n\ge 1. \end{aligned}$$

Clearly, this implies that \(\mu ({{\mathcal {B}}}_N) < C' e^{-\eta N}\) for all \(N \ge 1\), which in turn implies the integrability statement (4.5) since

$$\begin{aligned} \int _\Omega \log \Vert H^s_{\omega ^{(\omega _0)}\wedge \omega ,\omega } \Vert \, d\mu&= \int _\Omega \log \Vert H^s_{\omega ,\omega ^{(\omega _0)}\wedge \omega }\Vert \, d\mu \\&= \sum ^{\infty }_{N=1} \int _{{{\mathcal {B}}}_{N} {\setminus } {{\mathcal {B}}}_{N+1}} \log \Vert H^s_{\omega ,\omega ^{(\omega _0)}\wedge \omega }\Vert \, d\mu \\&\le \sum ^{\infty }_{N=1} \mu ({{\mathcal {B}}}_N) C N \\&\le \sum ^{\infty }_{N=1} C'C e^{-\eta N}N \\&<\infty . \end{aligned}$$

The case of \(K_u(N,\delta )\) can be done similarly after replacing \(A_n(\omega )\) by \(A_{-n}(\omega )\)\(\square \)

Definition 4.3

For a periodic point p with period n, we let \(L(A,p)=\lim \frac{1}{n}\log \Vert A_n(p)\Vert \) be the individual Lyapunov exponent of A at p. We say p is \(\gamma \)-bunched if \(2L(A,p) < \gamma \le \alpha \).

Next, we show the following result, which says that \(\alpha \)-bunched periodic points are in the support of \(K_s(N,\delta )\) for suitable \(\delta > 0\) and for large N. We note that a similar result has appeared in [17].Footnote 2

Lemma 4.4

Suppose \((\Omega , T)\) is a subshift of finite type and \(\mu \) has a local product structure. Assume that \(2L(A,\mu )<\alpha \). Let p be an \(\alpha \)-bunched periodic point. Then for every \(0< \delta _0 < \min \{ \alpha - 2L(A,\mu ), \alpha -2L(A,p) \}\), there exists \(N_0 \in {{\mathbb {Z}}}_+\) such that

$$\begin{aligned} p \in \mathrm {supp}\left( \mu |_{K_{s}(N_0,\delta _0)\cap K_{u}(N_0,\delta _0)}\right) . \end{aligned}$$
(4.12)

Proof

Fix any number \(\delta \) so that \(0< \delta < \min \{ \alpha - 2L(A,\mu ), \alpha -2L(A,p) \}\). Then we have \(2L(A,p) < \alpha - \delta \) and \(2L(A,\mu )<\alpha -\delta \). Assume the period of p is r.

Consider the family \(K_s(N,\delta )\) as in Proposition 4.2. By (4.10), we have for all \(\omega \in K_s(N,\delta )\), all \({{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(\omega )\), and all \(n\ge 1\),

$$\begin{aligned} \Vert A_n({{\tilde{\omega }}})^{-1}A_n(\omega )\Vert <e^{CN}. \end{aligned}$$

Recall (2.8) says \((\pi ^+)^{-1}(\pi ^+\omega )=W^s_\mathrm {loc}(\omega )\). Thus by definition of \(K_s(N,\delta )\) and the estimate above, for each \(0<\delta _1<\delta \), there exists \(N_1>N\) such that for all \(\omega \in (\pi ^+)^{-1}(\pi ^+[K_s(N,\delta )])\) and all \(n\ge N_1\), we have

$$\begin{aligned} \Vert A_n(\omega )\Vert ^2<e^{(\alpha -\delta _1)n}. \end{aligned}$$
(4.13)

Fix such a choice of \(\delta _1\) and \(N_1\). By choosing N large, we may assume that \(\mu ( K_s(N,\delta )\cap [0;i])>0\) for each \(1\le i\le \ell \), which in turn implies that

$$\begin{aligned} \mu ^+[\pi ^+(K_s(N,\delta ))\cap [0;i]^+)=\mu ( K_s(N,\delta )\cap [0;i])>0. \end{aligned}$$

By Corollary 2.5, for each \(n\ge 0\), we have

$$\begin{aligned} \mu ^+(T^{-n}_+[\pi ^+(K_s(N,\delta ))]\cap [0;p_0,\ldots , p_{n}]^+) > 0. \end{aligned}$$
(4.14)

By the same argument as the one leading to (4.10), \(2L(A,p) < \alpha - \delta \) implies that for each \(0< \delta _2 < \delta \), we can find \(m \in {{\mathbb {Z}}}_+\) large enough so that \(\Vert A_{rm}(\omega )\Vert ^2 \le e^{(\alpha - \delta _2)rm}\) for \(\omega \in T^{-rm}W^u_{\mathrm {loc}}(T^{rm} p)\). By periodicity of p, we have for each \(l \ge 1\) and each \(1\le k < l\),

$$\begin{aligned} \Vert A_{rm}(T^{krm}\omega )\Vert ^2 \le e^{(\alpha - \delta _2)rm} \text{ for } \text{ all } \omega \text{ with } T^{-lrm} \omega \in W^u_{\mathrm {loc}}(T^{lrm} p), \end{aligned}$$

which in turn implies that for each \(1\le k\le l\),

$$\begin{aligned} \Vert A_{krm}(\omega )\Vert ^2<\prod ^{k-1}_{j=0}\Vert A_{rm}(T^{jrm})(\omega )\Vert ^2\le e^{(\alpha -\delta _2)krm}. \end{aligned}$$
(4.15)

For each \(l\in {{\mathbb {Z}}}_+\), we define the following s-locally saturated set,

$$\begin{aligned} {{\mathcal {D}}}^l_+=(\pi ^+)^{-1}(T^{-lrm}_+[\pi ^+(K_s(N,\delta ))]\cap [0;p_0,\ldots , p_{lrm}]^+). \end{aligned}$$

By (4.14), we have \(\mu ({{\mathcal {D}}}^l_+)>0\) and \({{\mathcal {D}}}^l_+\subset [0;p_0,\ldots , p_{lrm}]\). For each \(0<\delta _3<\delta _2\), we can fix a \(N'\in {{\mathbb {Z}}}_+\) large enough so that the following holds true. For all l large and for each \(\omega \in {{\mathcal {D}}}^l_+\), we have for all \(N'\le n\le lrm+N_1\) that

$$\begin{aligned} \Vert A_n(\omega )\Vert ^2<\Vert A_{krm}(\omega )\Vert ^2\cdot \Vert A_{n-krm}(T^{krm}\omega )\Vert ^2<e^{(\alpha -\delta _3)n}, \end{aligned}$$

where k is so chosen that \(0\le n-N_1-krm<rm\). On the other hand, if \(n>lrm+N_1\), then we have

$$\begin{aligned} \Vert A_n(\omega )\Vert ^2&\le \Vert A_{lrm}(\omega )\Vert ^2\cdot \Vert A_{n-lrm}(T^{lrm}\omega )\Vert ^2\\&\le \Vert A_{lrm}(\omega \wedge p)^{-1}\cdot A_{lrm}(\omega )\Vert ^2\cdot \Vert A_{lrm}(\omega \wedge p)\Vert ^2\\&\quad \cdot \Vert A_{n-lrm}(T^{lrm}\omega )\Vert ^2. \end{aligned}$$

We estimate each factor in the product above. First we consider the last factor. The fact that \(\omega \in {{\mathcal {D}}}^l_+\) implies that \(T^{krm}(\omega )\in W^s_{\mathrm {loc}}(\omega ')\) for some \(\omega '\in K_s(N,\delta )\). Thus by (4.13) and the fact that \(n-lrm>N_1\), we have

$$\begin{aligned} \Vert A_{n-lrm}(T^{lrm}\omega )\Vert ^2<e^{(\alpha -\delta _1)(n-lrm)}. \end{aligned}$$

For the second factor, the fact \(\omega \in {{\mathcal {D}}}^l_+\) implies that \(T^{-lrm}(\omega \wedge p)\in W^u_{\mathrm {loc}}(T^{lrm}p)\). Thus by (4.15), we have for each \(1\le k\le l\)

$$\begin{aligned} \Vert A_{krm}(\omega \wedge p)\Vert ^2\le e^{(\alpha -\delta _2)krm}. \end{aligned}$$

In particular, the second factor is taken care of by choosing \(k=l\). Combining the fact \(\omega \in W^u_{\mathrm {loc}}(\omega \wedge p)\) with the estimate above and using

$$\begin{aligned} A_{lrm}(\omega )=\prod ^{0}_{j=l-1}A_{rm}((T^{rm})^j\omega ), \end{aligned}$$

the same argument getting (4.10) yields

$$\begin{aligned} \Vert A_{lrm}(\omega \wedge p)^{-1}\cdot A_{lrm}(\omega )\Vert ^2<C. \end{aligned}$$

Thus by setting \(\delta '=\min \{\delta _1, \delta _3\}\), we have for all large l and \(n\ge lrm+N_1\) that

$$\begin{aligned} \Vert A_{n}(\omega )\Vert ^2<e^{(\alpha -\delta ')n}. \end{aligned}$$

Combining the estimates in the case of \(N'\le n\le lrm+N_1\), we obtain for all l large, all \(\omega \in {{\mathcal {D}}}^l_+\), and all \(n\ge N'\) that

$$\begin{aligned} \Vert A_n(\omega )\Vert ^2<e^{(\alpha -\delta ')n}, \end{aligned}$$

which implies that \({{\mathcal {D}}}^l_+\subset K_s(N',\delta ')\cap [0;p_0,\ldots p_{lrm}]\) for all large l. Note that \(0<\delta _i<\delta \), \(i=1,2,3\), are arbitrarily chosen, hence \(\delta '\) can be any number in \((0,\delta )\). In particular, we have for all \(\delta _0\in (0,\delta )\), there is a \(N'\) such that \({{\mathcal {D}}}^l_+\subset K_s(N',\delta _0)\cap [0;p_0,\ldots p_{lrm}]\) for all large l.

Similarly, for each \(0<\delta _0<\delta \), we can find a \(N''\in {{\mathbb {Z}}}_+\) and a sequence of u-locally saturated \({{\mathcal {D}}}^l_-\subset K_u(N'',\delta _0)\cap [-lrm; p_{-lrm},\ldots , p_{-1}, p_0]\) with \(\mu ({{\mathcal {D}}}^l_-)>0\). Taking \(N_0=\max \{N',N''\}\), we have for all large l,

$$\begin{aligned}&{{\mathcal {D}}}^l_-\cap {{\mathcal {D}}}^l_+\subset K_s(N_0,\delta _0)\cap K_u(N_0,\delta _0),\\&{{\mathcal {D}}}^l_-\cap {{\mathcal {D}}}^l_+\subset [-lrm; p_{-lrm},\ldots , p_{lrm}], \end{aligned}$$

where the second line implies that \({{\mathcal {D}}}^l_-\cap {{\mathcal {D}}}^l_+\) is contained in arbitrarily small neighborhood of p as l tends to infinity. Finally, combining that \({{\mathcal {D}}}^l_+\) is s-locally saturated in \([0;p_0]\), \({{\mathcal {D}}}^l_-\) is u-locally saturated in \([0;p_0]\), and (2.10), we have for all l large,

$$\begin{aligned} \mu ({{\mathcal {D}}}^l_-\cap {{\mathcal {D}}}^l_+)>0, \end{aligned}$$

which then implies that \(p \in \mathrm {supp}\left( \mu |_{K_{s}(N_0,\delta _0)\cap K_{u}(N_0,\delta _0)}\right) .\) This concludes the proof since the choices of \(\delta _0\) and \(\delta \) such that \(0<\delta _0<\delta <\min \{ \alpha - 2L(A,\mu ), \alpha -2L(A,p) \}\) are all arbitrary. \(\square \)

4.2 Invariance principle and su-states

Let \({{\mathcal {M}}}\) be the Borel \(\sigma \)-algebra of the subshift of finite type \((\Omega ,T,\mu )\), where \(\mu \) has a local product structure. Let \(A:\Omega \rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) be a measurable map. Then the following invariance principle is due to Ledrappier [33], see also [3, 39]:

Proposition 4.5

Let \({{\mathcal {B}}}\subseteq {{\mathcal {M}}}\) be a \(\sigma \)-algebra such that

  1. (1)

    \(T^{-1}{{\mathcal {B}}}\subseteq {{\mathcal {B}}}\) mod 0 and \(\{T^n{{\mathcal {B}}}: n\in {{\mathbb {Z}}}\}\) generates \({{\mathcal {M}}}\) mod 0.

  2. (2)

    the \(\sigma \)-algebra generated by A is contained in \({{\mathcal {B}}}\) mod 0.

If \(L(A,\mu )=0\), then for any (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component, the disintegration \(\{m_\omega \}_{\omega \in \Omega }\) is \({{\mathcal {B}}}\)-measurable mod 0.

Definition 4.6

We say that a function defined on \(\Omega \) only depends on the future (resp., past) if it is constant on every local stable (resp., unstable) set.

The following consequence of Proposition 4.5 is due to [7]. We sketch a proof for the reader’s convenience.

Proposition 4.7

Suppose A only depends on the future and \(L(A,\mu ) = 0\). Then for every (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component, its disintegration only depends on the future for \(\mu \)-almost every \(\omega \in \Omega \).

Proof

Let \({{\mathcal {B}}}\subseteq {{\mathcal {M}}}\) be the \(\sigma \)-algebra generated by sets \(\{W^s_{\mathrm {loc}}(\omega ): \omega \in \Omega \}\). It is clear that the sets \(W^s_{\mathrm {loc}}(\omega )\) are mutually disjoint. Thus, \(D \in {{\mathcal {B}}}\) if and only if for each \(\omega \in \Omega \), either \(W^s_{\mathrm {loc}}(\omega )\cap D = \varnothing \) or \(W^s_{\mathrm {loc}}(\omega ) \subseteq D\). Since \(T {{\mathcal {B}}}\) is the \(\sigma \)-algebra generated by \(\{ T W^s_{\mathrm {loc}}(\omega ) : \omega \in \Omega \}\), it is clear that \({{\mathcal {B}}}\subseteq T{{\mathcal {B}}}\), or equivalently \(T^{-1} {{\mathcal {B}}}\subseteq {{\mathcal {B}}}\). More generally, \(T^n {{\mathcal {B}}}\) is generated by \(\{ T^n W^s_{\mathrm {loc}}(\omega ) : \omega \in \Omega \}\). Now for any cylinder \([n;\underline{l}]\subset \Omega \), it is clear that it is \(T^n {{\mathcal {B}}}\)-measurable for some large \(n \in {{\mathbb {Z}}}_+\). Since \({{\mathcal {M}}}\) is generated by cylinders, we then have that \(\{ T^n {{\mathcal {B}}}: n \in {{\mathbb {Z}}}\}\) generates \({{\mathcal {M}}}\) mod 0. The result then follows from Proposition 4.5 and the straightforward fact that A is \({{\mathcal {B}}}\)-measurable if and only if A depends on the future. \(\square \)

An immediate consequence of Proposition 4.7 is that if A is constant along the local stable set and \(L(A,\mu )=0\), then for every (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component, its disintegration is constant on the local stable set \(W^s_{\mathrm {loc}}(\omega )\) for \(\mu \)-almost every \(\omega \in \Omega \). Indeed, we just need to define \(\omega ' = \phi (\omega )\) to be the sequence for which \(\omega '_n=\omega _{-n}\) for all \(n\in {{\mathbb {Z}}}\) and set

$$\begin{aligned} \Omega ' := \{ \omega ' = \phi (\omega ) : \omega \in \Omega \}. \end{aligned}$$

Then \(\mu \) is again an ergodic measure of \((\Omega ',T)\) which has a local product structure. Set \(A'(\omega ') = A(\phi (\omega ))\) so that \(A'\) depends only on the past. Then it is a standard result that \(L(A',\mu ) = L(A,\mu )=0\) and m is \((T,A')\)-invariant if it is (TA)-invariant. Now the conclusion follows from Proposition 4.7.

We have the following consequence regarding the existence of su-states.

Proposition 4.8

Suppose the cocycle map A is measurable, satisfies the integrability condition \(\int _\Omega \log \Vert A(\omega )\Vert \, d\mu < \infty \), and admits measurable canonical stable and unstable holonomies which satisfy the integrability conditions (4.5) and (4.6). If \(L(A,\mu ) = 0\), then every (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component has a disintegration that is almost surely invariant under the stable and unstable holonomies.

Proof

First we consider the s-invariance. For simplicity, we define \(\varphi (\omega )=\omega ^{(\omega _0)} \wedge \omega \), which depends only on the future. We define a new cocycle map as follows:

$$\begin{aligned} {\tilde{A}}(\omega ):=H^s_{T\omega , \varphi (T\omega )}\cdot A(\omega )\cdot H^s_{\varphi (\omega ), \omega }. \end{aligned}$$
(4.16)

It is clear that \({\tilde{A}}\) is conjugate to A via the stable holonomy. By the condition (4.5) and the definition of \({\tilde{A}}\), we then obtain \(\int _\Omega \log \Vert {\tilde{A}}(\omega )\Vert \, d\mu < \infty \) and \(L({\tilde{A}},\mu )=0\). On the other hand, by conditions (i)–(ii) of the definition of stable holonomy, we have that

$$\begin{aligned} {\tilde{A}}(\omega )&=H^s_{T\omega , \varphi (T\omega )}\cdot A(\omega )\cdot H^s_{\varphi (\omega ),\omega }\\&= H^s_{T\omega , \varphi (T\omega )}\cdot H^s_{T\varphi (\omega ),T\omega }\cdot A(\varphi (\omega ))\\&= H^s_{T\varphi (\omega ),\varphi (T\omega )}\cdot A(\varphi (\omega )), \end{aligned}$$

which implies that \({\tilde{A}}(\omega )\) depends only on the future. Thus Proposition 4.7 implies that we have for every \((T,{\tilde{A}})\)-invariant measure m that projects to \(\mu \) in the first component, its disintegration only depends on the future.

Now let m be a (TA)-invariant measure that projects to \(\mu \) in the first component. Let \(\{m_\omega : \omega \in \Omega \}\) be a disintegration of m. Thus \(A(\omega )_*m_\omega =m_{T\omega }\) for \(\mu \)-almost every \(\omega \). We define

$$\begin{aligned} {\tilde{m}}_\omega =(H^s_{\omega ,\varphi (\omega )})_*m_\omega ,\ \omega \in \Omega . \end{aligned}$$
(4.17)

One readily checks that \({\tilde{A}}(\omega )_*{\tilde{m}}_\omega ={\tilde{m}}_{T(\omega )}\). Thus the family of conditional measures \(\{{\tilde{m}}_\omega :\omega \in \Omega \}\) is a disintegration of a \((T,{\tilde{A}})\)-invariant measure \({\tilde{m}}\). Thus \({\tilde{m}}_\omega \) depends only on the future. In other words, for \(\mu \)-almost every \(\omega \), we have for each \(\omega '\in W^s_{\mathrm {loc}}(\omega )\) that

$$\begin{aligned} (H^s_{\omega ,\varphi (\omega )})_*m_\omega =(H^s_{\omega ',\varphi (\omega ')})_*m_{\omega '}. \end{aligned}$$
(4.18)

Since \(\varphi (\omega )=\varphi (\omega ')\), by condition (i) of the definition of stable holonomy we have

$$\begin{aligned} m_\omega =(H^s_{\varphi (\omega '),\omega }\cdot H^s_{\omega ',\varphi (\omega ')})_*m_{\omega '}=(H^s_{\omega ',\omega })_*m_{\omega '}. \end{aligned}$$
(4.19)

In other words, \(\{m_\omega :\omega \in \Omega \}\) is s-invariant \(\mu \)-almost everywhere, concluding the proof of the s-invariance.

As for the u-invariance, we just need to conjugate A to a new \({\tilde{A}}\) via the unstable holonomy so that \({\tilde{A}}\) is constant along the local unstable set \(W^u_{\mathrm {loc}}(\omega )\) for \(\mu \)-almost every \(\omega \in \Omega \). Then by repeating the same argument above and using the remark following Proposition 4.7, we obtain that \({m_\omega }\) is u-invariant \(\mu \)-almost everywhere. This completes the proof. \(\square \)

Lemma 4.9

Assume that \(L(A,\mu )=0\) and \(\mu \) has the bounded distortion property. Then there exists a full measure set \(K \subset \Omega \) on which one has measurable stable and unstable holonomies. Moreover, every (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) that projects to \(\mu \) in the first component has a continuous, su-invariant disintegration over \(\mathrm {supp}(\mu |_K)\cap K\).

Proof

Since \(L(A,\mu )=0\), Lemma 4.2 applies and yields for each \(\delta \) with \(0<\delta <\alpha \), the sets \(K_s(N,\delta )\), \(K_u(N,\delta )\) along with continuous families of holonomies satisfying the estimates required to apply Proposition 4.8. Thus, applying Proposition 4.8, choose a (TA)-invariant measure m on \(\Omega \times {{\mathbb {R}}}{\mathbb {P}}^1\) and consider its disintegration \(\{ m_\omega \}\), which is invariant almost everywhere with respect to the stable and unstable holonomies. Recall that both \(\bigcup _{N > 0} K_s(N,\delta )\) and \(\bigcup _{N > 0} K_u(N,\delta )\) have full measure. We let

$$\begin{aligned} K_\delta = \left( \bigcup _{N> 0} K_s(N,\delta ) \right) \cap \left( \bigcup _{N > 0} K_u(N,\delta ) \right) . \end{aligned}$$
(4.20)

As in [7], we can now produce a disintegration \(\{ {\tilde{m}}_\omega \}\) over \(\mathrm {supp} (\mu |_{K_\delta })\cap K_\delta \), which is holonomy-invariant and continuous. For the reader’s convenience we give the argument. In the following argument, we will work in the full measure set \(K_{\delta }\) so that the stable and unstable holonomies are defined on the local stable and unstable set of \(\omega \in K_\delta \), respectively.

For each \(\omega \in K_\delta \), if \(m_{\omega '}\) already exists for some \(\omega '\in W^s_{\mathrm {loc}}(\omega )\) from the original disintegration of m above, then we may define \(m^s_{\omega ''}\) via \(H^s(\omega ', \omega '')_*m_{\omega '}\) for each \(\omega ''\in W^s_{\mathrm {loc}}(\omega )\). If \(m_{\omega '}\) does not exist from the original disintegration of m for all \(\omega '\in W^s_{\mathrm {loc}}(\omega )\), then \(W^s_{\mathrm {loc}}(\omega )\) is a \(\mu \)-zero measure set and we may pick any probability measure \(m^s_\omega \) and extend via \(m^s_{\omega ''}=H^s(\omega , \omega '')_*m^s_{\omega }\) for each \(\omega ''\in W^s_{\mathrm {loc}}(\omega )\). Clearly, the new family \(m^s_\omega \) is invariant under the stable holonomy at every \(\omega \in K_{\delta }\). On the other hand, due to the almost sure invariance of the original disintegration \(m_\omega \) of m, the new family \(m^s_\omega \) coincides with \(m_\omega \) for \(\mu \)-almost every \(\omega \) in \(K_\delta \), and hence it also coincides with \(m_\omega \) for \(\mu \)-almost every \(\omega \). In particular, \(m^s_\omega \) is again a disintegration of m. Similarly, we may construct another disintegration \(m^u_\omega \) of m which is invariant under the unstable holonomy at every \(\omega \in K_\delta \). Note that the set \({\widetilde{K}}=\{\omega \in K_\delta : m^s_{\omega } = m^u_{\omega } \}\) has full \(\mu \)-measure.

Clearly, for each \(1\le j\le \ell \), \([0;j]\cap {\widetilde{K}}\) has full \(\mu \)-measure in [0; j]. By the local product structure of \(\mu \), \([0;j]\cap {\widetilde{K}}\) has full \(\mu ^-\times \mu ^+-\)measure in [0; j]. Thus by Fubini’s theorem, for \(\mu ^-\)-almost every \(\omega ^-\in [0;j]^-\), we have that \(\pi ^+[(\{\omega ^-\}\times [0;j]^+)\cap {\widetilde{K}}]\) has full \(\mu ^+\)-measure in \([0;j]^+\). Note for each \(\omega \in [0;j]\) with \(\pi ^-(\omega )=\omega ^-\), we have \(\omega ^-\times [0;j]^+=W^u_{\mathrm {loc}}(\omega )\). Thus for each \(1\le j\le \ell \), we may choose an \(\omega ^{(j)}\in [0;j] \cap K_\delta \) such that

$$\begin{aligned} \mu ^+\left( \pi ^+\big (W^u_{\mathrm {loc}}(\omega ^{(j)})\cap {\widetilde{K}}\big )\right) =\mu ^+([0;j]^+). \end{aligned}$$

By the definition of \(\mu ^+\), we then have that

$$\begin{aligned}&\mu \left( (\pi ^+)^{-1}\bigg [\pi ^+\big (\bigcup ^{\ell }_{j=1}\big (W^u_{\mathrm {loc}} (\omega ^{(j)})\cap {\widetilde{K}}\big )\big )\bigg ]\right) \nonumber \\&\quad = \mu ^+ \left( \pi ^+ \bigg ( \bigcup ^{\ell }_{j=1} \big (W^u_{\mathrm {loc}}(\omega ^{(j)}) \cap {\widetilde{K}}\big )\bigg ) \right) \nonumber \\&\quad = \sum ^{\ell }_{j=1}\mu ^+([0;j])^+ = 1. \end{aligned}$$
(4.21)

In other words, for \(\mu \)-almost every \(\omega \in K_\delta \), we have \(\omega ^{(\omega _0)}\wedge \omega \in W^u_{\mathrm {loc}}(\omega ^{(\omega _0)})\cap {\widetilde{K}}\). Now for each \(\omega \in K_\delta \), we define

$$\begin{aligned} {\tilde{m}}_\omega ^s = H^s_{\omega ^{(\omega _0)}\wedge \omega ,\omega } \cdot m^u_{\omega ^{(\omega _0)}\wedge \omega }= H^s_{\omega ^{(\omega _0)}\wedge \omega ,\omega } \cdot H^u_{\omega ^{(\omega _0)},\omega ^{(\omega _0)}\wedge \omega }\cdot m^u_{\omega ^{(\omega _0)}}. \end{aligned}$$

Recall that by the proof of Lemma 4.7, the stable and unstable holonomies are continuous on each local stable and unstable set, respectively. Thus the equalities in the definition of \({\tilde{m}}_\omega \) above imply that for each \(1\le j\le \ell \), we have that \({\tilde{m}}^s_\omega \) is continuous in \(\omega \) at \([0;j]\cap K_\delta \). Thus \({\tilde{m}}^s_\omega \) is continuous on \(K_\delta \). Clearly, the construction also implies that \({\tilde{m}}^s_\omega \) is s-invariant. On the other hand, by invariance with respect to stable holonomies, we have that for each \(\omega \) such that \(\omega ^{(\omega _0)}\wedge \omega \in W^u_{\mathrm {loc}}(\omega ^{(\omega _0)})\cap {\widetilde{K}}_{\omega _0}\), we have

$$\begin{aligned} {\tilde{m}}^s_\omega = H^s_{\omega ^{(\omega _0)}\wedge \omega ,\omega } \cdot m^u_{\omega ^{(\omega _0)}\wedge \omega } = H^s_{\omega ^{(\omega _0)}\wedge \omega ,\omega } \cdot m^s_{\omega ^{(\omega _0)}\wedge \omega } = m^s_\omega . \end{aligned}$$

Thus, we have \({\tilde{m}}_\omega ^s = m_\omega ^s\) for \(\mu \)-almost every \(\omega \in K_\delta \), and we obtain an s-invariant and continuous disintegration \(\{ {\tilde{m}}^s_\omega \}\) of m. Producing in an analogous fashion a u-invariant and continuous disintegration \(\{ {\tilde{m}}^u_\omega \}\), we find that \({\tilde{m}}^s_\omega = {\tilde{m}}^u_\omega \) in \(\mathrm {supp} (\mu |_{K_\delta })\cap K_\delta \) by continuity and almost everywhere coincidence. This produces an su-invariant continuous disintegration \(\{ {\tilde{m}}_\omega \}\) over \(\mathrm {supp}(\mu |_{K_\delta })\cap K_\delta \) by setting \({\tilde{m}}_\omega ={\tilde{m}}^s_\omega \). By continuity, we also have invariance under (TA), that is, \(A(\omega )_*{\tilde{m}}_\omega = {\tilde{m}}_{T\omega }\) for every \(\omega \in \mathrm {supp}(\mu |_{K_\delta })\cap K_\delta \). Clearly, any \(K_\delta \) can be chosen to be the desired K. \(\square \)

4.3 Application of conformal barycenter

Let \({\mathbb {H}} \subseteq {{\mathbb {C}}}\) be the upper-half plane, \({{\mathbb {D}}}\) the open unit disk, and \(S^1=\partial {{\mathbb {D}}}\) the unit circle. It is a standard result that the Möbius transformation associated with an element of the group \(\mathrm {SU}(1,1)\) preserves \(S^1\) and \({{\mathbb {D}}}\). Here \(P=\left( {\begin{matrix}a &{} b\\ {\bar{b}} &{} {\bar{a}}\end{matrix}}\right) \in \mathrm {SU}(1,1)\) if \(|a|^2-|b|^2=1\) and the Möbius transformation associated with it is \(P\cdot z=\frac{az+b}{{\bar{b}} z+{\bar{a}}}\). It is a standard result that \(\mathrm {SU}(1,1)\) is conjugate to \({\mathrm {SL}}(2,{{\mathbb {R}}})\) through the \({\mathrm {SL}}(2,{{\mathbb {C}}})\)-matrix \(Q=\frac{-1}{1+i}\left( {\begin{matrix}1 &{} -i\\ 1 &{} i\end{matrix}}\right) \), that is, \(Q^*\mathrm {SU}(1,1)Q={\mathrm {SL}}(2,{{\mathbb {R}}})\). In fact, we have the following commutative diagram:

(4.22)

where all transformations are Möbius transformations, as well as homeomorphisms. Moreover, Q is a homeomorphism between their boundaries, that is, a homeomorphism from \({{\mathbb {R}}}{\mathbb {P}}^1 = {{\mathbb {R}}}\cup \{\infty \} = \partial {{\mathbb {H}}}\) to \(S^1 = \partial {{\mathbb {D}}}\). We need the following proposition from [25, Proposition 1].

Proposition 4.10

For each probability measure \(\nu \) on the unit circle \(S^1\) containing no atom of mass \(\ge \frac{1}{2}\), there is an unique point \(B(\nu )\in {\mathbb {D}}\), called the conformal barycenter of \(\nu \), so that the map \(\nu \rightarrow B(\nu )\) is invariant under the Möbius transformation of \(\mathrm {SU}(1,1)\), that is, \(B(P_*\nu ) = P\cdot B(\nu )\) for each \(P\in \mathrm {SU}(1,1)\).

Lemma 4.11

Let \((\Omega ,T,\mu )\), A, and \(K\subset \Omega \) be as in Lemma 4.9. Then there exists a family of A-invariant, su-invariant measures \({\hat{m}}_\omega \) over \(\mathrm {supp}(\mu |_K)\cap K\) such that for each \(\omega \in \mathrm {supp}(\mu |_K)\cap K\), \({\hat{m}}_\omega \) is supported by at most two points of \({{\mathbb {C}}}{\mathbb {P}}^1\).

Proof

We start with the continuous disintegration \(\{{\tilde{m}}_\omega :\omega \in \Omega \}\) of m over \(\omega \in \mathrm {supp} (\mu |_K)\cap K\) that we constructed as stated in Lemma 4.9. To produce the family of measures \(\{{\hat{m}}_{\omega }\}\), we divide it into three different cases.

If \({\tilde{m}}_\omega \) has an atom \(z(\omega )\in {{\mathbb {R}}}{\mathbb {P}}^1\) of mass \(> 1/2\), we let \({\hat{m}}_\omega =\delta _{z(\omega )}\), that is, the Dirac measure (mass one) supported in this point \(z(\omega )\). By invariance of \({\tilde{m}}_\omega \) under the holonomies, it is clear that if \({\tilde{m}}_\omega \) has such a point \(z(\omega )\), then so does \(m_{\omega '}\) for each point \(\omega '\) in \(W^s_{\mathrm {loc}}(\omega )\cup W^u_{\mathrm {loc}}(\omega )\). Moreover, \(z(\omega ')=H^*_{\omega ,\omega '}(z(\omega ))\) for \(*\in \{s,u\}\), which exactly implies that \(\delta _{z(\omega )}\) is invariant under the holonomies. Similarly, by invariance of \({\tilde{m}}_\omega \) under \(A(\omega )\), we have that \({\tilde{m}}_{T^n\omega }\) has such a point mass for all \(n\in {{\mathbb {Z}}}\) and \(A(\omega )_*\delta _{z(\omega )}=\delta _{z(T\omega )}\).

If \({\tilde{m}}_\omega \) contains two atoms of mass 1/2 each, we set \({\hat{m}}_\omega = {\tilde{m}}_\omega \). Similar to the argument of the case (1) above, we have that \(m_{\omega '}\) falls into this case for each point \(\omega '\) in \(W^s_{\mathrm {loc}}(\omega )\cup W^u_{\mathrm {loc}}(\omega )\cup \mathrm {Orb}(\omega )\) and \({\hat{m}}_\omega \) is invariant under the holonomies and \(A(\omega )\).

In all other cases, by Proposition 4.10, we define \({\hat{m}}_\omega \) to be the Dirac measure supported at

$$\begin{aligned} z(\omega ):=Q^{-1}\cdot B(Q_*{\tilde{m}}_\omega )\in {{\mathbb {H}}}, \end{aligned}$$
(4.23)

where \(B(Q_*{\tilde{m}}_\omega )\) is the conformal barycenter of the measure \(Q_*{\tilde{m}}_\omega \) of the unit circle \(S^1\). Note again by holonomy invariance, if \(\omega \) is not in the two cases above, then neither is \(m_{\omega '}\) for each point \(\omega '\) in \(W^s_{\mathrm {loc}}(\omega )\cup W^u_{\mathrm {loc}}(\omega )\cup \mathrm {Orb}(\omega )\). Moreover, for \(\omega '\in W^s_{\mathrm {loc}}(\omega )\), we have \(QH^s_{\omega ,\omega '}Q^{-1}\in \mathrm {SU}(1,1)\) which together with Proposition 4.10 and the holonomy invariance of \({\tilde{m}}_\omega \) implies that

$$\begin{aligned} H^s_{\omega ,\omega '}\cdot z(\omega )&= H^s_{\omega ,\omega '}\cdot \big (Q^{-1} \cdot B(Q_*{\tilde{m}}_\omega )\big )\\&= Q^{-1}(QH^s_{\omega ,\omega '}Q^{-1})\cdot B(Q_*{\tilde{m}}_\omega )\\&=Q^{-1}\cdot B\left( (QH^s_{\omega ,\omega '}Q^{-1})_*Q_*{\tilde{m}}_\omega \right) \\&=Q^{-1}\cdot B\left( (QH^s_{\omega ,\omega '}Q^{-1}Q)_*{\tilde{m}}_\omega \right) \\&=Q^{-1}\cdot B\left( Q_*(H^s_{\omega ,\omega '})_*{\tilde{m}}_\omega \right) \\&=Q^{-1}\cdot B\left( Q_*{\tilde{m}}_{\omega '}\right) \\&=z(\omega '), \end{aligned}$$

which in turn implies that \({\hat{m}}(\omega )\) is invariant under the stable holonomy. By a similar argument we can establish the invariance under the unstable holonomy and under \(A(\omega )\). \(\square \)

4.4 Local su-invariance

In this subsection, we drop the assumption that \(\mu \) has bounded distortion and assume only that it has a local product structure. Note that to ensure the existence of the su-state when only assuming that the Lyapunov exponent is small, we need the bounded distortion property of \(\mu \) so that we have the integrability conditions on the measurable stable and unstable holonomies. But to prove our main results concerning the positivity of the Lyapunov exponent, we actually only need a “local su-state” for which a local product structure suffices.

We adapt the techniques from [39] to produce a certain disintegration of m that has local su-invariance. Throughout this subsection, we assume \(A \in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) to be such that \(L(A,\mu ) = 0\) and we fix a \(\delta \) with \(\frac{\alpha }{2}< \delta < \alpha \).

We start with the following consequence of the proof of Lemma 4.2. Recall that \(K_s(N,\delta )\) was defined in (4.3).

Lemma 4.12

Let \((\Omega ,T,\mu )\), A, and \(\delta \) be above. Then there exists a \({\tilde{C}}={\tilde{C}}(\delta ,N)\) so that the following holds true. For all \(\omega \in K_s(N,\delta )\), all \({{\tilde{\omega }}} \in W^s_{\mathrm {loc}}(\omega )\), and all \(j \ge 0\), we have

$$\begin{aligned} H^s_{T^j\omega ,T^j{{\tilde{\omega }}}}:=\lim _{n\rightarrow \infty } H^{s,n}_{T^j{{\tilde{\omega }}},T^j\omega } \text{ exists } \text{ and } \Vert H^s_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \le {\tilde{C}}. \end{aligned}$$
(4.24)

Proof

By (4.9), we have that

$$\begin{aligned} \Vert H^s_{{{\tilde{\omega }}},\omega }\Vert =\Vert \lim _{n\rightarrow \infty }H^{s,n}_{{{\tilde{\omega }}}, \omega }\Vert \le e^{CN}. \end{aligned}$$

A direct computation shows that

$$\begin{aligned} H^{s,n}_{T^j{{\tilde{\omega }}},T^j\omega }=A_j({{\tilde{\omega }}}) H^{s,n+j}_{{{\tilde{\omega }}},\omega }A_j(\omega )^{-1}, \end{aligned}$$

which implies the existence of

$$\begin{aligned}&H^{s}_{T^j{{\tilde{\omega }}},T^j\omega }:=\lim _{n\rightarrow \infty } H^{s,n}_{T^j{{\tilde{\omega }}},T^j\omega } \text{ and } \\&\Vert H^{s}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \le \Vert H^{s,n}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \cdot \Vert A_j(\omega )\Vert \cdot \Vert A_j ({{\tilde{\omega }}})\Vert . \end{aligned}$$

In particular, for all \(0\le j\le N\), we have

$$\begin{aligned} \Vert H^{s}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \le e^{3CN}. \end{aligned}$$
(4.25)

Fix a \(j>N\). Then we have \(\Vert A_j(\omega )\Vert ^2 < e^{(\alpha -\delta )j}\) since \(\omega \in K_s(N,\delta )\). Using (4.8), a direct computation shows that

$$\begin{aligned} \Vert \delta ^{s,n}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert&=\Vert A_j(\omega )\delta ^{s,n+j}_{{{\tilde{\omega }}},\omega }A_j(\omega )^{-1}\Vert \\&\le Ce^{(\alpha -\delta )j}e^{(\alpha -\delta )(n+j)}e^{-\alpha (n+j)}\\&= Ce^{(\alpha -2\delta )j}e^{-\delta n} \\&< Ce^{-\delta n}, \end{aligned}$$

where the last inequality follows from the fact \(2\delta >\alpha \). Combining (4.24) and the proof of (4.9), we obtain for all \(j\ge N\)

$$\begin{aligned} \Vert H^{s}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \le \Vert H^{s,N}_{T^j{{\tilde{\omega }}},T^j\omega }\Vert \cdot \exp \left( C\sum ^{\infty }_{n=N}\Vert \delta ^{s,n}_{\omega ,{{\tilde{\omega }}}}\Vert \right) \le Ce^{2CN}. \end{aligned}$$
(4.26)

Combining (4.25) and (4.26), we clearly obtain the latter half of (4.24), where we may take \({\tilde{C}}=\max \{e^{3CN}, Ce^{2CN}\}\). \(\square \)

So we may choose N large so that \(K(N,\delta )=K_s(N,\delta )\cap K_u(N,\delta )\) (which were defined in (4.3)) has measure sufficiently close to 1 and \(\mu (K(N,\delta )\cap [0;j])>0\) for all \(1\le j\le \ell \). We set \(K^j_\tau :=K_\tau (N,\delta )\cap [0;j]\) for \(\tau \in \{s,u\}\) and \(K^j=K^j_s\cap K^j_u\).

Lemma 4.13

Let \((\Omega ,T,\mu )\), A, \(\delta \) be as in Lemma 4.12 and let \(K(N,\delta )\) be as above. Then for every (TA)-invariant measure m that projects to \(\mu \) on the first component, there is a disintegration \(\{m_\omega : \omega \in \Omega \}\) of m that is su-invariant for \(\mu \)-almost every \(\omega \in K(N,\delta )\).

Proof

We only consider the case of s-invariance, as u-invariance can be established in a completely analogous way. We break the argument into three steps.

Step I. As in the proof of Proposition 4.7, the first step is to construct a certain \(\sigma \)-algebra \({{\mathcal {B}}}\) to which we can apply the invariance principle as formulated in Proposition 4.5. By (4.9), \(K_s(N,\delta )\) is s-saturated. Similarly, \(K_u(N,\delta )\) is u-saturated. Fix a \(\omega ^j\in K^j\) and set \(S=W^u_{\mathrm {loc}}(\omega ^j)\cap K^j\). For each \(\omega '\in S\), we define \(r(\omega ')=1\) if \(T(W^s_{\mathrm {loc}}(\omega '))\cap W^s_{\mathrm {loc}}(\omega '')=\varnothing \) for some \(\omega ''\in S\); otherwise, we define \(2\le r(\omega ')\in {{\mathbb {Z}}}_+\cup \{\infty \}\) be the largest number such that \(T^i(W^s_{\mathrm {loc}}(\omega '))\cap W^s_{\mathrm {loc}}(\omega '') = \varnothing \) for all \(\omega ''\in S\) and for all \(0<i< r(\omega ')\). Now we define the \(\sigma \)-algebra \({{\mathcal {B}}}\subseteq {{\mathcal {M}}}\) to be the one generated by the family

$$\begin{aligned} \{T^i(W^s_{\mathrm {loc}}(\omega ')): \omega '\in S,\ 0\le i< r(\omega ')\}. \end{aligned}$$

By our definition of \(r(\omega ')\), it is clear that the sets in the family above are mutually disjoint. Thus, \({{\mathcal {B}}}\) contains all \(B\in {{\mathcal {M}}}\) such that for all \(\omega '\in S\) and all \(0\le i<r(\omega ')\), either \(B\cap T^i(W^s_{\mathrm {loc}}(\omega '))=\varnothing \) or \(T^j(W^s_{\mathrm {loc}}(\omega '))\subset B\). First we claim that \({{\mathcal {B}}}\) satisfies condition (1) of Proposition 4.5. The proof is analogous to the one of Proposition 4.7. Indeed, \(T{{\mathcal {B}}}\) is the \(\sigma \)-algebra generated by

$$\begin{aligned} \{T^{i+1}(W^s_{\mathrm {loc}}(\omega ')): \omega '\in S,\ 0\le i< r(\omega ')\}, \end{aligned}$$

which is again a family of mutually disjoint sets. Since \(T^{r(\omega ')}(W^s_{\mathrm {loc}}(\omega ')) \subseteq W^s_{\mathrm {loc}}(\omega '')\) for some \(\omega ''\in S\), one readily checks that \(B\in {{\mathcal {B}}}\) implies \(B\in T{{\mathcal {B}}}\). Hence, we have that \(T{{\mathcal {B}}}\) contains \({{\mathcal {B}}}\), or equivalently, \(T^{-1} {{\mathcal {B}}}\subseteq {{\mathcal {B}}}\). More generally, for all \(n\ge 1\), we have that \(T^n{{\mathcal {B}}}\) is generated by \(\{ T^{i+n} (W^s_{\mathrm {loc}}(\omega ')) : \omega '\in S,\ 0\le i < r(\omega ')\}\), which implies that \(T^n {{\mathcal {B}}},\ n\ge 1\) generates \({{\mathcal {M}}}\) mod 0. Indeed, since \({{\mathcal {M}}}\) is generated by cylinders, we just need to show that all \([k;\underline{l}]\) are contained in \(T^n{{\mathcal {B}}}\) for some large n. Taking any \(n\ge |k|\), it is clear that \([k;\underline{l}]\in T^n{{\mathcal {B}}}\).

Step II. Similarly to the proof of Proposition 4.8, our second step is to conjugate A to some \({\tilde{A}}\), which is measurable with respect to \({{\mathcal {B}}}\). We define \({\tilde{A}}\) by

$$\begin{aligned} {\tilde{A}}(\omega ):=H^s_{T\omega ,T^{i+1}\omega '}A(\omega )H^s_{T^i\omega ',\omega } =A(T^i\omega ') \end{aligned}$$
(4.27)

if \(\omega \in T^{i}(W^s_{\mathrm {loc}}(\omega '))\) for some \(\omega '\in S\) (so that \(\omega \in W^s_{\mathrm {loc}}(T^i\omega ')\) ) and \(0\le i<r(\omega ')\); and

$$\begin{aligned} {\tilde{A}}(\omega ):=A(\omega ) \text{ otherwise }. \end{aligned}$$
(4.28)

Clearly, if we set \(B(\omega )\) as

$$\begin{aligned} B(\omega )= {\left\{ \begin{array}{ll}H^s_{\omega ,T^j\omega '}, &{}\omega \in T^j(W^s_{\mathrm {loc}}(\omega ')), \omega '\in S, \text{ and } 0\le j<r(\omega ')\\ I_2, &{}\text{ otherwise }, \end{array}\right. } \end{aligned}$$
(4.29)

then \({\tilde{A}}(\omega )=B(T\omega )A(\omega )B(\omega )^{-1}\). In other words, \({\tilde{A}}\) is conjugate to A via B. Combining this with the fact that \(\omega '\in S\subseteq K_s(N,\delta )\) and Lemma 4.12, we have \(\Vert B(\omega )\Vert \le {\tilde{C}}\) for all \(\omega \in \Omega \). In particular, \(\int \log \Vert {\tilde{A}}\Vert \, d\mu < \infty \) and \(L({\tilde{A}},\mu )=0\). By definition, \({\tilde{A}}\) is constant on \(T^i(W^s_{\mathrm {loc}}(\omega '))\) for any \(\omega '\in S\) and any \(0\le i<r(\omega ')\), which clearly implies that \({\tilde{A}}\) is \({{\mathcal {B}}}\)-measurable.

Step III. Following the second half of the proof of Proposition 4.8, for any given disintegration \(\{m_\omega :\omega \in \Omega \}\) of m, we can set

$$\begin{aligned} {\tilde{m}}_\omega =B(\omega )^{-1}_*m_\omega ,\ \omega \in \Omega . \end{aligned}$$
(4.30)

Then it becomes a disintegration of a \((T,{\tilde{A}})\)-invariant measure \({\tilde{m}}\). We can now apply Proposition 4.5 to \(({{\mathcal {B}}}, {\tilde{A}}, {\tilde{m}}_\omega )\) and obtain that \(\{{\tilde{m}}_\omega \}\) is \({{\mathcal {B}}}\)-measurable. In particular, \({\tilde{m}}_\omega \) is constant on \(T^i(W^s_{\mathrm {loc}}(\omega '))\) for all \(\omega '\in S\) and all \(0\le i\le r(\omega ')\). Taking \(i=0\), then a similar proof to (4.19) yields

$$\begin{aligned} (H^s_{\omega ,{{\tilde{\omega }}}})_*m_\omega =m_{{{\tilde{\omega }}}} \text{ for } \text{ all } \omega ,{{\tilde{\omega }}}\in W^s_{\mathrm {loc}}(\omega ') \text{ and } \text{ all } \omega '\in S. \end{aligned}$$
(4.31)

Thus we have obtained s-invariance for all points in

$$\begin{aligned} \bigcup _{\omega \in S}\{W^s_{\mathrm {loc}}(\omega ):\omega \in W^u_{\mathrm {loc}}(\omega ^j)\}, \end{aligned}$$

which contains

$$\begin{aligned} \bigcup _{\omega \in W^u_{\mathrm {loc}}(\omega ^j)\cap K^j}\{W^s_{\mathrm {loc}}(\omega ):\omega \in W^u_{\mathrm {loc}}(\omega ^j)\cap K^j\}. \end{aligned}$$

By local product structure of \(\mu \) and following the proof of (4.21), we have that the set above is a full measure subset of \(K^j\). Since \(1\le j\le \ell \) is arbitrarily chosen, we thus obtain s-invariance of \(\{m_\omega \}\) on a full measure subset of \(K(N,\delta )\). \(\square \)

Now we can apply the proof of Lemmas 4.9 and  4.11 (replacing \(K_\delta \) by \(K(N,\delta )\)) to obtain the following corollary.

Corollary 4.14

Using the setup of Lemma 4.13, there is a disintegration \(\{{\tilde{m}}_\omega \}\) of m so that \(\omega \mapsto {\tilde{m}}_\omega \) is continuous and su-invariant on \(\mathrm {supp}(K(N,\delta ))\cap K(N,\delta )\). Moreover, there is family of measures \(\{{\hat{m}}_\omega \}\) that is su-invariant on \(\mathrm {supp}(K(N,\delta ))\cap K(N,\delta )\) and for each \(\omega \), \(\mathrm {supp}({\hat{m}}_\omega )\) contains at most two points.

The main goal of the present Sect. 4 is to obtain the following corollary.

Corollary 4.15

Suppose \((\Omega ,T)\) is a subshift of finite type and \(\mu \) is a T-ergodic measure that has a local product structure. Let \(A : \Omega \rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) be a cocycle map so that \(L(A,\mu )=0\). Then for every periodic point p (of period n) such that \(2L(A,p) < \frac{\alpha }{2}\), there exists a set \(Z_p\subseteq {{\mathbb {C}}}{\mathbb {P}}^1\), invariant under complex conjugation and under \(A_n(p)\), and consisting of either one or two points, with the following property. Let q be another periodic point such that \(2L(A,q)<\frac{\alpha }{2}\). If \(p_0=q_0\), then

$$\begin{aligned} H^u_{q,q\wedge p}(Z_q)=H^s_{p, q\wedge p}(Z_p). \end{aligned}$$
(4.32)

Proof

Since \(L(A,\mu )=0\) and \(L(A,p)<\frac{\alpha }{2}\), by Lemma 4.4, we clearly have that \(p \in \mathrm {supp}(\mu |_{K(N,\delta )})\cap K_{\delta }(N,\delta )\) for some \(\frac{\alpha }{2}<\delta <\alpha \). Thus we may apply Corollary 4.14 to obtain the measure \({\hat{m}}_p\) defined at p which is \(A_n(p)\)-invariant and su-invariant. Hence, if we define

$$\begin{aligned} Z_p:=\{z(p), \overline{z(p)}:\ z(p)\in \mathrm {supp}({\hat{m}}_{p})\}, \end{aligned}$$

then \(Z_p\) consists of at most two points. Moreover, it is clear that \(Z_p\) is su-invariant, A(p)-invariant, and invariant under complex conjugation.

Since q is also a periodic point such that \(L(A,q)<\frac{\alpha }{2}\), we can certainly find \(\frac{\alpha }{2}<\delta <\alpha \) so that both p and q belong to \(\mathrm {supp}(\mu |_{K(N,\delta )})\cap K(N,\delta )\). Thus \(Z_p\) and \(Z_q\) are both defined. By su-invariance of \(Z_p\) and \(p_0=q_0\), we have

$$\begin{aligned} H^u_{q,q\wedge p}(Z_q)=H^s_{p, q\wedge p}(Z_p) \end{aligned}$$

as desired. \(\square \)

5 Positivity of the Lyapunov exponent I

Throughout this section we assume that \(\Omega \subseteq {\mathcal {A}}^{{\mathbb {Z}}}\) is a subshift of finite type and \(\mu \) is a T-ergodic probability measure that is fully supported on \(\Omega \) and has a local product structure. We fix a non-constant \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\) and consider the one-parameter family of Schrödinger cocycles \((T,A^E)\). We shall apply the techniques from Sect. 4 to study the positivity property of the Lyapunov exponent.

In Sect. 5.1, we show under a very general condition that the set of energies with zero Lyapunov exponent is a discrete set. In Sect. 5.2, we apply the same techniques to the scenario where we have global existence of holonomies and obtain a stronger result for the corresponding Schrödinger cocycles. Namely, we show that under the same general condition, the set of energies with zero Lyapunov exponent is a finite set. Global existence of the holonomies may be obtained if the \(\Vert \cdot \Vert _\infty \) norm of the sampling function is small or if the sampling function is locally constant.

5.1 General case: positivity away from a discrete set

Throughout this subsection, we assume that \(E_0 \in {{\mathbb {R}}}\) is an accumulation point of

$$\begin{aligned} {{\mathcal {Z}}}_f = \{ E : L(E) = 0 \}. \end{aligned}$$

Clearly,

$$\begin{aligned} E_0 \in \Sigma , \end{aligned}$$
(5.1)

since \({{\mathcal {Z}}}_f \subseteq \Sigma \) and \(\Sigma \) is closed.

Definition 5.1

We say that a periodic point q is \(\gamma \)-bunched at E if

$$\begin{aligned} 2L(A^{E},q) < \gamma \le \alpha . \end{aligned}$$
(5.2)

We fix p to be a periodic point that is \(\frac{\alpha }{2}\)-bunched at \(E_0\). We let \(n_p\) denote the period of p.

Lemma 5.2

\(L(A^{E_0},p)=0\). In particular, \(E_0\in \sigma (H_{p})\).

Proof

Let \(E_n \rightarrow E_0\), \(E_n \ne E_0\) be a sequence in \({{\mathcal {Z}}}_f\). Recall from (5.1) that \(E_0\in \Sigma \).

Assume that p is hyperbolic for \(E_0\) (i.e., the matrix \(A^{E_0}_{n_p}(p)\), which serves as the monodromy matrix at energy \(E_0\) for the periodic potential associated with p, is hyperbolic). Then p is still hyperbolic and \(\frac{\alpha }{2}\)-bunched in a small neighborhood J of \(E_0\). Let \(E_0\in \sigma (H_{\omega '})\) for some \(\omega '\in \Omega \). By Proposition 2.10 and by choosing \(\delta >0\) small, we have \(J\cap \sigma (H_{\omega })\ne \varnothing \) for any \(\omega \) such that \(\mathrm {Orb}(\omega )\cap B_{\delta }(\omega ')\ne \varnothing \). On the other hand, by Proposition 2.7 there is an \(r = r(\delta )\) so that for any \(I_1=[0,n_1]\subseteq {{\mathbb {Z}}}\), there is a periodic orbit q with period \(n_q=n_1+r+1\) so that \(d(T^jq,T^jp)<\delta \) for all \(1\le j\le n_1\) and \(d(T^{n_1+r+1}q,\omega ')<\delta \). An immediate consequence is that \(\sigma (H_q)\cap J\ne \varnothing \). Moreover, as \(n_1\) goes to infinity, it clearly holds that \(L(A^E,q)\) tends to \(L(A^E,p)\) uniformly for all \(E\in J\). In particular, by choosing \(n_1\) large, we have that q is \(\frac{\alpha }{2}\)-bunched for all \(E\in J\) as well. We fix such a periodic point q.

Clearly, \(p_0=q_0\). Thus we may define

$$\begin{aligned} H^E:=H^{u,E}_{q\wedge p, q}\cdot H^{s,E}_{p,q\wedge p} \end{aligned}$$
(5.3)

for each \(E\in J\). Here \(H^{s,E}_{p,q\wedge p}\) and \(H^{u,E}_{q,q\wedge p}\) are the holonomies corresponding to \(A^{E}\), which are well-defined since both p and q are \(\frac{\alpha }{2}\)-bunched through J. Moreover, they are holomorphic on J since they are limits of uniformly convergent sequences of holomorphic functions \(H^{s,n}(E)\) or \(H^{u,n}(E)\) on J. Thus we have that \(E \mapsto H^E\) is analytic. Let \(Z_p=Z_p(E_n)\) be as in Corollary 4.15. By passing to a subsequence, we may assume that

$$\begin{aligned} Z_p(E_n)= {\left\{ \begin{array}{ll} \{s(E_n)\}\ &{} \text{ for } \text{ all } n,\\ \{u(E_n)\}\ &{} \text{ for } \text{ all } n, \text{ or } \\ \{s(E_n),u(E_n)\}\ &{} \text{ for } \text{ all } n. \end{array}\right. } \end{aligned}$$
(5.4)

Thus, we may extend the definition of \(Z_p(E)\) to all \(E \in J\) so that \(Z_p(E)\) consists of one or two functions that are analytic on J. By Corollary 4.15,

$$\begin{aligned} Z_q(E):= H^E\cdot Z_p(E) \end{aligned}$$
(5.5)

is invariant under the monodromy matrix of q for infinitely many \(E_n\). By analyticity, it follows that \(Z_q(E)\) is invariant by the monodromy matrix of q for every \(E \in J\). Since \(Z_p(E)\) is real, so is \(Z_q(E)\), which implies that the absolute value of the trace of the monodromy matrix of q cannot become smaller than 2 anywhere in J. Since J is open and \(E\in \sigma (H_q)\) cannot be an isolated point, we must have \(J\cap \sigma (H_q)=\varnothing \), which contradicts our choice of q. It follows that p is not hyperbolic for \(E_0\). In particular, \(E_0 \in \sigma (H_p)\). \(\square \)

Choose a small open disk \(D \subseteq {{\mathbb {C}}}\) around \(E_0\) such that p is \(\frac{\alpha }{2}\)-bunched for all energies E in the closed disk \({\bar{D}}\). Recall that by Proposition 2.11, \(\Delta (E)={\mathrm {Tr}}(A^E_{n_p}(p))\) is monotonic on each connected component of \(\Delta ^{-1}(-2,2)\). Thus we may also assume that D is small enough so that, through \({\bar{D}} {\setminus } \{E_0\}\), \(\Delta (E)\) is different from \(-2,2,0\). According to Sect. 2.2, if \(E_0\notin \partial (\sigma (H_p))\), we can then define two holomorphic functions \(u,s: D \rightarrow {{\mathbb {C}}}{\mathbb {P}}^1\), distinct everywhere, such that u(E) and s(E) are eigendirections of \(A^{E}_{n_p}(p)\); otherwise we can still define holomorphic functions us on the ramified (at \(E_0\)) double cover of \(\pi : {\tilde{D}}\rightarrow D\), giving (distinct) eigendirections when \({\tilde{E}} \in {\tilde{D}} {\setminus } \{E_0\}\), but taking as value at \(E_0\) the single real eigendirection of \(A^{E_0}_{n_p}(p)\). Moreover, for \(\pi ({\tilde{E}})\in {{\mathbb {R}}}\), \(s({\tilde{E}})\) and \(u({\tilde{E}})\) are real if and only if \(\pi ({\tilde{E}})\) not in the interior of \(\sigma (H_p)\).

Lemma 5.3

If q is a periodic point that is \(\frac{\alpha }{2}\)-bunched through \(E \in {\bar{D}}\) and \(q_i=p_j\) for some ij, then \(\sigma (H_p) \cap D = \sigma (H_q) \cap D\).

Proof

Since \(\sigma (H_{\omega })=\sigma (H_{T^n\omega })\) for any \(\omega \) and for any n, we may assume that \(p_0=q_0\). Then similarly to the proof of Lemma 5.2, we define \(H^E=H^{u,E}_{q\wedge p, q}\cdot H^{s,E}_{p,q\wedge p}\) for \(E \in D\). Since \(E_0\in \sigma (H_p)\), we consider two different cases.

If \(E_0\notin \partial (\sigma (H_p))\), then \(D\cap \sigma (H_p) = D\cap {{\mathbb {R}}}\) by our choice of D, and \(Z_p(E_n)\) is a nonempty subset of \(\{u(E_n),s(E_n)\}\). Following the same argument that showed that \(A^{E_0}_{n_q}(q)\) has a real eigendirection in the proof of Lemma 5.2, we obtain that \(A^{E}_{n_q}(q)\) of the present lemma has a non-real eigendirection for all \(E \in D\cap {{\mathbb {R}}}\). This implies that \(D\cap {{\mathbb {R}}}\subseteq \sigma (H_q)\), and the claim follows in this case.

If \(E_0 \in \partial (\sigma (H_p))\), then by our choice of D we have \(D\cap \Sigma _p\) is either \([E_0,E_+)\) or \((E_-,E_0]\), where

$$\begin{aligned} (E_-,E_+) = D\cap {{\mathbb {R}}}. \end{aligned}$$
(5.6)

For simplicity, we will assume that \({\text {int}}D \cap \Sigma = [E_0,E_+)\). Recall \(\pi :{\tilde{D}}\rightarrow D\) is the double cover map of D ramified at \(E_0\). For each n, choose a preimage \({\tilde{E}}_n \in \pi ^{-1}(E_n)\). Then \(Z_p(E_n)\) is a subset of \(\{ u({\tilde{E}}_n), s({\tilde{E}}_n) \}\). As in the proof of Lemma 5.2 and up to replacing \(E_n\) by a subsequence, \(Z_p(E_n)\) is always of the form \({\tilde{Z}}_p({\tilde{E}}_n)\), where

$$\begin{aligned} {\tilde{Z}}_p({\tilde{E}})= {\left\{ \begin{array}{ll} \{s({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}},\\ \{u({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}}, \text{ or } \\ \{s({\tilde{E}}), u({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}}. \end{array}\right. } \end{aligned}$$
(5.7)

Notice that if \(\pi ({\tilde{E}}) \in (E_-,E_0)\), then \({\tilde{Z}}_p({\tilde{E}})\) consists of real directions; if \(\pi ({\tilde{E}}) \in (E_0,E_+)\), then \({\tilde{Z}}_p({\tilde{E}})\) consists of non-real directions. We again define

$$\begin{aligned} {\tilde{Z}}_q({\tilde{E}}) := H^{\pi ({\tilde{E}})} \cdot {\tilde{Z}}_p({\tilde{E}}). \end{aligned}$$

Then \( {\tilde{Z}}_p({\tilde{E}})\) is invariant under \(A^{\pi ({\tilde{E}})}_{n_q}(q)\) whenever \({\tilde{E}} = {\tilde{E}}_n\). By the fact that \(H^{\pi (\cdot )}\), u, and s are all holomorphic on \({\tilde{D}}\), it follows that \({\tilde{Z}}_p({\tilde{E}})\) is invariant under \(A^{\pi ({\tilde{E}})}_{n_q}(q)\) for all \({\tilde{E}}\in {\tilde{D}}\). This implies that \(A^{E}_{n_q}(q)\) has at least one real eigendirection for \(E \in (E_-,E_0)\) and has at least one non-real eigendirection for \(E \in (E_0,E_+)\). This can only happen when \(D \cap \sigma (H_q) = [E_0,E_+)\), and the claim follows in this case. \(\square \)

Lemma 5.4

If q is any periodic point, then \(\sigma (H_q) \cap D = \sigma (H_p) \cap D\).

Proof

Fix an arbitrary periodic point \(q^*\). Let us say that a periodic point q is \((\epsilon , \delta )\)-good, \(0< \delta< \epsilon < 1\), if it spends at least a \(1 - \epsilon \) proportion of its iterates within distance \(\delta \) of p, and at least a \(\epsilon /2\) proportion of its iterates within distance \(\delta \) of \(q^*\). By Proposition 2.7 and similar to the argument leading to the choice of q in the proof of Lemma 5.2, we see that the set of \((\epsilon ,\delta )\)-good periodic points is not empty for any choice of \(0<\delta<\epsilon <1\). Moreover, if \(\epsilon \) is sufficiently small, then an \((\epsilon ,\delta )\)-good q is \(\frac{\alpha }{2}\)-bunched for energies \(E \in {\bar{D}}\). Moreover, since certain iterates of q are close to p, we clearly have \(q_i=p_j\) for some ij. By Lemma 5.3, it then holds that \(\sigma (H_p) \cap D = \sigma (H_q) \cap D\) for all such q’s. We fix such a small \(\epsilon \) for the remainder of this proof.

First we show \(\sigma (H_{q^*}) \cap D \subseteq \sigma (H_p) \cap D\). If this is not true, then there is some \(E_0\in (\sigma (H_{q^*})\cap D){\setminus } \sigma (H_{p})\). In particular, we have

$$\begin{aligned} \varepsilon := \min \{ d(E_0,\sigma (H_{p})), d(E_0,\partial D) \} > 0. \end{aligned}$$

Then for an \((\epsilon , \delta )\)-good periodic point q, we also have that

$$\begin{aligned} \varepsilon = \min \{ d(E_0,\sigma (H_{q})), d(E_0,\partial D) \} > 0. \end{aligned}$$

By Proposition 2.10 and the fact that \(\mathrm {Orb}(q)\cap B_\delta (q^*)\ne \varnothing \) for \((\epsilon , \delta )\)-good points, we have for sufficiently small \(\delta \) and an \((\epsilon ,\delta )\)-good periodic point q that

$$\begin{aligned} \sigma (H_{q^*}) \subseteq B_{\frac{\varepsilon }{2}}(\sigma (H_{q})). \end{aligned}$$

Clearly, this implies that \(d(E_0,\sigma (H_{q})) < \frac{\varepsilon }{2}\) and we obtain a contradiction. So the first part follows.

Now we show \(\sigma (H_p) \cap D \subseteq \sigma (H_{q^*}) \cap D\). Suppose this is not the case. By the first part, there is an \(E_0\in (\sigma (H_{p})\cap D) {\setminus } \sigma (H_{q^*})\). In particular, we have

$$\begin{aligned} {{\tilde{\varepsilon }}}:=\min \{d(E_0,\sigma (H_{q^*})), d(E_0,\partial D)\}>0. \end{aligned}$$

Notice that \(\mathrm {Orb}(q^*) \cap B_\delta (T^mq)\ne \varnothing \) for an \((\epsilon ,\delta )\)-good periodic point p and for some \(m\in {{\mathbb {Z}}}\). Thus by Proposition 2.10, and by choosing \(\delta \) small (the smallness of which is independent of q or \(q^*\)), we have

$$\begin{aligned} \sigma (H_{T^mq}) \subseteq B_{\frac{{{\tilde{\varepsilon }}}}{2}}(\sigma (H_{q^*})). \end{aligned}$$

Since \(\sigma (H_{T^mq}) = \sigma (H_{q})\) and \(E_0 \in \sigma (H_{p})\cap D = \sigma (H_{q})\cap D\), we obtain

$$\begin{aligned} d(E_0,\sigma (H_{q^*}))<\frac{{{\tilde{\varepsilon }}}}{2}, \end{aligned}$$

which is a contradiction and the lemma follows. \(\square \)

By (2.17), the spectrum \(\Sigma \) is the closure of the union of the spectra of periodic points. Thus Lemma 5.4 implies that \(\Sigma \cap D = \sigma (H_p) \cap D\). For each T-ergodic measure \(\nu \) on \(\Omega \), we let \(\Sigma _\nu \) denote the set such that \(\sigma (H_\omega )=\Sigma _\nu \) for \(\nu \) almost every \(\omega \); see, for example, [32].

Lemma 5.5

For any T-ergodic measure \(\nu \) on \(\Omega \), we have \(\Sigma _\nu \cap D=\sigma (H_p) \cap D\). Moreover, \(L(A^E;\nu )=0\) for all \(E \in \sigma (H_p) \cap D\). In particular, \(L(E)=0\) for all such E’s.

Proof

Since we have \(\sigma (H_\omega )\subseteq \Sigma \) for each \(\omega \in \Omega \) and \(\Sigma \cap D = \sigma (H_p) \cap D\), it clearly holds that \(\Sigma _\nu \cap D\subseteq \sigma (H_p) \cap D\). On the other hand, if \(E\notin \Sigma _\nu \), then the sequence \(\{A^E(T^n\omega )\}_{n\in {{\mathbb {Z}}}}\) is uniformly hyperbolic for \(\nu \)-almost every \(\omega \in \Omega \), which in turn implies that \(L(A^E;\nu )>0\); see, for example, [43, Theorem 3]. Thus \(L(A^E;\nu )=0\) implies that \(E\in \Sigma _\nu \). So we only need to prove the second part of the lemma.

Assume that the statement is false. In other words, we have \(L(A^E;\nu )>0\) for some \(E\in \sigma (H_p) \cap D\). By [30, Theorem 3], for each \(\epsilon >0\), there is a periodic point \(q\in \Omega \) so that \(|L(A^E;\nu )-L(A^E,q)|<\epsilon \). Thus there is periodic point \(q\in \Omega \) so that \(L(A^E,q)>0\). In particular, \(E\notin \sigma (H_q)\), which contradicts Lemma 5.4, concluding the proof. \(\square \)

Lemma 5.6

Let \(E_0\) be an accumulation point of \({{\mathcal {Z}}}_f=\{E: L(E)=0\}\). Assume that there exists a periodic point p that is \(\frac{\alpha }{2}\)-bunched at \(E_0\). Then

  1. (1)

    The connected component I of \(E_0\) in the spectrum is isolated,

  2. (2)

    \(L(A^E;\nu )=0\) for all \(E\in I\) and all T-ergodic measure \(\nu \) on \(\Omega \).

Proof

By Lemma 5.2, \(E_0\in \sigma (H_p)\). Let I be the connected component of \(\sigma (H_p) \) that contains \(E_0\). Notice that p is \(\frac{\alpha }{2}\)-bunched for every \(E \in I\) since \(L(A^E,p)=0\) on I. Let S be the set of accumulation points of \({{\mathcal {Z}}}_f\cap I\). It is clearly a closed and non-empty subset of I since \(E_0\in S\). Moreover, applying Lemma 5.5 to any \(E \in S\), we see that there is open disk D around E so that \(L(E)=0\) on \(\sigma (H_p)\cap D\) which contains \(I\cap D\). This implies \(I\cap D\subset S\). Thus S is open in I as well. Thus we have \(S = I\). Clearly \(I\subseteq \Sigma \). Applying Lemma 5.5 to the boundary points of I, we obtain \(\Sigma \cap D=\sigma (H_p)\cap D\) for some disk D around the boundary points. Thus, I is an isolated component of \(\Sigma \). Applying Lemma 5.5 to all \(E \in I\) again, we obtain that \(L(A^E,\nu )=0\) for all \(E\in I\) and for all T-ergodic measure \(\nu \) on \(\Omega \). \(\square \)

We have now collected all the tools to prove our main theorem.

Proof of Theorem 1.1

Suppose to the contrary that there are \(E_0 \in \{ E : L(E) = 0 \}\) and \(E_n \in \{ E : L(E) = 0 \} {\setminus } \{ E_0 \}\), \(n \in {{\mathbb {Z}}}_+\), such that \(E_n \rightarrow E_0\) as \(n \rightarrow \infty \). Since \(L(E_0) = 0\), we can choose a \(\frac{\alpha }{2}\)-bunched periodic point by [30, Theorem 3]. It now follows from Lemma 5.6 that \(E_0\) belongs to a non-degenerate compact interval I, which is a connected component of \(\Sigma \), as well as of all periodic spectra \(\sigma (H_p)\). In particular, for the fixed point of T, the unique connected component of its spectrum is an interval of length 4. Since having such a connected component is only possible for constant periodic potentials by Proposition 2.13, it follows that the potential associated with each periodic point must be constant. This implies that f itself must be constant; contradiction. \(\square \)

5.2 Special cases: positivity away from a finite set

In this subsection we consider sampling functions \(f : \Omega \rightarrow {{\mathbb {R}}}\) for which we have global existence of the holonomies in the sense that the cocycle \(A^E\) admits canonical holonomies as defined in Sect. 2.1.5 for all E in a complex neighborhood of the convex hull \({{\mathcal {U}}}\subseteq {{\mathbb {C}}}\) of the spectrum \(\Sigma _f\). Since \(A^E\) depends on E holomorphically, we obtain that the holonomies are holomorphic on \({{\mathcal {U}}}\) as well. In this case, we are able to improve the result we obtained in Sect. 5.1.

There are two types of f for which we have such global existence of holonomies. One is the set of \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) for which \(A^{E}\) is fiber bunched in the sense of Definition 5.7 below for every E in the convex hull of the spectrum \(\Sigma \). Such an f will be called globally fiber bunched (or just globally bunched). The other is the set of locally constant f’s.

5.2.1 Fiber bunching and existence of holonomies

Definition 5.7

We say that \(A \in C^\alpha (\Omega , \mathrm {SL}(2,{{\mathbb {C}}}))\) is fiber bunched if there exists \(n_0 \ge 1\) such that for every \(\omega \in \Omega \), we have

$$\begin{aligned} \Vert A_{n_0}(\omega )\Vert ^2 < e^{\alpha n_0}. \end{aligned}$$
(5.8)

Equivalently, there is \(\theta < \alpha \) such that \(\Vert A_{n_0}(\omega )\Vert ^2 < e^{\theta n_0}\) for every \(\omega \in \Omega \).

Note that fiber bunching is clearly a \(C^0\)-open condition. A fiber bunched cocycle has canonical holonomies as defined in Sect. 2.1.5. In fact, we can run the proof of Lemma 4.2 to show that for \(\omega ' \in W^u_{\mathrm {loc}}(\omega )\), \(H^{u,n}_{\omega ,\omega '} = A_{-n}(\omega ')^{-1} \cdot A_{-n}(\omega )\) converges uniformly on \(\Omega \) to the unstable holonomy \(H^u_{\omega ,\omega '}\). Similarly for \(\omega ' \in W^s_{\mathrm {loc}}(\omega )\), we have that \(H^{s,n}_{\omega ,\omega '} = A_n(\omega ')^{-1} A_n(\omega )\) converges uniformly on \(\Omega \) to the stable holonomy \(H^s_{\omega ,\omega '}\). Indeed, to obtain the uniform convergence to holonomies, the only condition we used in the proof of Lemma 4.2 is the condition in (4.3), which is exactly the fiber bunching condition (5.8). We also note the following: If \(A^t \in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {C}}}))\), t in some domain \(U\subseteq {{\mathbb {C}}}\), is a continuous family such that \(t \mapsto A^t(\omega )\) is holomorphic for every \(\omega \in \Omega \) and \(A^t\) is fiber bunched for every t, then the stable and unstable holonomies depend holomorphically on t. Indeed, in this case, the holonomies are limits of uniformly convergent sequences of holomorphic functions. In particular, we may consider Schrödinger cocycles \(A^E\) with sampling function \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\). If \(\Vert f\Vert _{\infty }\) is sufficiently small, then \(A^E\) is fiber bunched in a complex neighborhood of the convex hull of the spectrum \(\Sigma \). To see this, we first see that \(\left( {\begin{matrix} E &{} -1 \\ 1 &{} 0 \end{matrix}}\right) \) is fiber bunched for all \(E\in [-2,2]\) since they are all elliptic or parabolic. By openness of fiber bunching, we then have that \(A^{E}\) is fiber bunched for all E in a complex neighborhood of \([-2,2]\) provided \(\Vert f\Vert _{\infty }\) is sufficiently small. If necessary, we can then choose \(\Vert f\Vert _{\infty }\) smaller so that the convex hull of \(\Sigma _f\) is contained in such an open neighborhood. Thus f is globally bunched.

5.2.2 Locally constant cocycles

The other class for which the canonical holonomies exist for obvious reasons is defined as follows.

Definition 5.8

We say that \(A:\Omega \rightarrow \mathrm {SL}(2,{{\mathbb {R}}})\) is locally constant if there exists a \(n_0\) such that for each \(\omega \in \Omega \), \(A(\omega )\) depends only on the cylinder set \([-n_0;\omega _{-n_0}, \ldots , \omega _{n_0}]\).

Evidently, locally constant cocycles are \(\alpha \)-Hölder continuous for all \(\alpha >0\). Locally constant cocycles might not be fiber bunched. However, the holonomies exist trivially. Indeed, if A is locally constant, then there is a \(n_0\in {{\mathbb {Z}}}_+\) so that for all \(\omega \) and all \(n>n_0\) we have

$$\begin{aligned} H^{\tau ,n}_{\omega ,\omega ^\tau }=H^{\tau ,n_0}_{\omega ,\omega ^\tau }, \end{aligned}$$

where \(\tau \in \{s,u\}\) and \(\omega ^\tau \in W^\tau _{\mathrm {loc}}(\omega )\). Thus \(H^{\tau ,n_0}_{\omega , \omega ^\tau }\) are exactly the holonomies. Now we consider Schrödinger cocycles \(A^E\) with potential \(f : \Omega \rightarrow {{\mathbb {R}}}\). If there is a \(n_0\in {{\mathbb {Z}}}_+\) such that \(f(\omega )\) depends only on \([-n_0;\omega _{-n_0}, \ldots , \omega _{n_0}]\), then \(A^E\) is locally constant for all \(E \in {{\mathbb {C}}}\). In other words, a locally constant sampling function induces locally constant Schrödinger cocycle maps.

5.2.3 Energies admitting an su-state

Again, our objective is to study the energies for which \(L(E) = L(A^{E},\mu ) = 0\). We will point out how the desired statements will follow by simple specialization of the proofs of the lemmas in Sect. 4.

Assume that \(A \in C^0(\Omega , \mathrm {SL}(2,{{\mathbb {R}}}))\) has canonical holonomies \(H^\tau _{\omega , \omega '}\), where \(\tau \in \{s,u\}\). Recall that an su-state for A is a (TA)-invariant measure m with a disintegration \(\{m_\omega : \omega \in \Omega \}\) that is invariant under the cocycle and the holonomies. In particular, for \(\mu \)-almost every \(\omega \in \Omega \), we have

  1. (1)

    \(A(\omega )_* m_\omega = m_{T \omega }\),

  2. (2)

    \((H^s_{\omega ,\omega '})_* m_\omega = m_{\omega '}\) for every \(\omega ' \in W^s_{\mathrm {loc}}(\omega )\).

  3. (3)

    \((H^u_{\omega ,\omega '})_* m_\omega = m_{\omega '}\) for every \(\omega ' \in W^u_{\mathrm {loc}}(\omega )\).

Then we have the following invariance principle:

Proposition 5.9

If \(L(A,\mu ) = 0\), then there exists an su-state for A.

Proof

This follows from Proposition 4.7, noting that the canonical holonomies exist and are continuous on \(\Omega \), and therefore the conditions (4.5) and  (4.6) are automatically satisfied. \(\square \)

One of the main properties of su-states is the following.

Proposition 5.10

If m is an su-state, then it admits a disintegration for which the conditional measures \(m_\omega \) depend continuously on \(\omega \) and are both s-invariant and u-invariant.

Proof

We take an su-state m. Then we run the proof of Lemma 4.9 where we constructed the disintegration \({\tilde{m}}\) which is continuous on \(\mathrm {supp}(K_\delta )\cap K_\delta \). In the present setting, we have \(K_\delta =\Omega \) since we have canonical holonomies. The result follows. \(\square \)

By continuity and almost everywhere coincidence, all the invariance properties in the definition of su-states may then hold true for every \(\omega \in \Omega \). From now on, we always choose such a disintegration for an su-state m.

5.2.4 Finiteness of the set of energies admitting su-states

Now we return to the Schrödinger case. Recall that by general principles, \(L(E) = 0\) implies \(E \in \Sigma \subseteq {{\mathbb {R}}}\). Since each of these real energies gives rise to an su-state for \(A^{E}\), let us consider the following set (whose dependence on \(\mu \) and f we leave implicit):

$$\begin{aligned} {\mathcal {F}} = \{ E \in \Sigma : \text { there is an { su}-state for } A^{E} \}. \end{aligned}$$
(5.9)

Lemma 5.11

Suppose that \(0 < \alpha \le 1\) and \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) is globally bunched or locally constant. Assume that \({\mathcal {F}}\) is infinite. Let \(p , q \in \Omega \) be two periodic points of T. Then \(\sigma (H_{p}) = \sigma (H_{q})\).

Proof

First, we consider the case that \(p_i=q_j\) for some \(i, j\in {{\mathbb {Z}}}\). Since \(\sigma (H_{\omega })=\sigma (H_{T^n\omega })\) for any \(\omega \) and for any n, we may assume that \(p_0=q_0\). Recall in this case there is a unique \(q\wedge p\in W^u_{\mathrm {loc}}(q)\cap W^s_{\mathrm {loc}}(p)\). Assume that \(n_p\) is the period of p and \(n_q\) is the one of q. By our choice of f, we may choose \({{\tilde{\Sigma }}}\subseteq {{\mathbb {R}}}\) to be a compact interval containing the spectrum \(\Sigma \) and \({{\mathcal {U}}}\subseteq {{\mathbb {C}}}\) to be a complex neighborhood of \({{\tilde{\Sigma }}}\) where \(A^E\) has canonical holonomies for all \(E\in {{\mathcal {U}}}\). Recall that under the conditions of the present lemma, the holonomies are holomorphic functions on \({{\mathcal {U}}}\).

By the arguments in the proof of Lemma 4.9, Corollary 4.15, and the existence of canonical holonomies, we can find for each periodic \(\omega \in \Omega \), a subset \(Z_\omega \subseteq {{\mathbb {C}}}{\mathbb {P}}^1\) consisting of at most two points that is invariant under \(A(\omega )\) and the holonomies. In particular, for the periodic point p with period \(n_p\), \(Z_{p}\) is invariant under \(A_{n_p}(p)\) and

$$\begin{aligned} H^s_{p,q\wedge p}(Z_p)=H^u_{q,q\wedge p}(Z_q) \text{ whenever } q_0=p_0. \end{aligned}$$
(5.10)

Note that if \({\mathrm {Tr}}(A_{n_p}(p))\ne 0\), then \(Z_p\) must be a subset of the eigendirections of \(A_{n_p}(p)\). In particular, for \(A_{n_p}(p)\) with nonzero trace, \(A_{n_p}(p)\) is elliptic if and only if \(Z_p\) is non-real. We let \(\{s(E),u(E)\}\) denote the pair of eigendirections of \(A^E_{n_p}(p)\). Note that both s(E) and u(E) are continuous on \({{\tilde{\Sigma }}}\) and analytic on each spectral gap or on the interior of each connected component of \(\sigma (H_p)\). We define

$$\begin{aligned} H^E=H^{u,E}_{q\wedge p, q}\cdot H^{s,E}_{p,q\wedge p}, \end{aligned}$$
(5.11)

which are holomorphic in E on a complex neighborhood \({{\mathcal {U}}}\) of \({{\tilde{\Sigma }}}\).

Let \(E_0\) be an accumulation point of \({{\mathcal {F}}}\). Then, similarly to the proof of Lemma 5.2 or  5.3, we can find a sequence \(\{E_n\}_{n\ge 1}\) in \({{\mathcal {F}}}\) so that \(E_n\rightarrow E_0\), \(E_n\ne E_0\), and

$$\begin{aligned} Z_p(E_n)= {\left\{ \begin{array}{ll} \{s(E_n)\} &{} \text{ for } \text{ all } n\ge 1,\\ \{u(E_n)\} &{} \text{ for } \text{ all } n\ge 1, \text{ or } \\ \{s(E_n), u(E_n)\} &{} \text{ for } \text{ all } n\ge 1. \end{array}\right. } \end{aligned}$$
(5.12)

Thus we may extend the domain of \(Z_p(\cdot )\) from \(\{E_n,\ n\ge 1\}\) to \({{\mathcal {U}}}\). Then we define

$$\begin{aligned} Z_q(E):=H^E(Z_p(E)) \end{aligned}$$
(5.13)

and we get that \(Z_q(E_n)\) is invariant under \(A^{E_n}_{n_q}(q)\) for all \(n\ge 1\). By the continuity and analyticity properties of \(H^E\), s(E), and u(E), we obtain the following conclusions: if \(E_0\) is in a spectral gap, then \(Z_q(E)\) is invariant under \(A^{E_n}_{n_q}(q)\) for all E in the closure of that spectral gap; if \(E_0\) is the interior of a connected component of \(\sigma (H_p)\), then \(Z_q(E)\) is invariant under \(A^{E_n}_{n_q}(q)\) for all E in that connected component.

Now by the same arguments as in the proof of Lemma 5.2, if \(E_0\) is in a spectral gap of \(H_p\), then it is away from \(\sigma (H_q)\) with a uniform distance for all periodic points q. But \(E_0\in \Sigma \) since it is an accumulation point of \({{\mathcal {F}}}\). Thus \(E_0\) can be approximated by \(\sigma (H_q)\) for a certain choice of q, a contradiction. We may conclude that \(E_0 \in \sigma (H_p)\). So we may let \(I\subseteq \sigma (H_p)\) be the connected component containing \(E_0\). Now we claim that

$$\begin{aligned} Z_q(E)=H^E(Z_p(E)) \text{ is } \text{ invariant } \text{ under } A^E_{n_q}(q) \text{ for } \text{ all } E\in {{\tilde{\Sigma }}}. \end{aligned}$$
(5.14)

If \(E_0\) is in the interior of I, we have already obtained that \({\tilde{Z}}_q(E)\) is invariant under \(A^E_{n_q}(q)\) for all \(E\in I\). If \(E_0\) belongs to the boundary of I, then similarly to the proof of Lemma 5.3, there is an open disk D centered at \(E_0\) with ramified (at \(E_0\)) double cover \(\pi :{\tilde{D}}\rightarrow D\) so that s and u are holomorphic on \({\tilde{D}}\). Thus we may assume \(Z_p(E)={\tilde{Z}}_p({\tilde{E}})\) where \({\tilde{E}}\in \pi ^{-1}(E)\) and

$$\begin{aligned} {\tilde{Z}}_p({\tilde{E}})= {\left\{ \begin{array}{ll} \{s({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}},\\ \{u({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}}, \text{ or } \\ \{s({\tilde{E}}), u({\tilde{E}})\}\ &{} \text{ for } \text{ all } {\tilde{E}}\in {\tilde{D}}. \end{array}\right. } \end{aligned}$$
(5.15)

Then we define

$$\begin{aligned} {\tilde{Z}}_q({\tilde{E}}):=H^{\pi ({\tilde{E}})}({\tilde{Z}}_p({\tilde{E}})), \end{aligned}$$
(5.16)

so that \({\tilde{Z}}_p({\tilde{E}})\) is invariant under \(A^{\pi ({\tilde{E}})}_{n_q}(q)\) for infinitely many \({\tilde{E}}_n\in {\tilde{D}}\). By the fact that \(H^{\pi (\cdot )}\), s, and u are holomorphic on \({\tilde{D}}\), we obtain that \({\tilde{Z}}_{q}({\tilde{E}})\) is invariant under \(A^{\pi ({\tilde{E}})}_{n_q}(q)\) for all \({\tilde{E}}\in {\tilde{D}}\). Descending to D, we obtain that

$$\begin{aligned} Z_q(E):=H^E(Z_p(E)) \end{aligned}$$
(5.17)

is invariant under \(A^{E}_{n_q}(q)\) for all \(E\in D\). In particular, \(Z_q(E)\) is invariant under \(A^{E}_{n_q}(q)\) for all \(E\in (E_0-\rho ,E_0+\rho )\), where \(\rho >0\) is the radius of D.

By the analysis above, we obtain that no matter whether \(E_0\) belongs to the boundary or to the interior of I, after a finite number of continuations, we get that \(Z_q(E)\) is invariant under \(A^{E}_{n_q}(q)\) for all \(E \in {{\tilde{\Sigma }}}\), as claimed. As in the proof of Lemma 5.3, and by the fact that \(H^E\) is real for E real, we obtain that \(Z_p(E)\) and \(Z_q(E)\) are simultaneously real or non-real for all \(E \in {{\tilde{\Sigma }}} \supseteq \Sigma \). This clearly implies that

$$\begin{aligned} \sigma (H_p) = \sigma (H_q) \text{ whenever } p_0=q_0. \end{aligned}$$
(5.18)

Now we remove the condition \(p_i=q_j\) for some \(i,j\in {{\mathbb {Z}}}\). As in the proof of Lemma 5.4, we can find a periodic point \(p'\) with some iterates very close to p and some very close to q. In particular, \(p'_i=p_j\) for some \(i, j\in {{\mathbb {Z}}}\) and \(p'_k=q_m\) for some \(k, m\in {{\mathbb {Z}}}\). Thus by the first case we consider above, we have

$$\begin{aligned} \sigma (H_{p})=\sigma (H_{p'})=\sigma (H_{q}). \end{aligned}$$
(5.19)

This concludes the proof. \(\square \)

5.2.5 Proof of Theorem 1.3

Theorem 1.3 is an immediate consequence of the following theorem.

Theorem 5.12

Suppose \(0<\alpha \le 1\) and let \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) be globally bunched or locally constant. If the periodic spectra associated with periodic points of T in \(\Omega \) are not all identical, then \(\{ E : L(E) = 0 \}\) is finite.

Proof

As \(\{ E : L(E) = 0 \} \subseteq {\mathcal {F}}\), the statement follows from Lemma 5.11. \(\square \)

Remark 5.13

Theorem 5.12 is particularly easy to apply when T has a fixed point, as the latter property ensures the presence of a constant potential and all one needs to do in order to show that not all periodic spectra are the same is to use the non-constancy of the sampling function to produce a non-constant periodic potential. However, there are certainly cases of interest where the base dynamics given by T is fixed-point-free. In this case Theorem 5.12 still provides a direct tool for proving that \(\{ E : L(E) = 0 \}\) is finite for many globally bunched or locally constant \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\), one just needs to take a closer look at the resulting periodic spectra.

Remark 5.14

Consider \((\Omega ^+,T_+, \mu ^+)\) and assume that we can lift \(\mu ^+\) to an ergodic measure \(\mu \) on \((\Omega ,T)\) that has a local product structure. Then all our main results of this section, in particular Theorem 1.1 and Theorem 1.3, can be applied to \(f \in C^\alpha (\Omega ^+,{{\mathbb {R}}})\). Indeed, such an f can be lifted to an \({\bar{f}}\in C^\alpha (\Omega , {{\mathbb {R}}})\) that depends only on the future. Then all our results follow since \(L(\mu , A^{(E-{\bar{f}})})=L(\mu ^+, A^{(E-f)})\).

6 Positivity of the Lyapunov exponent II

We first show that in the scenario of Sect. 5.2, we may remove the finite exceptional set for an open and dense subset of sampling functions. Then we apply similar arguments to the general case discussed in Sect. 5.1 and obtain that for a residual set of sampling functions, the discrete exceptional set can be removed. Throughout this section, we again assume that \((\Omega ,T)\) is a subshift of finite type with a fully supported ergodic measure \(\mu \) that has a local product structure. Note that for \(0 < \alpha \le 1\), the space \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) is a Banach space with the \(C^\alpha \) norm defined by

$$\begin{aligned} \Vert A\Vert _{0,\alpha }=\Vert A\Vert _{\infty }+\sup _{\omega \ne \omega '}\frac{\Vert A(\omega ) -A(\omega ')\Vert }{d(\omega ,\omega ')^\alpha }, \end{aligned}$$
(6.1)

where \(\Vert A\Vert _\infty \) is the standard \(C^0\) norm \(\Vert A\Vert _\infty = \sup _{\omega \in \Omega }\Vert A(\omega )\Vert \). Similarly, the space \(C^\alpha (\Omega ,{{\mathbb {R}}})\) is a Banach space with a \(C^\alpha \) norm that can be defined analogously. We say that a subset of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) has codimension infinity if it is locally contained in finite unions of closed submanifolds with arbitrary codimension. The same notion can be defined when we consider a subspace or an open subset of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\).

6.1 Special cases: uniform positivity in a dense open set

In this subsection, we assume that \(A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) is fiber bunched or locally constant, and hence admits canonical holonomies by our earlier discussion.

We first introduce the follow notion of typical cocycles.

Definition 6.1

We say A is typical if there are two periodic points p and q with periods \(n_p\) and \(n_q\) such that \(p_0=q_0\) and the following properties hold:

  1. (1)

    \(A_{n_p}(p)\ne I_2\) and \({\mathrm {Tr}}(A_{n_p}(p))\ne 0\).

  2. (2)

    Let \(\{s(p),u(p)\}\subseteq {{\mathbb {C}}}{\mathbb {P}}^1\) be the set of eigendirections of \(A_{n_p}(p)\). Then there is no \(Z_p\subseteq \{s(p),u(p)\}\) so that \(H^u_{q\wedge p, q}\cdot H^s_{p,q\wedge p}\cdot Z_p\) is invariant under \(A_{n_q}(q)\).

Since the definition involves two periodic points p and q, we may more precisely say that A is typical with respect to (pq). Note that A might be typical with respect to many other pairs of periodic points as well. Clearly, the defining conditions of a typical cocycle are open in the \(C^0\) topology. Thus they are open in the \(C^\alpha \) topology as well.

The notion of a typical cocycle in the present scenario was first introduced in [7, 8]. Our version is slightly different from theirs. It is adapted for the proof of Theorem 1.5 below. In particular, employing the arguments from [7, 8], one can show the following result. We only sketch the proof for the convenience of the reader.

Proposition 6.2

The set of typical cocycles as defined above forms a \(C^\alpha \)-open and dense subset in the set of fiber bunched (resp., locally constant) cocycles. Moreover, the complement of the set of typical cocycles has codimension infinity.

Proof

Following the arguments from [7, 8], for each fixed pair of periodic points p and q with \(p_0=q_0\), the complement of the set cocycles satisfying conditions (1) and (2), denoted by \({{\mathcal {B}}}_{p,q}\), is seen to be contained in the union of a finite number of sets of the form

$$\begin{aligned} \{A:{{\mathcal {H}}}(A)=0\}, \end{aligned}$$

where each \(A\mapsto {{\mathcal {H}}}(A)\) is a \(C^1\) submersion when restricted to suitable sets of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\). Thus for each fixed pair (pq), one can show that \({{\mathcal {B}}}_{p,q}\) is a submanifold of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) with positive codimension. Note that the complement of the set of typical cocycles is

$$\begin{aligned} \bigcap _{p,q\in \mathrm {Per}(T):\ p_0=q_0}{{\mathcal {B}}}_{p,q}. \end{aligned}$$

Since there are infinitely many such pairs (pq), the set above is contained in a subset of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) with codimension infinity. Thus, the complement of the set of typical cocycles has codimension infinity and the set of typical cocycles is open and dense. \(\square \)

Remark 6.3

Let us mention that one can have the following type of perturbation from [7, 8]: for each fixed pair of periodic points p and q with \(p_0=q_0\), one can modify the values of A at other points without changing its values at p and q as well as without changing its holonomies on the local stable and unstable sets of these two points.

We first note the following consequence of our proof of Lemma 5.11, which also recovers one of the results in [7]:

Lemma 6.4

Assume that the fiber bunched or locally constant \(A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) is typical. Then \(L(A,\mu )>0\). In particular, there is an open and dense subset \({{\mathcal {G}}}\) of fiber bunched or locally constant cocycles whose complement has codimension infinity and \(L(A,\mu )>0\) for all \(A\in {{\mathcal {G}}}\).

Proof

Assume that \(L(A,\mu )=0\). Let p and q be two periodic points satisfying the conditions in the definition of typical cocycles. Then by the proof of Lemma 5.11, we know that there is a set \(Z_p\subseteq {{\mathbb {C}}}{\mathbb {P}}^1\) consisting of at most two points with the following properties:

  1. (1)

    \(Z_p\) is invariant under \(A_{n_p}(p)\),

  2. (2)

    \(H^u_{q\wedge p, q}\cdot H^s_{p,q\wedge p}\cdot Z_p\) is invariant under \(A_{n_q}(q)\).

Since p and q satisfy the conditions stated in the definition of typical cocycles, we have that \(A_{n_p}(p)\ne I_2\) and \({\mathrm {Tr}}(A_{n_p}(p))\ne 0\). Thus property (1) implies that \(Z_p\) is a subset of \(\{s(p),u(p)\}\). As a consequence, property (2) contradicts condition (2) of the definition of typical cocycles, concluding the proof. \(\square \)

Remark 6.5

Although Proposition 6.2 and Lemma 6.4 are stated for the space of general cocycles, \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\), they hold true if one restricts to the subspace of Schrödinger cocycles, that is, cocycles taking the form

$$\begin{aligned} A(\omega )=\begin{pmatrix}f(\omega ) &{} -1\\ 1 &{} 0\end{pmatrix}. \end{aligned}$$

Note that this subspace is equivalent to the space \(C^\alpha (\Omega ,{{\mathbb {R}}})\). Indeed, it is not difficult to see that the perturbation argument used in the proof of Proposition 6.2 works equally well when considering Schrödinger coccyles.

We note the following consequence of [4, Theorem 2.8].

Proposition 6.6

Suppose \((\Omega ,T)\) is a subshift of finite type and \(\mu \) is T-ergodic with a local product structure. Let \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) be globally fiber bunched or locally constant. Then \(E \mapsto L(E)\) is continuous on \({{\mathbb {R}}}\).

Indeed, [4, Theorem 2.8] implies that the Lyapunov exponent is continuous on the subspace of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) of globally fiber bunched or locally constant cocycles. If \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\) is globally fiber bunched or locally constant, then there is a connected compact interval \({\hat{\Sigma }}\) that contains the spectrum \(\Sigma = \Sigma _f\) so that \(A^{E}\) is fiber bunched or locally constant for all \(E \in {\hat{\Sigma }}\). Thus L(E) is continuous on \({\hat{\Sigma }}\). On the other hand, L(E) is smooth outside of the spectrum as \((T,A^E)\) is uniformly hyperbolic for \(E\notin \Sigma \) and the Lyapunov exponent is pluriharmonic on the set of uniformly hyperbolic cocycles. Thus L(E) is continuous on \({{\mathbb {R}}}\).

Proof of Theorem 1.5

We focus on the case where f is globally fiber bunched as the proof in the locally constant case is completely analogous.

Fix an \(f \in C^\alpha (\Omega ,{{\mathbb {R}}})\) that is non-constant and globally fiber bunched. Thus we may find a compact connected interval \({\hat{\Sigma }}\) whose interior contains the spectrum \(\Sigma _f\) so that \(A^{(E-f)}\) is fiber bunched for each \(E\in {\hat{\Sigma }}\). Note that fiber bunching is a \(C^0\) open condition and

$$\begin{aligned} \forall E \, : \, \Vert A^{(E-f_1)}-A^{(E-f_2)}\Vert _\infty = \Vert f_1-f_2\Vert _\infty . \end{aligned}$$
(6.2)

Thus, for any open neighborhood \({{\mathcal {U}}}_f \subseteq C^\alpha (\Omega ,{{\mathbb {R}}})\) of f that is sufficiently small, we have for each \(g \in {{\mathcal {U}}}_f\) that \(\Sigma _g \subseteq {\hat{\Sigma }}\) and \(A^{(E-g)}\) is fiber bunched for all \(E \in {\hat{\Sigma }}\). In the remaining part of the proof, we fix such a \({{\mathcal {U}}}_f\) and work inside it.

If \(\sigma (H_{p,f})=\sigma (H_{q,f})\) for all periodic points p and q, then by total disconnectedness of \(\Omega \), we can modify the value of f at q without changing its value along the orbit of p. On the other hand, if we choose E on the boundary of the spectrum of \(\sigma (H_{q,f})\), we can certainly perturb f to g so that \(L(A^{(E-g)},q)>0\). Thus we may perturb f to a g that is arbitrarily close to f with the property \(\sigma (H_{p,g})\ne \sigma (H_{q,g})\). Then we can instead work with g. Moreover, when perturbing f to g, we can certainly choose p and q so that \(p_0=q_0\).

Thus, we may assume without loss of generality that f is such that \(\sigma (H_{p,f})\ne \sigma (H_{q,f})\) for suitably chosen periodic points p and q such that \(p_0=q_0\). As described in Sect. 2.2, we again let \(\{s(E),u(E)\}_{E\in \Sigma }\) be the pair of functions associated with the eigendirections of \(A^{(E-f)}_{n_p}(p)\). Define \(H^E=H^{u,E}_{q\wedge p, q}\cdot H^{s,E}_{p,q\wedge p}\). Then by the proof of Lemma 5.11, if we define \(Z_p(E)\) to be

$$\begin{aligned} Z_p(E)= {\left\{ \begin{array}{ll} \{s(E)\}\ &{} \text{ for } \text{ all } E\in {\hat{\Sigma }},\\ \{u(E)\}\ &{} \text{ for } \text{ all } E\in {\hat{\Sigma }}, \text{ or } \\ \{s(E),u(E)\}\ &{} \text{ for } \text{ all } E\in {\hat{\Sigma }}, \end{array}\right. } \end{aligned}$$
(6.3)

then the set

$$\begin{aligned} \left\{ E\in {\hat{\Sigma }}:\ A^{(E-f)}_{n_q}(q)\cdot H^E\cdot Z_p(E)=H^E\cdot Z_p(E)\right\} \end{aligned}$$
(6.4)

is finite. On the other hand, the set

$$\begin{aligned} \left\{ E\in {\hat{\Sigma }}:\ A^{(E-f)}_{n_p}(p)=\pm I_2 \text{ or } {\mathrm {Tr}}(A^{(E-f)}_{n_p}(p))=0\right\} \end{aligned}$$
(6.5)

is finite as well. Combining the facts above, we then have that

$$\begin{aligned} {{\mathcal {B}}}_f:=\left\{ E\in {\hat{\Sigma }}:\ A^{(E-f)} \text{ is } \text{ not } \text{ typical }\right\} \end{aligned}$$
(6.6)

is finite. Note that for all \(E\notin {{\mathcal {B}}}_f\), \(A^{(E-f)}\) is typical with respect to (pq). By Remark 6.3 we can modify the values of \(A^{(E-f)}\) at different points and keep its values at p and q, as well as their holonomies. In particular, by Remark   6.5, after a finite number of perturbations, we can perturb f to g with the following properties. There is a pair of periodic points \((p',q')\) with \(p'_0=q'_0\) and \(A^{(E-g)}\) is typical with respect to (pq) for all \(E\notin {{\mathcal {B}}}_f\) and typical with respect to \((p',q')\) for all \(E\in {{\mathcal {B}}}_f\). Thus we have that \(A^{(E-g)}\) is typical for all \(E\in {\hat{\Sigma }}\). By the fact that the defining properties of typical cocycles are open conditions with respect to the \(C^0\) topology, property (6.2), and the compactness of \({\hat{\Sigma }}\), we obtain a neighborhood \({{\mathcal {U}}}_g\subseteq {{\mathcal {U}}}_f\) of g so that for each \(h\in {{\mathcal {U}}}_g\), we have

$$\begin{aligned} L(A^{(E-h)},\mu )>0 \text{ for } \text{ all } E\in {\hat{\Sigma }}. \end{aligned}$$

By Proposition 6.6, \(L(A^{(E-h)})\) is continuous on \({{\mathbb {R}}}\). On the other hand, it is well known that \((T,A^{(E-h)})\) is uniformly hyperbolic outside of \(\Sigma \) and \(L(A^{(E-h)},\mu )\) tends to \(\infty \) as |E| tends to \(\infty \). Combining all these statements, we find that for each \(h\in {{\mathcal {U}}}_g\), we have

$$\begin{aligned} \inf _{E\in {{\mathbb {R}}}}L(A^{(E-h)},\mu )>0. \end{aligned}$$

This concludes the proof. \(\square \)

6.2 General case: full positivity for generic sampling functions

In this subsection, we return to the general setting of Theorem 1.1. Note that in this case we have neither the canonical holonomies, nor global existence of holonomies. Moreover, the discrete set can in principle be infinite. To remove the discrete exceptional set, the price we need to pay is that we can only do it for \(C^\alpha \)-generic sampling functions. For the remaining part of the section, we fix \(0 < \alpha \le 1\) and consider the space \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\).

We start with a new definition of typical cocycles that is adapted for the purpose of this section.

Definition 6.7

We say \(A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) is typical if there are two periodic points p and q with periods \(n_p\) and \(n_q\), respectively, such that \(p_0=q_0\) and the following properties hold:

  1. (1)

    p and q are \(\frac{\alpha }{2}\)-bunched, that is, \(2L(A,p)<\frac{\alpha }{2}\) and \(2L(A,q)<\frac{\alpha }{2}\).

  2. (2)

    \(A_{n_p}(p)\ne I_2\) and \({\mathrm {Tr}}(A_{n_p}(p))\ne 0\).

  3. (3)

    Let \(\{s(p),u(p)\}\subseteq {{\mathbb {C}}}{\mathbb {P}}^1\) be the set of eigendirections of \(A_{n_p}(p)\). Then there is no \(Z_p\subseteq \{s(p),u(p)\}\) so that \(H^u_{q\wedge p, q}\cdot H^s_{p,q\wedge p}\cdot Z_p\) is invariant under \(A_{n_q}(q)\).

Note that the existence of the holomomies of p and q in condition (3) is guaranteed by condition (1). As in the previous subsection, we may also say that A is typical with respect to (pq), as the definition involves p and q.

Define

$$\begin{aligned} {\mathcal {T}}_\alpha :=\left\{ A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}})): A \text{ is } \text{ a } \text{ typical } \text{ cocycle } \right\} . \end{aligned}$$
(6.7)

It is a standard fact that \(A \mapsto L(A,\mu )\) is upper-semicontinuous on \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\). In particular, the set

$$\begin{aligned} {{\mathcal {L}}}_\alpha = \left\{ A \in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}})) : 2L(A,\mu ) < \frac{\alpha }{2} \right\} \end{aligned}$$
(6.8)

is open in \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\). Again by [30, Theorem 3], if \(2L(A,\mu ) < \frac{\alpha }{2}\), there exists a periodic point p such that \(2L(A,p) < \frac{\alpha }{2}\), that is, p is \(\frac{\alpha }{2}\)-bunched. Then, as in the proof of Lemma 5.2, we may use the specification property to produce infinitely many pairs of \(\frac{\alpha }{2}\)-bunched periodic points (pq) so that \(p_0=q_0\). In particular, similarly to Proposition 6.2, we have the following:

Proposition 6.8

Suppose \((\Omega ,T)\) is a subshift of finite type and \(\mu \) is a T-ergodic measure that has a local product structure. Consider the space \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) for \(\alpha >0\) and let \({\mathcal {T}}_\alpha \) and \({{\mathcal {L}}}_\alpha \) be defined as above. \({\mathcal {T}}_\alpha \cap {{\mathcal {L}}}_\alpha \) forms an open and dense subset of \({{\mathcal {L}}}_\alpha \). Moreover, \({{\mathcal {L}}}_\alpha {\setminus }{\mathcal {T}}_\alpha \) has codimension infinity in \({{\mathcal {L}}}_\alpha \).

Similarly to Lemma 6.4, Proposition 6.8 has the following consequence, which has appeared in [39]. For simplicity, we define

$$\begin{aligned} {{\mathcal {P}}}_\alpha =\left\{ A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}})): L(A,\mu )>0\right\} . \end{aligned}$$
(6.9)

Lemma 6.9

We have \({\mathcal {T}}_\alpha \subseteq {{\mathcal {P}}}_\alpha \). In other words, \(L(A,\mu ) > 0\) for each A that is typical. Moreover, the set \({{\mathcal {P}}}_\alpha \) contains an open and dense subset of \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) and the complement of \({{\mathcal {P}}}_\alpha \) has codimension infinity.

Proof

If \(A\notin {{\mathcal {L}}}_\alpha \), then \(L(A,\mu )\ge \frac{\alpha }{4}>0\). If \(A\in {{\mathcal {L}}}_\alpha \) is typical, then we may apply the proof of Lemma 6.4 to get \(L(A,\mu )>0\). However, here we have to use the full strength of Sect. 5.1. Specifically, \(\frac{\alpha }{2}\)-bunching of p and q and the proof of Lemma 4.2 guarantee the existence of the holonomies associated with p and q. Then Lemmas 4.4 and 4.9 and Corollary 4.15 can be used to guarantee the existence and holonomy-invariance of \(Z_p\) and \(Z_q\). Once we have all these tools, the proof of \(L(A,\mu )>0\) is then identical to the proof of Lemma 6.4.

Next, we want to show that the set \({{\mathcal {P}}}_\alpha \) contains an open and dense set. To this end, we fix any \(A\in C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\). If there is an open neighborhood \({{\mathcal {U}}}_A\) of A such that for each \(B\in {{\mathcal {U}}}_A\), \(L(B,\mu )\ge \frac{\alpha }{4}\), then there is nothing we need to say. Otherwise, in any open neighborhood \({{\mathcal {U}}}\) of A, we can find a \(B\in {{\mathcal {L}}}_\alpha \). Then by Proposition 6.8 and the proof above, we can find an open set \({\mathcal {V}}\subseteq {{\mathcal {U}}}\cap {\mathcal {T}}_\alpha \), which implies that \(L(B,\mu )>0\) for each \(B\in {\mathcal {V}}\).

Finally, it is clear that the complement of \({{\mathcal {P}}}_\alpha \) is contained in \({{\mathcal {L}}}_\alpha {\setminus }{\mathcal {T}}_\alpha \), which has codimension infinity in \({{\mathcal {L}}}_\alpha \). Hence, the complement of \({{\mathcal {P}}}_\alpha \) has codimension infinity in \(C^\alpha (\Omega ,{\mathrm {SL}}(2,{{\mathbb {R}}}))\) as well. \(\square \)

Note that this is an improved version of Lemma 6.4, as here we remove the assumption of global bunching or local constancy of f.

Now we are ready to generically remove the discrete set that appeared in Theorem 1.1.

Proof of Theorem 1.4

By Remark 6.5 and via the arguments from the proof of Lemma 6.9 we can show that the set

$$\begin{aligned} {{\mathcal {Z}}}_\alpha :=\{f\in C^\alpha (\Omega ,{{\mathbb {R}}}): L(A^{(f)},\mu )=0\} \end{aligned}$$
(6.10)

has codimension infinity in \(C^\alpha (\Omega ,{{\mathbb {R}}})\). In other words, \({{\mathcal {Z}}}_\alpha \) is locally contained in finite unions of closed submanifolds with arbitrary codimension. More precisely, for each \(k\in {{\mathbb {Z}}}_+\) and each \(f\in {{\mathcal {Z}}}_\alpha \), we can find an open neighborhood \({{\mathcal {U}}}_f\) of f and submanifolds \({{\mathcal {M}}}_j\), \(1\le j\le m\), each with codimension k, so that

$$\begin{aligned} \left( {{\mathcal {Z}}}_\alpha \cap {{\mathcal {U}}}_f\right) \subseteq \bigcup ^m_{j=1}{{\mathcal {M}}}_j. \end{aligned}$$
(6.11)

On the other hand, if we define the set \({{\mathcal {B}}}_\alpha \) to be

$$\begin{aligned} {{\mathcal {B}}}_\alpha :=\{g \in C^\alpha (\Omega ,{{\mathbb {R}}}): E-g \in {{\mathcal {Z}}}_\alpha \text{ for } \text{ some } E\in {{\mathbb {R}}}\}, \end{aligned}$$
(6.12)

then for each \(g\in {{\mathcal {B}}}_\alpha \), we can find \(f\in {{\mathcal {Z}}}_\alpha \) and \(E\in {{\mathbb {R}}}\) so that \(g=E-f\). Thus \({{\mathcal {B}}}_\alpha \) is locally contained in finite unions of submanifolds of arbitrary codimension as well. Indeed, for the g and f above, we may just assume that f is the one in (6.11). Thus for a fixed \(k\in {{\mathbb {Z}}}_+\), for each \({{\mathcal {M}}}_j\) in (6.11), the set

$$\begin{aligned} {{\mathcal {N}}}_j:=\{h\in C^\alpha (\Omega ,{{\mathbb {R}}}):h-E\in {{\mathcal {M}}}_j \text{ for } \text{ some } E\in {{\mathbb {R}}}\} \end{aligned}$$

may be viewed as a submanifold of \(C^\alpha (\Omega ,{{\mathbb {R}}})\) with codimension \(k-1\) whose local charts can be obtained from those of \({{\mathcal {M}}}_j\) and \(E\in {{\mathbb {R}}}\). In particular, it is nowhere dense if \(k\ge 2\). On the other hand, (6.11) clearly implies that the open neighborhood

$$\begin{aligned} {{\mathcal {U}}}_g={{\mathcal {U}}}_f+E:=\{h\in C^\alpha (\Omega ,{{\mathbb {R}}}): h-E\in {{\mathcal {U}}}_f\} \end{aligned}$$

of g satisfies

$$\begin{aligned} ({{\mathcal {B}}}_\alpha \cap {{\mathcal {U}}}_g)\subseteq \bigcup ^m_{j=1}{{\mathcal {N}}}_j. \end{aligned}$$

Since \(g\in {{\mathcal {B}}}_\alpha \) and \(k\in {{\mathbb {Z}}}_+\) can be arbitrarily chosen, we obtain that \({{\mathcal {B}}}_\alpha \) is nowhere dense. Equivalently, we may say that the complement \({{\mathcal {B}}}^c_\alpha \) of \({{\mathcal {B}}}_\alpha \) is residual in \(C^\alpha (\Omega ,{{\mathbb {R}}})\). By definition of \({{\mathcal {B}}}_\alpha \), we have for each \(f\in {{\mathcal {B}}}^c_\alpha \) that

$$\begin{aligned} L(A^{(E-f)},\mu )>0 \text{ for } \text{ all } E\in {{\mathbb {R}}}, \end{aligned}$$

concluding the proof. \(\square \)

Remark 6.10

Similarly to Remark 5.14, all the main results in this section can be applied to Hölder continuous sampling functions defined on \((\Omega ^+,T_+,\mu ^+)\), where the lift \(\mu \) of \(\mu ^+\) has a local product structure. Indeed, in this case, \(C^\alpha (\Omega ^+,{{\mathbb {R}}})\) can be considered as a closed subspace of \(C^\alpha (\Omega ,{{\mathbb {R}}})\) whose elements depend only on the future. All the perturbations can then be performed within this subspace.

7 Applications

All of the results of this paper may be applied to Hölder continuous cocycles defined over any transitive Anosov diffeomorphism (or transitive, uniformly expanding differentiable map), where \(\mu \) is taken to be the equilibrium state of a Hölder continuous potential. By a standard technique one can reduce the cocycles in question to Hölder continuous cocycles over a subshift of finite type via a Markov partition; see, for example, [14, 31]. Although the applicability is much wider, we will focus on a particular case as follows. It is standard result that if an invariant measure \(\mu \) of a \(C^2\) transitive Anosov diffeomorphism (or a \(C^2\) transitive, uniformly expanding map) is absolutely continuous with respect to the volume measure, then it is an equilibrium state of a Hölder continuous potential; see, for example, [14].

To illustrate this, we choose three differential models that have been widely studied in both the dynamical systems and mathematical physics communities. The first type of model is given by linear expanding maps of the circle,

$$\begin{aligned} T : {{\mathbb {R}}}/{{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}/{{\mathbb {Z}}}, \quad Tx=kx, \; k\ge 2, \end{aligned}$$
(7.1)

and the measure is taken to be the Lebesgue measure m on \({{\mathbb {R}}}/{{\mathbb {Z}}}\). One may find some existing results for this case in [6, 9, 12, 18, 22, 36, 40, 42, 44]. In particular, the case \(k=2\) corresponds to the doubling map, which is the most difficult map to study within this family of maps, as it is the least mixing among them. The second type is given by hyperbolic automorphisms of \({{\mathbb {R}}}^d/{{\mathbb {Z}}}^d\), where \(\mu \) is taken to be the Lebesgue measure m on \({{\mathbb {R}}}^d/{{\mathbb {Z}}}^d\). The most intensively studied case is the famous Arnold cat map, where

$$\begin{aligned} T : {{\mathbb {R}}}^2/{{\mathbb {Z}}}^2 \rightarrow {{\mathbb {R}}}^2/{{\mathbb {Z}}}^2, \quad T = \begin{pmatrix} 2&{}1\\ 1&{}1 \end{pmatrix} \end{aligned}$$
(7.2)

and \(\mu \) is taken to be the Lebesgue measure m on \({{\mathbb {R}}}^2/{{\mathbb {Z}}}^2\). One may find earlier results for this case in [12, 18, 36, 42]. It is clear that both linear expanding maps of the circle and hyperbolic toral automorphisms meet all the conditions necessary to apply our main theorems in Sects. 5 and 6. In particular, they all have a fixed point.

Our theorems then yield the following results. To unify the statements, we let \((\Omega ,T, \mu )\) be any of the following: \(({{\mathbb {R}}}/{{\mathbb {Z}}}, T_k, m)\), where \(T_kx=kx\) and \(k\ge 2\) is an integer; \(({{\mathbb {R}}}^d/{{\mathbb {Z}}}^d, T_A, m)\) where \(d\ge 2\) and \(T_A\) is the hyperbolic toral automorphism generated by some hyperbolic \(A\in {\mathrm {SL}}(d,{{\mathbb {Z}}})\). Recall that for a sampling function f, we set \(L(E)=L(A^{(E-f)}, \mu )\) and define

$$\begin{aligned} {{\mathcal {Z}}}_f:=\{E: L(E)=0\}\subseteq {{\mathbb {R}}}. \end{aligned}$$
(7.3)

For \(0 < \alpha \le 1\) and \(\lambda > 0\), we \(C^\alpha _\lambda (\Omega ,{{\mathbb {R}}})=\{f\in C^\alpha (\Omega ,{{\mathbb {R}}}): \Vert f\Vert _\infty <\lambda \}\).

Theorem 7.1

Let \((\Omega ,T,\mu )\) be as above and let \(0<\alpha \le 1\). For all non-constant \(f\in C^\alpha (\Omega , {{\mathbb {R}}})\), \({{\mathcal {Z}}}_f\) is a discrete set. Moreover, \({{\mathcal {Z}}}_f=\varnothing \) for f’s in a residual subset of \(C^\alpha (\Omega ,{{\mathbb {R}}})\). There is \(\lambda _0=\lambda _0(\alpha )>0\) such that \({{\mathcal {Z}}}_f\) is a finite set for all non-constant \(f\in C^\alpha _{\lambda _0}(\Omega ,{{\mathbb {R}}})\). Finally, there is an open and dense subset \({{\mathcal {O}}}^\alpha \) of \(C^\alpha _{\lambda _0}(\Omega ,{{\mathbb {R}}})\) such that for all \(f\in {{\mathcal {O}}}^\alpha \), \(\inf _{E\in {{\mathbb {R}}}}L(E) > 0\).

If we introduce a coupling constant \(\lambda \) , then we have the following immediate consequence of Theorem 7.1.

Corollary 7.2

Let \((\Omega ,T,\mu )\) and \(\alpha \) be as in Theorem 7.1. Fix a non-constant \(f\in C^\alpha (\Omega , {{\mathbb {R}}})\). Then \({{\mathcal {Z}}}_{\lambda f}\) is a discrete set for all \(\lambda >0\). Moreover, there is a \(\lambda _0=\lambda _0(\Vert f\Vert _\infty ,\alpha )>0\) such that \({{\mathcal {Z}}}_{\lambda f}\) is finite for all \(0<\lambda <\lambda _0\).

Remark 7.3

To the best of our knowledge, if we take T to be the doubling map for \(d=1\) or the Arnold cat map for \(d\ge 2\), then the results we stated in Theorem 7.1 and Corollary 7.2 are the first global results that do away with smallness or largeness assumptions for the coupling constant. In the large coupling regime, Herman’s subharmonicity trick [28] can be applied (for trigonometric polynomials), and in the (perturbatively!) small coupling regime, the perturbative analysis of Chulaevsky–Spencer [18] and Sadel–Schulz–Baldes [35, 36] can be applied. Other methods get around changing the coupling constant by changing the base dynamics instead, specifically to increase its hyperbolicity; compare Bourgain–Bourgain–Chang [9] and Bjerklöv [6].

Remark 7.4

Taking the doubling map as an example, we give two sample computations. First, we show how to reduce a Hölder continuous cocycle on \({{\mathbb {R}}}/{{\mathbb {Z}}}\times {{\mathbb {R}}}^2\) to one on \(\Omega \times {{\mathbb {R}}}^2\), where \(\Omega \) is the full shift, which is in particular a subshift of finite type. Let \(\Omega ^+=\{0,1\}^{{\mathbb {N}}}\) and \((\Omega ^+,T_+,\mu ^+)\) be the one-sided Bernoulli shift. Here we choose \(\mu ^+={{\tilde{\mu }}}^{{\mathbb {N}}}\) where \({{\tilde{\mu }}}(0)={{\tilde{\mu }}}(1)=\frac{1}{2}\). Then it is well know that the map

$$\begin{aligned} \pi :\Omega ^+\rightarrow {{\mathbb {R}}}/{{\mathbb {Z}}}, \; \omega ^+ \mapsto \sum ^\infty _{n=0}\frac{\omega ^+_n}{2^{n+1}} \end{aligned}$$

codes the dynamics of doubling map \(({{\mathbb {R}}}/{{\mathbb {Z}}}, T_2, m)\) to that of \((\Omega ^+, T_+, \mu ^+)\) since \(T_2\circ \pi =\pi \circ T_+\) and \(\pi _*\mu ^+=m\). In particular, for any cocycle map \(A:{{\mathbb {R}}}/{{\mathbb {Z}}}\rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\), we set \(A^+:\Omega ^+\rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) where \(A^+=A\circ \pi \), and we then have by construction \(L(A,m)=L(A^+,\mu ^+)\). Now we consider the full shift space \((\Omega ,T,\mu )\) whose one-sided shift is \((\Omega ^+,T_+,\mu ^+)\), as described above. By setting \({\bar{A}}(\omega )=A^+(\pi ^+\omega )\), we clearly have \(L(T,{\bar{A}})=L(T_+, A^+)\). It is clear that \({\bar{A}}\) is \(\alpha \)-Hölder continuous as long as \(A^+\) is, since \(d(\pi ^+\omega ,\pi ^+{{\tilde{\omega }}})\le d(\omega ,{{\tilde{\omega }}})\). So we just need to show that the Hölder continuity can be carried over from A to \(A^+\). This in turn follows from the following straightforward estimate:

$$\begin{aligned} |\pi \omega ^+ - \pi {{\tilde{\omega }}}^+|\le d(\omega ^+,{{\tilde{\omega }}}^+)^{\log 2}. \end{aligned}$$

In particular, \(\alpha \)-Hölder continuity of A implies \((\alpha \log 2)\)-Hölder continuity of \(A^+\) since

$$\begin{aligned} \Vert A^+(\omega ^+)-A^+({{\tilde{\omega }}}^+)\Vert&=\Vert A(\pi \omega ^+) -A(\pi {{\tilde{\omega }}}^+)\Vert \\&\le C|\pi \omega ^+-\pi {{\tilde{\omega }}}^+|^\alpha \\&\le Cd(\omega ^+,{{\tilde{\omega }}}^+)^{\alpha \log 2}. \end{aligned}$$

Next, we compute some explicit choices for the value of \(\lambda _0\) appearing in Theorem 7.1 and Corollary 7.2, when the base dynamics in question are given by the doubling map. Clearly, the above process still works if we replace \(A:{{\mathbb {R}}}/{{\mathbb {Z}}}\rightarrow {\mathrm {SL}}(2,{{\mathbb {R}}})\) by \(f:{{\mathbb {R}}}/{{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\). Given \(f\in C^\alpha ({{\mathbb {R}}}/{{\mathbb {Z}}},{{\mathbb {R}}})\), we may instead consider the corresponding \({\bar{f}}\in C^{\alpha \log 2}(\Omega ,{{\mathbb {R}}})\). In particular, \(\Vert f\Vert _\infty =\Vert {\bar{f}}\Vert _\infty \). We want to find a \(\lambda _0\) so that f is globally bunched if \(\Vert f\Vert _\infty <\lambda _0\). In other words,

$$\begin{aligned} A^E(\omega )=\begin{pmatrix}E-{\bar{f}}(\omega ) &{}-1\\ 1 &{}0\end{pmatrix} \end{aligned}$$

is fiber bunched for all \(E\in [-2-\Vert f\Vert _\infty ,2+\Vert f\Vert _\infty ]\). To simplify the computation, we ensure that fiber bunching is satisfied with \(n_0 = 1\). That it, we want for all \(E\in [-2-\Vert {\bar{f}}\Vert _\infty , 2+\Vert {\bar{f}}\Vert _\infty ]\) that

$$\begin{aligned} \Vert A^E(\cdot )\Vert _\infty <e^{\frac{\log 2}{2}\alpha }=2^{\frac{\alpha }{2}}. \end{aligned}$$

Recall the fiber bunching condition is only assumed to ensure the existence of stable and unstable holonomies. Thus, by the construction of the holonomies from the proof of Lemma 4.2, it is clear that we may reduce the condition above to the following condition. For each \(E\in [-2-\Vert {\bar{f}}\Vert _\infty , 2+\Vert {\bar{f}}\Vert _\infty ]\), there is a \(P(E)\in \mathrm {SL}(2,{{\mathbb {R}}})\) so that

$$\begin{aligned} \Vert P(E)^{-1}A^E(\cdot )P(E)\Vert _\infty <2^{\frac{\alpha }{2}}. \end{aligned}$$
(7.4)

First, we take care of the E’s that are away from \(\pm 2\). For each \(E\in (-2,2)\), a direct computation shows that

$$\begin{aligned} P(E)^{-1}\begin{pmatrix}E&{}-1\\ 1 &{}0\end{pmatrix} P(E)\in \mathrm {SO}(2,{{\mathbb {R}}}) \end{aligned}$$

which has norm one and where

$$\begin{aligned} P(E)=\begin{pmatrix}\frac{\sqrt{2}}{(4-E^2)^{\frac{1}{4}}}&{}0\\ \frac{E}{\sqrt{2}(4-E^2)^{\frac{1}{4}}}&{}\frac{(4-E^2)^{\frac{1}{4}}}{\sqrt{2}}\end{pmatrix}. \end{aligned}$$

If we choose \(\lambda _0\) so that for all \(E\in [-2+\lambda _0,2-\lambda _0]\) and all \(|\lambda |<\lambda _0\), we have

$$\begin{aligned} \left\| P(E)^{-1}\begin{pmatrix}\lambda &{}0\\ 0 &{}0\end{pmatrix} P(E)\right\| < 2^{\frac{\alpha }{2}}-1, \end{aligned}$$

then we have (7.4) for any \(\Vert {\bar{f}}\Vert _\infty =\Vert f\Vert _\infty <\lambda _0\) and all \(E\in [-2+\lambda _0, 2-\lambda _0]\). It is straightforward to see that

$$\begin{aligned} P(E)^{-1}\begin{pmatrix}\lambda &{}0\\ 0 &{}0\end{pmatrix} P(E)=\begin{pmatrix}\lambda &{}0\\ -\frac{E\lambda }{\sqrt{4-E^2} }&{}0\end{pmatrix}. \end{aligned}$$

Thus we have fiber bunching for all \(E\in [-2+\lambda _0, 2-\lambda _0]\) if for all such E’s and for all \(|\lambda |<\lambda _0\), we have

$$\begin{aligned} |\lambda |+\left| \frac{E\lambda }{\sqrt{4-E^2}}\right| <2^\frac{\alpha }{2}-1. \end{aligned}$$

Since the supremum of the left hand is attained at \(\lambda =\lambda _0\) and \(E=2-\lambda _0\), one can check that it suffices to have

$$\begin{aligned} \lambda _0+\frac{\lambda _0}{\sqrt{\lambda _0-\lambda _0^2}}<2^{\frac{\alpha }{2}}-1, \end{aligned}$$

which in turn can be guaranteed, for example, by the condition \(3\sqrt{\lambda _0}\le 2^\frac{\alpha }{2}-1\). In particular, if we choose any

$$\begin{aligned} 0<\lambda _0\le \frac{(2^\frac{\alpha }{2}-1)^2}{9}, \end{aligned}$$
(7.5)

then we have fiber bunching for all \(E\in [-2+\lambda _0, 2-\lambda _0]\) and for all \(\Vert f\Vert _\infty <\lambda _0\).

Now we take care of the energies \(E\in [-2-\lambda _0, -2+\lambda _0]\cup [2-\lambda _0, 2+\lambda _0]\). Take \(E=2\) for example. Then we have

$$\begin{aligned} G_a^{-1}\begin{pmatrix}2 &{}-1\\ 1&{} 0\end{pmatrix}G_a=\begin{pmatrix}1 &{} -a \\ 0&{} 1\end{pmatrix}, \end{aligned}$$

where \(a>0\) and

$$\begin{aligned} G_a=\begin{pmatrix}\frac{1}{\sqrt{a}} &{}-\sqrt{a}\\ \frac{1}{\sqrt{a}}&{} 0\end{pmatrix}. \end{aligned}$$

It is easy to see that we have

$$\begin{aligned} \left\| \begin{pmatrix}1 &{}a\\ 0&{} 1\end{pmatrix}\right\| \le 1 + |a|. \end{aligned}$$

On the other hand, we can see via a straightforward computation that

$$\begin{aligned} G_a^{-1}\begin{pmatrix}\lambda &{}0\\ 0 &{}0\end{pmatrix}G_a=\begin{pmatrix}0&{}0\\ -\frac{\lambda }{a}&{}\lambda \end{pmatrix}. \end{aligned}$$

Thus it suffices to choose \(a >0\) and \(\lambda _0 > 0\) so that for all \(|\lambda | \le \lambda _0\), we have

$$\begin{aligned} 1 + a + \frac{|\lambda |}{a}+|\lambda | < 2^\frac{\alpha }{2}, \end{aligned}$$

which may be guaranteed by

$$\begin{aligned} a + \frac{\lambda _0}{a} + \lambda _0 < 2^\frac{\alpha }{2} - 1. \end{aligned}$$

Clearly, we may choose \(a=\frac{1}{2}(2^\frac{\alpha }{2}-1)\). It is then easy to see that if we choose any \(\lambda _0\) such that

$$\begin{aligned} 0<\lambda _0\le \frac{(2^\frac{\alpha }{2}-1)^2}{8}, \end{aligned}$$
(7.6)

then we have fiber bunching for all \(E\in [2-\lambda _0, 2+\lambda _0]\) and for all f with \(\Vert f\Vert _\infty <\lambda _0\). A similar computation shows that the \(\lambda _0\) in (7.6) works for \(E\in [-2-\lambda _0, -2+\lambda _0]\) as well. Combining (7.5) and (7.6), we see that in the statement of Theorem 7.1 and Corollary 7.2 for the doubling map, we may choose

$$\begin{aligned} \lambda _0=\frac{(2^\frac{\alpha }{2}-1)^2}{9}. \end{aligned}$$

Remark 7.5

The computation of \(\lambda _0\) in Remark 7.4 actually works for \(A^E\) defined on any subshift of finite type \((\Omega ,T,\mu )\). Moreover, since we do not have the coding process as in Remark 7.4, we have that \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\) is globally bunched if

$$\begin{aligned} \Vert f\Vert _\infty \le \lambda _0=\frac{(e^\frac{\alpha }{2}-1)^2}{9}. \end{aligned}$$
(7.7)

In particular, this value of \(\lambda _0\) works for Theorem 7.6 below.

Let us now apply our results to Markov chains. We consider the full shift \(({{\mathcal {A}}}^{{\mathbb {Z}}},T)\), where \({{\mathcal {A}}}=\{1,\ldots , \ell \}\). Let \(P=(P_{ij})_{1\le i,j\le \ell }\) be a stochastic matrix, in other words, \(P_{ij}\ge 0\) and \(\sum ^{\ell }_{j=1}P_{ij}=1\). Assume that P is irreducible, that is, for all \(i,j \in {{\mathcal {A}}}\), there is \(n \in {{\mathbb {Z}}}_+\) such that the (ij)-entry of \(P^n\) is positive. Then there is a unique probability vector \(\underline{p}=(p_1,\ldots p_\ell )\) (i.e., \(p_i > 0\) and \(\sum ^{\ell }_{i=1} p_i=1\)) such that \(\sum ^{\ell }_{i=1} p_i P_{ij} = p_j\). Now we define the measure \(\mu \) on \({{\mathcal {A}}}^{{\mathbb {Z}}}\) via

$$\begin{aligned} \mu ([0;k_0, \ldots , k_n])=p_{k_0}\prod ^{n-1}_{i=0}P_{k_ik_{i+1}}. \end{aligned}$$
(7.8)

Such a measure \(\mu \) is called a Markov measure. By a standard result, the topological support of \(\mu \) is a subshift of finite type \(\Omega \) with the adjacency matrix \(A=(a_{ij})\) such that \(a_{ij}=1\) whenever \(p_{ij}>0\) and \(a_{ij}=0\) otherwise. Thus we may instead consider the space \((\Omega , T,\mu )\). Moreover, \(\mu \) is T-ergodic if and only if P is irreducible. Consider its associated one-sided space \((\Omega ^+,T_+,\mu ^+)\). It is a standard result that \(\mu ^+\) is the unique equilibrium state of the potential \(\phi (\omega ^+)=-\log P_{\omega ^+_0\omega ^+_1}\), which is locally constant; see, for example, [41]. Thus by Lemma 3.4, \(\mu \) has the bounded distortion property, and hence a local product structure as well.

Theorem 7.6

Let \((\Omega ,T,\mu )\) be a Markov chain as described above. Fix \(0 < \alpha \le 1\). Then we have the following statements:

  1. (a)

    There is a residual set \({{\mathcal {G}}}^\alpha \subseteq C^\alpha (\Omega ,{{\mathbb {R}}})\) such that \({{\mathcal {Z}}}_f=\varnothing \) for all \(f\in {{\mathcal {G}}}^\alpha \).

  2. (b)

    There are \(\lambda _0=\lambda _0(\alpha )>0\) and an open dense subset \({{\mathcal {O}}}^\alpha \subseteq C^\alpha _{\lambda _0}(\Omega ,{{\mathbb {R}}})\) such that for each \(f\in {{\mathcal {O}}}^\alpha \), we have \(\inf _{E\in {{\mathbb {R}}}}L(E)>0\).

    If in addition \((\Omega ,T)\) has a fixed point (which happens if and only if \(P_{ii}>0\) for some \(1\le i\le \ell \)), the following stronger statements hold true:

  3. (c)

    \({{\mathcal {Z}}}_f\) is a discrete set for all non-constant \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\) and it is a finite set for all non-constant \(f\in C^\alpha _{\lambda _0}(\Omega ,{{\mathbb {R}}})\) or for all non-constant f that are locally constant.

  4. (d)

    In particular, \({{\mathcal {Z}}}_{\lambda f}\) is discrete for all \(\lambda >0\) and finite for all \(0<\lambda <\lambda _0\) for all non-constant \(f\in C^\alpha (\Omega ,{{\mathbb {R}}})\). If f is locally constant and non-constant, then \({{\mathcal {Z}}}_{\lambda f}\) is a finite set for all \(\lambda >0\).

Remark 7.7

Reiterating what we said in Remark 5.13, even if \((\Omega ,T)\) does not have a fixed point (i.e., when \(P_{ii} = 0\) for every \(1\le i\le \ell \)), we can work with periodic spectra of higher periods and test for non-coincidence of two of them. In concrete cases this procedure is easy to implement and will in many cases lead to the desired result. For instance, we can apply it to the last example we present in the end of this section.

Note that the Anderson model is a special case of the Markov chains described above, provided that the single-site measure is supported on a finite set. Indeed, such models may be generated as follows. Let \(\mu \) be a probability measure on the full shift space \({{\mathcal {A}}}^{{\mathbb {Z}}}\) that is generated by a single site measure \({\bar{\mu }}\{i\}=p_i\) where \(\underline{p}=(p_1,\ldots p_\ell )\) is a probability vector. It is clearly a Markov chain with the same probability vector and with the stochastic matrix \(p_{ij}=p_j\). Thus, we have the following corollary of Theorem 7.6.

Corollary 7.8

Consider the full shift space \(({{\mathcal {A}}}^{{\mathbb {Z}}},T,\mu )\), where \(\mu ={{\tilde{\mu }}}^{{\mathbb {Z}}}\) and \({{\tilde{\mu }}}\) is a probability measure on \({{\mathcal {A}}}= \{1,\ldots \ell \}\) that has full support. Then all the conclusions that we stated in Theorem 7.6 hold true. In particular, if f is locally constant and non-constant, then \({{\mathcal {Z}}}_{\lambda f}\) is a finite set for all \(\lambda >0\).

In particular, the Anderson model is generated by a sampling function \(f : {{\mathcal {A}}}^{{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) that depends only the 0th position. Note that such a function is in particular locally constant. Corollary 7.8 implies the finiteness of \({{\mathcal {Z}}}_{\lambda f}\) for all such f’s that are non-constant. Of course, in this case, the celebrated Furstenberg’s Theorem yields uniform positivity of the Lyapuonv exponent. However, the finiteness of \({{\mathcal {Z}}}_f\) for all non-constant locally constant \(f:{{\mathcal {A}}}^{{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\) already may not be directly obtained from Furstenberg’s Theorem. Moreover, our result is basically sharp. Indeed, there are plenty of examples where \({{\mathcal {Z}}}_f\) is not empty for locally constant and non-constant \(f:{{\mathcal {A}}}^{{\mathbb {Z}}}\rightarrow {{\mathbb {R}}}\), see [15]. Nevertheless, the finiteness of \({{\mathcal {Z}}}_f\) can already be a starting point to prove full spectral localization.

For the reader’s convenience, we provide an example with the property \({{\mathcal {Z}}}_f\ne \varnothing \), where f is a non-constant locally constant function defined over a Markov chain. To give such an example, let us show that the well-known random dimer model (cf., e.g., [5, 26]) is covered by our framework. The random dimer model arises from the standard Bernoulli–Anderson model by doubling up the sites. That is, with \(\{ \omega _n \}_{n \in {{\mathbb {Z}}}}\) i.i.d. random variables taking two different values, say 0 and \(\lambda \) with probability \(0<p<1\) and \(1-p\), the potentials are given by \(V_\omega (2n) = V_\omega (2n+1) = \omega _n\). To realize these potentials in our framework, consider the subshift of finite type \(\Omega \) over the alphabet \(\{ 1, 2, 3, 4 \}\) with the adjacency matrix

$$\begin{aligned} A = \begin{pmatrix} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 \\ 1 &{} 1&{} 0 &{} 0 \\ 1 &{} 1 &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$
(7.9)

The measure \(\mu \) is the Markov measure generated by the following probability vector and the stochastic matrix

$$\begin{aligned} \underline{p}=\left( \frac{p}{2}, \frac{1-p}{2},\frac{p}{2}, \frac{1-p}{2}\right) ,\ P = \begin{pmatrix} 0 &{} 0 &{} 1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 1 \\ p &{} 1-p&{} 0 &{} 0 \\ p &{} 1-p &{} 0 &{} 0 \end{pmatrix}. \end{aligned}$$
(7.10)

The sampling function \(f:\Omega \rightarrow {{\mathbb {R}}}\) is generated by \({\bar{f}} : \{ 1,2,3,4\} \mapsto \{ 0 , \lambda \}\), \({\bar{f}}(1) = {\bar{f}}(3) = 0\), \({\bar{f}}(2) = {\bar{f}}(4) = \lambda \) via \(f(\omega )={\bar{f}}(\omega _0)\) which is locally constant. It is readily checked that the resulting model is indeed the random dimer model. It is well known, and in fact easy to see, that for \(-2< \lambda < 2\), \(A^{(E-f)}_n(\omega )\) is bounded for all n at energies 0 and \(\lambda \). Thus \(\{0,\lambda \}\subseteq {{\mathcal {Z}}}_f\). Although this system has no fixed point, we do have that f is constant on the orbit of \(\omega \in \Omega \) where \(\omega _{2n}=1, \omega _{2n+1}=3\). Note that in statement of Theorems 1.1 and 5.12 , the fixed point is only there to produce a constant potential \(V_\omega (n)\). Thus, Theorem 7.6 can still be applied to obtain the finiteness of \({{\mathcal {Z}}}_f\). However, for this model, we can provide more information. It actually follows from Furstenberg’s Theorem that the Lyapunov exponent is positive away from these two energies \(\{0,\lambda \}\). This shows that in this particular case \({{\mathcal {Z}}}_f=\{ 0 , \lambda \}\).