1 Introduction

The law of large numbers, stating that on average an i.i.d. process is close to its theoretical mean, often used to describe the typical statistical behavior of a random sample, is the basis for understanding general additive processes with applications in various branches of mathematics such as probability, combinatorics or ergodic theory.

The multiplicative version of the law of large numbers for products of random matrices is the classical theorem of Furstenberg and Kesten [1], which asserts that with probability 1 the logarithmic growth rate of products of random matrices equals its mean growth rate. More formally, a special case of this theorem states that for an i.i.d. sequence of random matrices \(L_1, \, L_2, \,\ldots \,\), with common law given by a compactly supported probability measure \(\mu \) on \(GL_d({\mathbb {R}})\), the following asymptotic equality holds almost surely

$$\begin{aligned} \lim _{n\rightarrow \infty }\frac{1}{n}\log \left\Vert L_n\ldots L_1\right\Vert = \lim _{n\rightarrow \infty }\frac{1}{n}{\mathbb {E}}[\, \log \left\Vert L_n\ldots L_1\right\Vert \, ] , \end{aligned}$$

where the right-hand-side, denoted by \(L(\mu )\), is the so called Lyapunov exponent of the law \(\mu \). The investigation of how the Lyapunov exponent changes as a function of the underlying measure \(\mu \) lies at the core of the multiplicative ergodic theory, with many fundamental contributions during the last 60 years.

The continuity of the Lyapunov exponent \(L(\mu )\) as a function of the measure \(\mu \), with respect to the weak* topology, was established by Furstenberg and Kifer [2] under a generic irreducibility assumption. A measure \(\mu \) is called irreducible if there exists no proper subspace of \({\mathbb {R}}^d\) which is \(\mu \)-invariant, i.e., invariant under all matrices in the support of \(\mu \). Otherwise \(\mu \) is called reducible, and any proper \(\mu \)-invariant subspace \(S\subset {\mathbb {R}}^d\) determines the Lyapunov exponent \(L(\mu \vert _S)\) corresponding to the logarithmic growth rate of the norms \(\left\Vert (L_n\vert _S)\, \ldots \, (L_1\vert _S)\right\Vert \). The continuity of Furstenberg and Kifer actually holds under the weaker quasi-irreduciblity assumption. A measure \(\mu \) is called quasi-irreducible if \(L(\mu \vert _S)=L(\mu )\) for every proper \(\mu \)-invariant subspace \(S\subset {\mathbb {R}}^d\). See [3, Theorem 1.46].

In [4], Bocker and Viana proved that for measures supported in \(GL_2({\mathbb {R}})\) the Lyapunov exponent \(\mu \mapsto L(\mu )\) is continuous with respect the weak* topology and the Hausdorff distance between their supports. Avila, Eskin and Viana announced that the same result holds for measures supported in \(GL_d({\mathbb {R}})\), any \(d\ge 2\). See the remark after Theorem 10.1 in [5].

It is then natural to raise the question about the precise modulus of continuity of this map.

A lower bound for this regularity was provided by Le Page in [6]. The Lyapunov exponent is locally Hölder continuous over an open and dense set of compactly supported measures on \(GL_d({\mathbb {R}})\), namely the set of quasi-irreducible measures \(\mu \) with a gap between the first and second Lyapunov exponents. See also [7, Theorem 1]. Recall that a function \(E\mapsto f(E)\) is said to be Hölder with exponent \(\alpha \), or \(\alpha \)-Hölder, if there exists a constant \(C<\infty \) such that for all \(E,E'\),

$$\begin{aligned} |f(E)-f(E')| \le C\, |E-E'|^\alpha . \end{aligned}$$

A function which is Hölder in a neighborhood of each point of its domain is called locally Hölder. Alternatively, we say that a function \(E\mapsto f(E)\) is point-wisely \(\alpha \)-Hölder if for every \(E_0\) there exists a constant \(C<\infty \) and a neighborhood of \(E_0\) where for all E,

$$\begin{aligned} |f(E)-f(E_0)| \le C\, |E-E_0|^\alpha . \end{aligned}$$

Notice that point-wise Hölder is weaker than locally Hölder. In fact the modulus of continuity around a point of a point-wisely Hölder function can be arbitrary bad.

In E. Tall and M. Viana [8] proved that for random \(GL_2({\mathbb {R}})\) cocycles the Lyapunov exponents are always point-wisely log-Hölder, and even point-wisely Hölder when the Lyapunov exponents are distinct.

In the same direction, the quasi-irreducibility hypothesis was discarded in [9], where it was established that for finitely supported measures in \(GL_2({\mathbb {R}})\) with distinct Lyapunov exponents, the function \(\mu \mapsto L(\mu )\) is either locally Hölder or else locally weak-Hölder. Given positive constants \(\alpha , \beta \le 1\), a function \(E\mapsto f(E)\) is said to be \((\alpha ,\beta )\)-weak Hölder if there exists a constant \(C<\infty \) such that for all \(E,E'\),

$$\begin{aligned} |f(E)-f(E')| \le C\, e^{-\alpha \, \left( \log |E-E'|^{-1}\right) ^\beta }. \end{aligned}$$

Notice that \((\alpha ,1)\)-weak Hölder is equivalent to \(\alpha \)-Hölder.

In the reverse direction, an example due to Halperin [10, Appendix 3A] provides an upper bound on this regularity. The example consists of the following 1-parameter family of measures on \(SL_2({\mathbb {R}})\), \(\mu _{a,b,E}:=\frac{1}{2}\,\delta _{A_E} + \frac{1}{2}\,\delta _{b,E}\), where

$$\begin{aligned} A_E=\begin{pmatrix} a-E &{} -1\\ 1 &{} 0 \end{pmatrix}, \quad B_E=\begin{pmatrix} b-E &{} 0\\ 1 &{} 0 \end{pmatrix}. \end{aligned}$$

It follows from [10, Theorem A.3.1] that the function \(E\mapsto L(\mu _{a,b,E})\) can not be \(\alpha \)-Hölder continuous for any \(\alpha >\frac{2\,\log 2}{\textrm{arccosh}(1+|a-b|/2)}\) (see also Proposition 4.1). On the other hand, it is not difficult to see that the measures \(\mu _{a,b,E}\) satisfy the assumptions of Le Page’s theorem, which implies that the function \(E\mapsto L(\mu _{a,b,E})\) is indeed Hölder continuous, but with a very small Hölder exponent \(\alpha \) when \(a-b\) is large.

In the same spirit, in [11], the authors provide the following example where the Lyapunov exponent is not even weak-Hölder continuous. They consider the measure

$$\begin{aligned} \mu :=\frac{1}{2}\, \delta _{A} + \frac{1}{2}\, \delta _{B},\quad A=\begin{pmatrix} 0 &{} -1\\ 1 &{} 0 \end{pmatrix}, \; B=\begin{pmatrix} e &{} 0\\ 0 &{} e^{-1} \end{pmatrix}\end{aligned}$$

and prove that there exists a curve \(\tilde{\mu }_t=\frac{1}{2}\,\delta _{A_t} +\frac{1}{2}\,\delta _{B_t}\) through \({\tilde{\mu }}_0=\mu \) such that \(t\mapsto L_1({\tilde{\mu }}_t)\) is not weak-Hölder around \(t=0\). Notice that \(\mu \) is not quasi-irreducible and \(L(\mu )=0\) so that \(\mu \) does not satisfy any of the assumptions of Le Page’s theorem.

In contrast with this low regularity, a classical theorem of Ruelle proves the analiticity of the Lyapunov exponent for uniformly hyperbolic measures with 1-dimensional unstable direction, see [12, Theorem 3.1]. A compactly supported measure \(\mu \) on \(GL_d({\mathbb {R}})\) is said to be uniformly hyperbolic if the linear cocycle generated by \(\mu \) is uniformly hyperbolic (see Sect. 3.1).

From now on we focus on the class of finitely supported measures in \(SL_2({\mathbb {R}})\), where the lack of regularity of the Lyapunov exponent can only occur outside of the class of uniformly hyperbolic measures. In [13], Avila, Bochi and Yoccoz gave a characterization of the uniformly hyperbolic cocycles generated by a finitely supported measure in \(SL_2({\mathbb {R}})\) in terms of existence of an invariant multicone. With this characterization they prove that the complement of the closure of the uniformly hyperbolic measures is the set of elliptic measures, meaning the finitely supported measures such that the semigroup \(\Gamma _{\mu }\) generated by the support of \(\mu \) contains a elliptic element, i.e., a matrix conjugated to a rotation.

Given a hyperbolic matrix \(A\in SL_2({\mathbb {R}})\) we denote by \({{\hat{s}}}(A)\), respectively \({{\hat{u}}}(A)\), the stable direction, respectively the unstable direction of A in the projective space \({\mathbb {P}}^1\). We say that \(\mu \) has a heteroclinic tangency if there are matrices \(A,B,C\in \Gamma _\mu \) such that A and B are hyperbolic and \(C\, {{\hat{u}}}(B)={{\hat{s}}}(A)\). In this case we also say that \((B,\, C,\, A)\) is a tangency for \(\mu \). If moreover \(A=B\), we say that \(\mu \) has a homoclinic tangency. Heteroclinic tangencies are referred to as heteroclinic connections in [13],Footnote 1. If \(\mu \) is not uniformly hyperbolic but \(L(\mu )>0\), i.e., if \(\mu \) is non-uniformly hyperbolic, then \(\Gamma _\mu \) contains hyperbolic matrices. By Theorem 4.1 of [13], in this case the semigroup \(\Gamma _\mu \) contains either a heteroclinic tangency or else a non hyperbolic matrix, i.e., an elliptic or parabolic matrix. In each of these two cases we can produce heteroclinic tangencies with an arbitrary small perturbation. See Proposition  7.8. Hence measures with heteroclinic tangencies are dense in the class of non-uniformly hyperbolic measures.

1.1 Results

Let \(H(\mu )\) be the Shannon’s entropy (see Sect. 3.3) of the finitely supported measure \(\mu \).

Theorem A

Let \(\mu \) be a finitely supported measure on \(SL_2({\mathbb {R}})\). Assume that \(L(\mu )>0\), \(\mu \) is irreducible and that \(\mu \) has a heteroclinic tangency. Then, there exists an analytic one parameter family of finitely supported measures \(\{\mu _E\}_E\) such that \(\mu _0 = \mu \) and for any \(\alpha > {H(\mu )}/{L(\mu )}\), the function \(E\mapsto L(\mu _E)\) is not locally \(\alpha \)-Hölder at any neighborhood of \(E=0\).

Remark 1

Our result implies a similar conclusion as in Halperin/Simon-Taylor example with a less sharper threshold. For simplicity we consider the parameters \(a=0\) with energy \(E=0\). In this example \(H(\mu _{0,b,0})=\log 2\) while

$$\begin{aligned} L(\mu _{0,b,0})&\le \frac{1}{2}\,\log \left\Vert B_0 \right\Vert = \frac{1}{2}\, \log \sqrt{\frac{2+b^2 + |b| \sqrt{b^2+4 }}{2}}\\&< \frac{1}{2}\, \textrm{arccosh}\left( 1+\frac{|b|}{2}\right) . \end{aligned}$$

The last two quantities are asymptotically equivalent, which implies that

$$\begin{aligned} \frac{H(\mu _{0,b,0})}{\frac{1}{2}\, \log \left\Vert B_0\right\Vert }\sim \frac{2\,\log 2}{\textrm{arccosh}(1+|b|/2)} \quad \text { as } \; b\rightarrow \infty . \end{aligned}$$

Set

$$\begin{aligned} \alpha _{\mu }:= \sup \left\{ \alpha >0 :\ L\text { is locally } \alpha \text {-H}\ddot{\hbox {o}}\text {lder around } \mu \right\} . \end{aligned}$$

Corollary A

Let \(\mu \) be a finitely supported measure on \(SL_2({\mathbb {R}})\) with \(L(\mu )>0\). Then, either \(\mu \) is uniformly hyperbolic and L is locally analytic around \(\mu \), or else

$$\begin{aligned} \alpha _{\mu }\le \frac{H(\mu )}{L(\mu )}. \end{aligned}$$

As a consequence of the proof of A we have the following application in mathematical physics (for precise definitions see Sect. 4).

Corollary B

Consider the Anderson model of the discrete Schrödinger operators associated with a finitely supported measure \(\mu \). Let \(\alpha > \frac{H(\mu )}{L(\mu )}\) and \(E_0\) be an energy in the spectrum. Then, the integrated density of states function \(E\mapsto {\mathcal {N}}(E)\) and the Lyapunov exponent function \(E\mapsto L(E)\) are not \(\alpha \)-Hölder continuous at any neighborhood of \(E_0\).

1.2 Relations with other dimensions

See Sect. 3.3 for a precise description of the objects treated in this subsection.

The study of formulas relating (some type of) dimension, entropy and Lyapunov exponent has a vast history with many contributions in different settings (see for instance [14, 15] for diffeormorphisms of a compact manifold and [16] for self affine measures).

For \(SL_2({\mathbb {R}})\) supported measures \(\mu \) with \(L(\mu )>0\), Ledrappier in [17], proved that we have a dimension type formula for any (forward) stationary measures \(\eta \) associated with \(\mu \), namely

$$\begin{aligned} \text {Dim }\eta = {\text {min}}\left\{ 1,\, \frac{h_F(\eta )}{2L(\mu )} \right\} . \end{aligned}$$
(1)

\(\text {Dim}\) is a different notion of dimension from \({\text {dim}}\) given in Sect. 3.3 (See [3, Remark 2.34]), but in the case that \(\eta \) is exact dimensional they coincide. The exactness of the dimension of the stationary measures was established by Hochman and Solomyak in [18] assuming additionally that \(\mu \) is irreducible. In particular, if \(\eta ^+\) and \(\eta ^-\) denote respectively the forward and backward stationary measures then

$$\begin{aligned} {\text {dim}}\eta ^{\pm } = {\text {min}}\left\{ 1,\, \frac{h_F(\eta ^{\pm })}{2L(\mu )} \right\} \le {\text {min}}\left\{ 1,\, \frac{H(\mu )}{2L(\mu )} \right\} . \end{aligned}$$

Moreover, in [18] they provided, among other things, conditions to obtain Ledrappier-Young type formulas relating the dimension of the stationary measure, the entropy and the Lyapunov exponents, i.e.,

$$\begin{aligned} {\text {dim}}\eta ^{\pm } = {\text {min}}\left\{ 1,\, \frac{H(\mu )}{2L(\mu )} \right\} , \end{aligned}$$

where \(h_F(\eta ^{\pm })\) is the Furstenberg entropy of \(\eta ^{\pm }\). In light of the above discussion we leave the following questions.

Question 1

Assume that \(\alpha > {\text {dim}}\eta ^+ + {\text {dim}}\eta ^-\). Under the assumptions of Theorem A, is it true that the Lyapunov exponent is not \(\alpha \)-Hölder continuous in any neighborhood of \(\mu \)?

Question 2

Is \(\frac{H(\mu )}{L(\mu )}\) a sharp bound for the regularity? In other words, is there an example where \(\alpha _{\mu } = \frac{H(\mu )}{L(\mu )}\)?

Halperin’s example above does not answer this question.

Question 3

In the case that \(\frac{H(\mu )}{L(\mu )} \ge 1\), is it true that the Lyapunov exponent is Lipschitz continuous function around \(\mu \)?

Question 4

Is it possible to express the lower bound for the regularity in terms of some of the previous measurements?

1.3 Sketch of the proof and organization

In Mathematical Physics the Thouless formula (5) relates the Lyapunov exponent of a Schrödinger cocycle with the integrated density of sates (IDS) of the corresponding Schrödinger operator. It follows from this identity (5) that the Lyapunov exponent and the IDS, as functions of the energy, share the same modulus of continuity. See Proposition 4.1. The IDS is a spectral quantity that measures the asymptotic distribution of the eigenvalues of truncation matrices of the Schrödinger operator as the size of the truncation tends to infinity. The strategy to break the Hölder regularity of the IDS in Halperin’s example is to establish around a certain energy a very large concentration of eigenvalues of the Schrödinger truncated matrices which implies a disproportionately large leap of the IDS around that energy, see [10, Appendix 3]. Then, as explained above, the loss of Hölder regularity passes from the IDS to the Lyapunov exponent.

Let \(\mu \) be an irreducible and finitely supported measure on \(SL_2({\mathbb {R}})\) with positive Lyapunov exponent and \({{\textbf {A}}}:\Omega \rightarrow SL_2({\mathbb {R}})\) be the associated locally constant cocycle. In order to use the strategy described above, in Sect. 5 we embed the cocycle \({{\textbf {A}}}\) into a family of locally constant Schrödinger cocycles over a Markov shift.

We call matching to a configuration where the horizontal direction \(e_1=(1,0)\) is mapped in n iterations to the vertical direction \(e_2=(0,1)\) in a way that that \(e_1\) is greatly expanded in the first half iterations followed by a similar contraction in the second half iterations. The number n is referred to as the size of the matching. A matching of size n at some energy \(E_0\) determines an almost eigenvector for an \(n\times n\) truncated Schrödinger matrix, which then implies a true nearby eigenvalue \(E_0^*\approx E_0\) of the same matrix. Hence matchings of size n for energies in some small interval I can be used to count eigenvalues of a truncated Schrödinger operator of size n.

By Proposition 7.14, a heteroclinic tangency of the cocycle \({{\textbf {A}}}={{\textbf {A}}}_{(0)}\) implies many nearby matchings of any chosen large size n, spreading through a small interval of length \(\sim e^{-c\, n}\). Because these matchings are still not enough to break the Hölder regularity in the stated form, we prove in Proposition 7.15 that a single tangency will cause many more tangencies to occur at nearby energies, which are in some sense typical. Propositions 7.14 and 7.15 were designed to be used recursively in the sense that the output of the second feeds the input of the first. They could be used recursively to characterize the fractal structure of matchings and tangencies, a path we do not explore in this work. We do use them in a single cycle to gather the matchings, of some appropriate size, associated to a typical nearby heteroclinic tangency. The matchings coming from a typical tangency are now enough to break the Hölder regularity in the stated form.

Proposition 7.11 plays a key role in the proof of Theorem A, to estimate the number of matchings and tangencies from Propositions 7.14 and 7.15. On the other hand the proof of Proposition 7.11 relies on a characterization of the projective random walk distribution in Proposition 7.10 and a few Linear Algebra facts on the geometry of the projective action in Appendix A. See propositions A.3A.6 and Lemma A.9.

1.3.1 Organization.

This work is organized as follows. Section 2 contains the general definitions that will be used throughout the paper. In Sect. 3 we define and state some properties of locally constant linear cocycles and Furstenberg measures. We discuss general spectral properties of Schrödinger operators in Sects. 4 and 5 we show how to embed a general locally constant cocycle into a Schrödinger family over a Markov shift. In Sect. 6 we obtain a lower bound for the oscillation of the integrated density of states in terms of counting matchings. Section 7 contains the core technical results of the work, namely propositions 7.11, 7.14 and 7.15. Section 8 provides lower bounds for the measure of the set of matchings. In Sect. 9 we give the proof of the results. The Appendix A contains the linear algebra tools needed in this work and Appendix B describes some of the formulas for derivatives of projective actions.

1.3.2 Logical structure

Figure 1 describes the logical structure of the proof of Theorem A.

Fig. 1
figure 1

Logical dependencies

2 Basic Definitions and General Concepts

In this subsection we establish some of the general notation used throughout this work.

2.1 Preliminary definitions and notations

  • We denote by \(GL_d({\mathbb {R}})\) and \(SL_d({\mathbb {R}})\) respectively the group of \(d\times d\) invertible matrices and its subgroup of matrices with determinant one. Given a \(d\times d\) square matrix H we denote its spectrum by \({\text {Spec}}(H)\) and by \(|{\text {Spec}}(H)|\) the number of elements in \({\text {Spec}}(H)\) counted with multiplicity. Unless otherwise stated, \(\left\Vert H\right\Vert \) refers to the operator norm of the matrix H.

  • The projective space of \({\mathbb {R}}^2\), consisting of all lines in \({\mathbb {R}}^2\), is denoted by \({\mathbb {P}}^1\). Its points are denoted by \({{\hat{v}}}\in {\mathbb {P}}^1\). After introducing a projective point \({{\hat{v}}}\), by convention the letter v will stand for any unit vector aligned with the line \({{\hat{v}}}\). A natural distance in \({\mathbb {P}}^1\) is given by \(d({{\hat{v}}},\, {{\hat{w}}}):=|v\wedge w|=\sin \measuredangle ({{\hat{v}}}, {{\hat{w}}})\). Each \(A\in SL_2({\mathbb {R}})\) induces a projective automorphism \({{\hat{A}}}:{\mathbb {P}}^1\rightarrow {\mathbb {P}}^1\), where \({{\hat{A}}}\, {{\hat{v}}}:=\widehat{A\, v}\) is the line determined by the unit vector \(A\, v/\left\Vert A v\right\Vert \). For the sake of notational simplicity we often write \(A\,{{\hat{v}}}\) instead of \({{\hat{A}}}\, {{\hat{v}}}\).

  • We use the standard classification for \(SL_2({\mathbb {R}})\) matrices as elliptic, \(parabolic \) or hyperbolic meaning respectively that the absolute value of the trace is smaller than, equal, or greater than two.

  • Let X be a compact metric space. The space of all Borel probability measures on X is denoted by \({\mathcal {P}}(X)\). This is a convex and compact set with respect to the weak* topology. Given a sequence of measures \(\eta _n\in {\mathcal {P}}(X)\), we say that \(\eta _n\) converges weak* to \(\eta \) in \({\mathcal {P}}(X)\), and write \(\eta _n{\mathop {\rightharpoonup }\limits ^{*}}\eta \), if for every continuous function \(\varphi \in C^0(X)\),

    $$\begin{aligned} \int \varphi \, d\eta =\lim _{n\rightarrow \infty } \int \varphi \, d\eta _n. \end{aligned}$$
  • Given two probability measures \(\mu _1, \mu _2\in {\mathcal {P}}(SL_2({\mathbb {R}}))\), the convolution between \(\mu _1\) and \(\mu _2\) is the measure

    $$\begin{aligned} \mu _1*\mu _2 := \int _{SL_2({\mathbb {R}})} g_*\mu _2\, d\, \mu _1(g). \end{aligned}$$

    The n-th convolution power, \(\mu ^{*n}\), of a measure \(\mu \in SL_2({\mathbb {R}})\) is defined inductively by \(\mu ^{*n}:= \mu ^{*(n-1)}*\mu \).

  • Let \(\Lambda \) be a finite set and \(\Sigma =\Lambda ^{\mathbb {Z}}\). Given \(k\in {\mathbb {Z}}\) and a finite word \(a=(a_0,a_1,\ldots , a_{m-1})\in \Lambda ^m\) the set

    $$\begin{aligned}{}[k;\, a]:=\left\{ \zeta \in \Sigma \, :\, \zeta _{j+k}=a_j,\, \forall j=0,1,\ldots , m-1 \, \right\} \end{aligned}$$

    is called the cylinder of \(\Sigma \) determined by the word a and the position k. The integer m is referred to as the length of the cylinder.

  • Given a compact metric space (Xd), \(0<\theta <1\) and a continuous function \(\varphi \in C^0(X)\), the \(\theta \)-Hölder constant of \(\varphi \) is defined by

    $$\begin{aligned} v_\theta (\varphi ):=\sup _{\begin{array}{c} x,y\in X\\ x\ne y \end{array}}\frac{|\varphi (x)-\varphi (y)|}{d(x,y)^\theta }. \end{aligned}$$

    The space of \(\theta \)-Hölder continuous functions on X is

    $$\begin{aligned} C^\theta (X):=\left\{ \varphi \in C^0(X)\, :\, v_\theta (\varphi )<\infty \, \right\} \end{aligned}$$

    which endowed with the norm

    $$\begin{aligned} \left\Vert \varphi \right\Vert _\theta := \left\Vert \varphi \right\Vert _\infty + v_\theta (\varphi ) \end{aligned}$$

    becomes a Banach algebra.

  • Given sequences of real numbers \((a_n)\) and \((b_n)\) with \(a_n,\, b_n>0\) we write

    • \(a_n = O(b_n)\) if there exists an absolute constant \(C>0\) and \(n_0\in {\mathbb {N}}\) such that \(a_n \le C\, b_n\) for every \(n\ge n_0\);

    • \(a_n\lesssim b_n\) or \(b_n \gtrsim a_n\) if \(a_n = O(b_n)\);

    • \(a_n\sim b_n\) if \(\lim _{n\rightarrow \infty } a_n/b_n = 1\).

    • For \(\gamma ,\, t>0\) we write \(\gamma \, \gg \, t\) to indicate that t is much smaller than \(\gamma \).

  • Given some interval \(J\subset {\mathbb {R}}\) or \(J\subseteq {\mathbb {P}}^1\), we denote by |J| the length (size) of J. Given a positive number t, we denote by \(t\, J\) the interval with the same center as J and size \(t\, |J|\).

2.2 Linear cocycles

Let X be a compact metric space, with Borel \(\sigma \)-algebra \({\mathcal {B}}\). Consider a homeomorphism \(T:X\rightarrow X\) which preserves a probability measure \(\xi \) defined on \({\mathcal {B}}\) and such that the system \((T,\xi )\) is ergodic. The triple \((X, T, \xi )\) is referred to as the base dynamics.

Any continuous map \(A:X\rightarrow SL_2({\mathbb {R}})\) defines a linear cocycle, over the base dynamics \((X,T,\xi )\), \(F_A:X\times {\mathbb {R}}^2\rightarrow X\times {\mathbb {R}}^2\) given by \(F_A(x,v):= (Tx, A(x)\, v)\). By linearity of the fiber action we can define the projectivization of \(F_A\) as the map \({\hat{F}}_A:X\times {\mathbb {P}}^1\rightarrow X\times {\mathbb {P}}^1\) given by \({\hat{F}}_A(x,{{\hat{v}}}):= (Tx, {{\hat{A}}}(x)\, {{\hat{v}}})\). We also use the term linear cocycle referring to the map \(A:X\rightarrow SL_2({\mathbb {R}})\) when the base dynamics is fixed.

Note that for each \(n\in {\mathbb {Z}}\), the n-th iteration of the linear cocycle \(F_A\) sends a point \((x,v)\in X\times {\mathbb {R}}^2\) to \((T^nx, A^n(x)\, v)\), where

$$\begin{aligned} A^n(x) = \left\{ \begin{array}{lc} A(T^{n-1}x)\,\ldots \, A(Tx)\, A(x) &{} \text {if } n>0 \\ \; I &{} \text {if } n=0 \\ A(T^nx)^{-1}\,\ldots \, A(T^{-2}x)^{-1}\, A(T^{-1}x)^{-1} &{} \text {if } n<0. \end{array} \right. \end{aligned}$$

The Lyapunov exponent of the cocycle \(F_A\) can be defined as the limit

$$\begin{aligned} L(A) = \lim _{n\rightarrow \infty }\frac{1}{n}\log \left\Vert A^n(x)\right\Vert , \end{aligned}$$

which exists and is constant for \(\xi \)-a.e. \(x\in X\) as a consequence of Kingman’s sub-additive ergodic theorem. Notice that the Lyapunov exponent depends on the base dynamics despite the fact that the notation L(A) does not refer to \((X,T,\xi )\). The underlying base dynamics should always be clear from the context.

2.3 Transition kernel and stationary measures

Let X be a compact metric space. We call transition kernel to any continuous map \(K:X\rightarrow {\mathcal {P}}(X)\). Any transition kernel K induces a linear operator \(K:C^0(X)\rightarrow C^0(X)\)

$$\begin{aligned} (K\varphi )(x) := \int _X \varphi dK_x, \end{aligned}$$

acting on the space \(C^0(X)\) of continuous functions \(\varphi :X\rightarrow {\mathbb {R}}\). This is called the Markov operator associated to the transition kernel K. The adjoint of K, \(K^*\), in the space of probability measures \({\mathcal {P}}(X)\) is given by

$$\begin{aligned} K^*\xi := \int K_x d\xi (x). \end{aligned}$$

We say that a probability measure \(\xi _0\) is stationary for K if \(\xi _0\) is a fixed point of \(K^*\), i.e., if for every \(\varphi \in C^0(X)\),

$$\begin{aligned} \int _X\varphi \, d\xi _0 = \displaystyle \int _X\left( \int _X \varphi (y)\ dK_x(y) \right) \, d\xi _0(x). \end{aligned}$$

Consider the process \(e_n:X^{\mathbb {N}}\rightarrow X\), \(e_n(\omega ):= \omega _n\). Given a probability measure \(\xi \in {\mathcal {P}}(X)\) there exists a unique measure \(\tilde{\xi }\) in the space \(X^{{\mathbb {N}}}\) such that:

  1. (a)

    \({\tilde{\xi }} (e_0^{-1} (E) )=\xi (E)\), \(\forall \, E\in {\mathcal {B}}\);

  2. (b)

    \({\tilde{\xi }} ( e_n^{-1} (E) \, \vert \, e_{n-1}=x )= K_{x}(E)\), \(\forall \, E\in {\mathcal {B}},\, \forall \, x\in X\).

We say that the measure \(\tilde{\xi }\) is the Kolmogorov extension of the pair \((K,\xi )\). The following statements are equivalent:

  1. (1)

    \(\xi \) is K-stationary,

  2. (2)

    \({\tilde{\xi }}\) is invariant under the shift map \(T:X^{{\mathbb {N}}}\rightarrow X^{{\mathbb {N}}}\), \(T(x_n)_{n\in {\mathbb {N}}}:=(x_{n+1})_{n\in {\mathbb {N}}}\),

  3. (3)

    \(e_n:X^{\mathbb {N}}\rightarrow X\) is a stationary Markov process with transition kernel K and common law \(\xi \).

When these conditions hold we refer to \((K,\xi )\) as a Markov system. In this case the Kolmogorov extension \({\tilde{\xi }}\) admits a natural extension to \(X^{\mathbb {Z}}\), still denoted by \({\tilde{\xi }}\), for which the two sided process \(e_n:X^{\mathbb {Z}}\rightarrow X\), \(e_n(\omega ):= \omega _n\), is a stationary Markov process. Moreover \({\tilde{\xi }}\) is invariant under the two sided shift map \(T:X^{{\mathbb {Z}}}\rightarrow X^{{\mathbb {Z}}}\),

$$\begin{aligned} T(\ldots ,x_{-1}, \mathbf {x_0}, x_1, x_2,\ldots ) := (\ldots , x_0, \mathbf {x_1}, x_2,\ldots ), \end{aligned}$$

where the bold term in the above expression indicates the 0-th position of the sequence. The dynamical system \((T, {\tilde{\xi }})\) is then called the Markov shift over X induced by the pair \((K,\xi )\).

We say that a Markov system \((K,\xi )\) is strongly mixing if

$$\begin{aligned} \lim _{n\rightarrow \infty } \left\Vert K^n\varphi - \int \varphi \, d\xi \right\Vert _\infty =0 \end{aligned}$$

with uniform convergence over bounded sets of \(C^0(X)\). It is important to observe that if a Markov system \((K,\xi )\) is strongly mixing then the Markov shift \((T, \tilde{\xi })\) is mixing. See [19, Proposition 5.1].

3 Random Product of Matrices

In this section we describe the base dynamics associated with random i.i.d. products of matrices generated by a probability measure on \(SL_2({\mathbb {R}})\).

In the subsequent sections \(\mu \) is a probability measure on \(SL_2({\mathbb {R}})\) with finite support given by \({\text {supp}}\mu = \{A_1,\ldots , A_{\kappa }\}\subset SL_2({\mathbb {R}})\). We write

$$\begin{aligned} \mu = \sum _{i=1}^{\kappa }\mu _i\, \delta _{A_i}, \end{aligned}$$

where the components \(\mu _i:= \mu (\{A_i\})>0\).

Remark 2

The positivity requirements \(\mu _i>0\) avoids discontinuities as in Kifer counter-example. See [20] or [4, Remark 7.5].

3.1 General locally constant cocycles

3.1.1 Locally constant cocycles.

Let \(\Omega = \{1,\ldots , \kappa \}^{{\mathbb {Z}}}\) be the space of sequences in the symbols \(\{1,\ldots , \kappa \}\), \({\tilde{\mu }}= (\mu _1, \ldots , \mu _\kappa )^{{\mathbb {Z}}}\) be the Bernoulli product measure on \(\Omega \) and consider \(\sigma : \Omega \rightarrow \Omega \) the shift map. Note that the system \((\sigma , {\tilde{\mu }})\) is ergodic. We say that the triple \((\Omega , \sigma , {\tilde{\mu }})\) is the base dynamics determined by \(\mu \).

It is important to point out that the base dynamics \((\Omega , \sigma , {\tilde{\mu }})\) does not depend on the \(\kappa \)-tuple \((A_1,\dots ,A_{\kappa })\in (SL_2({\mathbb {R}}))^{\kappa }\) but only on the values \(\mu _i = \mu \{A_i\}\).

Consider the map \({{\textbf {A}}}: \Omega \rightarrow SL_2({\mathbb {R}})\) given by

$$\begin{aligned} {{\textbf {A}}}(\ldots , \omega _{-1}, \omega _0,\omega _1,\ldots ) := A_{\omega _0}. \end{aligned}$$

Notice that for each sequence \(\omega \in \Omega \), \({{\textbf {A}}}(\omega )\) only depends on the 0-th coordinate of the sequence \(\omega \). Such maps are known in the literature as locally constant linear cocycles. This is an agreed abuse of the term since for the standard topology in \(\Omega \), locally constant observables include a broader class of functions. Since the base dynamics is fixed, \((A_1,\ldots , A_{\kappa })\) determines the Lyapunov exponent \(L({{\textbf {A}}})\) and for that reason some times we write \(L({{\textbf {A}}}) = L(A_1,\ldots ,A_{\kappa })\) to emphasize this dependence. This definition of Lyapunov exponent of \({{\textbf {A}}}\) agrees with the one given in the introduction for the distribution law \(\mu \), so that \(L(\mu ) = L({{\textbf {A}}}) = L(A_1\ldots , A_{\kappa })\).

3.1.2 Uniformly hyperbolic cocycles.

The measure \(\mu \), or equivalently, the locally constant cocycle \({{\textbf {A}}}:\Omega \rightarrow SL_2({\mathbb {R}})\) is said to be uniformly hyperbolic if there exist \(C>0\) and \(\gamma >0\) such that for every \(n\ge 1\) and \(\omega \in \Omega \),

$$\begin{aligned} \left\Vert {{\textbf {A}}}^n(\omega )\right\Vert \ge Ce^{\gamma n} . \end{aligned}$$
(2)

It is known [5] that this is equivalent to the existence of two \({{\textbf {A}}}\)-invariant continuous sections \(F^u,\, F^s:\Omega \rightarrow {\mathbb {P}}^1\) such that for every \(\omega \) \(F^u(\omega )\oplus \, F^s(\omega ) = {\mathbb {R}}^2\) and there exist \(C>0\) and \(\gamma >0\) such that

$$\begin{aligned} \left\Vert {{\textbf {A}}}^n(\omega )|_{F^s(\omega )} \right\Vert \le Ce^{-\gamma n} \quad \text {and} \quad \left\Vert {{\textbf {A}}}^{-n}(\omega )|_{F^u(\omega )} \right\Vert \le Ce^{-\gamma n}. \end{aligned}$$

3.1.3 Forward and backward stationary measures

Consider the transition kernels \(Q_+:{\mathbb {P}}^1\rightarrow {\mathcal {P}}({\mathbb {P}}^1)\) and \(Q_{-}:{\mathbb {P}}^1\rightarrow {\mathcal {P}}({\mathbb {P}}^1)\) defined, respectively, by

$$\begin{aligned} Q_+({{\hat{v}}}) := \sum _{i=1}^{\kappa }\mu _i\, \delta _{A_i\, {{\hat{v}}}} \quad \text {and}\quad Q_-({{\hat{v}}}) := \sum _{i=1}^{\kappa }\mu _i\, \delta _{A_i^{-1}\, {{\hat{v}}}}. \end{aligned}$$

Definition 1

A measure \(\eta \in {\mathcal {P}}({\mathbb {P}}^1)\) is called forward, resp. backward, stationary for \(\mu \)  if   \(Q_+^*\, \eta =\eta \), resp. \(Q_-^*\, \eta =\eta \), i.e., if \(\eta \) is stationary for \(Q_+\), resp. for \(Q_-\).

Notice that a backward stationary measure for \(\mu \) is a forward stationary measure for the reverse measure \(\mu ^{-1}:=\sum _{i=1}^{\kappa } \mu _i\, \delta _{A_i^{-1}}\).

3.2 Irreducible cocycles

Throughout this section, unless otherwise explicitly said, we assume that the probability measure \(\mu = \sum \mu _i\, \delta _{A_i}\) has positive Lyapunov exponent and is quasi-irreducible.

It follows that the Markov operator \(Q_+:C^0({\mathbb {P}}^1)\rightarrow C^0({\mathbb {P}}^1)\) defined by

$$\begin{aligned} (Q_+\varphi )({\hat{v}}) := \sum _{i=1}^{\kappa }\mu _i\, \varphi (A_i\, {\hat{v}}), \end{aligned}$$

preserves the space of \(\theta \)-Hölder continuous functions \(C^{\theta }({\mathbb {P}}^1)\), for some \(\theta >0\), and \(Q_+|_{C^{\theta }({\mathbb {P}}^1)}:C^{\theta }({\mathbb {P}}^1)\rightarrow C^{\theta }({\mathbb {P}}^1)\) is a quasi-compact and simple operator, i.e., it has a simple largest eigenvalue, namely 1 associated to the constant functions, and all other elements in the spectrum have absolute value strictly less than 1.

Proposition 3.1

There exist a unique forward stationary measure \(\eta ^+\) and a unique backward stationary measure \(\eta ^-\) for \(\mu \).

Proof

See [21, Proposition 4.2]. \(\square \)

The operator \(Q_+\) contracts the Hölder seminorm.

Proposition 3.2

There exist positive constants \(0<\theta <1\), C and c such that

$$\begin{aligned} v_\theta (Q_+^n \varphi )\le C\,e^{-c\, n}\, v_\theta (\varphi )\qquad \forall \, n\in {\mathbb {N}}. \end{aligned}$$

for every \(\varphi \in C^\theta ({\mathbb {P}}^1)\).

Proof

See [21, Propositions 4.1 and 4.2]. \(\square \)

Another consequence of the quasi-compactness is that the locally constant linear cocycle \({{\textbf {A}}}:\Omega \rightarrow SL_2({\mathbb {R}})\), associated with \(\mu \) and defined on the product space \(\Omega = \{1,\ldots ,\kappa \}^{{\mathbb {Z}}}\), satisfies uniform large deviation estimates of exponential type in a neighborhood of \({{\textbf {A}}}\).

Proposition 3.3

There exist constants \(\delta >0\), \(C>0\), \(\tau >0\) and \(\varepsilon _0>0\) such that for every \(\varepsilon \in (0,\varepsilon _0)\), for all locally constant \({{\textbf {B}}}:\Omega \rightarrow SL_2({\mathbb {R}})\) with \(\left\Vert {{\textbf {B}}} - {{\textbf {A}}}\right\Vert _{\infty }<\delta \), every \({{\hat{v}}}\in {\mathbb {P}}^1\) and \(n\in {\mathbb {N}}\),

$$\begin{aligned} {\tilde{\mu }}\left( \left\{ \omega \in \Omega \, :\, \left| \frac{1}{n}\log \left\Vert {{\textbf {B}}}^n(\omega )\,v\right\Vert - L({{\textbf {B}}}) \right| \ge \varepsilon \right\} \right) \le C\, e^{-\tau \varepsilon ^2n}, \end{aligned}$$

and

$$\begin{aligned} {\tilde{\mu }}\left( \left\{ \omega \in \Omega \, :\, \left| \frac{1}{n}\log \left\Vert {{\textbf {B}}}^n\right\Vert - L({{\textbf {B}}}) \right| \ge \varepsilon \right\} \right) \le C\, e^{-\tau \varepsilon ^2n}. \end{aligned}$$

Proof

See Theorem 4.1 and its proof in [21]. \(\square \)

Let \(0<\theta <1\) be the constant in Proposition 3.2.

Proposition 3.4

There exist constants \(C>0\) and \(c>0\) such that for any interval \(I\subset {\mathbb {P}}^1\), every \({\hat{v}}\in {\mathbb {P}}^1\) and \(n\ge 1\) we have,

$$\begin{aligned} \eta ^{\pm }(I/2) - C\frac{e^{-cn}}{|I|^\theta } \le {\tilde{\mu }}\left( \left[ {{\textbf {A}}}^{\pm n}(\cdot )\, {\hat{v}}\in I \right] \right) \le \eta ^{\pm }(2I) + C\frac{e^{-cn}}{|I|^\theta }. \end{aligned}$$

Proof

Set \(\varepsilon = |I|/4\) and consider piece-wise linear functions \(f^\pm _\varepsilon \in C^{\theta }({\mathbb {P}}^1)\) such that \(v_{\theta }(f^\pm _{\varepsilon }) = \varepsilon ^{-\theta }\),   \(0\le f^-_{\varepsilon } \le \chi _I \le f^+_{\varepsilon }\le 1\),   \(f^-_\varepsilon =1\) on I/2  and   \(f^+_{\varepsilon }=0\) out of 2I. By Proposition 3.2 we have that

$$\begin{aligned} \left| Q^n_+(f^{\pm }_{\varepsilon }) - \int f^{\pm }_{\varepsilon }\, d\eta ^+\right|&\le Ce^{-cn}v_{\theta }(f_{\varepsilon }) = Ce^{-cn}\varepsilon ^{-\theta }. \end{aligned}$$

Therefore,

$$\begin{aligned} {\tilde{\mu }}\left( \left[ {{\textbf {A}}}^n(\cdot )\, {\hat{v}}\in I \right] \right)&= Q_+^n(\chi _I) \ge Q_+^n(f^-_{\varepsilon }) \ge \int f^-_{\varepsilon }\, d\eta ^+- Ce^{-cn}\varepsilon ^{-\theta }\\&\ge \eta ^+(I/2) - 4^{\theta }Ce^{-cn}|I|^{-\theta }, \end{aligned}$$

and

$$\begin{aligned} {\tilde{\mu }}\left( \left[ {{\textbf {A}}}^n(\cdot )\, {\hat{v}}\in I \right] \right)&= Q_+^n(\chi _I) \le Q_+^n(f^+_{\varepsilon }) \le \int f^+_{\varepsilon }\, d\eta ^++ Ce^{-cn}\varepsilon ^{-\theta }\\&\le \eta ^+(2I) + 4^{\theta }Ce^{-cn}|I|^{-\theta }. \end{aligned}$$

The argument for \({{\textbf {A}}}^{-n}\) is analogous. \(\square \)

3.3 Entropy and dimensions

3.3.1 Entropies.

For a finitely supported measure \(\mu \in {\mathcal {P}}(SL_2({\mathbb {R}}))\) the Shannon’s entropy defined by

$$\begin{aligned} H(\mu ):= -\sum _{g\in {\text {supp}}(\mu )}\mu (\{g\})\log \mu (\{g\}) =-\sum _{i=1}^\kappa \mu _i\, \log \mu _i. \end{aligned}$$

A measurement of how far the semigroup generated by the \({\text {supp}}(\mu )\) is from being free is given by

$$\begin{aligned} h_{\text {WR}}(\mu ) := \lim _{n\rightarrow \infty }\frac{1}{n}H(\mu ^{*n}) = \inf _{n\in {\mathbb {N}}}\frac{1}{n}H(\mu ^{*n}), \end{aligned}$$

which is usually called the random walk entropy of \(\mu \). It holds that \(h_{\text {RW}}(\mu )\le H(\mu )\) and the equality is equivalent to the semigroup generated by \({\text {supp}}(\mu )\) being free. This is the typical case.

The Furstenberg’s entropy, also known as Boundary entropy, is defined by

$$\begin{aligned} h_F(\eta ) := \int \int \log \frac{dg_*\eta }{d\eta }(v) \, d\eta (v)\, d\mu (g). \end{aligned}$$

We always have that,

$$\begin{aligned} h_F(\eta ) \le h_{\text {RW}}(\mu ) \le H(\mu ). \end{aligned}$$

See [3, Theorem 2.31] for details.

3.3.2 Dimension.

Let \(\eta \) be a probability measure on \({\mathbb {R}}\) (or \({\mathbb {P}}^1\)). For any \(t\in {\mathbb {R}}\), the limits

$$\begin{aligned} {\overline{{\text {dim}}}}(\eta , t)&=\limsup _{\delta \rightarrow 0}\frac{ \log \eta ([t-\delta , t+\delta ]) }{\log \delta },\\ {\underline{{\text {dim}}}}(\eta ,t)&=\liminf _{\delta \rightarrow 0}\frac{ \log \eta ([t-\delta ,t+\delta ]) }{\log \delta } \end{aligned}$$

are called, respectively, the upper local dimension and lower local dimension of \(\eta \) at the point \(t\in {\mathbb {R}}\). We say that \(\eta \) is exact dimensional if there exists a real number \(\alpha \ge 0\) such that \({\overline{{\text {dim}}}}(\eta , t) = {\underline{{\text {dim}}}}(\eta , t) = \alpha \), for \(\eta \)-a.e. \(t\in {\mathbb {R}}\). In this case, the number \(\alpha \) is the dimension of the probability measure \(\eta \) and is denoted just by \({\text {dim}}\eta \).

As mentioned in the introduction the stationary measures of an irreducible cocycle with positive Lyapunov exponent are always exact dimensional.

3.3.3 Entropy deviations.

Consider \(\mathbf{{p}}_n:\Omega \rightarrow {\mathbb {R}}\), \(\mathbf{{p}}_{n}(\omega ):= \prod _{j=0}^{n-1} p_{\omega _j}\), the function \(\varphi :\Omega \rightarrow {\mathbb {R}}\),\(\varphi (\omega ):= -\log p_{\omega _0} \), and notice that

$$\begin{aligned} \displaystyle \int \varphi \, d\, {\tilde{\mu }}= H(\mu )\quad \text { and } \quad (S_n\varphi )(\omega ):=\sum _{j=0}^{n-1} \varphi (\sigma ^j\omega ) =-\log \mathbf{{p}}_n(\omega ). \end{aligned}$$

Proposition 3.5

Assuming \(L(\mu )>0\) let \(h:={\text {max}}_{1\le j\le \kappa } - \log p_j>0\). For every \(n\in {\mathbb {N}}\) and \(\beta >0\),

$$\begin{aligned} {\tilde{\mu }}\left( \left\{ \omega \in \Omega :\, \left| \frac{1}{n}\log \mathbf{{p}}_{n}(\omega ) + H(\mu ) \right| >\beta \right\} \right) \le 2 \, \exp \left( -n\, \frac{ 2\,\beta ^2}{ h^2}\right) . \end{aligned}$$

Proof

The large deviation set in the statement is

$$\begin{aligned} \Delta _{n}:=\left\{ \omega \in \Omega :\left| (S_n\varphi )(\omega )-{\mathbb {E}}(S_n\varphi ) \right| > n\, \beta \right\} \end{aligned}$$

and by Hoeffding’s inequality [22, Theorem 2]

$$\begin{aligned} {\tilde{\mu }}(\Delta _n)\le 2\, \exp \left( -\frac{2\,n^2\,\beta ^2}{n\, h^2}\right) = 2\, \exp \left( - n\, \frac{2 \,\beta ^2}{h^2}\right) . \end{aligned}$$

\(\square \)

4 Schrödinger Cocycles

In this section we present some background in the theory of Schrödinger cocycles. The advantage in dealing with this family is the intrinsic relation with the spectral theory of (discrete) Schrödinger operators which allow us, among other things, to analyze the behaviour of the Lyapunov exponent in terms of properties of the spectrum of these operators.

4.1 Schrödinger operators and cocycles

Consider the base dynamics \((X, T, \xi )\), where \(T:X\rightarrow X\) is a homeomorphism on the compact metric space X and \(\xi \) is a probability measure on X such that the system \((T, \xi )\) is ergodic. Fix a continuous function \(\phi :X\rightarrow {\mathbb {R}}\).

For each \(x\in X\), the (discrete) Schrödinger operator at x is the self-adjoint bounded linear operator \(H_x:l^2({\mathbb {Z}})\rightarrow l^2({\mathbb {Z}})\)Footnote 2 defined, for \(u = (u_n)_{n\in {\mathbb {Z}}}\in l^2({\mathbb {Z}})\) by

$$\begin{aligned} (H_x\, u)_n := -u_{n+1} - u_{n-1} + \phi (T^nx)u_n \end{aligned}$$

or in short notation

$$\begin{aligned} H_x\, u := -\Delta u + \phi _x\, u, \end{aligned}$$

where \(\Delta \) is the Laplace operator and \(\phi _x\) is the multiplication by \((\phi (T^nx))_{n\in {\mathbb {Z}}}\).

It is convenient to express the operator \(H_x\) as a matrix in the canonical basis \((e_i)_{i\in {\mathbb {Z}}}\) of \(l^2({\mathbb {Z}})\), where \((e_i)_n = \delta _{i,n}\).

$$\begin{aligned} H_x = \left( \begin{array}{cccccccc} \ddots &{} \vdots &{} \vdots &{} \vdots &{} &{} \vdots &{} \vdots &{} \\ \dots &{} \phi (T^{-1}x) &{} -1 &{} 0 &{} \dots &{} 0 &{} 0 &{} \dots \\ \dots &{} -1 &{} \phi (x) &{} -1 &{} \dots &{} 0 &{} 0 &{} \dots \\ \dots &{} 0 &{} -1 &{} \phi (Tx) &{} \dots &{} 0 &{} 0 &{} \dots \\ &{} \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots &{} \\ \dots &{} 0 &{} 0 &{} 0 &{} \dots &{} \phi (T^{n-2}x) &{} -1 &{} \dots \\ \dots &{} 0 &{} 0 &{} 0 &{} \dots &{} -1 &{} \phi (T^{n-1}x) &{} \dots \\ &{} \vdots &{} \vdots &{} \vdots &{} &{} \vdots &{} \vdots &{} \ddots \end{array} \right) \end{aligned}$$

A matrix with this structure where all entries outside the three main diagonals vanish is usually called tridiagonal matrix.

Assume that there exists a sequence \(u = (u_n)_{n\in {\mathbb {Z}}}\), not necessarily in \(l^2({\mathbb {Z}})\), which satisfies the eigenvalue equation for some \(E\in {\mathbb {R}}\), i.e.,

$$\begin{aligned} H_x\, u = E\, u. \end{aligned}$$
(3)

Using the definition of \(H_x\), Eq. (3) gives us a second order recurrence equation which can be written in matrix form as

$$\begin{aligned} \left( \begin{array}{cc} \phi (T^{n-1}x) - E &{} -1 \\ 1 &{} 0 \end{array} \right) \, \begin{pmatrix} u_{n-1} \\ u_{n-2} \end{pmatrix}= \begin{pmatrix} u_{n} \\ u_{n-1} \end{pmatrix}. \end{aligned}$$

This implies that

$$\begin{aligned} \left( \begin{array}{cc} \phi (T^{n-1}x) - E &{} -1 \\ 1 &{} 0 \end{array} \right) \cdot \ldots \cdot \left( \begin{array}{cc} \phi (x) &{} -1 \\ 1 &{} 0 \end{array} \right) \begin{pmatrix} u_0 \\ u_{-1} \end{pmatrix}= \begin{pmatrix} u_n \\ u_{n-1} \end{pmatrix}. \end{aligned}$$
(4)

Hence, if we define the family of cocycles \(A_E:X\rightarrow SL_2({\mathbb {R}})\)

$$\begin{aligned} A_E(x) := \left( \begin{array}{cc} \phi (x) - E &{} -1 \\ 1 &{} 0 \end{array} \right) , \end{aligned}$$

then equation (4) can be rewritten as

$$\begin{aligned} A^n_E(x)\, \begin{pmatrix} u_0\\ u_{-1} \end{pmatrix} = \begin{pmatrix} u_n\\ u_{n-1} \end{pmatrix}. \end{aligned}$$

In other words, any (formal) eigenvector \(u = (u_n)\) of the Schrödinger operator \(H_x\) associated with an eigenvalue E is completely determined by the orbit of the cocycle \(A_E\) starting at \((u_0, u_{-1})\in {\mathbb {R}}^2\). This is one of the first indications of the close relationship between the action of the cocycle \(A_E\) and the properties of the spectrum of \(H_x\).

The cocycles \(A_E: X\rightarrow SL_2({\mathbb {R}})\) are called Schrödinger cocycles with potential \(\phi :X\rightarrow {\mathbb {R}}\), generated by the dynamical system \((X, T, \xi )\).

4.2 Integrated density of states and Thouless formula

For each \(n\in {\mathbb {N}}\) and for each \(x\in X\), \(H^n_x\in {\mathbb {M}}_n({\mathbb {R}})\) denotes the truncated Schrödinger operator defined by

$$\begin{aligned} H^n_x = \left( \begin{array}{cccccccc} \phi (x) &{} -1 &{} 0 &{} \dots &{} 0 &{} 0\\ -1 &{} \phi (Tx) &{} -1 &{} \dots &{} 0 &{} 0\\ 0 &{} -1 &{} \phi (T^2x) &{} \dots &{} 0 &{} 0\\ \vdots &{} \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} 0 &{} 0 &{} \dots &{} \phi (T^{n-2}x) &{} -1\\ 0 &{} 0 &{} 0 &{} \dots &{} -1 &{} \phi (T^{n-1}x)\\ \end{array} \right) . \end{aligned}$$

For any interval \(I\subset {\mathbb {R}}\) denote by \(|{\text {Spec}}(H^n_x)\cap I|\) the number of eigenvalues of \(H^n_x\) in I counted with multiplicity. With this notation, set for each \(x\in X\)

$$\begin{aligned} {\mathcal {N}}_{n,x}(t) := \frac{1}{n}\left| {\text {Spec}}(H^n_x)\cap (-\infty , t] \right| . \end{aligned}$$

So, by definition \({\mathcal {N}}_{n,x}(t)\) is a distribution function of a probability measure supported in the spectrum of \(H^n_x\).

It is known [23, Subsections 3.2 and 3.3] that for each \(t\in {\mathbb {R}}\) the limit

$$\begin{aligned} {\mathcal {N}}(t) = \lim _{n\rightarrow \infty } {\mathcal {N}}_{n,x}(t), \end{aligned}$$

exists and by ergodicity of the base dynamics \((X, T, \xi )\) is constant for \(\xi \)-a.e. \(x\in X\). The function \({\mathcal {N}}:{\mathbb {R}}\rightarrow [0,\infty )\) is called the integrated density of states.

The following equation, known as the Thouless formula, relates the Lyapunov exponent of a Schrödinger cocycle with the integrated density of states.

$$\begin{aligned} L(A_E) = \displaystyle \int _{-\infty }^{\infty }\log |E - t|\ d{\mathcal {N}}(t), \qquad \forall \, E\in {\mathbb {R}}. \end{aligned}$$
(5)

See [23, Theorem 3.16]. Integrating by parts the Riemann-Stieltjes integral on the right-hand side of equation (5), we see that this equation expresses \(L(A_E)\) as the Hilbert transform of \({\mathcal {N}}(t)\). This fact implies, by the work of Goldstein and Schlag, see [24, Lemma 10.3], that the Lyapunov exponent and the integrated density of states must share all ‘sufficiently nice’ modulus of continuity. These nice moduli of continuity include the Hölder and weak-Hölder regularities. In particular we have:

Proposition 4.1

\({\mathcal {N}}(E)\) is not \(\beta \)-Hölder  if and only if  \(E\mapsto L(A_E)\) is not \(\beta \)-Hölder.

4.3 Temple’s lemma

The Thouless formula allows us to shift the analysis of the regularity from the Lyapunov exponent to the integrated density of states, and more specifically to the counting of eigenvalues of the truncated Schrödinger operators \(H^n_x\). An important tool is the next linear algebra fact, known as Temple’s lemma, which allows us to count eigenvalues by counting instead orthonormal almost eigenvectors.

Lemma 4.2

(Temple’s lemma) Let \((V, \langle \cdot ,\cdot \rangle )\) be a finite dimensional Hilbert space and let \(H:V\rightarrow V\) be a self-adjoint linear operator on V. Given \(\delta >0\) and \(\lambda _0\in {\mathbb {R}}\), assume that there exists a orthonormal set \(\{u_1,\ldots , u_k\}\subset V\) such that

  1. 1.

    \(\langle Hu_i, u_j\rangle = \langle Hu_i, H u_j\rangle = 0\)   if   \(i\ne j\),

  2. 2.

    \(\left\Vert Hu_i - \lambda _0 u_i\right\Vert \le \delta \) for every i.

Then   \(|{\text {Spec}}(H)\cap (\lambda _0 - \delta , \lambda _0 + \delta )|\ge k\).

Proof

See [10, Lemma A.3.2]. \(\square \)

We will say that \(u\in V\backslash \{0\}\) is a \(\delta \)-almost eigenvector associated with an almost eigenvalue \(\lambda _0\) if condition 2 above is satisfied.

5 Embedding Cocycles Into Schrödinger Families

Let \(\mu \), as in the previous section, be a probability measure on \(SL_2({\mathbb {R}})\) supported in \(\{A_1,\ldots , A_{\kappa }\}\).

We use the notation \(S(t)\in SL_2({\mathbb {R}})\) to denote the Schrödinger matrix

$$\begin{aligned} S(t) = \left( \begin{array}{cc} t &{} -1 \\ 1 &{} 0 \end{array} \right) \end{aligned}$$

The following lemma is the ground basis of the entire section. The fact that we can decompose any given \(SL_2({\mathbb {R}})\) matrix as a product of four Schrödinger matrices provides a way to embed our random cocycle in a Schrödinger cocycle over a Markov shift.

Lemma 5.1

For every \(B\in SL_2({\mathbb {R}})\), there exists real numbers \(t_0, t_1, t_2\) and \(t_3\) such that \(B = S(t_3)\,S(t_2)\,S(t_1)\,S(t_0)\).

Proof

Consider first the map \({\mathbb {R}}^3\ni (t_1,t_2,t_3)\mapsto S(t_3)\, S(t_2)\, S(t_1)\in SL_2({\mathbb {R}})\). A direct calculation shows that the range of this map is the set \(SL_2({\mathbb {R}})\setminus {\mathcal {M}}\) where

$$\begin{aligned} {\mathcal {M}}:=\left\{ \begin{pmatrix} a &{} \lambda \\ -\lambda ^{-1} &{} 0 \end{pmatrix} \, :\, \lambda \ne 0,1, \quad \text {and} \quad a\in {\mathbb {R}}\right\} . \end{aligned}$$

So, the range of the map \({\mathbb {R}}^3\ni (t_1,t_2,t_3)\mapsto S(0)\,S(t_3)\, S(t_2)\, S(t_1)\in SL_2({\mathbb {R}})\) is the set \(SL_2({\mathbb {R}})\setminus S(0)\,{\mathcal {M}}\) where

$$\begin{aligned} S(0)\, {\mathcal {M}} = \left\{ \begin{pmatrix} \lambda ^{-1} &{} 0 \\ a &{} \lambda \end{pmatrix} \, :\, \lambda \ne 0, 1 \quad \text {and} \quad a\in {\mathbb {R}}\right\} . \end{aligned}$$

Another simple calculation shows that if

$$\begin{aligned} (t_1,t_2,t_3,t_4)=(1,\, 1 - \lambda ^{-1},\, -\lambda ,\, \lambda ^{-2} - \lambda ^{-1} - a\lambda ^{-1}) \end{aligned}$$

then

$$\begin{aligned} S(t_3)\,S(t_3)\,S(t_2)\,S(t_1) = \begin{pmatrix} \lambda ^{-1} &{} 0 \\ a &{} \lambda \end{pmatrix}. \end{aligned}$$

Hence every matrix in \(SL_2({\mathbb {R}})\) is a product of four Schrödinger matrices. \(\square \)

5.1 Construction of the embedding

For each \(i=1,\ldots , \kappa \), by Lemma 5.1, there exists \(t^i = (t^i_0,\ldots , t^i_3)\in {\mathbb {R}}^4\) such that

$$\begin{aligned} A_i = S(t^i_3)\, S(t^i_2)\, S(t^i_1)\, S(t^i_0). \end{aligned}$$
(6)

Consider the set \(\Lambda = \{1,\ldots ,\kappa \}\times \{0, 1, 2, 3\}\). We define the following transition kernel \(K:\Lambda \rightarrow {\mathcal {P}}(\Lambda )\), for each element \((i,j)\in \Lambda \),

$$\begin{aligned} K_{(i,j)} := \left\{ \begin{array}{ll} \delta _{(i,j+1)} &{} \text {if } j\in \{0,1,2\} \\ \sum _{k=1}^{\kappa }\mu _k\delta _{(k,0)} &{} \text { if } j=3, \end{array} \right. \end{aligned}$$

where \(\mu _k = \mu (A_k)\), for any \(k\in \{1,\ldots , \kappa \}\) and \(\delta _{(k,l)}\) denotes the Dirac measure supported in (kl). Note that the measure

$$\begin{aligned} \nu = \frac{1}{4}\sum _{j=0}^3\sum _{i=1}^{\kappa }\mu _i\delta _{(i,j)}. \end{aligned}$$

defines a \(K\)-stationary measure on \(\Lambda \). Let \(\tilde{\nu }\) be the Kolmogorov extension of \((K, \nu )\) on the product space \(\Sigma = \Lambda ^{{\mathbb {Z}}}\). This defines the base dynamics \((\Sigma , \sigma , \tilde{\nu })\), where \(\sigma : \Sigma \rightarrow \Sigma \) is the shift map and \({\text {supp}}\tilde{\nu }\) is the set of K-admissible sequences.

5.2 Conjugating the embedded and original cocycle

Consider the real function \(\phi :\Sigma \rightarrow {\mathbb {R}}\) defined by

$$\begin{aligned} \phi (\zeta ) := t^{i_0}_{j_0}, \quad \text { where } \; \zeta = ((i_n,j_n))_n, \end{aligned}$$

where the numbers \(t^{i_0}_{j_0}\) were defined in (6). We can express the family of Schrödinger cocycles, \({{\textbf {A}}}_E: \Sigma \rightarrow SL_2({\mathbb {R}})\), with potential \(\phi \), generated by the Markov shift \((\Sigma ,\, \sigma ,\, \tilde{\nu })\), by

$$\begin{aligned} {{\textbf {A}}}_E(\zeta ) = S(\phi (\zeta ) - E), \end{aligned}$$

for every \(E\in {\mathbb {R}}\) and \(\zeta \in \Sigma \). It is important to notice that iterating the cocycle \({{\textbf {A}}}_0\) four times we recover the locally constant cocycle \({{\textbf {A}}}: \Omega \rightarrow SL_2({\mathbb {R}})\). More precisely, for each element \(\zeta = ((i_n,j_n))_n\in \Sigma \), with \(j_0=0\), consider the sequence \(\omega = (i_{4n})_n\in \Omega \). By (6) we have that

$$\begin{aligned} {{\textbf {A}}}_0^4(\zeta )&= {{\textbf {A}}}_0(\sigma ^3(\zeta ))\, {{\textbf {A}}}_0(\sigma ^2(\zeta ))\, {{\textbf {A}}}_0(\sigma (\zeta ))\, {{\textbf {A}}}_0(\zeta )\\&= S(t^{i_3}_3)\, S(t^{i_3}_2)\, S(t^{i_3}_1)\, S(t^{i_0}_0)\\&= A_{i_0} = {{\textbf {A}}}(\omega ). \end{aligned}$$

In this case, we say that \({{\textbf {A}}}_0:\Sigma \rightarrow SL_2({\mathbb {R}})\) is the embedding of the cocycle \({{\textbf {A}}}:\Omega \rightarrow SL_2({\mathbb {R}})\) into the Schrödinger family \(\{{{\textbf {A}}}_E:\Sigma \rightarrow SL_2({\mathbb {R}})\}_{E\in {\mathbb {R}}}\) over \((\Sigma , \sigma , \tilde{\nu })\).

For each \(j\in \{0,1,2,3\}\), set \(\Sigma _j:= \{(i_n,j_n)_n\in \Sigma ;\ j_0 = j\}\). Note that

$$\begin{aligned} \Sigma = \bigcup _{j=0}^3\Sigma _j, \end{aligned}$$

is a partition of the set \(\Sigma \) and for each \(j\in \{0,1,2,3\}\), \(\sigma (\Sigma _j) = \Sigma _{j+1\!\!\!\mod \!4}\). In particular, for every \(j=0,1,2,3\), \(\Sigma _j\) is \(\sigma ^4\)-invariant. Denote by \(\pi :\Sigma \rightarrow \Omega \) the natural projection mapping \(\Sigma \ni (i_n,j_n)_n\mapsto (i_{4n})_n\in \Omega \).

Using the notation above we see that \((\Omega , \sigma , {\tilde{\mu }})\) is a factor of \((\Sigma , \sigma , \tilde{\nu })\) in the following sense.

Lemma 5.2

The map \(\pi :\Sigma \rightarrow \Omega \) is surjective,   \(\pi _*\tilde{\nu } = {\tilde{\mu }}\)   and   \(\sigma \circ \pi = \pi \circ \sigma \).

Moreover, for each \(j\in \{0,1,2,3\}\), \(\pi |_{\Sigma _j}\) conjugates \((\Sigma _j, \sigma ^4, 4\tilde{\nu })\) \((\Omega , \sigma , {\tilde{\mu }})\), where \(4\tilde{\nu }\) is the normalization of \(\tilde{\nu }\) on \(\Sigma _j\).

For the linear cocycle we have:

Lemma 5.3

For every \(j=0,1,2,3\) the linear cocycle

$$\begin{aligned} \Sigma _j\times {\mathbb {R}}^2 \ni (\zeta , v) \mapsto (\sigma ^{4}(\zeta ), {{\textbf {A}}}^4_0(\zeta )\, v)\in \Sigma _j\times {\mathbb {R}}^2 \end{aligned}$$

is conjugated to the linear cocycle

$$\begin{aligned} \Omega \times {\mathbb {R}}^2\ni (\omega , v) \mapsto (\sigma (\omega ), {{\textbf {A}}}(\omega )\, v)\in \Omega \times {\mathbb {R}}^2. \end{aligned}$$

In particular, taking \(j=0\) we have that \(F^4_{{{\textbf {A}}}_0}:\Sigma _0\times {\mathbb {R}}^2\rightarrow \Sigma _0\times {\mathbb {R}}^2\) is conjugated to \(F_{{{\textbf {A}}}}:\Omega \times {\mathbb {R}}^2\rightarrow \Omega \times {\mathbb {R}}^2\). The same considerations hold for the projectivized cocycles.

As consequence of the previous lemmas we have

Lemma 5.4

\(\displaystyle L(\mu ) = L({{\textbf {A}}}) = 4\,L({{\textbf {A}}}_0). \)

Using the conjugation in Lemma 5.2 we build the one parameter family of cocycles \({{\textbf {A}}}_{(E)}:\Omega \rightarrow SL_2({\mathbb {R}})\),

$$\begin{aligned} {{\textbf {A}}}_{(E)}(\omega ) := {{\textbf {A}}}^4_E(\zeta ), \end{aligned}$$

where \(\zeta = (\pi |_{\Sigma _0})^{-1}(\omega )\). The cocycles of this family are locally constant and determined by the probability measures \(\mu _E\) on \(SL_2({\mathbb {R}})\) defined by

$$\begin{aligned} \mu _E = \sum _{i=1}^{\kappa }\mu _i\delta _{{{\textbf {A}}}_{(E)}(\bar{i})}, \end{aligned}$$

where \(\bar{i}\) is any sequence \(\omega \in \Omega \) such that \(\omega _0=i\), for every \(i=1,\ldots , \kappa \). This family is the smooth curve of measures through \(\mu \) whose existence is claimed in Theorem A. It depends analytically on E in the sense that the function \(E\mapsto \int \varphi \, d\mu _E=\sum _{i=1}^\kappa \mu _i\, \varphi ( {{\textbf {A}}}_{(E)}(\bar{i}))\) is analytic for every analytic function \(\varphi (A)\) on \(SL_2({\mathbb {R}})\). In particular the curve \(E\mapsto \mu _{E}\) is continuous with respect to the weak* topology.

Corollary 5.5

For every \(E\in {\mathbb {R}}\),   \(\displaystyle L(\mu _E) = L({{\textbf {A}}}_{(E)}) = 4\,L({{\textbf {A}}}_E)\).

6 Oscillations of the IDS

Consider the family of Schrödinger cocycles \({{\textbf {A}}}_E:\Sigma \rightarrow SL_2({\mathbb {R}})\) with potential \(\phi :\Sigma \rightarrow {\mathbb {R}}\) over the basis dynamics \((\Sigma , \sigma , \tilde{\nu })\), as in the Sect. 5. The purpose of this section is to get a lower bound on the oscillations of the finite scale IDS \({\mathcal {N}}_{n,\zeta }\) in terms of counting certain configurations along the orbit of \(\zeta \), referred to as \(\delta \)-matchings.

Let \(\{e_1, e_2\}\) be the canonical basis of \({\mathbb {R}}^2\). Given \(\delta >0\) and \(k\in {\mathbb {N}}\), we say that \(\zeta \in \Sigma \) has a \(\delta \)-matching of size k at E, or a \((\delta ,k,E)\)-matching, if

$$\begin{aligned} {{\textbf {A}}}_E^k(\zeta )\, {{\hat{e}}}_1={{\hat{e}}}_2 \quad \text { and } \quad \tau _k(\zeta , E) := \frac{ \left\Vert {{\textbf {A}}}_E^k(\zeta )\, e_1\right\Vert }{ \underset{0\le j\le k-1}{{\text {max}}}\ \left\Vert {{\textbf {A}}}^j_E(\zeta )\, e_1 \right\Vert } <\delta . \end{aligned}$$

For each \(\zeta \in \Sigma \) and \(k\in {\mathbb {N}}\), we consider the truncated Schrödinger operator \(H^k_{\zeta }:{\mathbb {R}}^k\rightarrow {\mathbb {R}}^k\) defined in Sect. 4.2, which can be described, for \(u\in {\mathbb {R}}^k\), by

$$\begin{aligned} H^k_{\zeta }\, u := \left( (H^k_{\zeta }u)_0,\ldots , (H^k_{\zeta }u)_{k-1} \right) , \end{aligned}$$

where

$$\begin{aligned} (H^k_{\zeta }\, u)_j := \left\{ \begin{array}{ll} -u_1 + \phi (\zeta )\, u_0, &{} \quad \text {if }\quad j = 0 \\ -u_{j+1} - u_{j-1} + \phi (\sigma ^j(\zeta ))\, u_j, &{} \quad \text {if }\quad j\ne 0, k-1 \\ -u_{k-2} + \phi (\sigma ^{k-1}(\zeta ))\, u_{k-1}, &{} \quad \text {if } \quad j= k-1. \end{array} \right. \end{aligned}$$

Let \(E\in {\mathbb {R}}\) and \((v_0,v_{-1})\in {\mathbb {R}}^2\). Define the sequence \((v_j)_{j\in {\mathbb {Z}}}\) by the following equation

$$\begin{aligned} {{\textbf {A}}}_E(\sigma ^j(\zeta ))\, \begin{pmatrix} v_j\\ v_{j-1} \end{pmatrix}= \begin{pmatrix} v_{j+1}\\ v_j \end{pmatrix}, \end{aligned}$$

which is equivalent to say that for every \(j\in {\mathbb {Z}}\),

$$\begin{aligned} -v_{j-1} - v_{j+1} + \phi (\sigma ^j(\zeta ))\, v_j = E\,v_j . \end{aligned}$$
(7)

Let \({\textbf{e}}_0, \ldots , {\textbf{e}}_{k-1}\) be the canonical basis of \({\mathbb {R}}^{k}\).

Lemma 6.1

Given \((v_j)_{j\in {\mathbb {Z}}}\) solution of (7), the vector \(v^* = (v_0,\ldots , v_{k-1})\in {\mathbb {R}}^k\) satisfies

$$\begin{aligned} H^k_{\zeta }\, v^* - Ev^* = v_{-1}{\textbf{e}}_0 + v_k{\textbf{e}}_{k-1}. \end{aligned}$$

Moreover, \({{\textbf {A}}}^k_E(\zeta )\, {{\hat{e}}}_1 = {{\hat{e}}}_2\)  if and only if  there exists a solution \((v_j)_{j\in {\mathbb {Z}}}\) of (7) such that \(v^*\) is an eigenvector of \(H^k_{\zeta }\) with the eigenvalue E.

Proof

The first statement follows from (7). For the second part observe that \({{\textbf {A}}}^k_E(\zeta ) {{\hat{e}}}_1 = {\hat{e}}_2\) if and only if there is a solution of (7) such that \(v_{-1} = v_k = 0\). \(\square \)

Consider a large integer \(N=m\, (k+2)\) and split the interval \([0,N-1]\) into m disjoint slots of length k, namely \(S_j:=[j\,(k+2), j\, (k+2)+k-1]\) for \(j=0,1,\ldots , m-1\). The integers \(j\,(k+2)-1\) and \(j\, (k+2)+k+1\) are referred to as the boundary points of the slot \(S_j\). Notice that \(\cup _{j=0}^{m-1} S_j\) has \(m\, k\) elements which exclude the boundary points of the slots. We say that \(\zeta \in \Sigma \) has a \((\delta , k,E)\)-matching in the slot \(S_j\) if \(\sigma ^{j\, (k+2)}(\zeta )\) has a \((\delta , k,E)\)-matching. Next lemma says that when the sequence \(\zeta \) has a \((\delta , k,E)\)-matching in the slot \(S_j\) we can construct a \(\delta \)-almost eigenvector for \(H^{N}_\zeta \) which is supported in that slot \(S_j\). Moreover, because consecutive slots share no boundary points in common, if \(\zeta \) admits several \((\delta , k)\)-matchings in different slots then the corresponding \(\delta \)-almost eigenvectors are pairwise orthogonal.

Lemma 6.2

Given \(\zeta \in \Sigma \) and \(j_0\in {\mathbb {N}}\) such that the sequence \(\sigma ^{j_0(k+2)}(\zeta )\) has a \((2^{-1/2} \delta ,k,E)\)-matching consider the vector \(v^*=(v_0,\ldots , v_{k-1})\in {\mathbb {R}}^k\) with components determined by

$$\begin{aligned} \begin{pmatrix}v_j \\ v_{j-1} \end{pmatrix} = {{\textbf {A}}}_E^j(\sigma ^{j_0(k+2)} \zeta )\, \begin{pmatrix}v_0 \\ 0 \end{pmatrix} \end{aligned}$$

where \(v_0\) is fixed so that \({\text {max}}_{0\le j\le k-1} |v_j|=1\). Then the vector \(v_{j_0,k}(\zeta )\in {\mathbb {R}}^{N}\), with all coordinates zero except those in the slot \(S_{j_0}\) which coincide with the respective coordinates of \(v^*\), satisfies

$$\begin{aligned} \left\Vert H^{N}_{\zeta }\, v_{j_0,k}(\zeta ) - E\, v_{j_0, k}(\zeta ) \right\Vert < \delta . \end{aligned}$$

In other words, \(v_{j_0,k}(\zeta )\) is an \(\delta \)-almost eigenvector of \(H^{N}_{\zeta }\) in the sense of Lemma 4.2.

Proof

For the sake of simplicity let \(j_0=0\) so that \(\zeta =\sigma ^{j_0 (k+2)}(\zeta )\) is the sequence with a \((\delta /\sqrt{2},k,E)\)-matching. By definition of \(v_{j_0, k}(\zeta )\) and Lemma 6.1 we have that

$$\begin{aligned} (H^{N}_{\zeta } -E)\, v_{j_0, k}(\zeta ) = -v_{k-1}{\textbf{e}}_k. \end{aligned}$$

Therefore,

$$\begin{aligned} \left\Vert H^{N}_{\zeta }\, v_{j_0,k}(\zeta ) - E\, v_{j_0, k}(\zeta ) \right\Vert&\le |v_{k-1}| \le |v_0|\, \left\Vert {{\textbf {A}}}_E^k(\zeta )\, e_1\right\Vert \le \sqrt{2}\, \tau _k(\zeta ,E) < \delta \end{aligned}$$

because \(v_{k-1}\) is one of the components of \(v_0\,{{\textbf {A}}}^k_E\, e_1 \) and

$$\begin{aligned} \frac{v_0}{\sqrt{2}}\, \underset{0\le j\le k-1}{{\text {max}}} \left\Vert {{\textbf {A}}}^j_E(\zeta )\, e_1\right\Vert \le v_0\, \underset{0\le j\le k-1}{{\text {max}}}\left| \langle {{\textbf {A}}}^j_E(\zeta )\, e_1, e_1 \rangle \right| =\underset{0\le j\le k-1}{{\text {max}}}\ |v_j| = 1. \end{aligned}$$

\(\square \)

From the point of view of Mathematical Physics, a \(\delta \)-matching determines a \(\delta \)-almost eigenvector of the Schrödinger operator.

Dynamically, these configurations correspond to stable-unstable matchings in the following sense: let \(k=k_1+k_2\) be some partition of k such that both factors in the decomposition \({{\textbf {A}}}^{k}_E(\zeta )={{\textbf {A}}}^{k_2}_E(\sigma ^{k_1}(\zeta ))\, {{\textbf {A}}}^{k_1}_E(\zeta )\) are very hyperbolic with nearly horizontal unstable direction and almost vertical stable one. If \(k_1, k_2\) are large then \({{\textbf {A}}}^{k_1}_E(\zeta ) {{\hat{e}}}_1\) is a good approximation of the Oseledets unstable direction \(E^u(\sigma ^{k_1}\zeta )\) at the point \(\sigma ^{k_1} (\zeta )\), while \({{\textbf {A}}}^{-k_2}_E(\zeta ) {{\hat{e}}}_2\) is a good approximation of the stable direction \(E^s(\sigma ^{k_1}(\zeta ))\) at the same point. The condition \({{\textbf {A}}}^k(\zeta )\, {{\hat{e}}}_1={{\hat{e}}}_2\) is equivalent to the matching \({{\textbf {A}}}^{k_1}_E(\zeta ) {{\hat{e}}}_1 = {{\textbf {A}}}^{-k_2}_E(\zeta ) {{\hat{e}}}_2\) between these two approximate stable and unstable directions at the middle point. This nearly stable-unstable matching also explains why \(\tau _k(\zeta ,E)\) should be very small.

The oscillation of the non-decreasing function \({\mathcal {N}}\) and its finite scale analogue \({\mathcal {N}}_{N,\zeta }\) on some interval \(I=[\alpha ,\beta ]\) are denoted by

$$\begin{aligned} \Delta _I {\mathcal {N}}:= {\mathcal {N}}(\beta )-{\mathcal {N}}(\alpha ), \; \text { resp. }\; \Delta _I {\mathcal {N}}_{N,\zeta }:= {\mathcal {N}}_{N,\zeta }(\beta )-{\mathcal {N}}_{N,\zeta }(\alpha ). \end{aligned}$$

Denote by \(\Sigma (\delta , k, I)\) the subset of \(\Sigma \) formed by \((\delta , k, E)\)-matching sequences \(\zeta \in \Sigma \) with \(E\in I\).

Lemma 6.3

For any interval \(I\subset {\mathbb {R}}\) and \(\zeta \in \Sigma \),

$$\begin{aligned} \Delta _{I_\delta }{\mathcal {N}}_{N,\zeta } \ge \frac{1}{N}\displaystyle \sum _{j=0}^{m-1}\chi _{\Sigma (\delta ,k,I)}(\sigma ^{j(k+2)} \zeta ) \end{aligned}$$

where \(I_\delta :=I+[-\delta ,\delta ]\) is the \(\delta \)-neighborhood of I.

Proof

Let \(m\in {\mathbb {N}}\) and set

$$\begin{aligned} {\mathcal {Z}}_{m,k}(\zeta ) := \left\{ 0\le j \le m-1\, :\, \sigma ^{j(k+2)} \zeta \in \Sigma (\delta , k,I) \right\} . \end{aligned}$$

The set of vectors \(\displaystyle \{ v_{j,k}(\zeta ) \in {\mathbb {R}}^{N}\, :\ j\in {\mathcal {Z}}_{k,m}(\zeta )\}\) is orthonormal and by Lemma 6.2 these are \(\delta \)-almost eigenvectors. By Lemma 4.2 there is the same amount of eigenvalues of \(H^{N}_{\zeta }\) in \(I_\delta \) (counted with multiplicity). Whence,

$$\begin{aligned} \displaystyle \sum _{j=0}^{m-1}\chi _{\Sigma (\delta ,k,I)}(\sigma ^{j\,(k+2)} \zeta ) = \left| {\mathcal {Z}}_{k,m}(\zeta ) \right| \, \le \, |{\text {Spec}}(H^{N}_{\zeta })\cap I_\delta | = N\, \Delta _{I_\delta } {\mathcal {N}}_{N,\zeta } . \end{aligned}$$

\(\square \)

Applying Birkhoff’s ergodic theorem sending \(m\rightarrow \infty \) in the previous lemma we have the following corollary.

Corollary 6.4

For any interval \(I\subseteq {\mathbb {R}}\),

$$\begin{aligned} \Delta _{I_\delta } {\mathcal {N}}\ge \frac{1}{k+2}\tilde{\nu }\left( \Sigma (\delta ,k,I) \right) . \end{aligned}$$

7 Variation with Respect to the Energy

This is the main technical section of the work.

7.1 Trace property

The main purpose of this subsection is to prove that if \(R_0={{\textbf {A}}}^{4 n_0}_0(\zeta _0)\) is elliptic then as we move the parameter E the rotation angle of \(R_E\) varies with non-zero speed around \(E=0\). This will be a consequence of the following proposition, which is a general fact about Schrödinger matrices. Recall that

$$\begin{aligned} S(t) = \left( \begin{array}{cc} t &{} -1 \\ 1 &{} 0 \end{array} \right) \end{aligned}$$

denotes a Schrödinger type matrix. For a vector \(x = (x_1,\ldots , x_n)\in {\mathbb {R}}^n\), write

$$\begin{aligned} S^n(x-E) := S(x_n-E)\,\ldots \, S(x_1-E). \end{aligned}$$

Lemma 7.1

If \(E\in {\mathbb {C}}\setminus {\mathbb {R}}\) then the matrix \(S^n(x-E)\) is hyperbolic.

Proof

See [25, Lemma 2.4]. For the sake of completeness we provide a proof of this fact. By induction the entries in the main diagonal of \(S^n(x)\) are polynomials in the variables \(x_1,\ldots , x_n\) of degrees n and \(n-2\), respectively, whose monomials have degrees with same parity as n, while the entries on the second diagonal are polynomials of degrees \(n-1\), whose monomials have degrees with same parity as \(n-1\). It follows that for all \(x\in {\mathbb {R}}^n\),

$$\begin{aligned} {\text {tr}}(S^n(-x))=(-1)^n\, {\text {tr}}(S^n(x)). \end{aligned}$$

In particular \({\text {tr}}(S^n(x-E))=\pm \, {\text {tr}}(S^n(E-x))\) and we only need to consider the case \(\textrm{Im}E<0\). In this case the open set \(U:=\{z\in {\mathbb {C}}:\textrm{Im}z>0\}\) determines an open cone in \({\mathbb {P}}({\mathbb {C}}^2)\equiv {\mathbb {C}}\cup \{\infty \}\). The projective action of the matrices \(S^n(x-E)\) with \(x\in {\mathbb {R}}^n\) and \(E\in U\) sends \({\overline{U}}\) inside of U. In fact, if \(n=1\) and \(z\in {\overline{U}}\backslash \{0\}\) (possibly \(z=\infty \)) then \(-z^{-1}\in {\overline{U}}\) and since \(\textrm{Im}E<0\),

$$\begin{aligned} \textrm{Im}(S(x_1 - E)\cdot z) = \textrm{Im}\left( x_1 - E -\frac{1}{z} \right) \ge -\textrm{Im}(E)>0, \end{aligned}$$

for every \(x_1\in {\mathbb {R}}\). Otherwise if \(z=0\), then \(S(x_1 - E)\cdot z = \infty \) and the statement follows iterating once again. The existence of this invariant cone implies that \(S^n(x-E)\) is hyperbolic. Similarly, if \(\textrm{Im}E>0\) we consider the open set \(U^-:=\{z\in {\mathbb {C}}:\textrm{Im}z<0\}\) and prove that, under the projective action, \(S^n(x-E)\) sends \(\overline{U^-}\) inside of \(U^-\). \(\square \)

Proposition 7.2

For any \(n\in {\mathbb {N}}\), if \(|{\text {tr}}(S^n(x))|<2\), then

$$\begin{aligned} \frac{d}{dE}{\text {tr}}\left( S^n(x - E) \right) \biggr |_{E = 0}\ne 0. \end{aligned}$$

Proof

Define the analytic function \(\psi :{\mathbb {C}}\rightarrow {\mathbb {C}}\) given by

$$\begin{aligned} \psi (E) := {\text {tr}}\left( S^n(x - E) \right) . \end{aligned}$$

Observe that \(\psi \) is real in the sense that \(\psi (E)\in {\mathbb {R}}\) for every \(E\in {\mathbb {R}}\). By assumption \(|\psi (0)| < 2\) and so there exists a radius \(r_0>0\) such that for every E in the disk centered in 0 and radius \(r_0\), \({\mathbb {D}}_{r_0}(0)\), we have that \(|\psi (E)|< 2\).

Assume by contradiction that \(\psi '(0) = 0\). By analiticity of \(\psi \), we can write

$$\begin{aligned} \psi (E) = \psi (0) + E^k\Psi (E), \end{aligned}$$

in a neighborhood of 0, where \(k\ge 2\) and \(\Psi (0)\ne 0\). In particular, there exists \(E^*\in {\mathbb {D}}_{r_0}(0)\backslash {\mathbb {R}}\) such that \(\psi (E^*)\in {\mathbb {R}}\). As a consequence, we conclude that if \(\lambda \) and \(1/\lambda \) are the eigenvalues of \(S^n(x-E^*)\), then \(\lambda + \lambda ^{-1}\in {\mathbb {R}}\). But, that can only happen if either \(|\lambda | = 1\) or else \(|\lambda |\ne 1\) and \(\lambda \) itself is real. The former can not happen since by Lemma 7.1 the matrix \(S^n(x - E^*)\) is hyperbolic.

The latter implies that

$$\begin{aligned} \left| {\text {tr}}\left( S^n(x - E^*) \right) \right| = \left| \lambda + \frac{1}{\lambda } \right| > 2 . \end{aligned}$$

This contradicts the fact that \(|\psi (E^*)|<2\) and proves the result. \(\square \)

Lemma 7.3

If \({{\textbf {A}}}_0^m(\zeta _0)\) is an elliptic element for some \(\zeta _0\in \Sigma \), then

$$\begin{aligned} \frac{d}{dE}{\text {tr}}(A^m_E(\zeta _0))\biggr |_{E=0}\ne 0. \end{aligned}$$

Proof

Direct consequence of Proposition 7.2. \(\square \)

Proposition 7.4

Given a Schrödinger cocycle \(A_E: X\rightarrow SL_2({\mathbb {R}})\) with continuous potential \(\phi :X\rightarrow {\mathbb {R}}\) and generated by the dynamical system \((X, T, \xi )\), for all \(n\in {\mathbb {N}}\), \(\rho \in (-2,\, 2)\) and \(x\in X\), the polynomial \(f_{\rho }:{\mathbb {R}}\rightarrow {\mathbb {R}}\), \(f_{\rho }(E):={\text {tr}}( A_E^n(x)) - \rho \), has n distinct real roots.

Proof

This polynomial can not have a real root \(E_0\) with multiplicity \(\ge 2\) because this would imply that \(f_{\rho }(E_0)=f'_{\rho }(E_0)=0\), contradicting the conclusion of Proposition 7.3. To see that it can not have complex non-real roots, assume that there exists \(E_0\in {\mathbb {C}}\setminus {\mathbb {R}}\) such that \(f_{\rho }(E_0)=0\). By Lemma 7.1, the matrix \(S^n(x-E_0)\) is hyperbolic. Denoting by \(\lambda \) and \(\lambda ^{-1}\) the eigenvalues of \(S^n(x-E_0)\) we have

$$\begin{aligned} \rho = {\text {tr}}(S^n(x-E_0)) = \lambda + \lambda ^{-1} \end{aligned}$$

which implies that \(|\lambda |= 1\). Therefore the matrix \(S^n(x-E_0)\) can not be hyperbolic. This contradiction proves that \(f_{\rho }(E)\) can not have complex non-real roots. \(\square \)

Corollary 7.5

In the previous context, \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\), \(f(E):={\text {tr}}( A_E^n(x))\), is a Morse function, with \(f(E)\ge 2\) at local maxima, and \(f(E)\le -2\) at local minima.

Proof

Since f has n different real roots, \(f'\) has \(n-1\) different real roots and for any pair \(a < b\) of roots of f, there exists a unique \(c\in (a,b)\) root of \(f'\). Moreover, by Proposition 7.4, if \(f''(c)<0\), then \(f(c)\ge 2\) and similarly, \(f''(c)>0\) implies \(f(c)\le -2\). \(\square \)

7.2 Density of tangencies

In this subsection we prove that cocycles with heteroclinic tangencies are dense outside the class of uniformly hyperbolic cocycles.

Let \({{\textbf {A}}}^{4\ell _0}_0(\zeta _0)\) be an elliptic matrix and \(\delta _0>0\) be such that \(R_E = {{\textbf {A}}}^{4\ell _0}_E(\zeta _0)\) is elliptic for every \(|E|\le \delta _0\).

Lemma 7.6

There exist \(c>0\) such that for every \(m\ge 1\), every \(E\in [- \delta _0,\delta _0]\) and every \({\hat{v}}\in {\mathbb {P}}^1\) we have

$$\begin{aligned} \left| \frac{d}{dE}R^m_E\, {\hat{v}} \right| \ge m\, c. \end{aligned}$$

Proof

Take \(E_0\in [-\delta _0,\delta _0]\) and an inner product in \({\mathbb {R}}^2\) for which \(R_{E_0}\) is a rotation. Then by Proposition B.1 and Lemma B.3

$$\begin{aligned} \frac{1}{m}\, \left| \frac{d}{dE} R_E^m\, {\hat{v}}\biggr |_{E=E_0} \right|&= \frac{1}{m}\, \sum _{j=1}^{m} R_{E_0} \, {\textbf{v}}_{j-1} \wedge \dot{R}_{E_0}\, {\textbf{v}}_{j-1} \end{aligned}$$

is bounded away from 0. Notice that by compactness of \([-\delta _0,\delta _0]\), all norms associated with inner products that turn the matrices \(R_E\) into rotations are uniformly equivalent. \(\square \)

Proposition 7.7

Given \(\zeta \in \Sigma \), \(E\in [-\delta _0,\delta _0]\) and \(\ell \in {\mathbb {N}}\), if \({{\textbf {A}}}_{E_0}^{\ell }(\zeta )\) is parabolic then there exist E arbitrary close to \(E_0\) such that \({{\textbf {A}}}_{E}^{\ell }(\zeta )\) is irrational elliptic.

Proof

Consider the function \(f:{\mathbb {R}}\rightarrow {\mathbb {R}}\), \(f(E):={\text {tr}}(A_E^\ell (\zeta ))\). Assume \({{\textbf {A}}}_{E_0}^{\ell }(\zeta )\) parabolic, i.e., \(f(E_0)=\pm 2\). When \(f'(E_0)\ne 0\), all matrices \(A_E^{\ell }(\zeta )\) are elliptic in a 1-sided neighborhood of \(E_0\). On the other hand, if \(f'(E_0)=0\) by Corollary 7.5 all matrices \(A_E^\ell (\zeta )\) are elliptic in a 2-sided neighborhood of \(E_0\). \(\square \)

Fig. 2
figure 2

Creation of tangencies

Proposition 7.8

Assume \(L(\mu _{E_0})>0\) and \({{\textbf {A}}}^{4l}_{E_0}\) is not uniformly hyperbolic, then there exist E arbitrary close to \(E_0\) at which \(\mu _{E}\) admits heteroclinic tangencies.

Proof

By [13, Thereom 4.1], either \(\mu _{E_0}\) has an heteroclinic tangency, or else the semigroup generated by \({\text {supp}}\mu _{E_0}\) contains a parabolic or an elliptic matrix. Since \(L(\mu _{E_0})>0\), \({\text {supp}}(\mu _{E_0})\) admits hyperbolic matrices \(A_{E_0}\) and \(B_{E_0}\). By Proposition 7.7 we can assume that \(C_{E_0}:={{\textbf {A}}}_{E_0}^{4 \ell }(\zeta )\) is an irrational elliptic rotation, which implies that the distance \(d(C_{E_0}^m\, u(B_{E_0}), s(A_{E_0}))\) gets arbitrary small for some large m. On the other hand, the curves \(E\mapsto u(B_E), s(A_E)\) are smooth, see Proposition B.4, while by Lemma 7.6 the projective curve \(E\mapsto C^m_E\, u(B_E)\) has large speed when m is large. Hence the equation \(C_E^m\, u(B_E)=s(A_E)\) has infinitely many solutions with E arbitrary close to \(E_0\). See Figure 2. \(\square \)

7.3 Projective random walk distribution

In this subsection we establish some estimates on the distribution of the projective random walk, needed to prove Proposition 7.11.

Proposition 7.9

Assume that \(L(\mu ) > 0\) and \(\mu \) is irreducible. There exist \(C>0\) and \(t\in (0,1)\) such that

$$\begin{aligned} \sup _{{\hat{y}}\in {\mathbb {P}}^1}\displaystyle \int _{{\mathbb {P}}^1}\frac{1}{d({\hat{x}},\, {\hat{y}})^t}\, d\, \eta ({\hat{x}}) \le C. \end{aligned}$$

In particular, \(\eta \) is t-Hölder, i.e., for every \({\hat{x}}\in {\mathbb {P}}^1\) and \(r>0\)

$$\begin{aligned} \eta \left( B({\hat{x}},\, r) \right) \le Cr^t. \end{aligned}$$

Proof

Since \(\mu \) is an irreducible measure on \(SL_2({\mathbb {R}})\) and \(L(\mu )>0\) we have that \(\mu \) is strongly irreducible (see [26, Theorem 6.1]). Thus the proof follows from [27] or [28, Theorem 13.1]. \(\square \)

The first item of the next proposition corresponds to (13.8) of Proposition 13.3 in [28].

Proposition 7.10

Assume that \(L(\mu _{E_0}) > 0\) and \(\mu _{E_0}\) irreducible. Given \(\beta >0\), there exist constants \(C,\, c_1\, c_2 >0\) and \(k_0\in {\mathbb {N}}\) such that for every \(\ell , n\in {\mathbb {N}}\), \(n\ge k_0\, \ell \) and directions \({\hat{v}}, {\hat{w}}\in {\mathbb {P}}^1\), the sets

  1. (1)

    \(\left\{ \omega \in \Omega :\, \exists \,E,\, |E - E_0|\le e^{-c_1n},\, {{\textbf {A}}}^{ n}_{(E)}(\omega )\, {\hat{v}}\in B({\hat{w}},\, e^{-\beta \ell }) \right\} \);

  2. (2)

    \(\left\{ \omega \in \Omega :\, \exists \, E,\, |E- E_0|\le e^{-c_1n},\, {{\textbf {A}}}^{- n}_{(E)}(\omega )\, {\hat{w}}\in B({\hat{v}},\, e^{-\beta \ell }) \right\} \);

  3. (3)

    \(\left\{ (\omega ,\tilde{\omega })\in \Omega ^2:\, \exists \,E,\, |E-E_0|\le e^{-c_1n},\, d({{\textbf {A}}}^n_{(E)}(\omega )\, {\hat{v}},\, {{\textbf {A}}}^{-n}_{(E)}(\tilde{\omega })\, {\hat{w}}) \le e^{-\beta \ell } \right\} \).

have probability \(\le Ce^{-c_2\ell }\).

Proof

By Lemma B.2, there exist constants \(C^*,\, c_1^*,\, c_2^* > 0\) such that

$$\begin{aligned} d\left( {{\textbf {A}}}^n_{(E_0)}(\omega )\, {\hat{v}}^+,\, {{\textbf {A}}}^n_{(E)}(\omega )\, {\hat{v}}^+ \right) \le C^*e^{-c_1^*n}, \end{aligned}$$
(8)

for every E with \(|E - E_0|\le e^{-c_2^*n}\). By Proposition 3.4 and Proposition 7.9, we have that

$$\begin{aligned}&{\tilde{\mu }}\left( \left[ \exists \, E,\, |E-E_0| \le e^{-c_1n}, {{\textbf {A}}}^n_{(E)}(\cdot )\, {\hat{v}}\in B({\hat{w}},\, e^{-\beta \ell })\, \right] \right) \\&\qquad \qquad \qquad \qquad \qquad \lesssim \eta ^+\left( B({\hat{w}},\, e^{-\beta \ell }) \right) + C\frac{e^{-cn}}{e^{-c_1\theta \ell }}\\&\qquad \qquad \qquad \qquad \qquad \lesssim e^{-t\beta \ell } + C\frac{e^{-cn}}{e^{-\beta \theta \ell }}, \end{aligned}$$

where \(n\ge k_0\, \ell \) with \(k_0> {\beta (t+\theta )}/{c}\) and \(c_2:=t\beta \). The argument to estimate the probability in 2) is entirely analogous, making use of \(\eta ^-\) instead of \(\eta ^+\).

We now study the probability of the set in 3). We extend the Markov operators \(Q_\pm \) to the product space \({\mathbb {P}}^1\times {\mathbb {P}}^1\) defining a new operator \({{\textbf {Q}}}:C^{\theta }({\mathbb {P}}^1\times {\mathbb {P}}^1)\rightarrow C^{\theta }({\mathbb {P}}^1\times {\mathbb {P}}^1)\) by

$$\begin{aligned} ({{\textbf {Q}}}\varphi )({\hat{x}},\, {\hat{y}}) := \sum _{i,j = 1}^{\kappa }\mu _i\, \mu _j\, \varphi (A_{i,E_0}\, {\hat{x}}, A_{j,E_0}^{-1}\, {\hat{y}}) . \end{aligned}$$

This operator is also a quasi-compact operator and there exists constants \(c, C>0\) such that for every observable \(\varphi \in C^{\theta }({\mathbb {P}}^1\times {\mathbb {P}}^1)\) we have

$$\begin{aligned} v_{\theta }({{\textbf {Q}}}^n\varphi ) \lesssim e^{-cn}v_{\theta }(\varphi ). \end{aligned}$$

For each \(r>0\), let \(\Delta _r:= \{({{\hat{x}}}, {{\hat{y}}})\in {\mathbb {P}}^1\times {\mathbb {P}}^1 :d({{\hat{x}}},{{\hat{y}}})\le r\}\) and \(\rho _r:[0,+\infty [\rightarrow [0,1]\) be a piece-wise linear function supported in [0, 3r] such that \(\rho _r(t)= 1\) for \(t\in [0,2r]\). Define the \(\theta \)-Hölder observable \(\psi _{r}({{\hat{x}}}, {{\hat{y}}}):= \rho _r(d({{\hat{x}}}, {{\hat{y}}}))\), with \(v_{\theta }(\psi _r) = (2r)^{-\theta }\) and \(\chi _{\Delta _r}\le \psi _r\). Writing \(r = 2e^{-c_1n} + e^{-\beta \ell }\), we can use Markov’s inequality and Proposition 7.9 to conclude that

$$\begin{aligned}&{\tilde{\mu }}\times {\tilde{\mu }}\left( \left[ \exists \,E, |E-E_0|\le e^{-c_1n}, d\left( {{\textbf {A}}}^n_{(E)}(\cdot )\, {\hat{v}}^+, {{\textbf {A}}}^{-n}_{(E)}(\tilde{\cdot })\, {\hat{v}}^- \right) \le e^{-\beta \ell } \right] \right) \\&\le {\tilde{\mu }}\times {\tilde{\mu }}\left( \left\{ (\omega ,\tilde{\omega })\in \Omega ^2:\, d\left( {{\textbf {A}}}^n_{(E_0)}(\omega )\, {\hat{v}}^+,\, {{\textbf {A}}}^{-n}_{(E_0)}(\tilde{\omega })\, {\hat{v}}^- \right) \le 2e^{-c_1n} + e^{-\beta \ell } \right\} \right) \\&= {{\textbf {Q}}}^n(\chi _{\Delta _r})({\hat{v}}^+,\, {\hat{v}}^-) \le {{\textbf {Q}}}^n(\psi _{r})({\hat{v}}^+,\, {\hat{v}}^-)\\&\le \left| {{\textbf {Q}}}^n(\psi _{r}) - \displaystyle \int _{{\mathbb {P}}^1\times {\mathbb {P}}^1}\psi _{r}\, d\, (\eta _{E_0}^+\times \eta _{E_0}^-) \right| + \displaystyle \int _{{\mathbb {P}}^1\times {\mathbb {P}}^1}\psi _{r}\, d\, (\eta _{E_0}^+\times \eta _{E_0}^-)\\&\lesssim e^{-cn}v_{\theta }(\psi _{r}) + (\eta _{E_0}^+\times \eta _{E_0}^-)\left( \Delta _{3r} \right) \\&\lesssim \frac{e^{-cn}}{e^{-\beta \theta \ell }} + 3^t\, (2e^{-c_1n} + e^{-\beta \ell })^t\displaystyle \int _{{\mathbb {P}}^1\times {\mathbb {P}}^1}\frac{1}{d({\hat{x}},\, {\hat{y}})^t}\, d\, (\eta _{E_0}^+\times \eta _{E_0}^-)({\hat{x}}, {\hat{y}})\\&\lesssim \frac{e^{-cn}}{e^{-\beta \theta \ell }} + 3^t\, (2e^{-c_1n} + e^{-\beta \ell })^t\sup _{{\hat{y}}\in {\mathbb {P}}^1}\displaystyle \int _{{\mathbb {P}}^1\times {\mathbb {P}}^1}\, \frac{1}{ d\left( {\hat{x}},\, {\hat{y}} \right) ^t }\, d\, \eta _{E_0}^+({\hat{x}})\\&\lesssim \frac{e^{-cn}}{e^{-\beta \theta \ell }} + (2e^{-c_1 n} + e^{-\beta \ell })^t\; \lesssim e^{-c_2\ell }. \end{aligned}$$

In the two last inequalities we have used Proposition 7.9 and that we can increase \(k_0\) so that \(k_0> \frac{\beta (t+\theta )}{c}\) and decrease \(c_2\) so that \(c_2\le \beta \theta t\). This completes the proof of the Proposition. \(\square \)

7.4 Variation of the ‘hyperbolic’ elements

In this subsection we establish one of the core proposition for the proof of the Theorem A, providing plenty of good hyperbolic words. We will be using the notation introduced in the Sect. 5.2.

From now on we also use the following notation: given \(A\in SL_2({\mathbb {R}})\) with \(\left\Vert A\right\Vert >1\), denote by \({{\hat{v}}}_1(A)\), \({{\hat{v}}}_2(A)\), \({{\hat{v}}}_1^*(A)\) and \({{\hat{v}}}_2^*(A)\) the unique projective points such that taking unit vectors \(v_i\in {{\hat{v}}}_i(A)\) and \(v_j^*\in {{\hat{v}}}_j(A)\), with \(i,j=1,2\), \(\{v_1, v_2\}\) and \(\{v_1^*, v_2^*\}\) are singular basis of A characterized by the relations \(A\,v_1=\left\Vert A\right\Vert v_1^*\) and \(A\, v_2=\left\Vert A\right\Vert ^{-1} v_2^*\).

Take \(\delta _1 = \delta _1(E_0)>0\) as in the Proposition 3.3, in the sense that the large deviations hold uniformly for all cocycles \({{\textbf {A}}}_{(E)}\) with \(|E - E_0|\le \delta _1\) and also so that

$$\begin{aligned} \lambda := {\text {min}}_{|E - E_0|\le \delta _1} L(\mu _{(E)})>0. \end{aligned}$$

Proposition 7.11

Assume \(L(\mu _{E_0})>0\) and \(\mu _{E_0}\) irreducible. Given \(\beta >0\) there exist constants \(\tau >0\) and \(N_0\in {\mathbb {N}}\) such that for every \(n\ge N_0\) and every \({\hat{v}},\, {\hat{w}}\in {\mathbb {P}}^1\), the set \({\mathcal {G}}_n({\hat{v}},{\hat{w}},\beta , \tau , E_0)\) of all \(\omega \in \Omega \) satisfying for all \(|E - E_0|\le e^{-\tau \,n^{1/4}}\):

  1. (1)

    \(\left\Vert {{\textbf {A}}}^n_{(E)}(\omega )\, v\right\Vert \gtrsim e^{(\lambda - \beta )n}\) and \(\left\Vert {{\textbf {A}}}^{-n}_{(E)}(\sigma ^n\omega )\, w\right\Vert \gtrsim e^{(\lambda - \beta )n}\);

  2. (2)

    \({{\textbf {A}}}^n_{(E)}(\omega )\) is hyperbolic and \(\lambda ({{\textbf {A}}}^n_{(E)}(\omega ))\gtrsim e^{(\lambda - \beta )n}\);

  3. (3)

    \(d({\hat{v}}_1^*({{\textbf {A}}}^n_{(E)}(\omega )),\, {\hat{v}}_2({{\textbf {A}}}^n_{(E)}(\omega ))) \gtrsim e^{-\beta n^{1/8}}\).

has measure   \({\tilde{\mu }}\left( {\mathcal {G}}_n ({\hat{v}},{\hat{w}},\beta ,\tau , E_0) \right) >1 - \beta \).

Proof

Split n into blocks of size \(n_0\asymp n^{1/4}\). For the sake of simplicity we assume that \(n=m\, n_0\) with \(n_0=n^{1/4}\) and \(E_0=0\). Consider the set \({\mathcal {B}}_{n_0}\) of \(\omega \in \Omega \) such that there exists \(0\le j\le m-1\) with

$$\begin{aligned} \left\Vert {{\textbf {A}}}_{(0)}^{n_0}(\sigma ^{j\, n_0}\omega ) \right\Vert < e^{ (\lambda -\frac{\beta }{10})\,n_0} \, \vee \, \left\Vert {{\textbf {A}}}_{(0)}^{n_0}(\sigma ^{j\, n_0}\omega ) \right\Vert > e^{ (\lambda +\frac{\beta }{10})\,n_0}. \end{aligned}$$

Write \({\mathcal {B}}_{n_0}^*:={\mathcal {B}}_{n_0}\cup {\mathcal {B}}_{2 n_0}\), where \({\mathcal {B}}_{2 n_0}\) is similarly defined. By large deviations, Proposition 3.3, there exists a constant \(\tau _1>0\) such that for all large enough n, \({\tilde{\mu }}({\mathcal {B}}_{n_0}^*)\le 2\,n^{3/4}\, e^{-\tau _1\, n^{1/4}}\). By finite scale continuity, there exists \(\tau >0\) such that for all \(|E|\le e^{-\tau \, n^{1/4}}=e^{-\tau n_0}\) and \(\omega \notin {\mathcal {B}}_{n_0}^*\),

$$\begin{aligned} e^{ (\lambda -\frac{\beta }{5})\,n_0}\le \left\Vert {{\textbf {A}}}_{(E)}^{n_0}(\sigma ^{j n_0}\omega )\right\Vert \le e^{ (\lambda +\frac{\beta }{5}) \,n_0} \qquad \forall 0\le j<m \end{aligned}$$

and

$$\begin{aligned} e^{ 2\, (\lambda -\frac{\beta }{5}) \,n_0}\le \left\Vert {{\textbf {A}}}_{(E)}^{2 n_0}(\sigma ^{j n_0}\omega )\right\Vert \le e^{2\, (\lambda +\frac{\beta }{5})\, n_0} \qquad \forall 0\le j<m-1. \end{aligned}$$

Consider \(({\hat{v}},\, {\hat{w}})\in {\mathbb {P}}^1\times {\mathbb {P}}^1\). We will apply Lemma A.9 with the data

  • \(A_j = {{\textbf {A}}}^{n_0}_{(E)}(\sigma ^{jn_0}\omega )\), \(j=0,\ldots m-1\);

  • \({\hat{v}}= {\hat{v}}\) and \({\hat{w}}= {\hat{w}}\);

  • \(t:= \beta n_0^{1/2}\), \(\gamma := \frac{4}{5}\beta n_0\) and \(\tilde{\lambda }:= (\lambda - \frac{\beta }{5})n_0\).

Notice that if \(\omega \notin {\mathcal {B}}_{n_0}^*\) the assumptions (a)-(c) of Lemma A.9 are automatically satisfied.

Consider \(C, c_1, c_2>0\) and \(k_0\) given by Proposition 7.10 applied with \(n = n_0\) and \(\ell = n_0^{1/2}\). Denote by \({\mathcal {C}}_{n_0}({\hat{v}},\, {\hat{w}})\) the set of sequences \(\omega \in \Omega \) such that for every \(|E|\le e^{-\tau n_0}\) (\(\tau > c_1\)).

  1. (a)

    \({\text {min}}\left\{ d\left( {{\textbf {A}}}^{n_0}_{(E)}(\sigma ^{(m-1)n_0}\omega )\, {\hat{v}}, {\hat{w}}\right) ,\, d\left( {\hat{v}}, {{\textbf {A}}}^{-n_0}_{(E)}(\sigma ^{n_0m}\omega )\, {\hat{w}}\right) \right\} \ge e^{-\beta n_0^{1/2}}\);

  2. (b)

    \({\text {min}}\left\{ d\left( {{\textbf {A}}}^{n_0}_{(E)}(\omega )\, {\hat{v}},\, {\hat{w}}\right) ,\, d\left( {\hat{v}},\, {{\textbf {A}}}^{-n_0}_{(E)}(\sigma ^{n_0}\omega )\, {\hat{w}}\right) \right\} \ge e^{-\beta n_0^{1/2}}\);

  3. (c)

    \(d\left( {{\textbf {A}}}^{n_0}_{(E)}(\sigma ^{(m-1)n_0}\omega )\, {\hat{v}},\, {{\textbf {A}}}^{-n_0}_{(E)}(\sigma ^{n_0}\omega )\, {\hat{w}}\right) \ge e^{-\beta n_0^{1/2}}\).

If \(n_0/\ell = n_0^{1/2}\ge k_0\), then by Proposition 7.10 the set \({\mathcal {C}}_{n_0}^*:= {\mathcal {C}}_{n_0}({\hat{v}},\, {\hat{w}})\backslash \, {\mathcal {B}}_{n_0}^*\) satisfies

$$\begin{aligned} {\tilde{\mu }}\left( \Omega \, \backslash \, {\mathcal {C}}^*_{n_0} \right) \le Ce^{-c_2n_0^{1/2}} = Ce^{-c_2n^{1/8}}. \end{aligned}$$

If \(\omega \in {\mathcal {C}}^*_{n_0}\) the above conditions (a–c) ensure that the hypothesis (d–f) of Lemma A.9 holds. Therefore items 1, 2, and 3 are direct consequence of Lemma A.9. This concludes the proof of the proposition. \(\square \)

Proposition 7.12

If the cocycle \({{\textbf {A}}}_{(E_0)}\) is not irreducible with \(L({{\textbf {A}}}_{(E_0)})>0\) then there exists \(\delta >0\) such that for all \(0<|E-E_0|\le \delta \), the cocycle \({{\textbf {A}}}_{(E)}\) is irreducible.

Proof

The cocycle \({{\textbf {A}}}_{(E_0)}\) has either one or two invariant lines, i.e., invariant under all matrices in the support of \(\mu _{E}\). Since \({{\textbf {A}}}_{(E_0)}\) is not uniformly hyperbolic there exist hyperbolic periodic points \(\omega _1\) and \(\omega _2\), with periods \(n_1\) and \(n_2\), respectively, such that \({{\hat{u}}}({{\textbf {A}}}_{(E_0)}^{n_1}(\omega _1))={{\hat{s}}}({{\textbf {A}}}_{(E_0)}^{n_2}(\omega _2))\) or/and \({{\hat{s}}}({{\textbf {A}}}_{(E_0)}^{n_1}(\omega _1))={{\hat{u}}}({{\textbf {A}}}_{(E_0)}^{n_2}(\omega _2))\), for otherwise a simple argument implies that the reducible cocycle is uniformly hyperbolic, see inequality (2). By Proposition B.4 the directions \({{\hat{u}}}({{\textbf {A}}}_{(E)}^{n_i}(\omega _i))\) and \({{\hat{s}}}({{\textbf {A}}}_{(E)}^{n_i}(\omega _i))\) move in opposite directions with the parameter E. Hence in any case, for \(E\ne E_0\) close enough to \(E_0\), together the two matrices \({{\textbf {A}}}_{(E)}^{n_i}(\omega _i)\), \(i=1,2\), have four distinct invariant directions. This implies that the cocycle \({{\textbf {A}}}_{(E)}\) is irreducible. \(\square \)

7.5 Variation of the heteroclinic tangencies

In this subsection we establish the core Propositions 7.14 and 7.15 for the proof of Theorem A which allows us drive matchings and typical tangencies from an existing tangency.

Consider a family of cocycles \({{\textbf {A}}}_{(E)}:\Omega \rightarrow SL_2({\mathbb {R}})\) as introduced in Sect. 5.2. This family has a heteroclinic tangency at E if and only if there exist periodic orbits \(\omega _0,\omega _1\in \Omega \) with periods \(\ell _0, \ell _1\ge 1\) such that \({{\textbf {A}}}_{(E)}^{\ell _0}(\omega _0)\) and \({{\textbf {A}}}_{(E)}^{\ell _1}(\omega _1)\) are hyperbolic matrices, and there exists a heteroclinic orbit \(\omega \in W^u_{\textrm{loc}}(\omega _0)\cap \sigma ^{-k} W^s_{\textrm{loc}}(\omega _1)\) such that

$$\begin{aligned} {{\textbf {A}}}_{(E)}^k(\omega )\, {{\hat{u}}}({{\textbf {A}}}_{(E)}^{\ell _0}(\omega _0)) = {{\hat{s}}}({{\textbf {A}}}_{(E)}^{\ell _1}(\omega _1)). \end{aligned}$$

In this case we say that \((B_E,\, C_E,\, A_E)\) is a tangency for \(A_{(E)}\) where \(A_{E}={{\textbf {A}}}_{(E)}^{\ell _1}(\omega )\), \(B_{E}={{\textbf {A}}}_{(E)}^{\ell _0}(\omega _0)\) and \(C_{E}={{\textbf {A}}}_{(E)}^{k}(\omega ) \) are respectively the target, the source and the transition matrix of this heteroclinic tangency. See Figure 3. The size of the tangency is by definition the size of the full word \(B_E\, C_E\, A_E\) determined by the tangency.

Before entering in the main technical results of this section we state a version of Lemma B.3 suitable for our purposes. We identify the derivative of projective curves such as \(E\mapsto {{\textbf {A}}}^n_{(E)}(\omega )\, {\hat{v}}\) with its scalar value.

Lemma 7.13

There exists \(c_* >0\) such that for all \(n\ge 2\), \(\omega \in \Omega \), \(E\in {\mathbb {R}}\) and \({\hat{v}}\in {\mathbb {P}}^1\), we have

$$\begin{aligned} \frac{d}{dE}{{\textbf {A}}}^{-n}_{(E)}(\omega )\, {\hat{v}}< -c_*< 0< c_* <\frac{d}{dE}{{\textbf {A}}}^n_{(E)}(\omega )\, {\hat{v}}. \end{aligned}$$

Proof

Recall that for each \(\omega \in \Omega \), \({{\textbf {A}}}^n_{(E)}(\omega ) = {{\textbf {A}}}^n_E(\zeta )\) for some \(\zeta \in \Sigma \) and that the cocycle \({{\textbf {A}}}_E:\Sigma \rightarrow SL_2({\mathbb {R}})\) is a Schrodinger cocycle. Therefore, this lemma is a direct consequence of Lemma B.3. \(\square \)

Definition 2

Given \(\gamma ,\, t,\, \rho >0\), we say that a tangency \((B_{E_0},\, C_{E_0},\, A_{E_0})\) for a cocycle \({{\textbf {A}}}_{(E_0)}\) is \((\gamma ,\, \rho ,\, t)\)-controlled if the following conditions are satisfied:

  1. 1.

    \({\text {min}}\{\lambda (A_{E_0}),\, \lambda (B_{E_0})\} \ge e^{\gamma }\);

  2. 2.

    \(\left\Vert C_{E_0}\right\Vert \le e^{\rho }\);

  3. 3.

    \({\text {min}}\{d({\hat{v}}_1^*(B_{E_0}),\, {\hat{v}}_2(B_{E_0})),\, d({\hat{v}}_1^*(B_{E_0}),\, {\hat{v}}_2(B_{E_0}))\} \ge e^{-t}\).

Fig. 3
figure 3

Unfolding the heteroclinic tangency. Vertical lines represent \({\mathbb {P}}^1 \)

Proposition 7.14

There exists \(c_*>0\) such that for every \(\beta >0\) and \(R>0\) we can find \(\gamma _0\) with the following property: for every \(\gamma \ge \gamma _0\), if \((B_{E_0},\, C_{E_0},\, A_{E_0})\) is a \((\gamma ,\, \gamma ^{1/2},\, \gamma ^{1/7})\)-controlled tangency for \({{\textbf {A}}}_{E_0}\), then defining

$$\begin{aligned} I:=[ E_0 - 2c_*^{-1}(1+\beta )\, R\, e^{-2\gamma \, (1-\beta )},\, E_0 + 2c_*^{-1}(1+\beta )\, R\, e^{-2\gamma \, (1-\beta )} ], \end{aligned}$$

for every pair of smooth curves \({\hat{v}}^+,\, {\hat{v}}^-:I\rightarrow {\mathbb {P}}^1\) satisfying

  1. A1.

    \(d({\hat{v}}^+(E_0),\, {\hat{s}}(B_0))> R^{-1}\) and \(d({\hat{v}}^-(E_0),\, {\hat{u}}(A_0)) > R^{-1}\);

  2. A2.

    \(\frac{d}{dE}{\hat{v}}^+(E)\ge 0\) and \(\frac{d}{dE}{\hat{v}}^-(E)\le 0\), for every \(E\in I\);

the equation

$$\begin{aligned} A_E\, C_E\, B_E\, {\hat{v}}^+(E) = {\hat{v}}^-(E), \end{aligned}$$

has at least one solution \(E_*\in I\).

Proof

We assume for the sake of simplicity \(E_0 = 0\). First observe that using triangular inequality, condition A1, Proposition A.3 and the given control of the tangency, there exists \(K_0>\) such that

$$\begin{aligned} d({\hat{v}}^+(0),\, {\hat{v}}_2(B_0))&\ge d({\hat{v}}^+(0),\, {\hat{s}}(B_0)) - d({\hat{s}}(B_0),\, {\hat{v}}_2(B_0))\\&\ge R^{-1} - \frac{K_0}{d({\hat{v}}_1^*(B_0),\, {\hat{v}}_2(B_0))\left\Vert B_0\right\Vert ^2}\\&\ge R^{-1}\left( 1 - K_0R\, e^{-2\gamma (1 - \frac{1}{2\gamma ^{6/7}})} \right) \end{aligned}$$

and

$$\begin{aligned} d({\hat{v}}^-(0),\, {\hat{v}}_1^*(A_0))&\ge d({\hat{v}}^-(0),\, {\hat{s}}(B_0)) - d({\hat{s}}(B_0),\, {\hat{v}}_2(B_0))\\&\ge R^{-1} - \frac{K_0}{d({\hat{v}}_1^*(A_0),\, {\hat{v}}_2(B_0))\left\Vert A_0\right\Vert ^2}\\&\ge R^{-1}\left( 1 - K_0R\, e^{-2\gamma (1 - \frac{1}{2\gamma ^{6/7}})} \right) . \end{aligned}$$

Now using the previous inequalities jointly with item (b) of Lemma A.2 and the control of the transition matrix,

$$\begin{aligned} d(C_0\, B_0\, {\hat{v}}^+(0),\, C_0\, {\hat{u}}(B_0))&\le \left\Vert C_0\right\Vert ^2d(B_0\, {\hat{v}}^+(0),\, {\hat{u}}(B_0))\\&\le \left\Vert C_0\right\Vert ^2\frac{1}{d({\hat{v}}^+(0),\, {\hat{v}}_2(B_0))\left\Vert B_0\right\Vert ^2}\\&\le R\, \left( 1- K_0R\, e^{-2\gamma (1 - \frac{1}{2\gamma ^{6/7}})} \right) ^{-1}\, e^{-2\gamma (1 - \gamma ^{-1/2})} \end{aligned}$$

and

$$\begin{aligned} d(A_0^{-1}\, {\hat{v}}^-(0),\, {\hat{s}}(A_0))&\le \frac{1}{d({\hat{v}}^-(0),\, {\hat{v}}_1^*(A_0))\left\Vert A_0\right\Vert ^2}\\&\le R\, \left( 1- K_0R\, e^{-2\gamma (1 - \frac{1}{2\gamma ^{6/7}})} \right) ^{-1}\, e^{-2\gamma }. \end{aligned}$$

Taking

$$\begin{aligned} \gamma _0 := {\text {max}}\left\{ \beta ^{-2},\, (2\beta )^{-7/6},\, \frac{1}{2(1-\beta )}\log \left( \frac{K_0R(1+\beta )}{\beta } \right) \right\} , \end{aligned}$$

we conclude that for every \(\gamma \ge \gamma _0\)

$$\begin{aligned} d(C_0\, B_0\, {\hat{v}}^+(0),\, A_0^{-1}\, {\hat{v}}^-(0)) \le 2(1+\beta )Re^{-2\gamma (1-\beta )}. \end{aligned}$$
(9)

Choose appropriate projective coordinates in such way to preserve the natural orientation. Consider the functions \(f_+,\, f_-: I\rightarrow {\mathbb {P}}^1\) given by

$$\begin{aligned} f_+(E) = C_E\, B_E\, {\hat{v}}^+(E) \quad \text {and} \quad f_-(E) = A^{-1}_E\, {\hat{v}}^-(E). \end{aligned}$$

By condition A2 and Lemma 7.13 we have that there exists \(c_*>0\) such that \( f'_-(E)< -c_*< 0< c_* < f'_+(E)\) for every \(E\in I\). Moreover, by inequality (9), \(d(f_+(0),\, f_-(0)) \le 2(1+\beta )Re^{-2\gamma (1-\beta )}\). Therefore, there exists \(|E_*| \le 2(1+\beta )c_*^{-1}Re^{-2\gamma (1-\beta )}\) such that \(f_+(E_*) = f_-(E_*)\), i.e.,

$$\begin{aligned} A_{E_*}\, C_{E_*}\, B_{E_*}\, {\hat{v}}^+(E_*) = {\hat{v}}^-(E_*). \end{aligned}$$

\(\square \)

We finish this section showing that if the cocycle \(A_{E_0}\) has a tangency we can perturb the parameter to produce plenty of new tangencies which are typical with respect to the Lyapunov exponent and the Shannon entropy in a finite scale. Recall the notation of Sect. 3.3.

Proposition 7.15

Assume the cocycle \({{\textbf {A}}}_{(E_0)}\) has a heteroclinic tangency and is irreducible. Given \(\beta >0\), there exist constants \(C^*_1, C^*_2, c_1^*, c_2^* >0\), a sequence \((l_k)_k\subset {\mathbb {N}}\), \(l_k\rightarrow \infty \), and \(k_0\in {\mathbb {N}}\) such that for every \(k\ge k_0\) we can find a set \({\mathcal {X}}_k(\beta )\subset \Omega \) with \({\tilde{\mu }}({\mathcal {X}}_k(\beta )) \ge C_1^*e^{-c_1^*l_k^{1/3}}\) with the following property: for every \(\omega \in {\mathcal {X}}(\beta )\) there exists \(E_k = E_k(\omega )\) with \(|E_k - E_0|\le C_2^*e^{-c_2^*l_k^{1/3}}\) such that \({{\textbf {A}}}_{(E_k)}\) has a tangency \((P_{E_k}, T_{E_k}, S_{E_k})\), of size \(l_k\), satisfying

  1. 1.

    \((P_{E_k}, T_{E_k}, S_{E_k})\) is \((\gamma _k,\, \gamma _k^{1/2},\, \gamma _k^{1/7})\)-controlled with \(\gamma _k = \frac{l_k}{2}(\lambda - 3\beta )\);

  2. 2.

    \(\displaystyle \mathbf{{p}}_{l_k}(\omega ):=\prod _{j=0}^{l_k-1} p_{\omega _j} \ge e^{-(H(\mu )+\beta )\, l_k}.\)

Proof

To lighten notations assume \(E_0=0\). Fix \(\beta >0\) and let \((B_{0}, C_{0}, A_{0})\) be a tangency for \({{\textbf {A}}}_{(E_0)}\). Take integers \(p_k, q_k\ge 1\) such that

$$\begin{aligned} \left| \frac{p_k}{q_k}-\frac{\log \lambda (B_0)}{\log \lambda (A_0)} \right| <\frac{1}{q_k^2}, \end{aligned}$$

or equivalently

$$\begin{aligned} \lambda (B_0)^{q_k} \, \lambda (A_0)^{-\frac{1}{q_k}}< \lambda (A_0)^{p_k} < \lambda (B_0)^{q_k} \, \lambda (A_0)^{\frac{1}{q_k}}, \end{aligned}$$
(10)

and write \(\lambda _k:=\lambda (A_0)^{p_k}\sim \lambda (B_0)^{q_k}\). Consider for each \(k\ge 1\) the new tangency \((B_{0}^{q_k},\, C_{0},\, A_{0}^{p_k})\) of size \(m_k\), also for \({{\textbf {A}}}_{(E_0)}\). We claim that this tangency is \((\gamma ,\, \gamma ^{1/2},\, \gamma ^{1/7})\)-controlled with \(\gamma := (1-\beta )\log \lambda _k\) and k sufficiently large. Indeed, by inequality (10),

$$\begin{aligned} {\text {min}}\{\lambda (A_0^{p_k}),\, \lambda (B_0^{q_k})\} \ge e^{\gamma }, \end{aligned}$$

for every k sufficiently large. Furthermore, the upper bound for \(\left\Vert C_0\right\Vert \) and the lower bound for the distances \(d({\hat{v}}_1^*(A_0^{p_k}),\, {\hat{v}}_2(A_0^{p_k}))\) and \(d({\hat{v}}_1^*(B_0^{q_k}),\, {\hat{v}}_2(B_0^{q_k}))\) can be taken independently of k and so the conditions of Definition 2 are automatic satisfied for every k large.

For each \(R>0\), consider the projective intervals \(J^s:= \textrm{B}({\hat{s}}(B_{0}),\, R^{-1})\) and \(J^u:= \textrm{B}({\hat{u}}(A_{0}),\, R^{-1})\) as well as

$$\begin{aligned} I_k := [E_0 - 2 c_*^{-1}(1+\beta ) R\lambda _k^{-2(1-2\beta )}, E_0 + 2 c_*^{-1}(1+\beta ) R\lambda _k^{-2(1-2\beta )}]. \end{aligned}$$

Denote by \(\tau _k\) the finite word of size \(m_k\) determined by the tangency, i.e., for every \(\omega \in [0;\tau _k]\), \({{\textbf {A}}}^{m_k}_{(0)}(\omega ) = A_{0}^{p_k}\, C_{0}\, B_{0}^{q_k}\).

Since \({{\textbf {A}}}_{0}\) is strongly irreducible, the forward and backward stationary measures \(\eta ^+\) and \(\eta ^-\) are non-atomic. Hence, we can choose R sufficiently large so that

$$\begin{aligned} \eta _{0}^+(6J^s) \le 1/4 \quad \text {and}\quad \eta _{0}^-(6J^u) \le 1/4. \end{aligned}$$
(11)

Given \({\hat{v}}, {\hat{w}}\in {\mathbb {P}}^1\), consider the set \({\mathcal {G}}_n:={\mathcal {G}}_n({\hat{v}},\, {\hat{w}},\, \beta ,\, \tau , 0)\) given by Proposition 7.11. For each \(n\ge 1\), define

$$\begin{aligned} {\mathcal {G}}_n^u := \left\{ \omega \in {\mathcal {G}}_n:\, {\hat{u}}({{\textbf {A}}}^n_{(0)}(\omega ))\notin 2J^s \right\} \end{aligned}$$

and

$$\begin{aligned} {\mathcal {G}}_n^s := \left\{ \omega \in {\mathcal {G}}_n:\, {\hat{s}}({{\textbf {A}}}^n_{(0)}(\omega ))\notin 2J^u \right\} . \end{aligned}$$

Notice that by item 3. of Proposition 7.11

$$\begin{aligned} \left\{ \omega \in {\mathcal {G}}_n:{{\textbf {A}}}^n_{(0)}(\omega )\, {\hat{v}}\notin 3J^s \right\} \subset {\mathcal {G}}_n^u \quad \text {and } \left\{ \omega \in {\mathcal {G}}_n:{{\textbf {A}}}^{-n}_{(0)}(\omega )\, {\hat{w}}\notin 3J^u \right\} \subset {\mathcal {G}}_n^s. \end{aligned}$$

Thus, by inequality (11), Propositions 3.4 and 7.11 we have that

$$\begin{aligned} {\tilde{\mu }}({\mathcal {G}}^u_n) \ge {\tilde{\mu }}({\mathcal {G}}_n) - {\tilde{\mu }}([ {{\textbf {A}}}^n_{(0)}(\cdot )\, {\hat{v}}\in 3J^s]) \ge 1 - \beta - 2\eta _{0}^+(6J^s) \ge \frac{1}{2} - \beta , \end{aligned}$$
(12)

and similarly,

$$\begin{aligned} {\tilde{\mu }}({\mathcal {G}}^s_n) \ge {\tilde{\mu }}({\mathcal {G}}_n) - {\tilde{\mu }}([ {{\textbf {A}}}^{-n}_{(0)}(\cdot )\, {\hat{w}}\in 3J^u]) \ge 1 - \beta - 2\eta _{0}^-(6J^u) \ge \frac{1}{2} - \beta . \end{aligned}$$
(13)

We define the set \({\mathcal {T}}_k\) of tangencies by

$$\begin{aligned} {\mathcal {T}}_k := {\mathcal {G}}_{ m_k^3}^u \cap [ m_k^3;\, \tau _k] \cap \sigma ^{-d_k}\left( {\mathcal {G}}_{ m_k^3}^s \right) , \end{aligned}$$

where \(d_k:= m_k^3 + m_k\). Take \(\omega \in {\mathcal {T}}_k\) and define the functions \({\hat{v}}^+,\, {\hat{v}}^-:I_k\rightarrow {\mathbb {P}}^1\)

$$\begin{aligned} {\hat{v}}^+(E) := {\hat{u}}({{\textbf {A}}}^{m_k^3}_{(E)}(\omega )) \quad \text {and} \quad {\hat{v}}^-(E):= {\hat{s}}({{\textbf {A}}}^{m_k^3}_{(E)}(\sigma ^{d_k}\omega )). \end{aligned}$$

Notice that by definition of \({\mathcal {T}}_k\),

$$\begin{aligned} {\hat{v}}^+(0) \notin 2J^s \supset \textrm{B}({\hat{s}}(B_0),\, R^{-1}) \quad \text {and} \quad {\hat{v}}^-(0)\notin 2J^u\supset \textrm{B}({\hat{u}}(A_0),\, R^{-1}). \end{aligned}$$

Moreover, by the Lemma B.4,

$$\begin{aligned} \frac{d}{dE}{\hat{v}}^+(E)\ge 0 \quad \text {and} \quad \frac{d}{dE}{\hat{v}}^-(E)\le 0. \end{aligned}$$

Thus, we can apply Proposition 7.14, to guarantee that there exists \(E_k = E_k(\omega )\in I_k\), satisfying

$$\begin{aligned} A_{E_k}^{p_k}\,C_{E_k}\,B_{E_k}^{q_k}\, {\hat{u}}({{\textbf {A}}}^{m_k^3}_{(E_k)}(\omega )) = {\hat{s}}({{\textbf {A}}}^{m_k^3}_{(E_k)}(\sigma ^{d_k}\omega )). \end{aligned}$$

Set \(l_k:= 2 m_k^3+m_k\) and consider the set

$$\begin{aligned} {\mathcal {F}}_k(\beta ) := \left\{ \omega \in \Omega :\, \mathbf{{p}}_{l_k}(\omega ) \ge e^{-(H(\mu )+ \beta )\,l_k} \right\} . \end{aligned}$$

By Proposition 3.5 with \(n=l_k\) and \(\varepsilon =\beta \),

$$\begin{aligned} {\tilde{\mu }}({\mathcal {F}}_k(\beta )) \ge 1 - 2e^{-\frac{4}{h^2}\, l_k\,\beta ^2}, \end{aligned}$$
(14)

where h is a positive constant depending only on \(\mu \).

For each \(\omega \in {\mathcal {T}}_k \cap {\mathcal {F}}_{k}(\beta )\), define

$$\begin{aligned} P_k := {{\textbf {A}}}^{m_k^3}_{(E_k)}(\omega ), \quad T_k := A_{E_k}^{p_k}\,C_{E_k}\,B_{E_k}^{q_k} \quad \text {and} \quad S_k := {{\textbf {A}}}^{m_k^3}_{(E_k)}(\sigma ^{d_k}\omega ). \end{aligned}$$

and observe that by Proposition 7.11, \(P_k\) and \(S_k\) are hyperbolic and if \(\gamma _k:= (\lambda - 3\beta )\frac{l_k}{2}\), then \((P_k,\, T_k,\, S_k)\) is a \((\gamma _k,\, \gamma _k^{1/2},\, \gamma ^{1/7})\)-controlled tangency for the cocycle \({{\textbf {A}}}_{(E_k)}\) of size \(l_k\). Moreover,

$$\begin{aligned} \lambda (P_k) \gtrsim e^{(\lambda - \beta )m_k^3}> e^{(\lambda - 2\beta )\frac{l_k}{2}} \quad \text {and} \quad \lambda (S_k) \gtrsim e^{(\lambda - \beta ) m_k^3 }> e^{(\lambda - 3\beta )\frac{l_k}{2}}. \end{aligned}$$

which proves item 1. Item 2 holds because \(\omega \in {\mathcal {F}}_k(\beta )\).

To finish the proposition notice that by inequalities (12), (13) and (14)

$$\begin{aligned} {\tilde{\mu }}({\mathcal {T}}_k\cap {\mathcal {F}}_{k}(\beta ))&\ge {\tilde{\mu }}({\mathcal {T}}_k) - {\tilde{\mu }}(\Omega \backslash \, {\mathcal {F}}_k(\beta ))\\&\ge (1/2 - \beta )^2\, {\tilde{\mu }}([0;\, \tau _k]) - 2e^{-\frac{4}{h^2}l_k\beta ^2} \ge (1/2-\beta )^2\, e^{-c\,m_k} \end{aligned}$$

for some constant \(c>0\). Taking \({\mathcal {X}}_k(\beta ):= {\mathcal {T}}_k\cap {\mathcal {F}}_{k}(\beta )\) completes the proof. \(\square \)

8 Counting Matchings

The purpose of this section is to give a lower bound for the \(\tilde{\nu }\)-measure of the set of sequences for which we have a \((\delta , k, I)\)-matching for some small interval of energies I. Throughout this subsection we assume that the cocycle \({{\textbf {A}}}_0\) has a heteroclinic tangency and is irreducible. We keep the notation used in the Proposition 7.15.

Recall that for a suitable \(\delta _1>0\) we use the notation

$$\begin{aligned} \lambda = {\text {min}}_{|E|\le \delta _1}L(\mu _E) > 0. \end{aligned}$$

8.1 Subset of matchings

Take \(\beta >0\) and let \({\mathbb {N}}'\) be the set of sizes \(l\in {\mathbb {N}}\) of the heteroclinic tangencies of \({{\textbf {A}}}_{E_l}\), \((P_{E_l}, T_{E_l}, S_{E_l})\), given by Proposition 7.15, applied with \(E_0 = 0\), where \(E_l = E_l(\omega )\) for some \(\omega \in \Omega \) and \(|E_l| \le C_2^*\, e^{-c_2^*\, l^{1/3}}\) is such that \({{\textbf {A}}}^{l}_{(E_l)}(\omega ) = S_{E_l}\, T_{E_l}\, P_{E_l}\). Denote by \(\tau _l\in \Omega \) the finite word of length l associated with the block \(S_{E_l}\, T_{E_l}\, P_{E_l}\), for a given size \(l\in {\mathbb {N}}'\). By item 3 of Proposition 7.15

$$\begin{aligned} {\tilde{\mu }}([0; \tau _l])= \mathbf{{p}}_{l}(\omega ) \ge e^{-(H(\mu )+ \beta )\,l}. \end{aligned}$$
(15)

To apply Proposition 7.14 with the tangency \((P_{E_l},\, T_{E_l},\, S_{E_l})\) we consider the balls \(J^s_l\) and \(J^u_l\) in \({\mathbb {P}}^1\) centered respectively in \({\hat{s}}(P_{E_l})\) and \({\hat{u}}(S_{E_l})\) with radius \(R^{-1}>0\). Consider the interval

$$\begin{aligned} I_l := [E_l-C\, e^{-l\,(\lambda -\beta )},\, E_l + C\, e^{-l\,(\lambda -\beta )}] \end{aligned}$$

provided by this proposition with \(C:= 2 c_*^{-1} (1+\beta ) R\). Choosing \(\gamma := \frac{l}{2}(\lambda - \beta )\) by the said proposition the heteroclinic tangency \((P_{E_l},\, T_{E_l},\, S_{E_l})\) is \((\gamma ,\, \gamma ^{1/2},\, \gamma ^{1/7})\)-controlled so that the initial assumptions of Proposition 7.14 are automatically satisfied.

Fix \(\tau >0\) given by Proposition 7.11 and set \({\mathcal {G}}_{l^3}:= {\mathcal {G}}_{l^3}({{\hat{e}}}_1, {{\hat{e}}}_2,\beta , \tau , 0)\). Define the sets

$$\begin{aligned} \Theta _l^u := \left\{ \omega \in {\mathcal {G}}_{l^3} \, :\, {{\textbf {A}}}_{E_l}^{l^3}(\omega )\, {\hat{e}}_1\notin 2J^s_l \right\} \end{aligned}$$

and

$$\begin{aligned} \Theta _l^s := \left\{ \omega \in {\mathcal {G}}_{l^3} \, :\, {{\textbf {A}}}_{E_l}^{-l^3}(\omega )\, {\hat{e}}_2\notin 2J^u_l \right\} . \end{aligned}$$

Notice that taking R sufficiently large and applying Proposition 7.10 we have

$$\begin{aligned} {\tilde{\mu }}(\Theta _l^u) \ge 1/2 - \beta \quad \text {and} \quad {\tilde{\mu }}(\Theta _l^s) \ge 1/2 - \beta . \end{aligned}$$
(16)

Now, we finally define our subset of matchings as

$$\begin{aligned} {\mathcal {M}}_l := \Theta _l^u\cap \, \sigma ^{-l^3}([0;\tau _l])\cap \, \sigma ^{-(2 l^3+l)}(\Theta _l^s) . \end{aligned}$$

Lemma 8.1

For every \(l\in {\mathbb {N}}'\) and \(\omega \in {\mathcal {M}}_l\) there exists \(E_l^*\in I_l\) such that

  1. 1.

    \({{\textbf {A}}}_{(E_l^*)}^{2 l^3+ l}(\omega )\, {{\hat{e}}}_1 = {{\hat{e}}}_2\);

  2. 2.

    \(e^{(\lambda -\beta )\, l^3} \le \left\Vert {{\textbf {A}}}_{(E_l^*)}^{l^3}(\omega )\, e_1\right\Vert \le \left\Vert {{\textbf {A}}}_{(E_l^*)}^{l^3}(\omega )\right\Vert \le e^{(\lambda + \beta )\, l^3}\);

  3. 3.

    \(e^{(\lambda -\beta )\, l^3} \le \left\Vert {{\textbf {A}}}_{(E_l^*)}^{-l^3}(\sigma ^{2 l^3+l}\omega )\, e_2\right\Vert \le \left\Vert {{\textbf {A}}}_{(E_l^*)}^{-l^3}(\sigma ^{2 l^3+l} \omega )\right\Vert \le e^{(\lambda + \beta )\, l^3}\);

  4. 4.

    \(\left\Vert {{\textbf {A}}}_{(E_l^*)}^{2l^3+l}(\omega )\, e_1\right\Vert \le e^{3 \, \beta \, l^3}\).

Moreover, \(\displaystyle {\tilde{\mu }}\left( {\mathcal {M}}_l \right) \ge (1/2 - \beta )^2\, e^{ -l \,(H(\mu )+ \beta ) }\).

Proof

Fix \(\omega \in {\mathcal {M}}_l\). By Proposition 7.14 we conclude that the equation

$$\begin{aligned} {{\textbf {A}}}^{2 l^3+l}_E(\omega )\, {\hat{e}}_1 = {\hat{e}}_2, \end{aligned}$$
(17)

has at least a solution \(E_l^*\in I_l\). In fact, as explained above the heteroclinic tangency \((P_{E_l},\, T_{E_l},\, S_{E_l})\) is \((\gamma ,\, \gamma ^{1/2},\, \gamma ^{1/7})\)-controlled so that the initial assumptions of Proposition 7.14 are automatically satisfied. Next consider the curves \({{\hat{v}}}^+(E):= {{\textbf {A}}}^{l^3}_E(\omega )\, {\hat{e}}_1\) and \({{\hat{v}}}^-(E):= {{\textbf {A}}}^{-l^3}_E(\sigma ^{2\,l^3+l}\omega )\, {\hat{e}}_2\). Assumption A1 holds because \(\omega \in \Theta _l^u\) and \(\sigma ^{2\,l^3+l}\omega \in \Theta _l^s\). Assumption A2 holds by Lemma 7.13.

The lower bounds in items 2 and 3 follow from item 1 of Proposition 7.11 and the fact that \(\omega \in {\mathcal {G}}_{l^3}({{\hat{e}}}_1, {{\hat{e}}}_2,\beta , \tau , 0)\). From items 1 and 3 of the said proposition together with conclusion 2) of Proposition A.3 we get the upper bounds in items 2 and 3.

Taking unit vectors \(w_1\in {{\textbf {A}}}_{(E^*_l)}^{l^3}(\omega )\, {{\hat{e}}}_1\) and \(w_2\in {{\textbf {A}}}_{(E^*_l)}^{l}(\sigma ^{l^3}\omega )\, {{\hat{w}}}_1\), by 2 above,

$$\begin{aligned} \left\Vert {{\textbf {A}}}_{(E_l^*)}^{2l^3+l}(\omega )\, e_1\right\Vert&= \left\Vert {{\textbf {A}}}_{(E_l^*)}^{l^3}(\sigma ^{l^3+l}\omega )\, w_2\right\Vert \, \left\Vert {{\textbf {A}}}_{(E_l^*)}^{l}(\sigma ^{l^3} \omega )\, w_1\right\Vert \, \left\Vert {{\textbf {A}}}_{(E_l^*)}^{l^3}(\omega )\, e_1\right\Vert \\&\le e^{-(\lambda -\beta )\, l^3}\, e^{C\, l} \, e^{(\lambda +\beta ) l^3} \le e^{2 \, \beta \, l^3 + C\, l} \le e^{3\beta l^3} , \end{aligned}$$

which proves item 4.

To finish, using the inequalities in (15) and (16) we have

$$\begin{aligned} {\tilde{\mu }}({\mathcal {M}}_l) = {\tilde{\mu }}(\Theta _l^u)\, {\tilde{\mu }}(\Theta _l^s)\, {\tilde{\mu }}([0; \tau _l]) \ge (1/2 - \beta )^2\, e^{-l\, (H(\mu ) + \beta )}. \end{aligned}$$

This completes the proof of the lemma. \(\square \)

Now we can give a lower bound for the set of matchings. Recall the notation of Sect. 6.

Corollary 8.2

For all large \(l\in {\mathbb {N}}'\), if \(n_l:=4(2l^3 + l)\) then

$$\begin{aligned} \tilde{\nu }\left( \Sigma (e^{-l^3(\lambda - 4\beta )},\, n_l,\,I_l) \right) \ge \frac{1}{4}(1/2 - \beta )^2\, e^{-l\, (H(\mu ) + \beta )}. \end{aligned}$$

Proof

Let \(\pi _0:= \pi |_{\Sigma _0}:\Sigma _0\rightarrow \Omega \) be the conjugation given by Lemma 5.2. We claim that \(\pi _0^{-1}({\mathcal {M}}_l) \subset \Sigma (e^{-l^3(\lambda - \beta )},\, n_l,\,I_l)\). Indeed, by Lemma 8.1 if \(\pi _0(\zeta )\in {\mathcal {M}}_l\), there exist \(E_l^*\in I_l\) such that

$$\begin{aligned} {{\textbf {A}}}_{E_l^*}^{n_l}(\zeta )\, {\hat{e}}_1 = {{\textbf {A}}}^{2l^3 + l}_{(E_l^*)}(\pi _0(\zeta ))\, {\hat{e}}_1 = {\hat{e}}_2 \end{aligned}$$

Moreover,

$$\begin{aligned} \tau _{n_l}(\zeta , E_l^*)&\le \frac{ \left\Vert {{\textbf {A}}}_{E^*}^{n_l}(\zeta )\, e_1 \right\Vert }{ \left\Vert {{\textbf {A}}}_{E^*}^{4\,l^3}(\zeta )\, e_1 \right\Vert } = \frac{ \left\Vert {{\textbf {A}}}_{E_l^*}^{2 l^3 + l}(\pi _0(\zeta ))\, e_1 \right\Vert }{ \left\Vert {{\textbf {A}}}_{E_l^*}^{l^3}(\pi _0(\zeta ))\, e_1 \right\Vert } \le e^{3\,\beta \, l^3 - (\lambda - \beta )\,l^3}. \end{aligned}$$

This proves that any \(\zeta \) is a \((e^{-l^3\,(\lambda - 4\beta )}, n_l, E^*_l)\)-matching for some \(E^*\in I_l\).

To finish, since \(4\tilde{\nu }\) is normalization of \(\tilde{\nu }\) to \(\Sigma _0\),

$$\begin{aligned} \tilde{\nu }\left( \Sigma (e^{-l^3(\lambda - \beta )},\, n_l,\,I_l) \right)&\ge \tilde{\nu }(\pi _0^{-1}({\mathcal {M}}_l)) = \frac{1}{4}{\tilde{\mu }}({\mathcal {M}}_l)\\&\ge \frac{1}{4}(1/2 - \beta )^2\, e^{-l\, (H(\mu ) + \beta )}. \end{aligned}$$

This completes the proof of the corollary. \(\square \)

9 Proof of the Results

We keep the notations of the previous section.

9.1 Proof of theorem A

We keep the notation of the previous section. Take \(\alpha > \frac{H(\mu )}{L(\mu )}\) and choose \(\delta >0\) such that \(\lambda := {\text {min}}_{|E|\le \delta } L(\mu _E)\) satisfies \(\lambda \, \alpha -H(\mu )>0\). Then take \(0<\beta <\lambda \) small enough so that \(\lambda \, \alpha -H(\mu )>2\,\beta +\alpha \, \beta \), which implies that

$$\begin{aligned} -H(\mu )-\beta + \alpha \, (\lambda -\beta )>\beta . \end{aligned}$$
(18)

By Proposition 4.1, to prove Theorem A it is enough to prove that the integrated density of states \({\mathcal {N}}\) is not \(\alpha \)-Hölder continuous. By Corollaries 6.4 and 8.2, writing \(\delta _l: = e^{-l^3(\lambda - 4\beta )}\),

$$\begin{aligned} \Delta _{I_l+ [-\delta _l,\, \delta _l]}{\mathcal {N}}\ge \frac{1}{n_l}\tilde{\nu }\left( \Sigma (\delta _l,\, n_l,\, I_l) \right) \ge \frac{1}{4n_l}(1/2-\beta )^2\, e^{-l_l(H(\mu ) + \beta )}. \end{aligned}$$

Thus by inequality (18),

$$\begin{aligned} \frac{ \Delta _{I_l+ [-\delta _l,\, \delta _l]}{\mathcal {N}}}{ |I_l+ [-\delta _l,\, \delta _l]|^{\alpha } } \gtrsim e^{ l\left( -H(\mu ) - \beta + \alpha (\lambda - \beta ) \right) } \gtrsim e^{\beta \, l}. \end{aligned}$$

Taking \(l\in {\mathbb {N}}'\), \(l\rightarrow \infty \) we conclude that \({\mathcal {N}}\) can not be \(\alpha \)-Hölder continuous. \(\square \)

9.2 Proof of corollary A

By [13, Theorem 4.1], if \(\mu \) is not uniformly hyperbolic, then the semigroup generated by \({\text {supp}}\mu \) must contain a parabolic or elliptic matrix. In either case, by Proposition 7.8 and Proposition 7.12, we can approximate \(\mu \) by measures with finite support admitting tangencies which are irreducible. The result follows by continuity of the quotient \(E\mapsto \frac{H(\mu )}{L(\mu _E)}\). \(\square \)

9.3 Proof of corollary B

By Johnson’s theorem [29], if \(E_0\) is an energy in the almost sure spectrum of the Schrödinger operator, then the associated Schrödinger cocycle \(A_{E_0}\) is not uniformly hyperbolic. Therefore, we can again apply Propositions 7.8 and 7.12 to find energies close to \(E_0\) such that the cocycle \(A_{E_0}\) is irreducible and has heteroclinic tangencies The result follows by continuity of the quotient \(E\mapsto \frac{H(\mu )}{L(\mu _E)}\). \(\square \)