1 Context and main result

The SSH model (Su–Schrieffer–Heeger [23]) is the prototype of a chiral topological insulator in dimension one. Here, a slightly generalized and disordered or dirty version of it will be considered. In such systems, one can associate a non-commutative winding number as a topological invariant to the Fermi projection, provided that the Fermi level lies in a spectral region of Anderson localization. If one modifies the parameters of the system (such as the strength of the disorder in the hopping and on-site masses, see below), the topological invariant may change and the transition points make up the so-called topological phase boundary. It is known that there is no dynamical Anderson localization on the phase boundary even for higher-dimensional models (Section 6.6 in [18] and Section 5.5 in [21]). In the disordered SSH model, one can determine the phase boundary as those points at which the (smallest non-negative) Lyapunov exponent at energy \(E_c=0\) vanishes [9, 10, 15]. Away from these points, one can prove Anderson localization throughout the whole spectrum [22]. The novel contribution of this work is that the density of states (DOS) vanishes at zero energy for parameters away from the phase boundary (this is often referred to as a pseudo-gap), while it has a characteristic divergence at the phase boundary.

To formulate the main result, let us write out the generalized dirty SSH Hamiltonian H to be considered here which is a generalization of the model studied in [15]. Over each site of the lattice \({\mathbb {Z}}\), the system has a quantum cavity with 2L orbitals so that the total Hilbert space is \(\ell ^2({\mathbb {Z}},{\mathbb {C}}^{2L})\). On each site acts a chiral symmetry operator \(J=\textrm{diag}({\textbf{1}}_L,-{\textbf{1}}_L)\) which naturally extends to a symmetry on \(\ell ^2({\mathbb {Z}},{\mathbb {C}}^{2L})\). Within the cavity over site n, the Hamiltonian is off-diagonal in the grading of J with entry given by a random invertible matrix \(M_{n}\). Furthermore, all sites are supposed to be connected by rank one operators \(B_n=v_{n-1}{\hat{v}}_{n}^*\) with unit vectors \(v_{n-1},{\hat{v}}_{n}\in {\mathbb {C}}^L\) and random couplings \(t_n\). Hence the action of H on \(\psi =(\psi _{n})_{n\in {\mathbb {Z}}}\in \ell ^2({\mathbb {Z}},{\mathbb {C}}^{2L})\) is given by

$$\begin{aligned} (H\psi )_{n} \;=\; -\,t_{n+1} \begin{pmatrix} 0 &{}\quad {B_{n+1}} \\ 0 &{}\quad 0 \end{pmatrix} \psi _{n+1} \,+\, \begin{pmatrix} 0 &{} \quad \!\!\! M_{n} \\ M_{n}^* &{}\quad 0 \end{pmatrix} \psi _{n} \,-\, \overline{t_{n}} \begin{pmatrix} 0 &{}\quad 0 \\ {B_{n}^*} &{}\quad 0 \end{pmatrix} \psi _{n-1}.\nonumber \\ \end{aligned}$$
(1)

Here, \(t_{n}\in {\mathbb {C}}\setminus \{0\}\) with complex conjugate \(\overline{t_n}\), \(t_n\), \(M_{n}\) and \(B_{n}\) are random variables of the form \(t_{n}=e^{\imath \phi _n}(1+\lambda \omega _n)\), \(M_{n}=\frac{1}{2}(m\,{\textbf{1}}_L+\mu \omega '_n)\), \(B_{n}=v_{n-1}{\hat{v}}_{n}^{*}\) where \(\phi _n\in [0,2\pi )\), \(\omega _n\in [-\frac{1}{2},\frac{1}{2}]\), \(\omega '_n\in {\mathbb {C}}^{L\times L}\), and \(v_n,{\hat{v}}_n \in {\mathbb {C}}^{L}\) with \( \Vert v_{n}\Vert =\Vert {\hat{v}}_{n}\Vert =1\). Moreover, \(\lambda \), \(\mu \) are coupling constants of the randomness. The phases \(\phi _n\) may be chosen deterministically. To model the randomness, let us set \(\sigma _n=(\omega _n,\omega '_n,v_n,{\hat{v}}_n)\) on which the following assumptions will be made.

Main hypothesis The random variables \((\sigma _{n})_{n\in {\mathbb {Z}}}\) are i.i.d. random variables with compactly supported distribution. The parameter \(m\mu ^{-1}\) is sufficiently large, such that there is a uniform (almost sure) lower bound of \(M_{n}^{*} M_{n}\). Finally \(|{{\hat{v}}}_{n}^{*} M_{n}^{-1} v_{n}t_{n}|\) is a random variable with positive variance and uniform (almost sure) lower bound.

In the following, we will also simply write \(\sigma =(\omega _\sigma ,\omega '_\sigma ,v_\sigma ,{\hat{v}}_\sigma )\) for a random variable with the above distribution. Note that then \((t_{\sigma }\),\(M_{\sigma }\), \(v_{\sigma }\) \({{\hat{v}}}_{\sigma })\) can also be considered as a random vector depending on \(\sigma \). Let us briefly discuss the assumptions on the model (1). The hypothesis that \(B_n\) in (1) is of rank one is essential for the techniques of the present paper. In combination with the lower bound on \(|{{\hat{v}}}_{n}^{*} M_{n}^{-1} v_{n}t_{n}|\) assuring a uniform transfer through each cavity, it allows to work with reduced \(2\times 2\) transfer matrices. Thus, a one-dimensional rotation number calculation for the density of states is feasible via scalar Prüfer phases. The paper then analyzes the associated random dynamics on the unit circle. While it is known how to deal with block Jacobi operators, hence allowing to deal with higher rank r of the \(B_n\) [20], this random dynamics is then given by the matrix Möbius action on the higher-dimensional unitary group U(r) and thus considerably more difficult to control.

Just as any random Schrödinger operator, the generalized SSH model has a well-defined integrated density of states (IDOS) defined as the non-decreasing function \(E\in {\mathbb {R}}\mapsto {\mathcal {N}}(E)\) given by

$$\begin{aligned} {\mathcal {N}}(E) :=\; \lim _{N\rightarrow \infty }\frac{1}{N}\;\frac{1}{2L}\;\#\{\text{ eigenvalues } \text{ of } H_N\,\le \,E\}, \end{aligned}$$

where \(H_N\) is the restriction of H to \(\ell ^2(\{1,\ldots ,N\},{\mathbb {C}}^{2L})\). The limit is known to exist almost surely [17]. Furthermore, such a one-dimensional random model has a (smallest non-negative) Lyapunov exponent \(\gamma (E)\ge 0\) for every energy \(E\in {\mathbb {R}}\). Further down this will be introduced more carefully and it will also be shown that at the critical energy \(\gamma (0)=|{\mathbb {E}}(\log (\kappa ))|\) where \(\kappa _{\sigma }\) is the positive random variable defined by

$$\begin{aligned} \kappa _{\sigma } \;=\; { \frac{1}{|{{{\hat{v}}}_{\sigma }^* M_\sigma ^{-1} v_{\sigma } t_\sigma |}}. } \end{aligned}$$
(2)

Let us now introduce the set \({\mathcal {P}}\) of parameters \(\lambda ,\mu \) for which the Lyapunov exponent at the center of band vanishes:

$$\begin{aligned} {\mathcal {P}} \;=\; \big \{ (\lambda ,\mu )\in {\mathbb {R}}^2:\;\gamma (0)=|{\mathbb {E}}(\log (\kappa ))|=0 \big \}. \end{aligned}$$

All topological phase transition of the generalized SSH model lie on \({\mathcal {P}}\) [9, 10, 15]. The main result of this work now states that one can read off from the IDOS whether a model lies on \({\mathcal {P}}\) or not.

Theorem 1

Let \((\lambda ,\mu )\) be such that the main hypothesis is fulfilled. For \((\lambda ,\mu )\not \in {\mathcal {P}}\) lying off the phase boundary, i.e., \({\mathbb {E}}(\log (\kappa ))\ne 0\), the IDOS of the dirty SSH has a pseudo-gap at 0 in the sense that

$$\begin{aligned} \lim _{E \rightarrow 0}\, \frac{\log \big |\,{\mathcal {N}}(E)\,-\,{\mathcal {N}}(0)\,\big |}{\log ({|E|})} \;=\; \nu , \end{aligned}$$
(3)

where \(\nu >0\) is determined as the unique positive solution of \({\mathbb {E}}(\kappa ^{\nu })=1\) if \({\mathbb {E}} (\log (\kappa ))<0\), and the solution of \({\mathbb {E}}(\kappa ^{-\nu })=1\) otherwise. On the other hand, for \((\lambda ,\mu )\in {\mathcal {P}}\) on the phase boundary, i.e., \({\mathbb {E}}(\log (\kappa ))=0\), with some constant C, the DOS has a characteristic divergence at 0 specified by

$$\begin{aligned} \left| \,{\mathcal {N}}(E)\,-\,{\mathcal {N}}(0)\,-\,\tfrac{1}{4L}\,{\mathbb {E}}\big (\big (\log (\kappa )\big )^2\big )\,\big (\log ({|E|})\big )^{-2}\,\right| \;\le \; C \;|\log ({|E|})|^{-3}. \end{aligned}$$
(4)

Let us compare Theorem 1 with the literature on random hopping models which, as will be explained in Sect. 2, is essentially the particular case \(L=1\) of the generalized SSH model. For the random hopping model, the upper bound \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)| \le C_\delta |E|^{\nu -\delta }\) was proved in [2] for all \(\delta >0\). Hence (3) provides also the corresponding lower bound. For the random hopping model and points on \({\mathcal {P}}\), the characteristic divergence (4) is referred to as Dyson’s spike, due to his work [6] showing this for a particular distribution of the random hopping terms. Apart from Dyson’s work, there are several non-rigorous works on both regimes covered by Theorem 1. Section V.E in the review [8] contains relevant references. The behavior of the integrated density of states as in (4) was more recently proved rigorously to hold under even weaker assumptions by Kotowski and Virág [13] (no independence is assumed in their work, merely a sufficiently rapid correlation decay). However, no explicit error bound of order \(|\log (E)|^{-3}\) was provided, merely a bound of order \(o(E) \log (E)^{-2}\). Here we place all of these results in the joint context of topological phases and provide considerable technical improvements on both [13] and [2].

In order to justify this last statement, let us provide a more detailed technical comparison with the work of Kotowski and Virág [13]. First of all, both works analyze the perturbation of the rotation number of the induced dynamics on projective space \(P({\mathbb {R}}^2)\cong {\mathbb {S}}^1\) driven by the transfer matrices. The unperturbed dynamics at \(E=0\) leaves two critical points and semicircles in between invariant. The energy-dependent perturbation adds some rotation around the critical points (all in the same direction) and one needs to analyze the number of passages by the critical points into the next semi-circle. In the long-time limit, one then obtains the rotation number which is equivalent to the density of states. In [13] the free dynamics (which does not rotate) is subtracted by conjugations which involve products of the \(\kappa \)’s and these products can push the \({\mathcal {O}}(E)\) rotation induced by the perturbation. Now the process is compared to some family of different dynamics (with an additional parameter \(\delta \)) that are partially slower/faster (under certain conditions, and for number of steps \(n<\delta (\tfrac{1}{{|E|}})^{\delta /4}\) not too large). Then the number of crossings to the ’next’ semi-circle is shown to be approximately equal to the number of times where the log-transformed free dynamics \(\sum _{l=1}^n \log (\kappa _l)\) makes jumps of order \(|\log ({|E|})|\). Playing with all the parameters one can get to some scaling limit for \(E=e^{-\sqrt{n}} \rightarrow 0\) and \(n \rightarrow \infty \). Finally, using probabilistic techniques (cf. [13, Theorem 3.10]) the authors obtain bounds for the rotation number for E small, meaning they can let E fixed and \(n \rightarrow \infty \), namely a statement similar to (4), but only with an error o(E).

In contradistinction, in this work the dynamics is analyzed directly, without conjugation of the free dynamics. It is then compared to suitable slower and faster dynamics which allow to estimate the crossings at fixed E. In essence, the slower dynamics drops the \({\mathcal {O}}(E)\) rotation except for the region close to the critical points, and the faster dynamics essentially replaces the \({\mathcal {O}}(E)\) perturbations by a o(E) drift going forward. Formulating the rotation number in terms of an expectation of a certain stopping time, one can use the optional stopping theorem to get the claimed estimates. The present approach is more direct and a lot less technical than the one of [13]. Moreover, the constructions work immediately for both cases treated in Theorem 1; only the constructed martingales and the consequent usage of the optional stopping theorem are of different nature, leading to the different behavior at \(E=0\). It is hard to see how to modify the techniques of [13] for the case \({\mathbb {E}}(\log (\kappa ))\ne 0\) (which still remains possible, of course).

To conclude this introductory section, let us mention that we are currently investigating several interesting open questions on the generalized SSH model. First of all, one would like to have a controlled perturbation theory for the Lyapunov exponent in the vicinity of the critical energy of models on the transition (as in [3, 12, 17]). This depends on a good understanding of the Furstenberg measure. As illustrated numerically in Fig. 1, the Lyapunov exponent (or inverse localization length) has a similar singular behavior as the IDOS, as predicted by theoretical physicists (see [8]). Second of all, we expect that all these states at energies with large localization length (as exhibited in Theorem 1) lead to a quantitative lower bound on the quantum dynamics (going beyond the statements of, e.g., [18, 21], showing that models at the topological phase boundaries cannot be dynamically Anderson localized). The mechanism behind this quantitative delocalization phenomenon is similar as in the random dimer model [4], random polymer model [12] or the random Kronig–Penney model [3], but a proof is much more subtle due to the presence of the singularities of the DOS and the Lyapunov exponent. Another question concerns the fate of the (likely enhanced) area law in these models [16]. Let us note that the nature of the level statistics near the critical energy for models on the transition was already determined in [13]. Finally, it is a challenging open issue to analyze both the IDOS and the Lyapunov exponents for the model (1) when the \(B_n\) are of higher and possibly varying rank.

2 Transfer matrices and critical energies

The proof of Theorem 1 uses the transfer matrix formalism for the study of quasi-one-dimensional Jacobi operators. Clearly, the SSH Hamiltonian (1) is a such a block Jacobi matrix with \(2L\times 2L\) block entries on every site. However, the off-diagonal entries are not invertible so that one cannot define the \(4L\times 4L\) transfer matrices in the usual form (which involves working with the inverse of the off-diagonal terms). One rather has to pass to the so-called reduced transfer matrices [5, 19, 20, 24]. In the present situation, the matrices B are of rank 1 and therefore the reduced transfer matrices will be of size \(2\times 2\) satisfying

$$\begin{aligned} T^*\,I\,T \;=\; I, \qquad I :=\; \begin{pmatrix} 0 &{} -1 \\ 1 &{} 0 \end{pmatrix} . \end{aligned}$$
(5)

Then, the ranges of the lower and upper entries in the block Jacobi matrices both have a one-dimensional span \({\mathcal {H}}_n^-=\textrm{span}\{\left( {\begin{array}{c}0\\ {\hat{v}}_n\end{array}}\right) \}\) and \({\mathcal {H}}^+_n=\textrm{span}\{\left( {\begin{array}{c}v_n\\ 0\end{array}}\right) \}\) in \({\mathbb {C}}^{2L}\), respectively. These two spaces are orthogonal as required in [20]. The relevant part of the resolvent of the diagonal part is

$$\begin{aligned} {\begin{pmatrix} 0 &{} v_n \\ {\hat{v}}_n &{} 0 \end{pmatrix}^* \left( E\,{\textbf{1}}\,-\, \begin{pmatrix} 0 &{} \!\! M_{n} \\ M_{n}^* &{} 0 \end{pmatrix} \right) ^{-1} \begin{pmatrix} 0 &{} v_n \\ {\hat{v}}_n &{} 0 \end{pmatrix} \;=\; \begin{pmatrix} G^{E,-,-}_n &{} G^{E,-,+}_n \\ G^{E,+,-}_n &{} G^{E,+,+}_n \end{pmatrix} ,} \end{aligned}$$

by definition of the 4 scalar entries on the r.h.s.. An explicit evaluation of the inverse shows

$$\begin{aligned} {\begin{pmatrix} G^{E,-,-}_n &{} G^{E,-,+}_n \\ G^{E,+,-}_n &{} G^{E,+,+}_n \end{pmatrix} \;=\; \begin{pmatrix} E\,{{\hat{v}}}_{n}^{*} (E^2{\textbf{1}}-M_n^*M_n)^{-1} {{\hat{v}}}_n &{} {{\hat{v}}}_{n}^{*} M_n^*(E^2{\textbf{1}}-M_nM_n^*)^{-1} v_n \\ v_{n}^{*} M_n^*(E^2{\textbf{1}}-M_n^*M_n)^{-1} {\hat{v}}_n &{} E\,v_{n}^{*} (E^2{\textbf{1}}-M_nM_n^*)^{-1} v_n \end{pmatrix} . }\nonumber \\ \end{aligned}$$
(6)

Note that the main hypothesis assures that, for E sufficiently small, \(M_nM_n^*-E^2\textbf{1}\) and \(M_n^*M_n-E^2\textbf{1}\) have a uniform lower and upper bounds and that the off-diagonal \(G^{E,-,+}_n\) is non-vanishing. Therefore, the reduced transfer matrices, given in (15) or (17) of [20], are

$$\begin{aligned} T^E_n \;=\; -\, \begin{pmatrix} (G^{E,-,+}_n)^{-1} &{} (G^{E,-,+}_n)^{-1}G^{E,-,-}_n\\ -G^{E,+,+}_n(G^{E,-,+}_n)^{-1}&{} G^{E,+,-}_n-G^{E,+,+}_n ( G^{E,-,+}_n)^{-1}G^{E,-,-}_n \end{pmatrix} \begin{pmatrix} {(t_{n})^{-1}} &{} 0\\ 0 &{} \overline{t_{n}} \end{pmatrix} .\nonumber \\ \end{aligned}$$
(7)

Let us note that this SSH model also fits into the scheme of one-channel operators and the transfer matrices above correspond exactly to (1.10) in [19]. As already stressed above, one then knows that the \(T^E_n\) satisfy (5) and are analytic in E in a small ball around 0. This implies that \(T_n^{E}\) is of the form \(T^{E}_{n}=e^{\imath \varphi (E,n)} {\widetilde{T}}^{E}_{n}\) with \({\widetilde{T}}^{E}_{n} \in \textrm{SL}(2,{\mathbb {R}})\) where the \({\widetilde{T}}^{E}_{n}\) may be chosen analytically in E. For the rotation number calculation, one may simply consider the products of the \({\widetilde{T}}^{E}_{n}\) and ignore the products of the phases (of course they show up for the eigenvectors). Alternatively, one may eliminate the phases by an energy dependent gauge transformation. In the following, \(T^{E}_{n}\) will be assumed to be in \(\textrm{SL}(2,{\mathbb {R}})\). Using (6) in (7) and expanding around \(E_{c}=0\), one finds that

$$\begin{aligned} T^{E}_{n} \;=\; \pm \begin{pmatrix} \kappa _{n} &{} -E \,\kappa _{n} |t_{n}|^{2} \, {{\hat{v}}}_{n}^{*} (M_{n}^{*}M_{n})^{-1} {{\hat{v}}}_{n} \\ E\,\kappa _{n} v_{n}^{*}(M_{n}M_{n}^{*})^{-1}v_{n} &{} (\kappa _{n})^{-1} \end{pmatrix}\,+\,{\mathcal {O}}(E^{2}) , \end{aligned}$$
(8)

where \(\kappa _n=\kappa _{\sigma _n}\) is given by (2) and the irrelevant sign stems from the phase factors. Hence, \(E_c=0\) is indeed a hyperbolic critical energy in the sense [2] that all transfer matrices commute and some of them are hyperbolic (if any of the distributions is non-trivial so that \(|\kappa _n|\) is not identically equal to 1).

Let us briefly specify how to obtain the model studied in [15] as well as the random hopping model from [2, 6, 13]. One chooses \(L=1\), \(B_{{n}}=1\) and then the matrices \(M_n\) are scalars \(m_n\). If one denotes \(\lambda =W_1\), \(\mu =W_2\), and \(\omega _n\) and \(\omega '_n\) have a uniform distribution on \([-\frac{1}{2},\frac{1}{2}]\), one obtains after conjugation with a suitable Cayley transform exactly the random Hamiltonian of [15]. For these particular distributions, the Lyapunov exponent at \(E_c=0\) can be calculated explicitly [15], but the results of this paper do not depend on these particular choices. Then, keeping also the quadratic terms in E, (8) reduces to

$$\begin{aligned} T^E_n \;=\; \begin{pmatrix} \frac{m_{n}}{t_{n}} &{} 0\\ 0 &{} \frac{t_{n}}{m_{n}} \end{pmatrix} \,+\,E \begin{pmatrix} 0 &{} -\,\frac{t_{n}}{m_{n}}\\ \,\frac{1}{m_{n}\,t_{n}} &{} 0 \end{pmatrix} \,-\,E^2 \begin{pmatrix} \frac{1}{m_{n}\,t_{n}}&{} 0\\ 0 &{} 0 \end{pmatrix} . \end{aligned}$$

This is actually also connected to Dyson’s random hopping model studied in [6, 13]. More precisely, set \({\hat{t}}_{2n}=t_n\) and \({\hat{t}}_{2n+1}=m_n\) and suppose that they are identically distributed, then one can check

$$\begin{aligned} T^E_n \;=\; \begin{pmatrix} -\,E\,\tfrac{1}{{\hat{t}}_{2n+1}} &{} -{\hat{t}}_{2n+1} \\ \tfrac{1}{{\hat{t}}_{2n+1}} &{} 0 \end{pmatrix} \begin{pmatrix} -\,E\,\tfrac{1}{{\hat{t}}_{2n}} &{} -{\hat{t}}_{2n} \\ \tfrac{1}{{\hat{t}}_{2n}} &{} 0 \end{pmatrix} , \end{aligned}$$

which is indeed the two-step transfer matrix of a random hopping Hamiltonian on \(\ell ^2({\mathbb {Z}})\) given by

$$\begin{aligned} (H\psi )_n \;=\; -\,{\hat{t}}_{n+1} \psi _{n+1}\,-\,{\hat{t}}_{n}\psi _{n-1}, \qquad \psi =(\psi _n)_{n\in {\mathbb {Z}}}\in \ell ^2({\mathbb {Z}}). \end{aligned}$$

Hence, both the dirty SSH Hamiltonian from [15] and the random hopping model are particular cases of the generalized SSH Hamiltonian (1).

3 Prüfer phase formalism near critical energy

In the theory of products of random \(2\times 2\) matrices, the associated Lyapunov exponent can be accessed via the random action of the matrices on projective space, which in turn is bijectively mapped to a unit circle making up the so-called Prüfer phases. By a Cayley transform, they are mapped to real numbers which are then called Dyson–Schmidt variables. In both cases, the action is implemented by a Möbius transformation. This way of approaching the Lyapunov exponent is particularly efficient for perturbative expansions [14, 17]. Furthermore, if the random matrices are the transfer matrices from a given one-dimensional random operator and the Prüfer phases are suitably lifted to \({\mathbb {R}}\), then oscillation theory also allows to extract the DOS from the Prüfer phases and again this is a good way to tackle perturbative problems.

Traditionally, perturbation theory is done in a coupling constant of the randomness, corresponding to a weak coupling limit of the randomness (e.g., in the one-dimensional Anderson model). However, there are other situations where the perturbative parameter is the energy distance to some critical energy, and is hence intrinsic to the model rather than an external parameter. The first example of this type is the random dimer model [4] and its generalization, the random polymer model [12]. In these models exists a so-called critical energy at which all (random) transfer matrices commute and, moreover, the spectrum of all these matrices lies on the unit circle so that they can simultaneously diagonalized into (random) rotations. Due to the latter fact, the critical energy of this type is called elliptic. On the other hand, in a random Kronig–Penney model there can be a critical energy at which the transfer matrices are all similar to a Jordan block [3], so that the critical energy is then called parabolic. Finally, it was pointed out in [2] that the random hopping model and the SSH model have a hyperbolic critical energy with transfer matrices having their spectra off the unit circle. Sections 5 and 6 treat the case in which the Lyapunov exponent is non-vanishing at the critical energy, which is then called unbalanced. Section 7 then concerns the so-called balanced case with a vanishing Lyapunov exponent.

In order to cover other possible applications of hyperbolic critical energies and to stress structural aspects, let us consider the same set-up as in [2]. Suppose \((\Sigma ,\textbf{p})\) is a compact probability space and \(\sigma \in \Sigma \mapsto T^{E_c+\epsilon }_\sigma \in \text{ SL }(2,{\mathbb {R}})\) a family of transfer matrices over polymer blocks of length \(L_\sigma \in {\mathbb {N}}\) which is of the form

$$\begin{aligned} T^{E_c+\epsilon }_{\sigma } \;=\; \pm \left[ {\textbf{1}}\,+\,a_\sigma \epsilon \begin{pmatrix} 0 &{} -1 \\ 1 &{} 0 \end{pmatrix} \,+\,b_\sigma \epsilon \begin{pmatrix} 0 &{} 1 \\ 1 &{} 0 \end{pmatrix} \,+\,c_\sigma \epsilon \begin{pmatrix} 1 &{} 0 \\ 0 &{} -1 \end{pmatrix} \,+\,{\mathcal {O}}(\epsilon ^2)\right] D_{\kappa _{\sigma }}.\nonumber \\ \end{aligned}$$
(9)

Here \(a_\sigma ,b_\sigma ,c_\sigma \) are real numbers, \(\kappa _{\sigma }>0\) and furthermore

$$\begin{aligned} D_{\kappa } \;=\; \begin{pmatrix} \kappa &{} 0 \\ 0 &{}\frac{1}{\kappa } \end{pmatrix} . \end{aligned}$$

The hyperbolic critical energy will be called unbalanced if \({\mathbb {E}}(\log (\kappa ))\not =0\) and balanced if \({\mathbb {E}}(\log (\kappa ))=0\). In the former situation, we will always focus on the case \({\mathbb {E}}(\log (\kappa ))<0\), as otherwise one can simply conjugate by the matrix \(\left( {\begin{array}{c}0\;-1\\ 1\;\,0\end{array}}\right) \). The particular form (9) covers the reduced transfer matrices \(T^E_n\) of the generalized SSH model given in (7) due to (8). Comparing (8) with (9), one obtains for the SSH model

$$\begin{aligned} { a_{\sigma }\,+\,b_{\sigma } \;=\; v_{\sigma }^{*}(M_{\sigma }M_{\sigma }^{*})^{-1}v_{\sigma }, \quad a_{\sigma }\,-\,b_{\sigma } \;=\; \kappa _{\sigma }^{2} |t_{\sigma }|^{2} {{\hat{v}}}_{\sigma }^{*}(M_{\sigma }^{*} M_{\sigma })^{-1} {{\hat{v}}}_{\sigma } ,\qquad c_{\sigma }=0.}\nonumber \\ \end{aligned}$$
(10)

In more general situations (such as random polymer models), one may use so-called modified transfer matrices to attain (9), see [2, 12] for details. In all arguments below, it is possible to absorb the contribution of \(c_\sigma \) in the diagonal term by replacing \(\kappa _{\sigma }\) by \(\kappa _{\sigma }(1+\epsilon c_\sigma )\). Therefore, models with such a term could be handled as well. In order to somewhat simplify notations, we will suppose \(c_\sigma =0\) for all \(\sigma \in \Sigma \). Let us note that for Jacobi matrices with scalar entries and \(T^E_\sigma \) obtained by regrouping a finite number of blocks, one can verify (see Proposition 3 in [2]) that the inequalities \(a_{\sigma }\ge 0\) and \(a^2_\sigma \ge b_\sigma ^2+c_\sigma ^2\) hold for all \(\sigma \in \Sigma \). Here this also holds for the generalized SSH model, as will be shown after the technical hypothesis stated below. It will be useful to rewrite (9) as

$$\begin{aligned} T^{E_c+\epsilon }_{\sigma } \;=\; R^\epsilon _{\sigma }\,D_{\kappa _{\sigma }}, \end{aligned}$$
(11)

with the notations

$$\begin{aligned} R^{\epsilon }_{\sigma } \;=\; {\textbf{1}}\,+\,a_\sigma \epsilon \begin{pmatrix} 0 &{} -1 \\ 1 &{} 0 \end{pmatrix} \,+\,b_\sigma \epsilon \begin{pmatrix} 0 &{} 1 \\ 1 &{} 0 \end{pmatrix} \,+\, \epsilon ^2 \,A^\epsilon _\sigma , \qquad A_{\sigma }^{\epsilon } \;=\; \begin{pmatrix} \alpha _{\sigma }^{\epsilon } &{} \beta _{\sigma }^{\epsilon } \\ \gamma _{\sigma }^{\epsilon } &{} \delta _{\sigma }^{\epsilon } \end{pmatrix} . \end{aligned}$$

The overall sign in (9) is neglected as it merely leads to a shift by \(\pi \) in the Prüfer phase dynamics below that is irrelevant for the Prüfer phases relative to the critical energy.

In the following, let us consider a random polymer Hamiltonian with hyperbolic critical energy so that the nth (possibly modified) transfer matrices are of the form (11) with coefficients drawn from the probability space \((\Sigma ,{\textbf{p}})\) (in which the \(\kappa \) coefficients are taken to be independently and identically distributed). Hence \(\omega =(\sigma _n)_{n\in {\mathbb {Z}}}\) is a configuration from \(\Omega =\Sigma ^{\mathbb {Z}}\). The expectations w.r.t. the probability measure \({\mathbb {P}}\) on \(\Omega \) will be denoted by \({\mathbb {E}}\). Associated are random coefficients and matrices \(a_{\sigma _n}\), \(b_{\sigma _n}\), \(T^\epsilon _{\sigma _n}\), \(\kappa _{\sigma _n}\), etc., which for sake of notational convenience will simply be denoted by \(a_n\), \(b_n\), \(T^\epsilon _n\), \(\kappa _n\), etc., unless there is some danger of misunderstanding. Associated to each configuration is a random sequence of Prüfer phases \(\theta ^{\epsilon }_{n}\in {\mathbb {R}}\) at \(\epsilon \) (and relative to the critical energy \(E_c\)) which can be introduced by

$$\begin{aligned} e_{\theta ^\epsilon _n} \;=\; \frac{T^{\epsilon }_n\, e_{\theta ^\epsilon _{n-1}}}{ \Vert T^{\epsilon }_n\, e_{\theta ^\epsilon _{n-1}}\Vert }, \qquad e_{\theta } :=\; \begin{pmatrix} \cos (\theta ) \\ \sin (\theta ) \end{pmatrix} , \end{aligned}$$

a given (and irrelevant) initial condition \(\theta ^\epsilon _0\) and the lifting condition \(\theta ^{\epsilon }_{n+1} - \theta ^{\epsilon }_{n}\in (-\frac{\pi }{2},\frac{3\pi }{2})\) fixing the branch. Note that this definition is induced by a group action of \(\text{ SL }(2,{\mathbb {R}})\) on \({\mathbb {R}}\) and hence \((\theta ^{\epsilon }_{n})_{n\in {\mathbb {Z}}}\) is a Markov process on \({\mathbb {R}}\). As explained in detail in [12] and [2], the IDOS of the random polymer model is then given by

$$\begin{aligned} {\mathcal {N}}(E_c+\epsilon ) \;=\; {\mathcal {N}}(E_c) \,+\, \frac{1}{\pi }\,\frac{1}{{\mathbb {E}}(L_\sigma )}\;\lim _{N \rightarrow \infty }\,\frac{1}{N}\,{\mathbb {E}}(\theta ^\epsilon _N). \end{aligned}$$

For the generalized SSH model this also holds by combining the arguments of [2] with the oscillation theory as described in [20]. The r.h.s. is the so-called rotation number, here relative to the critical energy. It is helpful to write it as a Birkhoff sum

$$\begin{aligned} {\mathcal {N}}(E_c+\epsilon ) \,-\, {\mathcal {N}}(E_c) \;=\;\frac{1}{\pi }\,\frac{1}{{\mathbb {E}}(L_\sigma )}\; \lim _{N \rightarrow \infty } \,\frac{1}{N}\, \sum _{n=1}^N {\mathbb {E}}(\theta ^\epsilon _n\,-\,\theta ^\epsilon _{n-1}), \end{aligned}$$
(12)

because by the above each summand then lies in the interval \((-\frac{\pi }{2},\frac{3\pi }{2})\) and is called a phase shift. Before going into an intuitive description of the random dynamics of Prüfer phases, let us furthermore recall [1] the definition of the Lyapunov exponent

$$\begin{aligned} \gamma (E_c+\epsilon ) \;=\; \lim _{N \rightarrow \infty } \,\frac{1}{N}\, {\mathbb {E}}\big (\log (\Vert T^\epsilon _{N}\cdots T^\epsilon _{1}\Vert )\big ) . \end{aligned}$$

(one may include a factor \(\frac{1}{{\mathbb {E}}(L_\sigma )}\) here) and that it can be expressed as a Birkhoff sum of the Prüfer phases just as the IDOS [2, 12]:

$$\begin{aligned} \gamma (E_c+\epsilon ) \;=\; \lim _{N \rightarrow \infty } \,\frac{1}{N}\, \sum _{n=1}^N {\mathbb {E}}\big (\log (\Vert T^\epsilon _{n+1}e_{\theta ^\epsilon _n}\Vert )\big ). \end{aligned}$$
(13)

The two formulas (12) and (13) allow to numerically compute the IDOS and the Lyapunov exponent for the random hoping model with great precision. As an example, both formulas are implemented in the balanced case of the random hopping model in Fig. 1. In particular, this illustrates (4) and shows that the Lyapunov exponent has a similar behavior, as argued in the physics literature (see again Section V.E in [8]).

Fig. 1
figure 1

Numerical plot of the IDOS \({\mathcal {N}}(\epsilon )-{\mathcal {N}}(0)={\mathcal {N}}(\epsilon )-\frac{1}{2}\) relative to the center of band \(E_c=0\) and the Lyapunov exponent \(\gamma (\epsilon )\) for the balanced random hopping model, both in a log-log plot as well as without logarithms in the inlay plot. All points on these curves are computed via the Birkhoff sums (12) and (13) over orbits of length \(N=10^7\)

For the convenience of the reader, let us briefly recall from [2] the intuitive description of the Prüfer phase dynamics for \(\epsilon \ge 0\). According to (11) and the group action property, it is useful to split the dynamics into two steps, first one induced by \(D_{\kappa _{\sigma }}\) and the second by \(R^\epsilon _{\sigma }\). Thus, let us set for half-integers \(n'=n-\frac{1}{2}\)

$$\begin{aligned} e_{\theta ^\epsilon _n} \;=\; \frac{R^\epsilon _{n}\, e_{\theta ^\epsilon _{n'}}}{\Vert R^\epsilon _{n}\, e_{\theta ^\epsilon _{n'}}\Vert }, \qquad e_{\theta ^\epsilon _{n'}} \;=\; \frac{D_{n}\, e_{\theta ^\epsilon _{n-1}}}{\Vert D_{n}\, e_{\theta ^\epsilon _{n-1}}\Vert }, \end{aligned}$$
(14)

where \(D_n=D_{\kappa _n}\). The first step of the random dynamics induced by \(D_{n}\) has fixed points at \(\frac{\pi }{2}\,{\mathbb {Z}}\) and leaves each interval \((k\frac{\pi }{2},(k+1)\frac{\pi }{2})\) invariant. It will be explained and used below that in a logarithmic representation of the associated Dyson–Schmidt variables this dynamics becomes a random walk, with a supplementary drift in the unbalanced case. The second step induced by \(R^\epsilon _{n}\) is a right shift (or clockwise rotation on the projected circle) by random angles of order \(\epsilon \) because \(a_n-|b_n|\ge C_1>0\) a.s. and

$$\begin{aligned} \theta _n^\epsilon \;=\; \theta _{n'}^\epsilon \,+\, \epsilon \big (a_n + b_n\cos (2\theta _{n'}^\epsilon )\big ) \,+\, {\mathcal {O}}(\epsilon ^2). \end{aligned}$$
(15)

Hence, the combined dynamics passes through the fixed points \(\frac{\pi }{2}\,{\mathbb {Z}}\) only in the increasing direction. This is illustrated in Fig. 2. Note that, in particular, the random dynamics passes in an alternating manner through a fixed point from \(\pi {\mathbb {Z}}\) and one from \(\pi (\frac{1}{2}+{\mathbb {Z}})\). Furthermore, one readily deduces a crucial order preserving property of the random dynamics, namely if one considers two further sequences \({\widehat{\theta }}^{\epsilon }_{n}\) and \({\widetilde{\theta }}^{\epsilon }_{n}\) constructed as in (14) with the same realization \(\omega \), then

$$\begin{aligned} {\widehat{\theta }}^{\epsilon }_{0} \;<\; {\theta }^{\epsilon }_{0} \;<\; {\widetilde{\theta }}^{\epsilon }_{0} \quad \Longrightarrow \quad {\widehat{\theta }}^{\epsilon }_{n} \;<\; {\theta }^{\epsilon }_{n} \;<\; {\widetilde{\theta }}^{\epsilon }_{n}, \end{aligned}$$
(16)

for all \(n\in \frac{1}{2}\,{\mathbb {Z}}\). Based on this, it will be shown how to bound the dynamics above and below by two constructed processes. Then the rotation number in (12) can, via the elementary renewal theorem, be estimated by the inverse of their expected passage times through the intervals \((k\frac{\pi }{2},(k+1)\frac{\pi }{2})\).

Fig. 2
figure 2

The dynamics \(\theta ^\epsilon _n\) for \(\epsilon =0\) has fixed points at \(0\,\textrm{mod}\,\frac{\pi }{2}\) and therefore has many invariant intervals. However, for \(\epsilon >0\) the dynamics crosses the fixed points by steps of size \(\epsilon \) to the right. The hypothesis \({\mathbb {E}}(\log (\kappa _0)) < 0\) induces local drifts indicated by the arrows below the axis

4 Dyson–Schmidt variables and renewal processes

The Dyson–Schmidt variable \(x^\epsilon _n\in {\mathbb {R}}\) for \(n\in \frac{1}{2}{\mathbb {Z}}\) associated with the Prüfer phases is defined by

$$\begin{aligned} x^\epsilon _n :=\; -\, \cot (\theta ^\epsilon _n). \end{aligned}$$

This establishes an orientation preserving bijection of every interval \([k\pi ,(k+1)\pi )\) with \(k\in {\mathbb {Z}}\) to \(\overline{{\mathbb {R}}}={\mathbb {R}}\cup \{\infty \}\), with the central point \((k+\frac{1}{2})\pi \) being mapped to 0. Then (14) becomes for \(n\in {\mathbb {Z}}\) and \(n'=n-\frac{1}{2}\)

$$\begin{aligned} x^\epsilon _n :=\; -(R^\epsilon _{n}\cdot (-x^\epsilon _{n'})) \,=\; Q^\epsilon _{n}\cdot x^\epsilon _{n'}, \qquad x^\epsilon _{n'} :=\; D_{n}\cdot x^\epsilon _{n-1}, \end{aligned}$$
(17)

where the dot \(\cdot \) denotes the standard Möbius action and

$$\begin{aligned} Q^\epsilon _{n} :=\; \begin{pmatrix} 1 &{} 0 \\ 0 &{} -1 \end{pmatrix} R^\epsilon _{n} \begin{pmatrix} 1 &{} 0 \\ 0 &{} -1 \end{pmatrix}, \end{aligned}$$

namely \(Q^\epsilon _{n}\) is obtained from \(R^\epsilon _{n}\) by flipping the signs on the off-diagonals. Due to the explicit forms of \(Q^\epsilon _{n}\) and \(D_{n}\), the action here becomes

$$\begin{aligned} x^\epsilon _n \;=\; \frac{(1+\epsilon ^2\alpha ^{\epsilon }_n)x^\epsilon _{n'}+(a_n{-b_n}-\epsilon \beta ^{\epsilon }_n)\epsilon }{1+\epsilon ^2\delta ^{\epsilon }_n-(a_n{+b_n}+\epsilon \gamma ^{\epsilon }_n)\epsilon x^\epsilon _{n'}}, \qquad x^\epsilon _{n'} \;=\; \kappa _n^2 \, x^\epsilon _{n-1}. \end{aligned}$$
(18)

Note that the interval \([k\pi ,(k+1)\pi )\) of Prüfer variables contains two fixed points \(k\pi \) and \((k+\frac{1}{2})\pi \) of the dynamics generated by \(D_\kappa \), so that one copy \(\overline{{\mathbb {R}}}\) of the Dyson–Schmidt variable also contains two such fixed points 0 and \(\infty \). For the moment, only the half-axis \([0,\infty ) \subset \overline{{\mathbb {R}}}\) will be considered. Below, it will be justified that this is indeed sufficient to show the results in Theorem 1. On this interval, it will be useful to take the logarithm. For this purpose, let us now state the main technical assumptions throughout the remainder of the paper.

Technical hypothesis The family \((\log (\kappa _n))_{n \ge 0}\) of random variables is supposed to be independent and identically distributed with a non-trivial distribution in the sense that \({\mathbb {P}}\big (\{\log (\kappa _0)> 0\}\big )>0\). This distribution is also assumed to have compact support, that is,

$$\begin{aligned} C_0 :=\; \mathop {\mathrm{ess\,sup}}\,|\log (\kappa _0)| \,\in \, (0,+\infty ), \end{aligned}$$

where the essential supremum is taken over (the suppressed index) \(\sigma \in \Sigma \) w.r.t. the given distribution thereon. Furthermore the following constants are supposed to be positive and finite:

$$\begin{aligned} C_1 :=\; \mathop {\mathrm{ess\,inf}}\left( a_{\sigma }-|b_{\sigma }|\right) , \quad C_2 :=\; \mathop {\mathrm{ess\,sup}}\big (a_{\sigma }+|b_{\sigma }|\big ), \quad C_3 :=\; \sup \limits _{|\epsilon |\le 1}\mathop {\mathrm{ess\,sup}}\Vert A_{\sigma }^{\epsilon }\Vert . \end{aligned}$$

Let us note that for the generalized SSH model satisfying the main hypothesis, the above technical hypothesis holds for \(\epsilon =E/E_0\) with an adequate constant \(E_0>0\). Then, the uniform lower bound on \(M_{\sigma }^{*}M_{\sigma }-E^{2}{} \textbf{1}\) and the compact support of \(\sigma \) imply the boundedness of \(C_{3}\). Using (10), the uniform lower and upper bound on \(M_{\sigma }^{*}M_{\sigma }\) and again the compact support of \(\sigma \), one obtains uniform positive lower bounds and uniform upper bounds for \(a_{\sigma }\pm b_{\sigma }\). These in turn imply \(C_{1}>0\) and \(C_{2}<\infty \) for the SSH model.

The technical hypothesis implies, in particular, that one has at least for \(\epsilon =0\)

$$\begin{aligned} \log (x^0_{n+1}) \;=\; \log (x^0_{n})\;+\;2\,\log (\kappa _n) \;=\; \log (x^0_{n})\;+\;2\,C_0\,\chi _n, \end{aligned}$$
(19)

where \(\chi _n:=\frac{1}{C_0}\log (\kappa _n)\) is a random variable satisfying \(-1 \le \chi _n \le 1\) almost surely. In the unbalanced case, \({\mathbb {E}}(\chi _n) <0\) while in the balanced \({\mathbb {E}}(\chi _n) = 0\). In both cases, \(\log (x^0_{n})\) is a random walk on \({\mathbb {R}}\). It will be shown in the next two sections that in this logarithmic representation the random walk roughly has to go from \(\log (\epsilon )\) to \(-\log (\epsilon )\). The central limit theorem indicates that it takes of order of \(\log (\epsilon )^2\) time steps to cross this distance of order \(|\log (\epsilon )|\) in the balanced case, and turns out that in the unbalanced case an order of \(\epsilon ^{-\nu }\) time steps are needed. This provides an intuitive understanding for the behavior in Theorem 1. Controlling the \(\epsilon \)-dependent perturbations is quite delicate and the main technical endeavor of this work.

The outcome will be bounds for the r.h.s. of (12). In order to control the rotation number, it is useful to look at the order statistics of the following set of random variables

$$\begin{aligned} \big \lbrace N \in {\mathbb {N}} :\, x^\epsilon _{N-1}< 0 \le x^\epsilon _{N} \text{ or } x^\epsilon _{N} < 0 \le x^\epsilon _{N-1} \big \rbrace , \end{aligned}$$
(20)

which will be denoted by the random increasing times \(N^\epsilon _{(1)}<N^\epsilon _{(2)}<N^\epsilon _{(3)}<\ldots \). These are the random passage times over the two points 0 and \(\infty \) (which are fixed points of the action induced by \(D^0_{n}\), so without \(\epsilon \)-perturbation). Recall that the two conditions in (20) are realized in an alternating manner. For sake of concreteness, let us fix the initial condition \(x_0^\epsilon \in (-\infty ,0)\cup \{\infty \}\) such that for all k one has \(x^\epsilon _{N^\epsilon _{(2k-1)}} < 0 \le x^\epsilon _{N^\epsilon _{(2k-1)}-1}\) and \(x^\epsilon _{N^\epsilon _{(2k)}-1} < 0 \le x^\epsilon _{N^\epsilon _{(2k)}}\). The (random) differences \(N^\epsilon _{(k+1)}-N^\epsilon _{(k)}\) are the durations of the passages of x through the intervals \([0,+\infty )\) and \(\overline{{\mathbb {R}}} \backslash [0,+\infty )\). Clearly, these quantities depend on the precise value of the initial condition \(x_0^\epsilon \) and therefore they are not identically distributed (nor independent). To circumvent this difficulty, two families of random dynamical processes \({\widehat{x}}^\epsilon _k=({\widehat{x}}^\epsilon _{k,n})_{n\ge 0}\) and \({\widetilde{x}}^\epsilon _k=({\widetilde{x}}^\epsilon _{k,n})_{n\ge 0}\) on \([0,\infty )\) will be constructed for all \(k \in {\mathbb {N}}\) in the two following sections, providing lower and upper bounds on the original process, respectively. It will be imposed that \(({\widehat{x}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\), \(({\widehat{x}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\), \(({\widetilde{x}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\) and \(({\widetilde{x}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\) are all families of non-negative i.i.d. random variables. Note that these processes will not exactly correspond to the notations \({\widehat{\theta }}^\epsilon _n\) and \({\widetilde{\theta }}^\epsilon _n\) in (16). They will obey almost surely that for all \(k \in {\mathbb {N}}\) and then all \(n \in \lbrace 0,1,\dots ,N^\epsilon _{(k+1)} - N^\epsilon _{(k)}-1\rbrace \)

$$\begin{aligned} {\widehat{x}}_{k,n}^\epsilon \,\le \, x^\epsilon _{N^\epsilon _{(k)} + n}\, \le \, {\widetilde{x}}_{k,n}^\epsilon \qquad \text{ or }\qquad {\widehat{x}}_{k,n}^\epsilon \,\le \, -\big (x^\epsilon _{N^\epsilon _{(k)} + n}\big )^{-1}\, \le \, {\widetilde{x}}_{k,n}^\epsilon \end{aligned}$$
(21)

and furthermore, for the next step

$$\begin{aligned} {\widetilde{x}}^\epsilon _{k,N^\epsilon _{(k+1)} - N^\epsilon _{(k)}} \;=\; \infty . \end{aligned}$$
(22)

Note that the left condition in (21) always applies during passages through \((0,\infty )\) and the right condition for passages through \((-\infty ,0)\). Moreover, the constructed comparison processes are only constrained on the first \(N^\epsilon _{(k+1)} - N^\epsilon _{(k)}\) times. Associated with these two families of processes, there are now two families of random passage times

$$\begin{aligned} {\widehat{T}}^\epsilon _k :=\; \inf \lbrace n \in {\mathbb {N}}_0 :\, {\widehat{x}}^\epsilon _{k,n} = \infty \rbrace , \qquad {\widetilde{T}}^\epsilon _k :=\; \inf \lbrace n \in {\mathbb {N}}_0 :\, {\widetilde{x}}^\epsilon _{k,n} = \infty \rbrace . \end{aligned}$$
(23)

Then (21) and (22) imply that a.s. \({\widetilde{T}}^\epsilon _k \le N^\epsilon _{(k+1)} - N^\epsilon _{(k)} \le {\widehat{T}}^\epsilon _k\). Furthermore, by construction the families \(({\widehat{T}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\), \(({\widehat{T}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\), \(({\widetilde{T}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\) and \(({\widetilde{T}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\) are i.i.d. random variables. As then \(({\widehat{T}}^\epsilon _{2k-1}+{\widehat{T}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\) and \(({\widetilde{T}}^\epsilon _{2k-1}+{\widetilde{T}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\) are interarrival times [11], both families specify a renewal process, given by

$$\begin{aligned}{} & {} {\widehat{P}}^\epsilon _N :=\, \max \Big \lbrace K \in {\mathbb {N}} :\sum _{k=1}^K ({\widehat{T}}^\epsilon _{2k-1}+{\widehat{T}}^\epsilon _{2k}) \le N\Big \rbrace ,\\{} & {} \quad {\widetilde{P}}^\epsilon _N :=\, \max \Big \lbrace K \in {\mathbb {N}} : \sum _{k=1}^K ({\widetilde{T}}^\epsilon _{2k-1}+{\widetilde{T}}^\epsilon _{2k}) \le N\Big \rbrace . \end{aligned}$$

These random variables can be interpreted as the number of times the slower or faster process has passed through \(\overline{{\mathbb {R}}}\) up to the time N. Each such passage corresponds to a passage of the Prüfer variables through \([k\pi ,(k+1)\pi )\). Thus, it follows that

$$\begin{aligned} {\widehat{P}}^\epsilon _N - 1 \;\le \; \frac{\theta ^\epsilon _N}{\pi } \;\le \; {\widetilde{P}}^\epsilon _N + 1 \end{aligned}$$

a.s. for \(N \in {\mathbb {N}}_0\) and \(\theta ^\epsilon _0 \in \left[ -\frac{\pi }{2},\frac{\pi }{2}\right) \). Finally, the elementary renewal theorem [11] yields

$$\begin{aligned} \frac{1}{{\mathbb {E}}({\widehat{T}}_1^\epsilon )+{\mathbb {E}}({\widehat{T}}_2^\epsilon )}= & {} \lim _{N \rightarrow \infty } \frac{{\widehat{P}}^\epsilon _N}{N} \;\le \; \lim _{N \rightarrow \infty } \frac{1}{N}\frac{{\mathbb {E}}(\theta ^\epsilon _N)}{\pi } \;\le \; \lim _{N \rightarrow \infty } \frac{{\widetilde{P}}^\epsilon _N}{N} \nonumber \\= & {} \frac{1}{{\mathbb {E}}({\widetilde{T}}_1^\epsilon )+{\mathbb {E}}({\widetilde{T}}_2^\epsilon )}. \end{aligned}$$
(24)

These bounds hold for both the unbalanced and the balanced case.

Let us now first address the unbalanced case. The opposite directions of the drifts in Fig. 2 clearly show that \({\mathbb {E}}({\widehat{T}}_1^\epsilon ) \ge {\mathbb {E}}({\widehat{T}}_2^\epsilon )\). Thus, the inverse of \(2\,{\mathbb {E}}({\widehat{T}}_1^\epsilon )\) provides a lower bound on the l.h.s. of (24), while the r.h.s. can simply be estimated by the inverse of \({\mathbb {E}}({\widetilde{T}}_1^\epsilon )\). These rough estimates are actually superfluous, since the contributions of \({\mathbb {E}}({\widehat{T}}_1^\epsilon )\) and \({\mathbb {E}}({\widetilde{T}}_1^\epsilon )\) turn out to dominate those of \({\mathbb {E}}({\widehat{T}}_2^\epsilon )\) and \({\mathbb {E}}({\widetilde{T}}_2^\epsilon )\) for \(\epsilon \) going to 0. For this reason, passages through \((-\infty ,0)\) need not to be taken into account, so restricting the following to the case \(k=1\) is sufficient (this corresponds to the first passage with positive x variables). The proof of the following result will be provided in the next two sections. Combining it with (24) directly implies the claim (3) in Theorem 1.

Proposition 2

Given a family of random matrices of the form (11), suppose that the above hypothesis holds and \({\mathbb {E}}(\log (\kappa _0)) < 0\). Then, there exists a unique positive solution \(\nu >0\) of \({\mathbb {E}}(\kappa _0^{\nu }) = 1\). Moreover, there exist constants \(C_-, C_+ \in (0,\infty )\) such that for all \({\widetilde{\nu }} < \nu \) there exists some \(\epsilon _0\) with then for all \(\epsilon \in (0,\epsilon _0)\)

$$\begin{aligned} \frac{1}{{\mathbb {E}}({\widehat{T}}_1^\epsilon )} \;\ge \; C_-\epsilon ^{\nu }\,\big (1+{\mathcal {O}}(\epsilon ^{\nu }|\log (\epsilon )|)\big )\frac{1}{{\mathbb {E}}({\widetilde{T}}_1^\epsilon )} \;\le \; C_+\epsilon ^{{\widetilde{\nu }}}\,\big (1+{\mathcal {O}}(\epsilon ^{{\widetilde{\nu }}}|\log (\epsilon )|)\big ) . \end{aligned}$$

In the balanced case, \({\mathbb {E}}({\widehat{T}}_k^\epsilon )={\mathbb {E}}({\widetilde{T}}_k^\epsilon )\) to lowest order. This value is independent of k and can be computed, as the next result shows. Together with (24) this shows (4) in Theorem 1.

Proposition 3

Given a family of random matrices of the form (11), suppose that the above hypothesis holds and \({\mathbb {E}}(\log (\kappa _0)) = 0\). Then, for all k it holds that

$$\begin{aligned}{} & {} \frac{1}{{\mathbb {E}}({\widehat{T}}_k^\epsilon )} \;=\; \frac{{\mathbb {E}}(\log (\kappa _0)^2)}{\log (\epsilon )^2}\,\big (1+{\mathcal {O}}(|\log (\epsilon )|^{-1})\big ) ,\\{} & {} \qquad \frac{1}{{\mathbb {E}}({\widetilde{T}}_k^\epsilon )} \;=\; \frac{{\mathbb {E}}(\log (\kappa _0)^2)}{\log (\epsilon )^2}\,\big (1+{\mathcal {O}}(|\log (\epsilon )|^{-1})\big ) . \end{aligned}$$

5 Lower bound on the rotation number

The first task of this section is to construct the slower comparison process satisfying the first inequality of (21) for \(k=1\). To improve readability from this point on, we will suppress the indices \(\epsilon \), \(\sigma \), n, etc., as long as no confusion can arise. Let us start by providing some basic properties of the dynamics, such as the following observation.

Lemma 4

For each realization and \(\epsilon \) small enough, \(x \in [0,\infty )\) and \(Q \cdot x \ge 0\) imply \(Q \cdot x \ge x\).

Proof

The main hypothesis implies for \(x \in [0,1]\) the estimates

$$\begin{aligned} Q \cdot x \;=\; \tfrac{(1+\epsilon ^2\alpha )x + (a-b-\epsilon \beta )\epsilon }{1+\epsilon ^2\delta - (a+b+\epsilon \gamma )\epsilon x} \,\ge \, x\,\tfrac{1+\epsilon ^2\alpha + (a-b-\epsilon \beta )\frac{\epsilon }{x}}{1+\epsilon ^2\delta } \,\ge \, x\,\tfrac{1+C_1\,\epsilon -2\,C_3\,\epsilon ^2}{1+C_3\,\epsilon ^2}, \end{aligned}$$

while for \(x \in (1,\infty )\)

$$\begin{aligned} Q \cdot x \,\ge \, x\,\tfrac{1+\epsilon ^2\alpha }{1+\epsilon ^2\delta - (a+b+\epsilon \gamma )\epsilon x} \,\ge \, x\,\tfrac{1+\epsilon ^2\alpha }{1+\epsilon ^2\delta - (a+b+\epsilon \gamma )\epsilon } \,\ge \, x\,\tfrac{1-C_3\,\epsilon ^2}{1-C_1\,\epsilon +2\,C_3\,\epsilon ^2}. \end{aligned}$$

In both cases, the statement directly follows. (Note that the lemma can also be deduced from (15).) \(\square \)

The following lemma states more properties of the given action and relies on the quantities

$$\begin{aligned} {\widehat{x}}_- :=\; \tfrac{C_1\,\epsilon }{2}, \qquad {\widehat{x}}_c :=\; \tfrac{C_1\epsilon }{2}(e^{-2C_0}+1) , \qquad {\widehat{x}}_+ :=\; \tfrac{2\,e^{2C_0}}{C_1\epsilon }. \end{aligned}$$

These points and the lemma itself are graphically illustrated in the left part of Fig. 3.

Fig. 3
figure 3

The arrows on the left part illustrate properties of the original Dyson–Schmidt dynamics on \((0,\infty )\) as stated in Lemma 5. The right part illustrates the notations after the logarithmic transformation \({\widehat{f}}\) to \({\mathbb {R}}\)

Lemma 5

For each realization, one has

$$\begin{aligned}&x \,\in \, [0,\infty )&\qquad \Longrightarrow \qquad&Q \cdot (D \cdot x) \,\notin \, [0,{\widehat{x}}_-)\,, \end{aligned}$$
(25)
$$\begin{aligned}&x \,\in \, [{\widehat{x}}_-,\infty )&\qquad \Longrightarrow \qquad&Q \cdot (D \cdot x) \,\notin \, [0,{\widehat{x}}_c)\,, \end{aligned}$$
(26)
$$\begin{aligned}&Q \cdot (D \cdot x) \,\in \, [0,\infty )&\qquad \Longrightarrow \qquad&x \,\notin \, [{\widehat{x}}_+,\infty )\,. \end{aligned}$$
(27)

Proof

For (25), first note that \(x \in [0,\infty )\) implies \(D \cdot x \in [0,\infty )\). By combining (15) and the order-preserving property (16), then non-negative \(Q \cdot (D \cdot x)\) obey, due to (18) and the hypothesis,

$$\begin{aligned} Q \cdot (D \cdot x) \;\ge \; Q \cdot 0 \;=\; \tfrac{(a-b-\epsilon \beta )\epsilon }{1+\epsilon ^2\delta } \;\ge \; \tfrac{(C_1-C_3\epsilon )\epsilon }{1+C_3\epsilon ^2} \;\ge \; \tfrac{C_1\,\epsilon }{2} \;=\; {\widehat{x}}_-. \end{aligned}$$

For the proof of (26) let us use that, if \(x \in [{\widehat{x}}_-,\infty )\), then clearly \(D \cdot x \ge e^{-2C_0}{\widehat{x}}_-\). Similarly to the proof of (25), non-negative \(Q \cdot (D \cdot x)\) then obey

$$\begin{aligned} Q \cdot (D \cdot x) \;\ge \; Q \cdot \tfrac{e^{-2C_0}C_1\epsilon }{2} \;\ge \; \tfrac{(1-C_3\epsilon ^2)\frac{e^{-2C_0}C_1\epsilon }{2} + (C_1-C_3\epsilon )\epsilon }{1+C_3\epsilon ^2 - (C_1-C_3\epsilon )\frac{e^{-2C_0}C_1\epsilon ^2}{2}} \;\ge \; (e^{-2C_0} + 1)\tfrac{C_1\epsilon }{2} \;=\;{\widehat{x}}_c. \end{aligned}$$

Finally let us verify (27) by contraposition. If \(x \in [{\widehat{x}}_+,\infty )\), then clearly \(D \cdot x \ge e^{-2C_0}{\widehat{x}}_+=\tfrac{2}{C_1\epsilon }\). Then, the order-preserving property (16) implies that \(Q \cdot (D \cdot x) \notin [0,\infty )\), since as in the proof of (25) it holds that

$$\begin{aligned} 0 \;>\; Q \cdot \infty \;\ge \; Q \cdot (D \cdot x) \;\ge \; Q \cdot \tfrac{2}{C_1\epsilon } \;\ge \; \tfrac{(1-C_3\epsilon ^2)\frac{2}{C_1\epsilon } + (C_1-C_3\epsilon )\epsilon }{1+C_3\epsilon ^2 - (C_1-C_3\epsilon )\frac{2}{C_1}} , \end{aligned}$$

which is also negative. \(\square \)

Now a new process \({\widehat{x}}=({\widehat{x}}_n)_{n\ge 0}\) is constructed by setting \({\widehat{x}}_0 = 0\), \({\widehat{x}}_1={\widehat{x}}_-\) and for \(n\ge 1\)

$$\begin{aligned} {\widehat{x}}_{n+1} \;=\; {\left\{ \begin{array}{ll} {\widehat{x}}_c ,&{}\text { if } {\widehat{x}}_n \le {\widehat{x}}_-,\\ D_n \cdot {\widehat{x}}_n ,&{}\text { if } {\widehat{x}}_n \in ({\widehat{x}}_-,{\widehat{x}}_+), \\ \infty , &{}\text { else, so if } {\widehat{x}}_n \ge {\widehat{x}}_+. \end{array}\right. } \end{aligned}$$

Comparing with (17), the main case \({\widehat{x}}_{n+1}=D_n \cdot {\widehat{x}}_n\) of this process merely omits the action of \(Q_n\) for \(n\ge 2\). Let us now argue why this process satisfies the first inequality in (21). Indeed, omitting the action of \(Q_n\) slows the process down because of the order-preserving property (16) and Lemma 4. Carefully analyzing the first case in the definition of \({\widehat{x}}_{n+1}\) in combination with (25) and (26) shows that a.s. \({\widehat{x}}_n \le x_{N_{(1)} + n}\) for all \(n \in \lbrace 0,1,\dots ,N_{(2)} - 1 - N_{(1)}\rbrace \), that is, as long as \(x_{N_{(1)} + n} \in [0,\infty )\). Moreover, by (27) it is impossible that \({\widehat{x}}_n = \infty \) for \(n \in \lbrace 3,4,\dots ,N_{(2)} - 1 - N_{(1)}\rbrace \), as this would imply \(x_{N_{(1)} + n-1} < {\widehat{x}}_+ \le {\widehat{x}}_{n-1}\). Conversely, as a.s. \({\widehat{x}}_{{\widehat{T}}_1} = \infty \), then indeed \(N_{(2)} - N_{(1)} \le {\widehat{T}}_1\).

Now let us come to the second task, namely analyzing the \(\epsilon \)-dependence of \(\big ({\mathbb {E}}({\widehat{T}}_1)\big )^{-1}\) and thereby proving the first statement of Proposition 2. Similar as in (19), it will be advantageous to pass to a shifted logarithm of the Dyson–Schmidt variables, via the map \({\widehat{f}}:(0,\infty )\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} {\widehat{f}}(x) :=\; \frac{1}{2C_0}\,\log \Big (\frac{x}{{\widehat{x}}_c}\Big ). \end{aligned}$$

By construction, \({\widehat{f}}({\widehat{x}}_c) = 0\). Furthermore, for n such that \({\widehat{x}}_{n+2}<\infty \), let us introduce

$$\begin{aligned} {\widehat{y}}_n :=\; {\widehat{f}}({\widehat{x}}_{n+2}) , \qquad {\widehat{y}}_-:=\;{\widehat{f}}({\widehat{x}}_-) , \qquad {\widehat{y}}_+:=\;{\widehat{f}}({\widehat{x}}_+) , \end{aligned}$$

and the stopping time

$$\begin{aligned} {\widehat{T}}_{-,+} :=\; \inf \big \{ n \in {\mathbb {N}} :\, {\widehat{y}}_n\notin ({\widehat{y}}_-,{\widehat{y}}_+) \big \}. \end{aligned}$$

Again these quantities are illustrated in Fig. 3. As long as \(n \le {\widehat{T}}_{-,+}\), it holds that

$$\begin{aligned} {\widehat{y}}_n= & {} \tfrac{1}{2C_0}\,\log \Big (\frac{{\widehat{x}}_{n+2}}{{\widehat{x}}_c}\Big ) \;=\; \tfrac{1}{2C_0}\log \Big (\frac{D^n \cdot {\widehat{x}}_2}{{\widehat{x}}_c}\Big ) \;=\; \tfrac{1}{2C_0}\log \Big (\prod _{j=N_{(1)}+1}^{N_{(1)}+n} \kappa _j^2\Big ) \nonumber \\= & {} \sum _{j=N_{(1)}+1}^{N_{(1)}+n} \chi _j, \end{aligned}$$
(28)

namely \( {\widehat{y}}_n \) is a random walk with a drift in the negative direction starting at \({\widehat{y}}_0=0\). The following two lemmata recollect properties about these newly introduced quantities.

Lemma 6

\({\widehat{y}}_- \in (-\infty ,0)\) is independent of \(\epsilon \) and \(\lim _{\epsilon \rightarrow 0} \frac{{\widehat{y}}_+}{-\log (\epsilon )} = \frac{1}{C_0}\).

Proof

The explicit expressions

$$\begin{aligned} {\widehat{y}}_- \,=\, -\tfrac{1}{2\,C_0}\,\log (1+e^{-2\,C_0}), \qquad {\widehat{y}}_+ \,=\, \tfrac{1}{2\,C_0}\big (2\log \big (\tfrac{2}{C_1\,\epsilon }\big ) -\log (1+e^{-2\,C_0})\big ) +1 , \end{aligned}$$

immediately imply the claims. \(\square \)

Lemma 7

\({\mathbb {E}}({\widehat{T}}_{-,+} ) < +\infty \).

Proof

Since the cumulative distribution function of \(\chi \) is right-continuous and \({\mathbb {P}}(\{\chi> 0\}) > 0\), there exists some \(\ell \in (0,1]\) such that \( {\widehat{p}}:={\mathbb {P}}(\{\chi \ge \ell \})\) satisfies \({\widehat{p}} > 0\). Denoting \({\widehat{E}}:= \lceil \frac{{\widehat{y}}_+-{\widehat{y}}_-}{\ell }\rceil \) and introducing the random variable

$$\begin{aligned} {\widehat{N}} :=\; \min \lbrace n \in {\mathbb {N}} :\, \chi _{(n-1){\widehat{E}} + 1} \ge \ell ,\; \chi _{(n-1){\widehat{E}} + 2} \ge \ell , \dots ,\; \chi _{n{\widehat{E}}} \ge \ell \rbrace , \end{aligned}$$

the latter then is geometrically distributed with success probability \({\widehat{p}}^{{\widehat{E}}}\). In particular, one has \({\mathbb {E}}({\widehat{N}}) < \infty \). Moreover, \({\widehat{T}}_{-,+} < {\widehat{E}}\,{\widehat{N}}\) a.s. by construction, so \({\mathbb {E}}({\widehat{T}}_{-,+})< {\widehat{E}}\,{\mathbb {E}}({\widehat{N}}) < \infty \). \(\square \)

In order to connect the two stopping times \({\widehat{T}}_{-,+}\) and \({\widehat{T}}_1\), one further random variable will be introduced. Suppose that \({\widehat{T}}_{-,+} = m\) for some \(m \in {\mathbb {N}}\), and \({\widehat{y}}_{{\widehat{T}}_{-,+}}\le {\widehat{y}}_-\), then let us introduce the at \(m+1\) reinitialized stopping time as in (23) by

$$\begin{aligned} {\widehat{T}}^{(m)}_1 :=\; \inf \big \{ n \in {\mathbb {N}} :\, {\widehat{x}}_{m+1+n} = \infty \big \}. \end{aligned}$$

It then clearly follows that \({\widehat{T}}_1 = m + 1 + {\widehat{T}}^{(m)}_1\), provided that \({\widehat{T}}_{-,+} = m\) and \({\widehat{y}}_{m}\le {\widehat{y}}_-\). Now the Markov property allows to compute the conditional expectations

$$\begin{aligned}{} & {} {\mathbb {E}}\big ({\widehat{T}}^{(m)}_1 \,\big |\, {\widehat{y}}_{m} \le {\widehat{y}}_-,\;\; {\widehat{T}}_{-,+}=m\big ) \;=\; {\mathbb {E}}({\widehat{T}}_1) ,\\{} & {} \qquad {\mathbb {E}}\big ({\widehat{T}}_1 \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big ) \;=\; {\mathbb {E}}\big ({\widehat{T}}_{-,+} + 3 \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big ) , \end{aligned}$$

where the 3 stems from an index shift by 2 when the process is started and 1 additional step at the end. As by construction \({\mathbb {P}}\big (\{{\widehat{y}}_{{\widehat{T}}_{-,+}}\in ({\widehat{y}}_-,{\widehat{y}}_+)\}\big )=0\) and as by Lemma 7 one has \({\widehat{T}}_{-,+}<\infty \) a.s., it follows that

$$\begin{aligned} \begin{aligned} {\mathbb {E}}({\widehat{T}}_1)&\;=\; {\mathbb {E}}\big ({\widehat{T}}_1 \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big )\,{\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big ) \\&\qquad + \sum _{m=0}^{\infty } {\mathbb {E}}\big ({\widehat{T}}_1 \,\big |\, {\widehat{y}}_m \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big )\,{\mathbb {P}}\big (\big \{{\widehat{y}}_m \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big \}\big )\\&\;=\; {\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )\,{\mathbb {E}}\big ({\widehat{T}}_1 \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big ) \\&\qquad + \sum _{m=0}^{\infty } {\mathbb {P}}\big (\big \{{\widehat{y}}_m \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big \}\big ){\mathbb {E}}\big (m + 1 + {\widehat{T}}^{(m)}_1 \,\big |\, {\widehat{y}}_{m} \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big ) \\&\;= \;{\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )\,{\mathbb {E}}\big ({\widehat{T}}_{-,+} + 3 \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big ) \\&\qquad + \sum _{m=0}^{\infty } {\mathbb {P}}\big (\big \{{\widehat{y}}_m \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big \}\big )\left( {\mathbb {E}}\big ({\widehat{T}}_{-,+} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \le {\widehat{y}}_-\,, {\widehat{T}}_{-,+} = m\big ) + 1 + {\mathbb {E}}({\widehat{T}}_1)\right) \\&\;=\; {\mathbb {E}}({\widehat{T}}_{-,+} )\, +\, 3\,{\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )\, +\, \left( 1-{\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )\right) \left( 1 + {\mathbb {E}}({\widehat{T}}_1)\right) \,, \end{aligned} \end{aligned}$$

which is equivalent to

$$\begin{aligned} \big ({\mathbb {E}}({\widehat{T}}_1)\big )^{-1} \;=\; \left[ \frac{{\mathbb {E}}({\widehat{T}}_{-,+}) + 1}{{\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )}\, +\, 2\right] ^{-1}. \end{aligned}$$
(29)

It now remains to compute the probability and the expectation on the r.h.s. of (29). This will essentially follow from the optional stopping theorem. It is convenient to define the quantities

$$\begin{aligned} {\widehat{y}}'_-&\;:=\; {\mathbb {E}}\big ({\widehat{y}}_{{\widehat{T}}_{-,+}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \le {\widehat{y}}_-\big )\,,&\qquad&{\widehat{y}}''_- \;:=\; \tfrac{1}{C_0\nu }\log \Big ({\mathbb {E}}\big (e^{C_0\nu {\widehat{y}}_{{\widehat{T}}_{-,+}}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \le {\widehat{y}}_-\big )\Big )\,,\\ {\widehat{y}}'_+&\;:=\; {\mathbb {E}}\big ({\widehat{y}}_{{\widehat{T}}_{-,+}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big )\,,&\qquad&{\widehat{y}}''_+ \;:=\; \tfrac{1}{C_0\nu }\log \Big ({\mathbb {E}}\big (e^{C_0\nu {\widehat{y}}_{{\widehat{T}}_{-,+}}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big )\Big )\,, \end{aligned}$$

in which \({\widehat{y}}'_-, {\widehat{y}}''_- \in \left[ {\widehat{y}}_--1,{\widehat{y}}_-\right] \) and \({\widehat{y}}'_+, {\widehat{y}}''_+ \in \left[ {\widehat{y}}_+,{\widehat{y}}_++1\right] \). Now (28) implies that \({\widehat{y}}_n - n{\mathbb {E}}(\chi )\) is a martingale. As \(|\chi | \le 1\) a.s., its increments are a.s. bounded, namely more precisely \(|{\widehat{y}}_{n+1} - (n+1){\mathbb {E}}(\chi ) - {\widehat{y}}_n + n{\mathbb {E}}(\chi )| \le 2\). Then, with \({\mathbb {E}}({\widehat{T}}_{-,+}) < \infty \) from Lemma 7, one can use the optional stopping theorem to find \(0 = {\mathbb {E}}({\widehat{y}}_0 - 0 \cdot {\mathbb {E}}(\chi )) = {\mathbb {E}}({\widehat{y}}_{{\widehat{T}}_{-,+}} - {\widehat{T}}_{-,+} \cdot {\mathbb {E}}(\chi ))\), or

$$\begin{aligned} {\mathbb {E}}({\widehat{T}}_{-,+}) \;=\; \frac{{\mathbb {E}}\big ({\widehat{y}}_{{\widehat{T}}_{-,+}}\big )}{{\mathbb {E}}(\chi )} \;=\; \frac{{\widehat{y}}'_-\big (1 - {\mathbb {P}}\big (\{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+ \}\big )\big ) \,+\, {\widehat{y}}'_+{\mathbb {P}}\big (\{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+ \}\big )}{{\mathbb {E}}(\chi )}.\nonumber \\ \end{aligned}$$
(30)

Now an expression for \({\mathbb {P}}\big (\{ {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+ \}\big )\) is needed. A standard technique (see, e.g., [7]) is based on the following lemma.

Lemma 8

There is a unique solution \(\nu \in (0,\infty )\) for the equation \({\mathbb {E}}(\kappa _0^{\nu }) = 1\), implying that \(e^{C_0\nu {\widehat{y}}_n}\) is a martingale.

Proof

The main hypothesis states that \({\mathbb {P}}(\{\chi> 0\}) > 0\), hence \(\lim \limits _{\rho \rightarrow \infty } {\mathbb {E}}(e^{C_0\rho \chi }) = \infty \). Consider the map \(\rho \in {\mathbb {R}} \mapsto {\mathbb {E}}(e^{C_0\rho \chi }) \in (0,\infty )\), which is differentiable at \(\rho = 0\) with derivative

$$\begin{aligned} \partial _{\rho }\;{\mathbb {E}}(e^{C_0\rho \chi })\big |_{\rho = 0} \;=\; {\mathbb {E}} (\log (\kappa _0)) \;<\; 0. \end{aligned}$$

This map is continuous, so the intermediate value theorem applies on \([0,\infty )\), yielding a solution \(\nu \) for \(\rho \) of \({\mathbb {E}}(e^{C_0\rho \chi }) = 1\) on \((0,\infty )\). As the map is strictly convex, this solution is unique. \(\square \)

As \({\widehat{y}}_n \in [{\widehat{y}}_--1, {\widehat{y}}_++1]\) and \(|\chi | \le 1\) a.s., the increments of the martingale \(e^{C_0\nu {\widehat{y}}_n}\) of Lemma 8 are uniformly bounded by \(|e^{C_0\nu {\widehat{y}}_n} - e^{C_0\nu {\widehat{y}}_{n+1}}| = e^{C_0\nu {\widehat{y}}_n}|e^{C_0\nu \chi _n} - 1| \le e^{C_0\nu ({\widehat{y}}_++1)}|e^{C_0\nu } - 1|\). Using \({\mathbb {E}}({\widehat{T}}_{-,+}) < \infty \) from Lemma 7 to apply the optional stopping theorem to this martingale yields

$$\begin{aligned}{} & {} 1 \;=\; e^{C_0\nu \cdot 0} \;=\; e^{C_0\nu {\widehat{y}}_0} \;=\; {\mathbb {E}}(e^{C_0\nu {\widehat{y}}_{{\widehat{T}}_{-,+}}}) \\{} & {} \quad \;=\; e^{C_0\nu {\widehat{y}}''_-}\big (1 - {\mathbb {P}}(\{ {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+ \})\big ) + e^{C_0\nu {\widehat{y}}''_+}{\mathbb {P}}(\{ {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+ \}). \end{aligned}$$

Inserting (30) into (29) and combining this with the foregoing finally gives

$$\begin{aligned} \big ({\mathbb {E}}({\widehat{T}}_1)\big )^{-1}&\;=\; \left[ \frac{{\widehat{y}}'_- + {\mathbb {E}}(\chi )}{{\mathbb {E}}(\chi ){\mathbb {P}}\big (\big \{{\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big \}\big )}\, +\,\frac{{\widehat{y}}'_+ - {\widehat{y}}'_-}{{\mathbb {E}}(\chi )}\, +\, 2\right] ^{-1}\\&\;=\; \left[ \left( 1 + \frac{C_0\,{\widehat{y}}'_-}{{\mathbb {E}}(\log (\kappa _0))}\right) \,\frac{e^{C_0\nu {\widehat{y}}''_+}-e^{C_0\nu {\widehat{y}}''_-}}{1-e^{C_0\nu {\widehat{y}}''_-}}\, +\, \frac{C_0\,({\widehat{y}}'_+ - {\widehat{y}}'_-)}{{\mathbb {E}}(\log (\kappa _0))}\, +\, 2\right] ^{-1} \,, \end{aligned}$$

which together with the two statements of Lemma 6 implies that

$$\begin{aligned} C_- :=\; \left( 1 + \frac{C_0\,({\widehat{y}}_--1)}{{\mathbb {E}}(\log (\kappa _0))}\right) ^{-1}\frac{1-e^{C_0\nu ({\widehat{y}}_--1)}}{e^{C_0\nu }} \end{aligned}$$

satisfies

$$\begin{aligned} C_- \;\le \; \lim _{\epsilon \rightarrow 0} \left( 1 + \frac{C_0\,{\widehat{y}}'_-}{{\mathbb {E}}(\log (\kappa _0))}\right) ^{-1}\,\frac{1-e^{C_0\nu {\widehat{y}}''_-}}{\epsilon ^{\nu }e^{C_0\nu {\widehat{y}}''_+}} = \lim _{\epsilon \rightarrow 0} \frac{\big ({\mathbb {E}}({\widehat{T}}_1)\big )^{-1}}{\epsilon ^{\nu }}, \end{aligned}$$

namely the first statement of Proposition 2.

6 Upper bound on the rotation number

This section is structured just as the previous one, namely first a faster comparison process satisfying (21) and (22) is constructed and then the \(\epsilon \)-dependence of its expected stopping time is analyzed in order to prove the second statement of Proposition 2. It will be useful to introduce a suitable positive-valued function \(\lambda \) of \(\epsilon \), satisfying the defining properties

$$\begin{aligned} \lim _{\epsilon \rightarrow 0}\; \lambda \;=\; 0,\qquad \lim _{\epsilon \rightarrow 0}\; \frac{\log (\lambda )}{\log (\epsilon )} \;=\; 0, \qquad \lim _{\epsilon \rightarrow 0} \;\frac{\epsilon }{\lambda } \;=\; 0 , \end{aligned}$$
(31)

where the last property actually follows from the first two. For conciseness, an additional notation is introduced:

$$\begin{aligned} \Lambda :=\; e^{2C_0\lambda }. \end{aligned}$$

Note that \(\Lambda \) also depends on \(\epsilon \). Similar as in Sect. 5, three reference points \(0<{\widetilde{x}}_-<{\widetilde{x}}_c<{\widetilde{x}}_+<\infty \) will be needed w.r.t. which the dynamics has uniform properties schematically described in Fig. 4. While similar to Fig. 3, note that the original dynamics now is bounded above by these points, see Lemma 10. For its proof, let us start out with a counterpart to Lemma 4.

Lemma 9

There exist \({\widetilde{x}}_-\) and \({\widetilde{x}}_+\) depending on \(C_0\), \(C_2\), \(C_3\) and \(\epsilon \) such that

$$\begin{aligned} x \,\in \,[{\widetilde{x}}_-,{\widetilde{x}}_+] \qquad \Longrightarrow \qquad Q \cdot (D \cdot x) \,\le \, \Lambda (D \cdot x) , \end{aligned}$$

as well as

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \;\frac{\lambda \,{\widetilde{x}}_-}{\epsilon } \;=\; \frac{C_2\,e^{2C_0}}{2C_0} , \qquad \lim _{\epsilon \rightarrow 0} \;\frac{\epsilon \,{\widetilde{x}}_+}{\lambda } \;=\; \frac{2C_0}{C_2\,e^{2C_0}}. \end{aligned}$$
(32)

Proof

For \(x \in [0,+\infty )\), one can estimate

$$\begin{aligned} Q \cdot (D \cdot x)= & {} \frac{(1+\epsilon ^2\alpha )(D \cdot x) + (a-b-\epsilon \beta )\epsilon }{1+\epsilon ^2\delta - (a+b+\epsilon \gamma )\epsilon (D \cdot x)} \\\le & {} \frac{(1+C_3\epsilon ^2)(D \cdot x) + (C_2+C_3\epsilon )\epsilon }{1-C_3\epsilon ^2 - (C_2+C_3\epsilon )\epsilon (D \cdot x)}. \end{aligned}$$

The latter is smaller than or equal to \(\Lambda (D \cdot x)\) if and only if

$$\begin{aligned} \Lambda (C_2+C_3\epsilon )\epsilon (D \cdot x)^2 - \big (\Lambda -1-C_3\epsilon ^2(\Lambda +1)\big )(D \cdot x) + (C_2+C_3\epsilon )\epsilon \;\le \; 0. \end{aligned}$$

Let us equate this to \(A(D \cdot x)^2 - B(D \cdot x) + C\), namely set

$$\begin{aligned} A :=\; \Lambda (C_2+C_3\epsilon )\epsilon , \qquad B := \; \Lambda -1-C_3\epsilon ^2(\Lambda +1) , \qquad C := \; (C_2+C_3\epsilon )\epsilon . \end{aligned}$$

Note that for \(\epsilon \rightarrow 0\), \(\frac{A}{C_2\epsilon }\), \(\frac{B}{2C_0\lambda }\) and \(\frac{C}{C_2\epsilon }\) all converge to 1 by (31), hence \(A,B,C \in (0,\infty )\) for \(\epsilon \) small. Now let us search for real solutions of the quadratic equation \(A(D \cdot x)^2 - B(D \cdot x) + C=0\). They clearly exist whenever \(B^2 - 4AC \ge 0\), which follows from the foregoing limits and (31). Then, denote the two real zeroes of the quadratic equation by \(x_- \le x_+\), so \(A(D \cdot x)^2 - B(D \cdot x) + C\) then equals \(A\left[ (D \cdot x)-x_+\right] \left[ (D \cdot x)-x_-\right] \). The fact that \(\sqrt{1-r} \ge 1-\frac{r}{2}-\frac{r^2}{2} \ge 1-r\) for all \(r \in [0,1]\) implies after some algebra that \(x_- \le \frac{(B^2+4AC)C}{B^3}\) and \(x_+\ge \frac{B^2-2AC}{AB} \). Hence let us set

$$\begin{aligned} {\widetilde{x}}_-&\;:=\; e^{2C_0}\,\frac{(B^2+4AC)C}{B^3} \\&\;=\; \frac{\big [\big (\Lambda -1-C_3\epsilon ^2(\Lambda +1)\big )^2 + 4\Lambda (C_2+C_3\epsilon )^2\epsilon ^2\big ](C_2+C_3\epsilon )\,e^{2C_0}\,\epsilon }{\left( \Lambda -1-C_3\epsilon ^2\left[ \Lambda +1\right] \right) ^3}\,,\\ {\widetilde{x}}_+&\;:=\; e^{-2C_0}\,\frac{B^2-2AC}{AB} \;=\; \frac{\big (\Lambda -1-C_3\epsilon ^2(\Lambda +1)\big )^2 \,- \,2\,\Lambda \,(C_2+C_3\epsilon )^2\,\epsilon ^2}{\big (\Lambda -1-C_3\epsilon ^2(\Lambda +1)\big )\,\Lambda \,(C_2+C_3\epsilon )\,e^{2C_0}\,\epsilon }\,. \end{aligned}$$

For \(x \in [{\widetilde{x}}_-,{\widetilde{x}}_+]\), one has due to \(e^{-2C_0}x \le (D \cdot x) \le e^{2C_0}x\) that \((D \cdot x) \in [e^{-2C_0}{\widetilde{x}}_-,e^{2C_0}{\widetilde{x}}_+]\subset [x_-,x_+]\), which by the above implies the first statement. The limits (32) follow again by the given limit behavior of A, B and C as well as from (31). \(\square \)

Let us now complete the left part of Fig. 4 by setting

$$\begin{aligned} {\widetilde{x}}_c :=\; e^{2C_0}\,\Lambda \,{\widetilde{x}}_-. \end{aligned}$$

The next statement corresponds to Lemma 5.

Fig. 4
figure 4

The arrows on the left part illustrate properties of the original Dyson–Schmidt dynamics on \((0,\infty )\) as stated in Lemma 10. The right part illustrates the notations after the logarithmic transformation \({\widetilde{f}}\) to \({\mathbb {R}}\)

Lemma 10

For each realization, one has

$$\begin{aligned}&x \,\notin \, [0,\infty )&\qquad \Longrightarrow \qquad&Q \cdot (D \cdot x) \,\notin \, [{\widetilde{x}}_-,\infty )\,, \end{aligned}$$
(33)
$$\begin{aligned}&x \,\notin \, [{\widetilde{x}}_-,\infty )&\qquad \Longrightarrow \qquad&Q \cdot (D \cdot x) \,\notin \, [{\widetilde{x}}_c,\infty )\,, \end{aligned}$$
(34)
$$\begin{aligned}&Q \cdot (D \cdot x) \,\notin \, [0,\infty )&\qquad \Longrightarrow \qquad&x \,\notin \, [0,{\widetilde{x}}_+)\,. \end{aligned}$$
(35)

Proof

For (33), let \(x \notin [0,\infty )\), so then \(D \cdot x \notin [0,\infty )\). By combining the order-preserving property (16) with (18) and the hypothesis, one has for non-negative \(Q \cdot (D \cdot x)\) that

$$\begin{aligned} Q \cdot (D \cdot x) \;<\; Q \cdot 0 \;=\; \tfrac{(a-b-\epsilon \beta )\epsilon }{1+\epsilon ^2\delta } \;\le \; \tfrac{(C_2+C_3\epsilon )\epsilon }{1-C_3\epsilon ^2} \;\le \; C_2\,e^{2C_0}\epsilon . \end{aligned}$$

By (31) and (32) it indeed follows that \(C_2e^{2C_0}\epsilon < {\widetilde{x}}_-\) for \(\epsilon \) small enough. For the proof of (34), combining its hypothesis, the order-preserving property (16), Lemma 9 and the fact that \(D \cdot x' \le e^{2C_0}x'\) for non-negative \(x'\), yields

$$\begin{aligned} Q \cdot (D \cdot x) \;<\; Q \cdot (D \cdot {\widetilde{x}}_-) \;\le \; \Lambda \,e^{2C_0}\,{\widetilde{x}}_- \;=\;{\widetilde{x}}_c. \end{aligned}$$

Finally let us verify (35) by contraposition. If \(x \in [0,{\widetilde{x}}_+)\), then the order-preserving property (16) and Lemma 9 imply

$$\begin{aligned} 0 \;\le \; Q \cdot 0 \;\le \; Q \cdot (D \cdot x) \;\le \; Q \cdot (D \cdot {\widetilde{x}}_+) \;\le \; \Lambda (D \cdot {\widetilde{x}}_+) \;<\;\infty , \end{aligned}$$

just as claimed. \(\square \)

Now a new process \({\widetilde{x}}=({\widetilde{x}}_n)_{n\ge 0}\) on \([0,\infty ]\) is constructed by setting for \(n\ge 0\)

$$\begin{aligned} {\widetilde{x}}_0 \;=\; {\widetilde{x}}_-, \qquad {\widetilde{x}}_{n+1} \;=\; {\left\{ \begin{array}{ll} {\widetilde{x}}_c, &{} \text { if } {\widetilde{x}}_n \le {\widetilde{x}}_-, \\ \Lambda (D_n \cdot {\widetilde{x}}_n) ,&{}\text { if } {\widetilde{x}}_n \in ({\widetilde{x}}_-,{\widetilde{x}}_+), \\ \infty , &{}\text { else, so if } {\widetilde{x}}_n \ge {\widetilde{x}}_+. \end{array}\right. } \end{aligned}$$

Comparing with (17), the main case \({\widetilde{x}}_{n+1}=\Lambda (D_n \cdot {\widetilde{x}}_n)\) of this process merely bounds the action of \(Q_n\) for \(n\ge 2\). Let us now argue why this process satisfies the first inequality in (21). Indeed, replacing the action of \(Q_n\) by a multiplication by \(\Lambda \) speeds up the process because of the order-preserving property (16) and Lemma 9 which applies in the case \({\widetilde{x}}_n \in ({\widetilde{x}}_-,{\widetilde{x}}_+)\). Carefully analyzing the first case in the definition of \({\widetilde{x}}_{n+1}\) in combination with (33) and (34) shows that a.s. \(x_{N_{(1)} + n} \le {\widetilde{x}}_n\) for all \(n \in \lbrace 0,1,\dots ,N_{(2)} - 1 - N_{(1)}\rbrace \), that is, as long as \(x_{N_{(1)} + n} \in [0,\infty )\). Moreover, by (35) it is indeed impossible that \({\widetilde{x}}_{N_{(2)} - N_{(1)}} \ne \infty \), as this would imply the contradiction \(x_{N_{(2)}-1} \ge {\widetilde{x}}_+ > {\widetilde{x}}_{N_{(2)} - N_{(1)}-1}\). Therefore also (22) holds, and conversely indeed \({\widetilde{T}}_1 \le N_{(2)} - N_{(1)}\) since a.s. \({\widetilde{x}}_{{\widetilde{T}}_1} = \infty \).

Now let us proceed proving the limit behavior of \(\big ({\mathbb {E}}({\widetilde{T}}_1)\big )^{-1}\) as given in Proposition 2. As in the previous section, this is achieved by passing to a shifted logarithm of the Dyson–Schmidt variables, here via the map \({\widetilde{f}}:(0,\infty )\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} {\widetilde{f}}(x) :=\; \frac{1}{2C_0}\,\log \Big (\frac{x}{{\widetilde{x}}_c}\Big ). \end{aligned}$$

By construction, \({\widetilde{f}}({\widetilde{x}}_c) = 0\). Furthermore, for n such that \({\widetilde{x}}_{n+1}<\infty \), let us introduce

$$\begin{aligned} {\widetilde{y}}_n :=\; {\widetilde{f}}({\widetilde{x}}_{n+1}) , \qquad {\widetilde{y}}_-:=\;{\widetilde{f}}({\widetilde{x}}_-) , \qquad {\widetilde{y}}_+:=\;{\widetilde{f}}({\widetilde{x}}_+) , \end{aligned}$$

and the stopping time

$$\begin{aligned} {\widetilde{T}}_{-,+} :=\; \inf \big \{ n \in {\mathbb {N}} :\, {\widetilde{y}}_n\notin ({\widetilde{y}}_-,{\widetilde{y}}_+) \big \}. \end{aligned}$$

As long as \(n \le {\widetilde{T}}_{-,+}\), it holds that

$$\begin{aligned} {\widetilde{y}}_n= & {} \tfrac{1}{2C_0}\log \Big (\frac{\Lambda ^{n} (D^n \cdot {\widetilde{x}}_1)}{{\widetilde{x}}_c}\Big ) \nonumber \\= & {} \tfrac{1}{2C_0}\log \Big (\prod _{j=N_{(1)}+1}^{N_{(1)}+n} e^{2C_0\lambda }\kappa _j^2\Big ) \;=\; \sum _{j=N_{(1)}+1}^{N_{(1)}+n} (\chi _j+\lambda ), \end{aligned}$$
(36)

namely \( {\widetilde{y}}_n \) is a random walk starting at \({\widetilde{y}}_0=0\). For \(\lambda \) small enough, it still contains a drift in the negative direction. The following two lemmata recollect properties about these newly introduced quantities.

Lemma 11

For \(\epsilon \rightarrow 0\), both \(\frac{C_0{\widetilde{y}}_+}{-\log (\epsilon )}\) and \(-{\widetilde{y}}_-\) converge to 1.

Proof

The first statement follows from the limit behavior of \({\widetilde{x}}_-\) and \({\widetilde{x}}_+\) as given in (32):

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \frac{C_0{\widetilde{y}}_+}{-\log (\epsilon )}= & {} \lim _{\epsilon \rightarrow 0} \frac{\log ({\widetilde{x}}_+)-\log (e^{-2C_0}\Lambda )-\log ({\widetilde{x}}_-)}{-2\log (\epsilon )} \\= & {} \lim _{\epsilon \rightarrow 0} \frac{\log (\lambda ) - \log (\epsilon )}{-\log (\epsilon )} \;=\; 1 \end{aligned}$$

by (31). The second statement follows from the observation that \({\widetilde{y}}_- = -1-\lambda \). \(\square \)

Lemma 12

\({\mathbb {E}}({\widetilde{T}}_{-,+}) < +\infty \).

Proof

As \(\lambda > 0\) and \({\mathbb {P}}(\{\chi> 0\})> 0\), it holds that \({\widetilde{p}}:={\mathbb {P}}(\{\chi + \lambda > 0\})\) is strictly positive. Denoting \({\widetilde{E}}:= \lceil \frac{{\widetilde{y}}_+-{\widetilde{y}}_-}{\lambda }\rceil \) and introducing the random variable

$$\begin{aligned} {\widetilde{N}} :=\; \min \big \{ n \in {\mathbb {N}} :\, \chi _{(n-1){\widetilde{E}} + 1} \ge \lambda ,\;\; \chi _{(n-1){\widetilde{E}} + 2} \ge \lambda , \,\dots ,\;\; \chi _{n{\widetilde{E}}} \ge \lambda \big \} , \end{aligned}$$

the latter then is geometrically distributed with success probability \({\widetilde{p}}^{{\widetilde{E}}}\). In particular, one has \({\mathbb {E}}({\widetilde{N}}) < \infty \). Moreover, \({\widetilde{T}}_{-,+} < {\widetilde{E}}\,{\widetilde{N}}\) a.s. by construction, so \({\mathbb {E}}({\widetilde{T}}_{-,+})< {\widetilde{E}}\,{\mathbb {E}}({\widetilde{N}}) < \infty \). \(\square \)

The connection between the two stopping times \({\widetilde{T}}_{-,+}\) and \({\widetilde{T}}_1\) is almost identical to that in the previous section: this time it holds that \({\mathbb {E}}\big ({\widetilde{T}}_1 \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big ) = {\mathbb {E}}\big ({\widetilde{T}}_{-,+} + 2 \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big )\). Therefore, up to this single change, the argument leading to (29) directly transposes (simply by replacing all hats with tildes, and with 2 instead of 3 everywhere), so that one has

$$\begin{aligned} \big ({\mathbb {E}}({\widetilde{T}}_1)\big )^{-1} \;=\; \left[ \frac{{\mathbb {E}}({\widetilde{T}}_{-,+}) + 1}{{\mathbb {P}}\big (\big \{{\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big \}\big )}\, +\, 1\right] ^{-1}. \end{aligned}$$
(37)

In complete analogy with the previous section, one can next define

$$\begin{aligned} {\widetilde{y}}'_-&\;:=\; {\mathbb {E}}\big ({\widetilde{y}}_{{\widetilde{T}}_{-,+}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \le {\widetilde{y}}_-\big )\,,&\qquad&{\widetilde{y}}''_- \;:=\; \tfrac{1}{C_0\nu }\log \Big ({\mathbb {E}}\big (e^{C_0\nu {\widetilde{y}}_{{\widetilde{T}}_{-,+}}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \le {\widetilde{y}}_-\big )\Big )\,,\\ {\widetilde{y}}'_+&\;:=\; {\mathbb {E}}\big ({\widetilde{y}}_{{\widetilde{T}}_{-,+}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big )\,,&\qquad&{\widetilde{y}}''_+ \;:=\; \tfrac{1}{C_0\nu }\log \Big ({\mathbb {E}}\big (e^{C_0\nu {\widetilde{y}}_{{\widetilde{T}}_{-,+}}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big )\Big )\,, \end{aligned}$$

in which \({\widetilde{y}}'_-, {\widetilde{y}}''_- \in \left[ {\widetilde{y}}_--1+\lambda ,{\widetilde{y}}_-\right] \) and \({\widetilde{y}}'_+, {\widetilde{y}}''_+ \in \left[ {\widetilde{y}}_+,{\widetilde{y}}_++1+\lambda \right] \). Now (36) implies that this time \({\widetilde{y}}_n - n({\mathbb {E}}(\chi ) + \lambda )\) is a martingale. As \(|\chi | \le 1\) a.s., its increments are a.s. bounded, namely more precisely \(|{\widetilde{y}}_{n+1} - (n+1)({\mathbb {E}}(\chi ) + \lambda ) - {\widetilde{y}}_n + n({\mathbb {E}}(\chi ) + \lambda )| \le 2\). With \({\mathbb {E}}({\widetilde{T}}_{-,+}) < \infty \) from Lemma 12, the optional stopping theorem yields \(0 = {\mathbb {E}}({\widetilde{y}}_0 - 0 \cdot ({\mathbb {E}}(\chi ) + \lambda )) = {\mathbb {E}}({\widetilde{y}}_{{\widetilde{T}}_{-,+}} - {\widetilde{T}}_{-,+} \cdot ({\mathbb {E}}(\chi ) + \lambda ))\), or

$$\begin{aligned} {\mathbb {E}}({\widetilde{T}}_{-,+}) \;=\; \frac{{\mathbb {E}}\big ({\widetilde{y}}_{{\widetilde{T}}_{-,+}}\big )}{{\mathbb {E}}(\chi ) + \lambda } \;=\; \frac{{\widetilde{y}}'_-\big (1 - {\mathbb {P}}\big (\{{\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+ \}\big )\big ) \,+\, {\widetilde{y}}'_+{\mathbb {P}}\big (\{{\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+ \}\big )}{{\mathbb {E}}(\chi ) + \lambda }.\nonumber \\ \end{aligned}$$
(38)

Lemma 13

For \(\lambda \) close enough to 0, i.e., \(\epsilon \) small enough, there is a unique solution \({\widetilde{\nu }} \in (0,\infty )\) for \(\rho \) of the equation \({\mathbb {E}}(e^{C_0\rho (\chi + \lambda )}) = 1\), implying that \(e^{C_0{\widetilde{\nu }}{\widetilde{y}}_n}\) is a martingale. Moreover, \({\widetilde{\nu }} < \nu \) and \(\lim _{\lambda \rightarrow 0} {\widetilde{\nu }} = \nu \).

Proof

As \({\mathbb {E}}(\chi ) + \lambda < 0\) for \(\lambda \) small enough and \({\mathbb {P}}(\{\chi + \lambda > 0\})\) is still positive, the proof of existence and uniqueness of \({\widetilde{\nu }}\) is identical to that of Lemma 8. Now, for \(\rho \in (0,\infty )\) the value of \(e^{C_0\rho \lambda }\) strictly decreases as \(\lambda \rightarrow 0\). The strict convexity thus implies that \({\widetilde{\nu }}\) is strictly increasing as \(\lambda \rightarrow 0\), hence \({\widetilde{\nu }} < \nu \). If \({\widetilde{\nu }} \le \nu ^{\prime }\) for all \(\lambda > 0\) and some \(\nu ^{\prime } < \nu \), then \({\widetilde{\nu }}< \frac{\nu ^{\prime }+\nu }{2} < \nu \) so that \({\mathbb {E}}\big (\exp (C_0\frac{\nu ^{\prime }+\nu }{2}(\chi + \lambda ))\big ) \ge 1\) for \(\lambda \) sufficiently small, contradicting the uniqueness of the solution \(\rho = \nu \) on \((0,\infty )\) of \({\mathbb {E}}(e^{C_0\rho \chi }) = 1\). \(\square \)

As \({\widetilde{y}}_n \in [{\widetilde{y}}_--1+\lambda , {\widetilde{y}}_++1+\lambda ]\) and \(|\chi | \le 1\) a.s., the increments of the martingale \(e^{C_0{\widetilde{\nu }}{\widetilde{y}}_n}\) of Lemma 13 are uniformly bounded by \(|e^{C_0{\widetilde{\nu }}{\widetilde{y}}_n} - e^{C_0{\widetilde{\nu }}{\widetilde{y}}_{n+1}}| = e^{C_0{\widetilde{\nu }}{\widetilde{y}}_n}|e^{C_0{\widetilde{\nu }}(\chi _n + \lambda )} - 1| \le e^{C_0{\widetilde{\nu }}({\widetilde{y}}_++1+\lambda )}|e^{C_0{\widetilde{\nu }}(1+\lambda )} - 1|\). Then, exactly as in the previous section, replacing \(\chi \) by \(\chi + \lambda \) and all hats by tildes, one gets

$$\begin{aligned}{} & {} \big ({\mathbb {E}}({\widetilde{T}}_1)\big )^{-1}\nonumber \\{} & {} \quad = \left[ \left( 1 + \frac{C_0\,{\widetilde{y}}'_-}{{\mathbb {E}}(\log (\kappa _0)) + C_0\lambda }\right) \,\frac{e^{C_0{\widetilde{\nu }}{\widetilde{y}}''_+}-e^{C_0{\widetilde{\nu }}{\widetilde{y}}''_-}}{1-e^{C_0{\widetilde{\nu }}{\widetilde{y}}''_-}}\, +\, \frac{C_0\,({\widetilde{y}}'_+ - {\widetilde{y}}'_-)}{{\mathbb {E}}(\log (\kappa _0)) + C_0\lambda }\, +\, 1\right] ^{-1} , \end{aligned}$$
(39)

which together with the two statements of Lemma 11 implies the second statement of Proposition 2 with

$$\begin{aligned} C_+ :=\; \left( 1 + \frac{C_0\,{\widetilde{y}}_-}{{\mathbb {E}}(\log (\kappa _0))}\right) ^{-1}(1-e^{C_0\nu {\widetilde{y}}_-}) , \end{aligned}$$

because

$$\begin{aligned} C_+ \;\ge \; \lim _{\epsilon \rightarrow 0} \left( 1 + \frac{C_0\,{\widetilde{y}}'_-}{{\mathbb {E}}(\log (\kappa _0)) + C_0\lambda }\right) ^{-1}\,\frac{1-e^{C_0{\widetilde{\nu }}{\widetilde{y}}''_-}}{\epsilon ^{{\widetilde{\nu }}}e^{C_0{\widetilde{\nu }}{\widetilde{y}}''_+}} \;=\; \lim _{\epsilon \rightarrow 0} \frac{\big ({\mathbb {E}}({\widetilde{T}}_1)\big )^{-1}}{\epsilon ^{{\widetilde{\nu }}}}. \end{aligned}$$
Fig. 5
figure 5

The dynamics of \(\theta _n\) on the real line in the balanced case where \({\mathbb {E}}(\log (\kappa _0)) = 0\). Contrary to the unbalanced case depicted in Fig. 2, there are no drifts in this situation

7 Modifications for the balanced case

This final section considers in the balanced case \({\mathbb {E}}(\log (\kappa _0)) = 0\). Hence, Fig. 2 is not valid any longer, but rather has to be modified to Fig. 5. The action induced by \(D_{\kappa }\) now yields no average drift everywhere on \(\overline{{\mathbb {R}}}\). Therefore, the random dynamics on the two half-axis \((-\infty ,0)\cup \{\infty \}\) and \([0,\infty )\) is essentially the same, up to flipping the sign of \(b_n\), swapping \(\alpha ^\epsilon _n\) for \(\delta ^\epsilon _n\) and \(\beta ^\epsilon _n\) for \(-\gamma ^\epsilon _n\), as well as changing \(\kappa _n\) to \(\kappa _n^{-1}\). Indeed, the bijective orientation preserving map \(x \in (-\infty ,0)\cup \{\infty \}\mapsto -x^{-1}\in [0,\infty )\) identifies these intervals and, moreover,

$$\begin{aligned} Q^\epsilon _n \cdot (-x^{-1}) \;=\; -\left[ \frac{(1+\epsilon ^2\delta ^\epsilon _n)x + (a_n+b_n+\epsilon \gamma ^\epsilon _n)\epsilon }{1+\epsilon ^2\alpha ^\epsilon _n - (a_n-b_n-\epsilon \beta ^\epsilon _n)\epsilon x}\right] ^{-1}, \quad D_n \cdot (-x^{-1}) \;=\; -\,\left[ \kappa _n^{-2}x\right] ^{-1}. \end{aligned}$$

As all estimates making use of the constants \(C_1\), \(C_2\), \(C_3\) and \({\mathbb {E}}(\log (\kappa _0)^2)\) are invariant under the above swapping, it is sufficient to analyze the random dynamics on \([0,\infty )\). So, (24) changes to

$$\begin{aligned} \frac{1}{2{\mathbb {E}}({\widehat{T}}_1^\epsilon )} \;\le \; \lim _{N \rightarrow \infty } \frac{1}{N}\frac{{\mathbb {E}}(\theta ^\epsilon _N)}{\pi } \;\le \; \frac{1}{2{\mathbb {E}}({\widetilde{T}}_1^\epsilon )}, \end{aligned}$$
(40)

for which again families of non-negative i.i.d. random variables \(({\widehat{x}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\), \(({\widehat{x}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\), \(({\widetilde{x}}^\epsilon _{2k-1})_{k\in {\mathbb {N}}}\) and \(({\widetilde{x}}^\epsilon _{2k})_{k\in {\mathbb {N}}}\) can be constructed. Note that this time all bounds do not differentiate between the processes with odd and even index k. Applying logarithmic transformations similar to \({\widehat{f}}\) and \({\widetilde{f}}\) to the processes \({\widehat{x}}^\epsilon _1\) and \({\widetilde{x}}^\epsilon _1\), one can obtain exactly the same processes \({\widehat{y}}\) and \({\widetilde{y}}\) as in (28) and (36) (again for n smaller than some stopping time similar to \({\widehat{T}}_{-,+}\) or \({\widetilde{T}}_{-,+}\), respectively). Also the calculations leading to (29) and (37) remain valid. As in the previous two sections, each expectation in (40) can be found by applying the optional stopping theorem to two martingales. In addition to the primed constants that were introduced before, let us set

$$\begin{aligned} {\widehat{y}}'''_-&\;:=\; -\sqrt{{\mathbb {E}}\big ({\widehat{y}}^2_{{\widehat{T}}_{-,+}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \le {\widehat{y}}_-\big )}\,,&\qquad&{\widetilde{y}}'''_- \;:=\; -\sqrt{{\mathbb {E}}\big ({\widetilde{y}}^2_{{\widetilde{T}}_{-,+}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \le {\widetilde{y}}_-\big )}\,,\\ {\widehat{y}}'''_+&\;:=\; \sqrt{{\mathbb {E}}\big ({\widehat{y}}^2_{{\widehat{T}}_{-,+}} \,\big |\, {\widehat{y}}_{{\widehat{T}}_{-,+}} \ge {\widehat{y}}_+\big )}\,,&\qquad&{\widetilde{y}}'''_+ \;:=\; \sqrt{{\mathbb {E}}\big ({\widetilde{y}}^2_{{\widetilde{T}}_{-,+}} \,\big |\, {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\big )}\,. \end{aligned}$$

The estimates \({\widehat{y}}'''_- \in \left[ {\widehat{y}}_--1,{\widehat{y}}_-\right] \), \({\widetilde{y}}'''_- \in \left[ {\widetilde{y}}_--1+\lambda ,{\widetilde{y}}_-\right] \), \({\widehat{y}}'''_+ \in \left[ {\widehat{y}}_+,{\widehat{y}}_++1\right] \), \({\widetilde{y}}'''_+ \in \left[ {\widetilde{y}}_+,{\widetilde{y}}_++1+\lambda \right] \) will again be used in what follows to bound these quantities. In the slower case (with hats), applying the optional stopping theorem to the martingales \({\widehat{y}}_n\) and \({\widehat{y}}_n^2 - n{\mathbb {E}}(\chi ^2)\) then yields

$$\begin{aligned} \big ({\mathbb {E}}({\widehat{T}}_1)\big )^{-1} \;=\; \frac{{\mathbb {E}}(\chi ^2)}{({\widehat{y}}'''_+)^2}\left[ 1 \,+\, \frac{{\mathbb {E}}(\chi ^2)({\widehat{y}}'_+ - 3{\widehat{y}}'_-) + {\widehat{y}}'_+({\widehat{y}}'''_-)^2}{-{\widehat{y}}'_-({\widehat{y}}'''_+)^2}\right] ^{-1}, \end{aligned}$$

which together with the limits of Lemma 6 then shows the first statement of Proposition 3. In addition to the defining properties of \(\lambda \) in (31), it will be necessary to require

$$\begin{aligned} \lim _{\epsilon \rightarrow 0} \lambda \log (\epsilon ) \;=\; 0 \end{aligned}$$

to control the faster process (with tildes) in the balanced case, as this implies \(\lim _{\epsilon \rightarrow 0} \lambda {\widetilde{y}}_+ = 0\). One can check that the choice \(\lambda := [\log (\epsilon )]^{-2}\) meets all the given conditions. A first martingale is now given by \({\widetilde{y}}_n - n\lambda \) which leads to exactly the same result as (38) with \({\mathbb {E}}(\chi ) = 0\) in this case. Next, similar to the proofs of Lemmata 8 and 13, one can show that there exists a unique real solution \({\widetilde{\rho }}\) for \(\rho \) solving \({\mathbb {E}}(e^{C_0\rho (\chi + \lambda )}) = 1\). This quantity must be negative, clearly depends on \(\lambda \) (so on \(\epsilon \)) and obeys \(\lim _{\lambda \rightarrow 0} {\widetilde{\rho }} = 0\). The implicit definition of \({\widetilde{\rho }}\) as a function of \(\lambda \) can be written as \(\lambda = \frac{\log ({\mathbb {E}}(e^{C_0{\widetilde{\rho }}\chi }))}{-C_0{\widetilde{\rho }}}\). By Fubini’s theorem, this is an analytic function in \({\widetilde{\rho }}\) (as it is also well-defined for \({\widetilde{\rho }}\) positive, so \(\lambda \) negative), with \(\frac{-C_0{\mathbb {E}}(\chi ^2)}{2} \ne 0\) as its first derivative w.r.t. \({\widetilde{\rho }}\). This allows to use the Lagrange inversion theorem for analytic functions, which shows

$$\begin{aligned} {\widetilde{\rho }} \;=\; -\frac{2}{C_0\,{\mathbb {E}}(\chi ^2)}\,\lambda \,-\, \frac{4\,{\mathbb {E}}(\chi ^3)}{3\,C_0\left( {\mathbb {E}}(\chi ^2)\right) ^3}\,\lambda ^2 \,+\, {\mathcal {O}}(\lambda ^3). \end{aligned}$$
(41)

Inserting this after applying the optional stopping theorem to the martingale \(e^{C_0{\widetilde{\rho }}{\widetilde{y}}_n}\) then yields

$$\begin{aligned} 1&\;=\; {\mathbb {E}}\big (e^{C_0{\widetilde{\rho }} \cdot 0}\big ) \;=\; {\mathbb {E}}\big (e^{C_0{\widetilde{\rho }}{\widetilde{y}}_0}\big ) \;=\; {\mathbb {E}}\big (e^{C_0{\widetilde{\rho }}{\widetilde{y}}_{{\widetilde{T}}_{-,+}}}\big )\\&\;=\; 1 \,-\, {\mathbb {E}}({\widetilde{y}}_{{\widetilde{T}}_{-,+}}) \;\left[ \frac{2\,\lambda }{{\mathbb {E}}(\chi ^2)} + \frac{4\,{\mathbb {E}}(\chi ^3)\,\lambda ^2}{3\big ({\mathbb {E}}(\chi ^2)\big )^2}\right] \,+\,\frac{1}{2}\;{\mathbb {E}}({\widetilde{y}}^2_{{\widetilde{T}}_{-,+}})\;\frac{4\,\lambda ^2}{\big ({\mathbb {E}}(\chi ^2)\big )^2}\\&\qquad + \;{\mathcal {O}}\left( \lambda ^3\right) \cdot \big (1 - {\mathbb {P}}\big (\{ {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+ \}\big )\big ) \,+\, {\mathcal {O}}\big ((\lambda {\widetilde{y}}_+)^3\big ) \cdot {\mathbb {P}}\big (\{ {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+ \}\big )\,, \end{aligned}$$

where the big \({\mathcal {O}}\)-notation makes sense as it was required earlier that \(\lambda {\widetilde{y}}_+ \rightarrow 0\) for \(\epsilon \rightarrow 0\). Hence,

$$\begin{aligned} \frac{1}{{\mathbb {P}}(\{ {\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+ \})} \;=\; \frac{{\widetilde{y}}'_+-{\widetilde{y}}'_-}{-{\widetilde{y}}'_-}\left[ 1 - \frac{\big (({\widetilde{y}}'''_-)^2{\widetilde{y}}'_+ - {\widetilde{y}}'_-({\widetilde{y}}'''_+)^2\big )\lambda }{-{\widetilde{y}}'_-({\widetilde{y}}'_+-{\widetilde{y}}'_-){\mathbb {E}}(\chi ^2)} \,+\,{\mathcal {O}}\left( [\lambda {\widetilde{y}}_+]^2\right) \right] , \end{aligned}$$

when carefully treating the error terms. Inserting this and (38) (with \({\mathbb {E}}(\chi ) = 0\)) into (37) yields

$$\begin{aligned} \left( {\mathbb {E}}({\widetilde{T}}_1)\right) ^{-1}&\;=\; \left[ \frac{1}{{\mathbb {P}}(\{{\widetilde{y}}_{{\widetilde{T}}_{-,+}} \ge {\widetilde{y}}_+\})}\left[ \frac{{\widetilde{y}}'_-}{\lambda } \,+\, 1\right] \, +\, \frac{{\widetilde{y}}'_+ - {\widetilde{y}}'_-}{\lambda }\, +\, 1\right] ^{-1}\\&\;=\; \frac{{\mathbb {E}}(\chi ^2)}{({\widetilde{y}}''_+)^2}\left[ 1\; +\; \frac{({\widetilde{y}}''_-)^2{\widetilde{y}}'_+\, +\, {\mathbb {E}}(\chi ^2)({\widetilde{y}}'_+ - 2\,{\widetilde{y}}'_-)}{-{\widetilde{y}}'_-({\widetilde{y}}''_+)^2} + {\mathcal {O}}(\lambda {\widetilde{y}}_+)\right] ^{-1}\,, \end{aligned}$$

which together with the limits stated in Lemma 11 implies the second statement of Proposition 3.

Fig. 6
figure 6

If E and \(|{\mathbb {E}}(\log (\kappa ))|\) simultaneously tend to 0, two types of scaling behavior for \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\) are separated by the curves on which \(|{\mathbb {E}}(\log (\kappa ))|\) is proportional to \(|\log (E)|^{-1}\)

Remark

It is possible to analyze the scaling of the quantity \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\) when both E and \(|{\mathbb {E}}(\log (\kappa ))|\) (or, equivalently, \(|{\mathbb {E}}(\chi )|\)) tend to zero. There are two different regimes, as depicted in Fig. 6. If \(\lim _{E \rightarrow 0} |{\mathbb {E}}(\chi )\log (E)| < \infty \), then \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\) is proportional to \(|\log (E)|^{-2}\). If the given limit equals zero, then the analysis in this section applies with \(|{\mathbb {E}}(\chi )|\) taking the role of \(\lambda \) (and E that of \(\epsilon \)). For a non-vanishing limit, a lower (with hats instead of tildes and \(\lambda \) set to zero) and an upper bound on \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\) are given by (39), in which \({\widetilde{\nu }}\) needs to be replaced by \({\widetilde{\rho }}\). In its expansion (41), one then needs to replace \(\lambda \) by \(|{\mathbb {E}}(\chi )|\) and \(|{\mathbb {E}}(\chi )|+\lambda \), respectively. If \({\mathbb {E}}(\chi )\log (E)\) converges to a nonzero constant (the case on the separating line in Fig. 6), then also the factor \(e^{C_0{\widetilde{\rho }}{\widetilde{y}}''_+}\) tends to a positive constant, and a further expansion in \(|{\mathbb {E}}(\chi )|\) shows that \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\) is also in this case proportional to \(|\log (E)|^{-2}\). Finally, if \(|{\mathbb {E}}(\chi )\log (E)| \rightarrow \infty \) for \(E \rightarrow 0\), then also \(e^{C_0{\widetilde{\rho }}{\widetilde{y}}''_+} \rightarrow \infty \). The dominant term between brackets in (39) contains the latter factor, which then implies the scaling of \(|{\mathcal {N}}(E)-{\mathcal {N}}(0)|\sim E^{\frac{2|{\mathbb {E}}(\log (\kappa ))|}{{\mathbb {E}}((\log (\kappa ))^2)}}|{\mathbb {E}}(\log (\kappa ))|^2\) as indicated for the grey region in Fig. 6. \(\diamond \)