1 Introduction

Random matrix products have applications in a wide range of disciplines, such as statistical and nuclear physics (Crisanti et al. 1993), population dynamics (Heyde and Cohen 1985) and quantum mechanics (Bougerol and Lacroix 1985). Their rigorous study began over sixty years ago, when Bellman (1954) studied the asymptotic behaviour of products of random matrices with strictly positive entries, corresponding to a weak law of large numbers. The seminal work of Furstenberg and Kesten strengthens this to a strong law for more general matrices. In particular, consider the random product of N i.i.d. \(d \times d\) matrices \(\{A_1,A_2, \ldots , A_m\}\),

$$\begin{aligned} M_N = \prod _{k=1}^N A_{i_k}, \quad i_k \in \{1,2, \ldots m\}. \end{aligned}$$
(1)

The asymptotic growth of \(M_N\) is typically quantified using a Lyapunov exponent.

Definition 1

The Lyapunov exponent \(\lambda \) is defined by

$$\begin{aligned} \lambda = \lim _{N\rightarrow \infty } \frac{1}{N}\, \mathbb {E}\log ||M_N|| \end{aligned}$$
(2)

where the limit exists, and where \(||\cdot ||\) is some (submultiplicative) matrix norm.

The Furstenberg–Kesten theorem (Furstenberg and Kesten (1960), Furstenberg (1963)) states that the limit (2) exists, and is positive under fairly weak assumptions on the \(A_i\), satisfied by the matrices we will be using. The Lyapunov exponent can be equivalently defined using a vector norm rather than a matrix norm, in a formulation arguably more familiar in a dynamical systems context.

Definition 2

The Lyapunov exponent \(\lambda \) can be written

$$\begin{aligned} \lambda = \lim _{N\rightarrow \infty } \frac{1}{N}\, \mathbb {E}\log ||X_N||, \qquad X_N = M_N X_0, \end{aligned}$$
(3)

whenever the limit exists.

The multiplicative ergodic theorem of Oseledets (1968) shows that the limit exists almost surely, and that there are at most d distinct Lyapunov exponents. While the above results apply to the general linear group \(GL(d, \mathbb {R})\) of \( d \times d\) invertible matrices, we are interested in the special linear group of \(2\times 2\) matrices of unit determinate, \(SL(2, \mathbb {R})\), and in particular in shear matrices, which form one-parameter subgroups of \(SL(2, \mathbb {R})\), since these arise naturally in many problems of fluid mixing. This means that (3) takes on only two possible values, and since shear matrices have unit determinant, the two Lyapunov exponents sum to zero. In this paper we use the formulation given by Definition 2, and the choice of initial \(X_0\) will be clear.

The principle of mixing by chaotic advection can be briefly summarised as repeated stretching in transverse directions (Ottino 1989; Sturman et al. 2006). Such behaviour can be seen in the blinking vortex flow (Aref 1984), in which fluid is efficiently stirred by the alternating operation of a pair of off-centre rotating rods (or vortices), resulting in hyperbolic dynamics and exponential growth of fluid filaments. The development of chaotic advection as a practical concept for fluid mixing has led to a wide variety of theoretical, numerical and phenomenological methods for quantifying, predicting and controlling mixing in an equally wide variety of application (Aref et al. 2017). For example, many industrial mixing devices are designed on this basis, with the most fundamental model being periodic application of transverse shear matrices (D’Alessandro et al. 1999; Stroock et al. 2002; Khakhar et al. 1987; Aref 1984). Although periodic composition is natural in many situations, there has also been interest in aperiodic protocols, and indeed an aperiodic sequence can produce improved mixing over periodic sequences of shears (Kang et al. 2008; Pacheco et al. 2008). Mixing devices in which the fluid stationarity is broken by time periodicity (rather than spatial periodicity), such as blinking vortices, pulsed source–sink devices (Jones and Aref 1988; Stremler and Cola 2006) and electro-osmotic mixers (Qian and Bau 2002; Pacheco et al. 2008), can all be operated in an aperiodic, or random manner.

Another application of random matrices is for taffy pullers, devices for making candy by stretching and folding that provide a paradigm for studying many aspects of chaotic mixing (Boyland et al. 2000; Thiffeault and Finn 2006; Finn and Thiffeault 2011; Thiffeault 2018). Figure 1 shows such a device for three rods, with the taffy being represented as a closed curve stretched on the rods. The key question is how fast the taffy grows asymptotically. This is determined by the spectral radius of a product of triangular matrices

$$\begin{aligned} A = \begin{pmatrix} 1 &{} 0 \\ 1 &{} 1 \end{pmatrix},\qquad B = \begin{pmatrix} 1 &{} 1 \\ 0 &{} 1 \end{pmatrix}. \end{aligned}$$

For the case in which the taffy puller is operated in the simplest periodic manner, so that the matrices A and B are alternated, this spectral radius is easily computed to be \((3+\sqrt{5})/2\), the largest eigenvalue of the matrix AB, the logarithm of which is the Lyapunov exponent for AB. This measure of exponential growth of taffy (corresponding to fluid filaments in a mixing device) is equally easy to compute for other periodic sequences of A and B, but when the sequence is chosen at random, this question becomes a classic and famously difficult one in the theory of random products of matrices.

Allowing the shear factors to vary, that is, considering the matrices

$$\begin{aligned} A = \begin{pmatrix} 1 &{} 0 \\ \alpha &{} 1 \end{pmatrix},\qquad B = \begin{pmatrix} 1 &{} \beta \\ 0 &{} 1 \end{pmatrix}, \end{aligned}$$

where \(\alpha \) and \(\beta \) are parameters, allows the study of asymmetric mixing devices. These matrices also constitute Dyson’s random chain model (1953) of vibrations of a one-dimensional vibrating string consisting of point masses, where \(\alpha \) represents the spacing between successive point masses \(\beta \) (Comtet and Tourigny 2017).

Fig. 1
figure 1

A random taffy puller for the sequence ABABBA

There is a paucity of exact results concerning Lyapunov exponents for random matrices, as famously lamented by Kingman (1973, p. 897). One well-known upper bound is easily derived from the submultiplicativity of \(||\cdot ||\). For two matrices chosen with equal probability, let

$$\begin{aligned} E_k = \frac{1}{k}\, \mathbb {E}\log ||C ||, \end{aligned}$$
(4)

with \(C \in \mathcal {A}^k\), where \(\mathcal {A}^k\) is the set of all \(2^k\) products of matrices of length k. The numbers \(E_k\) converge monotonically to \(\lambda \) from above as \(k \rightarrow \infty \) for any choice of matrix norm, although according to (Protasov and Jungers 2013) the Euclidean norm is usual. In Key (1990) the bound is described as “easy, if not efficient”, since the number of matrix product calculations required increases exponentially with k.

Further progress in this direction has tended to be either for specific simple cases, or algorithmic procedures leading to (sometimes very accurate) approximations. For example, Key (1987) and Pincus (1985) discuss cases where the Lyapunov exponent can be computed exactly, in particular when matrices can be grouped in commuting blocks. Chassaing et al. (1984) establish the distribution for the matrix product, in terms of a continued fraction, in the case that the matrices are \(2 \times 2\) shear matrices, but observe that even for these simple matrices, the Lyapunov exponent is still unobtainable. A similar approach allowed Viswanath (2000) to give a formula for the exponent in the case of matrices which give rise to a random Fibonacci sequence. [This was extended by Janvresse et al. (2007).] An exact expression for \(\lambda \) as the sum of a convergent series in the case for which one matrix is singular was given by Lima and Rahibe (1994). Analytic expressions for \(\lambda \) have also been obtained for large, sparse matrices (Cook and Derrida 1990), and for classes of \(2 \times 2\) matrices in terms of explicitly expressed probability distributions (Mannion 1993; Marklof et al. 2008). Pollicott (2010) recently gave a cycle expansion formula that allows a very accurate computation for a class of matrices. Protasov and Jungers (2013) obtain an efficient algorithm for Lyapunov exponent bounds using invariant cones for matrices with non-negative entries, concentrating on generality and efficiency. (They test their algorithm on examples up to dimension 60.)

For the problems of passive scalar decay and random taffy pullers, knowledge of the Lyapunov exponent is insufficient (Antonsen et al. 1996; Haynes and Vanneste 2005; Thiffeault 2008). We require more refined information via the growth rate of the qth moment of the matrix product norm. This is often characterised using generalised Lyapunov exponents (Crisanti et al. 1988, 1993), which can be defined using a matrix norm by

$$\begin{aligned} \ell _{\text {mat}}(q) = \lim _{N\rightarrow \infty } \frac{1}{N}\,\log \mathbb {E}||M_N ||^q. \end{aligned}$$
(5)

Here, we prefer a formulation using a vector norm.

Definition 3

The generalised Lyapunov exponent is defined by

$$\begin{aligned} \ell (q) = \lim _{N\rightarrow \infty } \frac{1}{N}\,\log \mathbb {E}||X_N ||^q, \end{aligned}$$
(6)

where \(X_N\) is as defined in (3)

Here, we must observe that (5) and (6) are not equivalent, particularly for \(q<0\). There are even greater difficulties in computing generalised Lyapunov exponents for products of random matrices, and various numerical schemes are proposed and employed (Vanneste 2010). The rigorous bounds we obtain are thus useful for benchmarking such numerical methods.

The paper is organised as follows: In Sect. 2 we state our results. Section 3 is devoted to test the accuracy of our bounds, in particular investigating numerically the tightness of upper and lower bounds while using different norms. Section 4 contains the construction of invariant cones and bounds on growth of vector norms required to prove the theorems. We discuss possible extensions in Sect. 5.

2 Rigorous Bounds for Lyapunov Exponents

We derive rigorous and explicit bounds for Lyapunov exponents and generalised Lyapunov exponents by reformulating the problem, grouping the matrices together. Assume without loss of generality that the first matrix in the product (1) is \(A_1=A\). By grouping A’s and B’s together into J blocks, the random product (1) can be written

$$\begin{aligned} M_{N_J} = \prod _{j=1}^J A^{a_j} B^{b_j}, \quad a_j+b_j = n_j, \quad \sum _{j=1}^J n_j = N_J, \end{aligned}$$
(7)

with \(1 \le a_i,b_i < n_i\), so \(n_i \ge 2\). Now it is the \(a_i\) and \(b_i\) that are the i.i.d. random variables, with identical probability distribution \(P(x) = 2^{-x}\), \(x\ge 1\). Hence, the length of each block is governed by the joint distribution \(P(a,b) = P(a)\,P(b) = 2^{-(a+b)}\). We have the expected values \(\mathbb {E}a = \mathbb {E}b = 2\), so \(\mathbb {E}n = 4\).

Let us now take the specific matrices

$$\begin{aligned} A = \begin{pmatrix} 1 &{}\quad 0 \\ \alpha &{}\quad 1 \end{pmatrix},\quad B = \begin{pmatrix} 1 &{} \quad \beta \\ 0 &{}\quad 1 \end{pmatrix}, \quad K_{ab} :=A^a B^b = \begin{pmatrix} 1 &{}\quad b\beta \\ a\alpha &{}\quad 1+a\alpha b\beta \end{pmatrix}. \end{aligned}$$
(8)

We consider first the case \(\alpha , \beta >0\), for which \(K_{ab}\) is positive definite, and hyperbolic (that is, having eigenvalues off the unit circle) \(\forall a, b \ge 1\). Although our technique holds for all positive \(\alpha , \beta \), we state our results for \(\alpha , \beta \ge 1\). This is partly due to ease of exposition, but also because in many applications \(\alpha \) and \(\beta \) would be assumed to be integers, so that a map induced by \(K_{ab}\) is continuous on the 2-torus. In particular, the algebraically simplest case \(\alpha = \beta = 1\) corresponds to the generators of the 3-braid group seen in many studies of topological mixing (Boyland et al. 2000; Thiffeault and Finn 2006; Finn and Thiffeault 2011). We then allow negative entries; in particular, we consider \(\alpha<0< \beta \) (note that \(\alpha>0>\beta \) is essentially similar, while \(\alpha<0, \beta <0\) is no different from the positive \(\alpha , \beta \) case). Now hyperbolicity is only guaranteed when the product \(|\alpha \beta |\) is sufficiently large, and we require this property to obtain our results. We gain different bounds by considering different vector norms, a valid approach since the limit in (3) is independent of the choice of norm. In particular, we will consider the \(L_1\), \(L_2\) and \(L_{\infty }\) norms. Which norm produces the most accurate bound depends on \(\alpha \) and \(\beta \). This is easily discerned by computation.

2.1 Lyapunov Exponents

Our theorems are stated in terms of infinite sums of products of an exponentially decreasing term and a choice of (logarithm of) increasing algebraic function, and so all obviously converge.

Theorem 1

The Lyapunov exponent \(\lambda (\alpha , \beta )\) for the product \(M_N\) for \(\alpha , \beta \ge 1\) satisfies

$$\begin{aligned} \max _{k \in \{ 1, 2, \infty \}} \mathcal {L}_k (\alpha , \beta ) \le 4\lambda (\alpha , \beta ) \le \min _{k \in \{ 1, 2, \infty \}} \mathcal {U}_k (\alpha , \beta ) \end{aligned}$$

where

$$\begin{aligned} \mathcal {L}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \phi _k (a, b, \alpha , \beta )\\ \mathcal {U}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \psi _k (a, b, \alpha , \beta ), \end{aligned}$$

and

$$\begin{aligned} \phi _1 (a, b, \alpha , \beta )= & {} 1+\tfrac{\alpha }{1+\alpha } \left( a+ b\beta + a\alpha b\beta \right) \\ \phi _2 (a, b, \alpha , \beta )= & {} \min {\left\{ \begin{array}{ll} ((1+a\alpha b\beta )^2 + b^2\beta ^2)^{1/2} &{} \\ \left( \tfrac{1}{1+\alpha ^2}\left( \alpha ^2(1+a+a\alpha b\beta )^2 + (1+\alpha b\beta )^2\right) \right) ^{1/2} &{} \\ \end{array}\right. }\\ \phi _{\infty } (a, b, \alpha , \beta )= & {} 1+a\alpha b\beta \\ \psi _1 (a, b, \alpha , \beta )= & {} 1+b\beta +a\alpha b\beta \\ \psi _2 (a, b, \alpha , \beta )= & {} \left( \tfrac{1}{2} \left( 2 + \mathcal {C}_{a\alpha b\beta } + \sqrt{\mathcal {C}_{a\alpha b\beta } (\mathcal {C}_{a\alpha b\beta } + 4)} \right) \right) ^{1/2} \\ \psi _{\infty } (a, b, \alpha , \beta )= & {} 1+a+a\alpha b\beta \end{aligned}$$

where \(\mathcal {C}_{a\alpha b\beta } = (a\alpha +b\beta )^2 + (a\alpha b \beta )^2\).

Losing a little sharpness, the \(L_{\infty }\)-norm bounds provide a pair of simpler expressions with no infinite sums, stated in:

Corollary 1

The Lyapunov exponent \(\lambda (\alpha , \beta )\) for the product \(M_N\) for \(\alpha , \beta \ge 1\) satisfies

$$\begin{aligned} \kappa + \log \alpha \beta \le 4\lambda \le \kappa + \log (\sqrt{\alpha \beta } + 1/\sqrt{\alpha \beta }) + \tfrac{1}{2} \log (1+\alpha \beta ), \end{aligned}$$

where

$$\begin{aligned} \kappa = \sum _{a,b=1}^{\infty } 2^{-a-b} \log ab \approx 1.0157\ldots . \end{aligned}$$

Theorem 1 is obtained by considering a cone in tangent space which is invariant for all a and b. We can improve these estimates by recognising that a smaller cone can be used for certain iterates of the map. In particular, we use the fact that since \(a_i\) and \(b_i\) are independent geometric distributions, \(P(a=b) = P(a>b) = P(b>a)\) to give

Theorem 2

The Lyapunov exponent \(\lambda (\alpha , \beta )\) for the product \(M_N\) for \(\alpha , \beta \ge 1\) satisfies

$$\begin{aligned} \max _{k \in \{ 1, 2, \infty \}} \hat{\mathcal {L}}_k (\alpha , \beta ) \le 4\lambda (\alpha , \beta ) \le \min _{k \in \{ 1, 2, \infty \}} \hat{\mathcal {U}}_k (\alpha , \beta ) \end{aligned}$$

where

$$\begin{aligned} \hat{\mathcal {L}}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \left( \frac{1}{3} \sum _{m=1}^3 \hat{\phi }_k^{(m)} (a, b, \alpha , \beta )\right) \\ \hat{\mathcal {U}}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \left( \frac{1}{3} \sum _{m=1}^3 \hat{\psi }_k^{(m)} (a, b, \alpha , \beta ) \right) , \end{aligned}$$

and

$$\begin{aligned} \hat{\phi }_1^{(1)} (a, b, \alpha , \beta )= & {} \phi _1(a, b, \alpha , \beta ) \\ \hat{\phi }_1^{(2)} (a, b, \alpha , \beta )= & {} \frac{\alpha \left( \alpha \beta + 2\right) \left( a \alpha b \beta + b \beta + 1\right) + \left( a \alpha + 1\right) \left( \alpha \beta + 1\right) }{\alpha \left( \alpha \beta + \beta + 2\right) + 1} \\ \hat{\phi }_1^{(3)}(a, b, \alpha , \beta )= & {} \frac{\alpha \left( 2 \alpha \beta + 3\right) \left( a \alpha b \beta + b \beta + 1\right) + \left( a \alpha + 1\right) \left( \alpha \beta + 1\right) }{\alpha \left( 2 \alpha \beta + \beta + 3\right) + 1} \\ \hat{\phi }_2^{(1)}(a, b, \alpha , \beta )= & {} \phi _2(a, b, \alpha , \beta ) \\ \hat{\phi }_2^{(2)}(a, b, \alpha , \beta )= & {} \min {\left\{ \begin{array}{ll} ((1+a\alpha b\beta )^2 + b^2\beta ^2)^{1/2} &{} \\ \left( \frac{\left( (1+\alpha \beta + \alpha b\beta (2+\alpha \beta ))^2 + (a\alpha (1+\alpha \beta ) + \alpha (2+\alpha \beta )(1+a\alpha b\beta )^2 \right) }{(1+\alpha \beta )^2+\alpha ^2(2+\alpha \beta )^2} \right) ^{1/2} &{} \\ \end{array}\right. } \\ \hat{\phi }_2^{(3)}(a, b, \alpha , \beta )= & {} \min {\left\{ \begin{array}{ll} ((1+a\alpha b\beta )^2 + b^2\beta ^2)^{1/2} &{} \\ \left( \frac{\left( (1+\alpha \beta + \alpha b\beta (3+2\alpha \beta ))^2 + (a\alpha (1+\alpha \beta ) + \alpha (3+2\alpha \beta )(1+a\alpha b\beta )^2 \right) }{(1+\alpha \beta )^2+\alpha ^2(3+2\alpha \beta )^2} \right) ^{1/2} &{} \\ \end{array}\right. }\\ \hat{\phi }_{\infty }^{(m)} (a, b, \alpha , \beta )= & {} \phi _{\infty }(a, b, \alpha , \beta ) \quad \text{ for } m = 1, 2, 3 \end{aligned}$$

and

$$\begin{aligned} \hat{\psi }_{1}^{(m)} (a, b, \alpha , \beta )= & {} \psi _{1}(a, b, \alpha , \beta ) \quad \text{ for } m = 1, 2, 3 \\ \hat{\psi }_{2}^{(m)} (a, b, \alpha , \beta )= & {} \psi _{2}(a, b, \alpha , \beta )\quad \text{ for } m = 1, 2, 3\\ \hat{\psi }_{\infty }^{(1)}(a, b, \alpha , \beta )= & {} \psi _{\infty }(a, b, \alpha , \beta ) \\ \hat{\psi }_{\infty }^{(2)}(a, b, \alpha , \beta )= & {} 1+a\alpha b\beta + \frac{a(1+\alpha \beta )}{2+\alpha \beta } \\ \hat{\psi }_{\infty }^{(3)}(a, b, \alpha , \beta )= & {} 1+a\alpha b\beta + \frac{a(1+\alpha \beta )}{3+2\alpha \beta }\,. \end{aligned}$$

2.2 Generalised Lyapunov Exponents

We can use the functions defined above to bound the generalised Lyapunov exponents for each q:

Theorem 3

We have, for \(\alpha , \beta \ge 1\),

$$\begin{aligned} 4\ell (q, \alpha , \beta )\ge & {} {\left\{ \begin{array}{ll} \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\phi _k (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\psi _k (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \\ 4\ell (q, \alpha , \beta )\le & {} {\left\{ \begin{array}{ll} \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\psi _k (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\phi _k (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \end{aligned}$$

and the more accurate expressions

$$\begin{aligned} 4\ell (q, \alpha , \beta )\ge & {} {\left\{ \begin{array}{ll} \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{3} \sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^3 (\hat{\phi }_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{3}\sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^3 (\hat{\psi }_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \\ 4\ell (q, \alpha , \beta )\le & {} {\left\{ \begin{array}{ll} \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{3} \sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^3 (\hat{\psi }_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{3}\sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^3 (\hat{\phi }_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \end{aligned}$$

with \(\phi _k, \psi _k, \hat{\phi }_k^{(m)}\) and \(\hat{\psi }_k^{(m)}\) defined as above.

An immediate observation is that since all the functions \(\phi , \psi , \hat{\phi }_k^{(m)}\) and \(\hat{\psi }_k^{(m)}\) are greater than 1 for all \(a,b \ge 1\), \(\alpha , \beta \ge 0\), and since \(\sum _{a,b=1}^{\infty } 2^{-a-b} = 1\), the bounds for \(\ell (q,\alpha ,\beta )\) grow from 0 for positive q and decay from zero for negative q. This apparently contradicts Proposition 2 of Vanneste (2010), which states that there is always a minimum in the curve for \(\ell (q)\), and in particular states that \(\ell (-2)=0\) if the 2-dimensional matrices in question have determinant 1. The existence of the invariant cone for these shear matrices guarantees that a vector is expanded at every application of A or B, which forces \(\ell (q)\) to be monotonic. In Vanneste (2010), the assumption is made that the linear operator corresponding to the generalised Lyapunov exponent has the same spectrum as its adjoint, a property precluded by the invariant cone. The fact that \(\phi , \psi , \hat{\phi }_k^{(m)}\) and \(\hat{\psi }_k^{(m)} \ge 1\) is also the reason why \(\phi \) and \(\psi \) exchange roles in upper and lower bounds for positive and negative q.

When q is a positive integer, we can evaluate the bounds in Theorem 3 in many cases simply by expanding the power q. Since

$$\begin{aligned} \sum _{a, b=1}^\infty 2^{-a-b} a^nb^m = \left( \sum _{a=1}^{\infty } 2^{-a} a^n\right) \left( \sum _{b=1}^{\infty } 2^{-b} b^m\right) \end{aligned}$$

such an expansion requires values of the polylogarithm \({{\mathrm{Li}}}_{-n}(\tfrac{1}{2})\), defined by

$$\begin{aligned} {{\mathrm{Li}}}_s(z) = \sum _{a=1}^\infty \frac{z^a}{a^s}\,. \end{aligned}$$

For integer \(n = -s\) we have special values

$$\begin{aligned} \sum _{a=1}^{\infty } 2^{-a} a^n = 1,2,6,26,150,1082,9366, \ldots \text{ for } n = 0,1,2,3,4,5,6,\ldots \end{aligned}$$

and so the \(L_{\infty }\) norm, for example, gives

Corollary 2

Generalised Lyapunov exponents in the case \(\alpha = \beta = 1\) are bounded by:

$$\begin{aligned} \tfrac{1}{4}\log 5\le & {} \ell (1,1,1) \le \tfrac{1}{4}\log 7 \\ \tfrac{1}{4}\log 45\le & {} \ell (2,1,1) \le \tfrac{1}{4}\log 79 \\ \tfrac{1}{4}\log 797\le & {} \ell (3,1,1) \le \tfrac{1}{4}\log 1543 \\ \tfrac{1}{4}\log 25437\le & {} \ell (4,1,1) \le \tfrac{1}{4}\log 50531 \\ \tfrac{1}{4}\log 1290365\le & {} \ell (5,1,1) \le \tfrac{1}{4}\log 2578567 . \end{aligned}$$

Theorem 3 also allows explicit estimates on topological entropy for the random matrix product, given by the generalised Lyapunov exponent with \(q=1\).

Corollary 3

The topological entropy \(\ell (1,\alpha ,\beta )\) in the case \(\alpha , \beta \ge 1\) is bounded by

$$\begin{aligned} \log (1+4\alpha \beta ) \le 4\ell (1,\alpha ,\beta ) \le \log (3+4\alpha \beta ). \end{aligned}$$

2.3 Matrices with Negative Entries

The case where the direction of one of the shears is reversed (that is, allowing negative entries in the matrix) can be tackled in an almost identical manner, with one important condition. Taking \(\alpha<0<\beta \) (the case \(\alpha>0>\beta \) is essentially identical), the matrix \(K_{11} = AB\) is hyperbolic only when the product \(|\alpha \beta | >4\). In the following, for simplicity, we will assume \(\alpha < -2\), \(\beta >2\) to achieve this.Footnote 1

Theorem 4

The Lyapunov exponent \(\lambda (\alpha , \beta )\) for the product \(M_N\) in the case \(\alpha <-2, \beta >2\) satisfies

$$\begin{aligned} \max _{k \in \{ 1, 2, \infty \}} \tilde{\mathcal {L}}_k (\alpha , \beta ) \le 4\lambda (\alpha , \beta ) \le \min _{k \in \{ 1, 2, \infty \}} \tilde{\mathcal {U}}_k (\alpha , \beta ) \end{aligned}$$

where

$$\begin{aligned} \tilde{\mathcal {L}}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \tilde{\phi }_k (a, b, \alpha , \beta )\\ \tilde{\mathcal {U}}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \tilde{\psi }_k (a, b, \alpha , \beta ), \end{aligned}$$

and

$$\begin{aligned} \tilde{\phi }_1 (a, b, \alpha , \beta )= & {} \tfrac{1}{1-\Gamma } \left( b\beta + \Gamma - a\alpha b\beta - 1- a\alpha \Gamma \right) \\ \tilde{\phi }_2 (a, b, \alpha , \beta )= & {} \left( \tfrac{1}{1+\Gamma ^2}\left( (\Gamma +b\beta )^2 + (1+a\alpha \Gamma + a\alpha b\beta )^2 \right) \right) ^{1/2} \\ \tilde{\phi }_{\infty } (a, b, \alpha , \beta )= & {} -a\alpha b\beta - a\alpha \Gamma -1 \\ \tilde{\psi }_1 (a, b, \alpha , \beta )= & {} b\beta -a\alpha b\beta -1 \\ \tilde{\psi }_2 (a, b, \alpha , \beta )= & {} ((-a\alpha b\beta -1)^2 + b^2\beta ^2)^{1/2} \\ \tilde{\psi }_{\infty } (a, b, \alpha , \beta )= & {} -a\alpha b\beta -1 \end{aligned}$$

where

$$\begin{aligned} \Gamma = - \frac{\beta }{2} + \sqrt{\left( \frac{\beta }{2} \right) ^2 + \frac{\beta }{\alpha }}. \end{aligned}$$

Again we can straightforwardly improve on the lower bounds by considering separately the cases when either, or both, of a and b are equal to 1.

Theorem 5

The Lyapunov exponent \(\lambda (\alpha , \beta )\) for the product \(M_N\) in the case \(\alpha <-2, \beta >2\) satisfies

$$\begin{aligned} \max _{k \in \{ 1, 2, \infty \}} \hat{\tilde{\mathcal {L}}}_k (\alpha , \beta ) \le 4\lambda (\alpha , \beta ) \le \hat{\tilde{\mathcal {U}}}_{\infty } (\alpha , \beta ) \end{aligned}$$

where

$$\begin{aligned} \hat{\tilde{\mathcal {L}}}_k(\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \frac{1}{4} \sum _{m_a, m_b=1}^{2} \hat{\tilde{\phi }}_k^{(m_a,m_b)} (a, b, \alpha , \beta )\\ \hat{\tilde{\mathcal {U}}}_{\infty } (\alpha ,\beta )= & {} \sum _{a,b=1}^\infty 2^{-a-b}\log \frac{1}{4} \sum _{m=1}^{4} \hat{\tilde{\psi }}_{\infty }^{(m)} (a, b, \alpha , \beta ) \end{aligned}$$

and

$$\begin{aligned} \hat{\tilde{\phi }}_1^{(m_a,m_b)} (a, b, \alpha , \beta )= & {} \tfrac{1}{1-\Gamma _{m_a,m_b}} \left( b\beta + \Gamma _{m_a,m_b} - a\alpha b\beta - 1- a\alpha \Gamma _{m_a,m_b}\right) \\ \hat{\tilde{\phi }}_2^{(m_a,m_b)} (a, b, \alpha , \beta )= & {} \left( \tfrac{1}{1+\Gamma _{m_a,m_b}^2}\left( (\Gamma _{m_a,m_b}+b\beta )^2 + (1+a\alpha \Gamma _{m_a,m_b} + a\alpha b\beta )^2 \right) \right) ^{1/2} \\ \hat{\tilde{\phi }}_{\infty }^{(m_a,m_b)} (a, b, \alpha , \beta )= & {} -a\alpha b\beta - a\alpha \Gamma _{m_a,m_b} -1\\ \hat{\tilde{\psi }}_{\infty }^{(1)} (a,b,\alpha ,\beta )= & {} -a\alpha b\beta -1 -a\alpha \beta /(1+\alpha \beta )) \\ \hat{\tilde{\psi }}_{\infty }^{(2)} (a,b,\alpha ,\beta )= & {} -a\alpha b\beta -1 -a \\ \hat{\tilde{\psi }}_{\infty }^{(3)} (a,b,\alpha ,\beta )= & {} \hat{\tilde{\psi }}_{\infty }^{(4)} (a,b,\alpha ,\beta ) = \tilde{\psi }_{\infty } (a, b, \alpha , \beta ) \end{aligned}$$

with

$$\begin{aligned} \Gamma _{m_a,m_b} = \frac{\Gamma +m_b\beta }{m_a\alpha \Gamma + m_a m_b \alpha \beta +1} \end{aligned}$$
(9)

for \(m_a, m_b = 1,2\). Note that \(\Gamma _{1,1} = L\).

We could also write improved upper estimates using \(L_1\) and \(L_2\) norms, but since these produce worse bounds than the \(L_{\infty }\) norm in all cases we study here, we do not give these explicitly.

As before, we can use the same estimates to give explicit bounds for generalised Lyapunov exponents.

Theorem 6

We have, for \(\alpha <-2\) and \(\beta >2\),

$$\begin{aligned} \qquad 4\ell (q, \alpha , \beta )\le & {} {\left\{ \begin{array}{ll} \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\tilde{\phi }_k (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\tilde{\psi }_k (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \\ \qquad 4\ell (q, \alpha , \beta )\ge & {} {\left\{ \begin{array}{ll} \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\tilde{\psi }_k (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \sum _{a, b=1}^{\infty } 2^{-a-b} (\tilde{\phi }_k (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \end{aligned}$$

and the more accurate, but more complicated expressions

$$\begin{aligned} 4\ell (q, \alpha , \beta )\le & {} {\left\{ \begin{array}{ll} \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{4} \sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m_a=1}^2 \sum _{m_b=1}^2 (\hat{\tilde{\phi }}_k^{(m_a,m_b)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \min _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{4}\sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^4 (\hat{\tilde{\psi }}_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \\ 4\ell (q, \alpha , \beta )\ge & {} {\left\{ \begin{array}{ll} \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{4} \sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m=1}^4 (\hat{\tilde{\psi }}_k^{(m)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q\ge 0 \\ \max _{k \in \{ 1, 2, \infty \} } \left\{ \log \frac{1}{4}\sum _{a, b=1}^{\infty } 2^{-a-b} \sum _{m_a=1}^2 \sum _{m_b=1}^2 (\hat{\tilde{\phi }}_k^{(m_a,m_b)} (a,b,\alpha , \beta ))^q\right\} &{}\quad q<0 \end{array}\right. } \end{aligned}$$

with \(\tilde{\phi _k}, \tilde{\psi _k}, \hat{\tilde{\phi }}_k^{(m_a,m_b)}\) and \(\hat{\tilde{\psi }}_k^{(m)}\) defined as above.

3 Accuracy of the Bounds

Our expressions are given as infinite sums, and a natural question is whether these represent a meaningful improvement over the infinite product in the definition of the Lyapunov exponent. The answer is yes—our expressions converge far faster than computing (3) directly. In practice, to evaluate our bounds one will truncate the infinite sum. Since all summands are positive, any truncation of a lower bound is also a lower bound, and few terms are necessary to approximate \(\mathcal {L}_k(\alpha , \beta )\) well. For example, consider the \(L_{\infty }\) norm. Truncating the sum after p terms, the error between \(\mathcal {L}_{\infty }(1,1)\) and this approximation is (for \(p \ge 2\))

$$\begin{aligned} \sum _{a, b = p+1}^{\infty } 2^{-a-b} \log (1+a b ) < \sum _{a, b = p+1}^{\infty } 2^{-a-b} a b = 4^{-p} (p+2)^2. \end{aligned}$$

This error decreases rapidly with p. For example, taking just \(p=20\) terms in the truncation gives an error of around \(10^{-9}\). Truncations of \(\mathcal {U}_k (\alpha , \beta )\) are of course not rigorous upper bounds on the Lyapunov exponent, but again, taking around twenty terms in the sum gives an evaluation of the rigorous upper bound to around 8 or 9 decimal places. By contrast, computing the actual value of \(\lambda (\alpha , \beta )\) using a standard algorithm (employing Gram–Schmidt orthonormalisation at each step) (Parker and Chua 2012) takes typically \(10^6\) iterates to compute a value to within 3 decimal places. As discussed above, generalised Lyapunov exponents are far more difficult to compute with any accuracy.

3.1 Lyapunov Exponents

For \(\alpha =\beta =1\) we have bounds on the Lyapunov exponent given in Table 1. The lowest upper bound (\(\mathcal {U}_2\)) and largest lower bound (\(\hat{\mathcal {L}}_1\)) differ by about \(2.5\%\). The true value [from explicit calculation via a standard algorithm (Parker and Chua 2012)] is 0.39625\(\ldots \), so the upper bound here is rather tighter than the lower.

Table 1 Bounds to five significant figures for the maximal Lyapunov exponent for the matrix product (1) in the case \(\alpha = \beta = 1\), where the true value (from numerical computation) is 0.39625\(\ldots \)

Figure 2 shows the accuracy of each bound from Theorem 1 increasing as \(\alpha \) increases, with \(\alpha = \beta \in [1,10]\). In Fig. 2a the bounds are all so close to the true value of \(\lambda \) that the details of the graph are difficult to resolve. It is clear, however, that the standard bound given by (4) (plotted in cyan) is a worse upper bound than all others in the figure, despite being calculated from the expected value of matrix products of length \(2^{12}\), and decreases in accuracy for this fixed k for increasing \(\alpha \).

Figure 2b shows the difference between the bounds and the true (numerically calculated) Lyapunov exponent. In this and other figures we colour bounds originating from \(L_1\)-, \(L_2\)- and \(L_{\infty }\)-norms green, red and blue, respectively. It shows that for increasing \(\alpha \), upper bounds (solid lines) appear tighter than lower bounds (dashed lines). In black are shown the upper and lower bounds given in Corollary 1, which are typically worse than those of Theorem 1, but have the advantage of being explicit, finite expressions rather than infinite sums.

Figure 2c shows the envelope formed from the difference between upper and lower bounds derived from each norm, which does not require the explicit numerical calculation of the Lyapunov exponent to compute. To this figure we add, in Fig. 2d, the corresponding bounds from Theorem 2 which improve on Theorem 1 by considering the expected relation between the random variables \(a_i\) and \(b_i\). In black is the envelope formed from taking the minimum upper bound, and maximum lower bound for each value of \(\alpha \). This improves on all bounds produced from a single norm.

Fig. 2
figure 2

Four different views of the accuracy of the upper and lower bounds for the Lyapunov exponent for the random matrix product (1) for \(\alpha = \beta \in [1,10]\). In each case, bounds obtained from different norms are shown, in particular \(L_1\) (green), \(L_2\) (red) and \(L_{\infty }\) (blue) are shown, with bounds from the global cone shown dashed, and the improved cone shown as a solid line. When shown, the standard bound is cyan. a Numerical estimate of the Lyapunov exponent obtained by random multiplication of the matrices (8), with bounds as given in Theorem 1. Only the standard bound is appreciably far from the true value. b Errors in bounds from numerically calculated value. c Difference between upper and lower bounds. d Envelope of bounds when improved cone is included (Color figure online)

The increase in accuracy of the bounds with increasing strength of shear can be understood geometrically, as the size of the cone narrows with increasing shear, and dynamically, as the approach to the unstable eigenvalue is more rapid for matrices whose largest eigenvalue is greater. In Figs. 2 and 4 it is clear that the upper and lower bounds approach the same curve for large \(\alpha = \beta \). A simple calculation (using the \(L_{\infty }\)-norm) gives

$$\begin{aligned} 4\lambda\ge & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (1+a\alpha b \beta ) \\\ge & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (a\alpha b \beta ) \\= & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (ab) + \sum _{a,b=1}^{\infty } 2^{-a-b} \log \alpha \beta \\\ge & {} \kappa + \log \alpha \beta . \end{aligned}$$

Meanwhile, for large \(\alpha , \beta \) we also have

$$\begin{aligned} 4\lambda\le & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (1+a+a\alpha b \beta ) \\\sim & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (a\alpha b \beta ) \\\sim & {} \kappa + \log \alpha \beta , \end{aligned}$$

and this indeed appears to be the asymptotic limit in the graphs shown for \(\alpha = \beta \rightarrow \infty \).

3.2 Generalised Lyapunov Exponents

In Fig. 3 we show bounds for generalised Lyapunov exponents for \(\alpha = \beta = 1\). Equivalent figures are increasingly accurate with increasing \(\alpha , \beta \). Figure 3a confirms that for this choice of matrices we do not have \(\ell (-2) = 0\), and that there is no minimum in the curve of \(\ell (q)\). Green, red and blue lines again show bounds originating from \(L_1\)-, \(L_2\)- and \(L_{\infty }\)-norms, respectively, with the explicit expressions from Corollary 2 shown as black circles. Figure 3b shows the envelope of the difference between upper and lower bounds, for each norm, and, in black, the envelope of best combined bounds.

Fig. 3
figure 3

Bounds for the generalised Lyapunov exponent for the matrix product 8 with \(\alpha = \beta =1\). As before, estimates originating from \(L_1\)-, \(L_2\)- and \(L_{\infty }\)-norms are given in green, red and blue, respectively. For integer values of \(q>0\), values from Corollary 2 are given as black circles. a The bounds confirm that the curve of generalised Lyapunov exponents has no minimum at \(q=-2\). b Difference between upper and lower bounds. Dashed lines represent bounds from Theorem 1, while solid lines are those from 2. The black line represents the minimal envelope over all norms, and thus our best bounds. (Color figure online)

3.3 Negative Shears

Figure 4 shows the bounds for the case \(\alpha <-2\), \(\beta >2\). In these figures we set \(\alpha = -\beta \), and observe that again, the increasing hyperbolicity from increasing \(|\alpha |\) results in improving bounds. In this case the \(L_{\infty }\)-norm always gives the minimal envelope, as seen in Fig. 4b. Generalised Lyapunov exponents for \(\alpha = -3\), \(\beta =3\) are shown in Fig. 5.

Fig. 4
figure 4

Bounds for the negative entry case, in which \(\alpha <-2, \beta >2\). In this case the \(L_{\infty }\)-norm always produces the most accurate bounds. a Bounds from Theorem 4. b Envelope of bounds from Theorem 4, shown dashed, and from Theorem 5, shown as solid lines.

4 Bounds on the Growth of Vector Norms

4.1 Invariant Cones

We obtain bounds for Lyapunov exponents by computing explicit bounds for the norm of tangent vectors under the action of \(K_{ab}\). The expression (2) is independent of the matrix norm used, and we give bounds derived from three standard norms.

Fig. 5
figure 5

Bounds for the generalised Lyapunov exponent for the matrix product 8 with \(\alpha = -3\), \(\beta =3\). As before, estimates originating from \(L_1\)-, \(L_2\)- and \(L_{\infty }\)-norms are given in green, red and blue, respectively. a The curve of generalised Lyapunov exponents for \(\alpha <0\), \(\beta >0\). b Difference between upper and lower bounds. Dashed lines represent bounds from Theorem 4, while solid lines are those from 5. The black line represents the minimal envelope over all norms, and thus our best bounds (Color figure online).

Fig. 6
figure 6

The invariant cones \(C^+\) and \(C^-\) in both the \(\alpha >0\) and \(\alpha <0\) cases, with expanding and contracting eigenvectors, \(v_+\) and \(v_-\), respectively, of the matrix \(K_{ab}^T K_{ab}\). a The \(\alpha >0\) case. Here, we show the cone \(C^+\) for \(\alpha >1\), where it lies inside the line \(u/v = 1\). The expanding eigenvector \(v_+\) also lies inside the cone \(C^+\), and so the orthogonal contracting eigenvector \(v_-\) lies outside \(C^+\), and hence Lemma 3. b The \(\alpha <0\) case. For \(\alpha \beta <-4\), when the matrix \(K_{ab}\) is hyperbolic, the cone \(C^-\) lies inside the line \(u/v=L \in (-1,0)\). Both eigenvectors \(v_+\) and \(v_-\) both lie outside the invariant cone for all \(\alpha <-2\), \(\beta <-2\), which produces Lemma 9

Lemma 1

The cone \(C^+ = \{ (u,v) : 0 \le u/v \le 1/\alpha \}\) (shown in Fig. 6a) is invariant under \(K_{ab}\) for all \(a,b \ge 1\), and is the smallest such cone.

Proof

The vector

$$\begin{aligned} \begin{pmatrix} u' \\ v' \end{pmatrix} = \begin{pmatrix} 1 &{} b\beta \\ a\alpha &{} 1+a\alpha b\beta \end{pmatrix} \begin{pmatrix} u \\ v \end{pmatrix} \end{aligned}$$

is such that

$$\begin{aligned} \frac{u'}{v'} = \frac{u+b\beta v}{a\alpha u+(1+a\alpha b\beta )v} \ge 0, \end{aligned}$$

clearly, and since \(au+abv \ge u+bv\) for \(a\ge 1\), we also have \(u'/v' \le 1/\alpha \). Setting \((a,b)=(1,\infty )\) gives \(u'/v'=1/\alpha \), while setting \((a,b)=(\infty ,1)\) gives \(u'/v'=0\), showing that the cone cannot be made smaller. \(\square \)

Throughout this section, we will consider a vector \(X = (u,v)^T \in C^+\), and assume without loss of generality that \(u,v \ge 1\) (the calculations for \(u, v, <0\) are entirely analogous). We will consider the norm of the vector \(K_{ab}X\), given by

$$\begin{aligned} K_{ab} X = \begin{pmatrix} u+b\beta v \\ a\alpha u+(1+a\alpha b\beta )v\end{pmatrix}. \end{aligned}$$

First we consider the \(L_1\)-norm, given by \(||X ||_1 = |u|+ |v|\).

Lemma 2

The norm \(||K_{ab}X||_1\) for a vector \(X \in C^+\) satisfies

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_1}{||X||_1} \ge 1+\tfrac{\alpha }{1+\alpha } \left( a+ b\beta + a\alpha b\beta \right) ; \end{aligned}$$
    (10)
  2. (ii)

    the upper bound

    $$\begin{aligned} \frac{||K_{ab} X||_1}{||X||_1} \le 1+b\beta +a\alpha b\beta . \end{aligned}$$
    (11)

Proof

For any \(X \in C^+\) we have \(||X ||_1 = u+v\). With \(a,b \ge 1\) we have

$$\begin{aligned} \frac{||K_{ab} X ||_1}{||X ||_1} = 1+\frac{b\beta v + a \alpha u + a\alpha b \beta v}{u+v}\,. \end{aligned}$$

With \(\alpha , \beta \ge 1\) this has no local minima or maxima in the cone \(C^+\). Thus, the lower and upper bounds are attained at the boundaries of \(C^+\), given by \((u,v) = (\frac{1}{1+\alpha },\frac{\alpha }{1+\alpha })\) and \((u,v) = (0,1)\), respectively. \(\square \)

For the \(L_2\)-norm \(||X ||_2 = \sqrt{u^2+v^2}\), the calculations are more involved, but the following holds:

Lemma 3

The norm \(||K_{ab}X||_2\) for a vector \(X \in C^+\) satisfies

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_2^2}{||X||_2^2} \ge \min \left\{ (1+a\alpha b\beta )^2 + b^2\beta ^2\,,\, \tfrac{1}{1+\alpha ^2}\left( \alpha ^2(1+a+a\alpha b\beta )^2 + (1+\alpha b\beta )^2\right) \right\} ; \end{aligned}$$
    (12)
  2. (ii)

    the upper bound

    $$\begin{aligned}&\frac{||K_{ab} X||_2^2}{||X||_2^2} \le \tfrac{1}{2} \left( 2 + \mathcal {C}_{a\alpha b\beta } + \sqrt{\mathcal {C}_{a\alpha b\beta } (\mathcal {C}_{a\alpha b\beta } + 4)}\right) ,\nonumber \\&\quad \quad \text{ where } \mathcal {C}_{a\alpha b \beta } :=(a\alpha +b\beta )^2 + (a\alpha b \beta )^2. \end{aligned}$$
    (13)

Proof

The real \(2\times 2\) matrix \(K_{ab}\) is non-singular, and so \(\forall X \in \mathbb {R}^2\), \(\frac{||K_{ab} X||_2}{||X||_2}\) is maximised (minimised) by \(\frac{||K_{ab} v_+ ||_2}{||v_+||_2}\)\(\left( \frac{||K_{ab} v_- ||_2}{||v_-||_2}\right) \), where \(v_+\) (\(v_-\)) is the eigenvector corresponding to \(e_+\) (\(e_-\)), the larger (smaller) eigenvalue of \(K_{ab}^TK_{ab}\), by the definition of the spectral matrix norm (and by singular value decomposition). Moreover, the value of \(\frac{||K_{ab} X||_2}{||X||_2}\) varies monotonically between these extremes. Since \(K_{ab}^TK_{ab}\) is symmetric, \(v_-\) and \(v_+\) are orthogonal.

The eigenvector \(v_+ = (r,s)\) satisfies

$$\begin{aligned} \frac{r}{s} = \frac{2(a\alpha (a\alpha b\beta +1)+b\beta )}{\mathcal {C}_{a\alpha b\beta } - 2a^2\alpha ^2 + \sqrt{\mathcal {C}_{a\alpha b\beta } (\mathcal {C}_{a\alpha b\beta } + 4)}}\,. \end{aligned}$$
(14)

Clearly \(r>0\), while \(s=\mathcal {C}_{a\alpha b\beta } - 2a^2\alpha ^2 + \sqrt{\mathcal {C}_{a\alpha b\beta } (\mathcal {C}_{a\alpha b\beta } + 4)}> 2C_{a\alpha b\beta } - 2a^2\alpha ^2 = 2b^2\beta ^2 + 4a\alpha b\beta + 2(a\alpha b\beta )^2 >0\), and so \(v_+\) lies in the positive quadrant of tangent space. Moreover, we have \(s> 2\mathcal {C}_{a\alpha b\beta } - 2a^2\alpha ^2 = 4a\alpha b\beta + 2b^2\beta ^2 + 2a\alpha b\beta )^2 > 2a\alpha + 2b\beta + 2a^2\alpha ^2 b\beta = r\) (since \(a, b, \alpha , \beta \ge 1\)), and so \(v_+ \in C^+\), giving the upper bound. The orthogonality of the eigenvectors then implies that \(v_- \notin C^+\), and the lower bound is given by the minimum of the value of the spectral norm on the boundaries of \(C^+\). \(\square \)

Next, consider the \(L_\infty \)-norm, given by \(||X ||_{\infty } = \max (|u|, |v|).\)

Lemma 4

The norm \(||K_{ab}X||_{\infty }\) for a vector \(X \in C^+\) satisfies, for \(\alpha \ge 1\),

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_{\infty }}{||X||_{\infty }} \ge 1+a\alpha b\beta \,; \end{aligned}$$
    (15)
  2. (ii)

    the upper bound

    $$\begin{aligned} \frac{||K_{ab} X||_{\infty }}{||X||_{\infty }} \le 1+a+a\alpha b\beta \,. \end{aligned}$$
    (16)

Proof

For \(\alpha \ge 1\) we have \(||X ||_{\infty } = v\). Then

$$\begin{aligned} \frac{||K_{ab} X ||_{\infty }}{||X ||} = a\alpha \tfrac{u}{v}+(1+a\alpha b\beta ). \end{aligned}$$

This takes minimum and maximum values at minimum and maximum values of u / v, respectively. For the cone \(C^+\) these are given by 0 and \(1/\alpha \), and the bounds follow immediately. \(\square \)

We can now use these bounds and the invariant cone to prove Theorem 1.

Proof of Theorem 1

Taking i.i.d. copies of \(K_{ab}\) and defining \(X_{N_k} = K_{a_kb_k}X_{N_{k-1}}\), \(k=1,\ldots ,J\), we have for an initial vector \(X_0 \in C^+\),

$$\begin{aligned} ||X_{N_J}||= \frac{||K_{a_Jb_J} X_{N_{J-1}} ||}{||X_{N_{J-1}} ||}\, \frac{||K_{a_{J-1}b_{J-1}} X_{N_{J-2}} ||}{||X_{N_{J-2}} ||} \cdots \frac{||K_{a_1b_1} X_{0} ||}{||X_{0} ||}. \end{aligned}$$
(17)

By Lemma 1, each term in the product is a vector \(\in C^+\), and so is bounded according to Lemmas 2, 3 and 4. Hence

$$\begin{aligned} (\phi _k(a,b,\alpha ,\beta ))^J \le ||X_{N_J} ||\le (\psi _k (a,b,\alpha ,\beta ))^J \end{aligned}$$

for \(k = 1, 2, \infty \). Now

$$\begin{aligned} \lambda = \lim _{N \rightarrow \infty } \frac{1}{N} \mathbb {E}\log ||X_N ||= \lim _{J \rightarrow \infty } \frac{1}{4J} \mathbb {E}\log ||X_{N_J} ||\end{aligned}$$
(18)

since \(\mathbb {E}n = 4\), and so using the probability distribution P(ab) we have

$$\begin{aligned} \lim _{J \rightarrow \infty } \frac{1}{J}\sum _{a, b=1}^{\infty }2^{-a-b} \log (\phi _k (a,b,\alpha ,\beta ))^J \le 4\lambda \le \lim _{J \rightarrow \infty } \frac{1}{J} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (\psi _k (a,b,\alpha ,\beta ))^J \end{aligned}$$

and hence

$$\begin{aligned} \sum _{a,b=1}^{\infty }2^{-a-b} \log \phi _k (a,b,\alpha ,\beta ) \le 4\lambda \le \sum _{a,b=1}^{\infty } 2^{-a-b} \log \psi _k (a,b,\alpha ,\beta ) \\ \end{aligned}$$

as required. \(\square \)

To obtain Corollary 1 we select the algebraically simplest bounds (the \(L_{\infty }\) bounds), and evaluate the infinite sums where possible.

Proof of Corollary 1

The lower \(L_{\infty }\) bound immediately gives:

$$\begin{aligned} 4\lambda\ge & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (1+a\alpha b\beta ) \\\ge & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (a\alpha b\beta ) \\= & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log a b + \log (\alpha \beta ) \sum _{a,b=1}^{\infty } 2^{-a-b} \\= & {} \kappa + \log \alpha \beta . \end{aligned}$$

A little more work is required for the upper bound. We have

$$\begin{aligned} 4\lambda\le & {} \sum _{a,b=1}^{\infty } 2^{-a-b} \log (1+a+ a\alpha b\beta ) \\= & {} \sum _{a=1}^{\infty } 2^{-a-1}\log (1+a+a\alpha \beta ) + \sum _{a=1}^{\infty } \sum _{b=2}^{\infty } 2^{-a-b} \log (1+a+a\alpha b\beta )\\\le & {} \tfrac{1}{2}\sum _{a=1}^{\infty } 2^{-a}\log \left( a( \sqrt{\alpha \beta } + 1/{\sqrt{\alpha \beta }} )^2 \right) + \sum _{a=1}^{\infty } \sum _{b=2}^{\infty } 2^{-a-b} \log (ab(1+\alpha \beta )) \end{aligned}$$

since \(a\left( \sqrt{\alpha \beta } + {1}/{\sqrt{\alpha \beta }} \right) ^2 = a(\alpha \beta + 2 + 1/\alpha \beta ) > a\alpha \beta + a+1\) for \(a\ge 1\), and since \(ab(1+\alpha \beta )>1+a+a\alpha b\beta \) for \(b \ge 2\). Then, the logarithms can be separated, reinstating and subtracting the \(b=1\) term to the second term, to give

$$\begin{aligned}&4\lambda \le \tfrac{1}{2}\sum _{a=1}^{\infty } 2^{-a} \log a + \log (\sqrt{\alpha \beta } + 1/\sqrt{\alpha \beta }) + \kappa \\&\quad + \log (1+\alpha \beta ) - \sum _{a=1}^{\infty }2^{-a-1} \log a(1+\alpha \beta ) \end{aligned}$$

and hence

$$\begin{aligned} 4\lambda \le \kappa + \log (\sqrt{\alpha \beta } + 1/\sqrt{\alpha \beta }) + \tfrac{1}{2} \log (1+\alpha \beta ). \end{aligned}$$

\(\square \)

4.2 Cone Improvement

In this section we improve on the lower bound by considering the relationship between two identical geometric distributions.

Lemma 5

When a and b are both i.i.d. geometric distributions with parameter 1 / 2, we have

$$\begin{aligned} P(a=b) = P(a>b) = P(b>a) = 1/3. \end{aligned}$$

Proof

We have

$$\begin{aligned} P(a=b) = \sum _{i=1}^{\infty } P(a=i \cap b=i) = \sum _{i=1}^{\infty } 2^{-2i} = \frac{1/4}{1-1/4} = \frac{1}{3}. \end{aligned}$$

Then, the remaining two equalities follow by symmetry.

Lemma 6

The cone \(C = \{ 0 \le \frac{u}{v} \le \frac{1}{\alpha }\}\) is mapped into the following cones, in the following cases:

  1. 1.

    when \(a<b\), \(K_{ab} (C) = C\);

  2. 2.

    when \(a=b\), \(K_{ab} (C) = \{ 0 \le \frac{u}{v} \le \frac{1+\alpha \beta }{2\alpha +\alpha ^2 \beta } \}\);

  3. 3.

    when \(a>b\), \(K_{ab} (C) = \{ 0 \le \frac{u}{v} \le \frac{1+\alpha \beta }{3\alpha + 2\alpha ^2 \beta } \}\). Consequently, we have

    $$\begin{aligned} \phi _k^{(m)}(a,b,\alpha ,\beta ) \le \frac{||K_{ab} X ||}{||X||} \le \psi _k^{(m)}(a,b,\alpha ,\beta ), \end{aligned}$$

    for \(k=1, 2, \infty \), and for \(m=1,2,3\) corresponding to the cases above, with \(\phi _k^{(m)}\) and \(\psi _k^{(m)}\) as given in Theorem 2.

Proof

In each case, the cone boundary \((0,1)^T\) is mapped onto \((b\beta ,1+a\alpha b\beta )^T\), which lies arbitrarily close to \((0,1)^T\) for large a, regardless of the relationship between a and b, and for all \(\alpha \), \(\beta >0\). The other cone boundary \((1,\alpha )^T\) is mapped onto \((1+\alpha b\beta ,1+ a\alpha +a\alpha b\beta )^T\), and then, we observe that:

  1. 1.

    if \(a=b\), the ratio \(\frac{1+a\alpha \beta }{a\alpha +\alpha +a^2 \alpha ^2 \beta }\) is maximised when \(a=1\);

  2. 2.

    if \(a>b\), the ratio \(\frac{1+b\alpha \beta }{a\alpha +\alpha +a\alpha ^2 b\beta }\) is maximised when \(a=2\) and \(b=1\);

  3. 3.

    if \(b>a\), the ratio \(\frac{1+\alpha b\beta }{a\alpha +\alpha +a\alpha ^2 b \beta }\) approaches \(\frac{1}{\alpha }\) for \(a=1\) and \(b \rightarrow \infty \).

The bounds then follow using the same derivations as in Lemmas 2, 3 and 4, substituting these new cone boundaries where appropriate.

Proof of Theorem 2

This follows the same argument as Proof of Theorem 1, except that whenever it happens that \(a=b\), or \(a>b\), on the following iterate the vector \(||X_i ||\) is bounded according to Lemma 6. Since by Lemma 5 these conditions occur on average 1 / 3 of the time, the result follows. \(\square \)

4.3 Negative Shears

As in Sect. 2.3, we reverse one of the shears, taking (without loss of generality) \(\alpha <-2, \beta >2\), with \(a, b >0\). Eigenvalues of \(K_{ab}\) are then given by

$$\begin{aligned} e_{\pm } = \frac{2+a\alpha b\beta \pm \sqrt{a\alpha b \beta (a\alpha b\beta +4)}}{2}\,. \end{aligned}$$

The expanding eigenvalue \(e_-\) has eigenvector \((u,v)^T\) with

$$\begin{aligned} \frac{u}{v} = -\frac{b\beta }{2} + \sqrt{\left( \frac{b\beta }{2} \right) ^2 + \frac{b\beta }{a\alpha }} < 0. \end{aligned}$$

In the case \(\alpha <-2\) the minimal cone is bounded by this eigenvector when \(a=b=1\), so setting

$$\begin{aligned} \Gamma = -\frac{\beta }{2} + \sqrt{\left( \frac{\beta }{2} \right) ^2 + \frac{\beta }{\alpha }} \in (-1,0) \end{aligned}$$

we have:

Lemma 7

The cone \(C^{-} = \{ (u,v) : \Gamma \le u/v \le 0 \}\) is invariant under \(K_{ab}\) for all \(a, b \ge 1\), and for all \(\alpha <-2\), \(\beta >2\), and is the smallest such cone.

Proof

Without loss of generality we will take an initial vector (uv) with \(u<0, v>0\) (an initial vector in the opposite sector proceeds exactly analogously) in \(C^-\), so that \(-u<v\) and \(u>-v\). Then, we consider

$$\begin{aligned} \begin{pmatrix} u' \\ v' \end{pmatrix} = \begin{pmatrix} 1 &{} b\beta \\ a\alpha &{} 1+a\alpha b\beta \end{pmatrix} \begin{pmatrix} u \\ v \end{pmatrix} = \begin{pmatrix} u+b\beta v \\ a\alpha u+(1+a\alpha b\beta )v \end{pmatrix}. \end{aligned}$$

Now \(u' = u+b\beta v>v(b\beta -1)>0\), and \(v' = a\alpha u+(1+a\alpha b\beta )v< a\alpha u +u(-1-a\alpha b\beta ) = u(-a\alpha (b\beta -1)-1)<0\), and so \(u'/v' <0\).

Since \(e_- = 1+\beta /\Gamma \), the characteristic equation for \(K_{11}\) is \(\alpha \Gamma ^2 + \alpha \beta \Gamma - \beta =0\). Then, since \(\alpha \beta \Gamma > |\alpha \Gamma ^2|\) (since \(\beta> 2 > |\Gamma |\)) we have \(a\alpha \Gamma ^2 + a\alpha \beta \Gamma - \beta \ge 0\) for \(a\ge 1\). We also have \(a\alpha \beta \Gamma > \beta \), and so for \(b \ge 1\),

$$\begin{aligned} a \alpha \Gamma ^2 + a \alpha b \beta \Gamma - b \beta \ge 0 \end{aligned}$$

and hence

$$\begin{aligned} a \alpha \Gamma ^2 + a \alpha b \beta \Gamma +\frac{u}{v} \ge b\beta + \frac{u}{v}. \end{aligned}$$

Now we use the fact that \(|a\alpha u/v| > |u/v| \) to replace two of these terms while respecting the inequality:

$$\begin{aligned} a \alpha \Gamma \frac{u}{v} + a \alpha b \beta \Gamma + \Gamma \ge b\beta + \frac{u}{v}. \end{aligned}$$

Rearranging then gives \(u'/v' \ge \Gamma \). This is the smallest such invariant cone, since setting \((a,b) = (1,1)\) gives \(u'/v' = \Gamma \) when \(u/v = \Gamma \), and setting \((a,b) = (\infty ,1)\) gives \(u'/v' = 0\). \(\square \)

As before, the \(L_{\infty }-\)norm gives bounds easily:

Lemma 8

The norm \(||K_{ab}X||_{\infty }\) for a vector \(X \in C^-\), when \(\alpha <-2, \beta >2\), satisfies

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_{\infty }}{||X||_{\infty }} \ge -a\alpha b\beta - a\alpha \Gamma -1; \end{aligned}$$
    (19)
  2. (ii)

    the upper bound

    $$\begin{aligned} \frac{||K_{ab} X||_{\infty }}{||X||_{\infty }} \le -a\alpha b\beta -1. \end{aligned}$$
    (20)

Proof

Since \(\Gamma >-1\), for any \(X \in C^-\) we have \(||X ||_{\infty } = |v|\) and since \(C^-\) is invariant under \(K_{ab}\), we have \(||K_{ab} X ||_{\infty } / ||X ||_{\infty } = |a\alpha \frac{u}{v}+1+a\alpha b\beta |\), which takes minimum and maximum values at the boundaries \((u,v) = (0,1)\) and \((u,v) = (\Gamma ,1)\) of the cone \(C^-\), and the bounds follow immediately. \(\square \)

For this invariant cone, the \(L_2\)-norm \(||\cdot ||_2\) cannot attain the spectral maximum, and the following holds:

Lemma 9

The norm \(||K_{ab}X||_2\) for a vector \(X \in C^-\) satisfies

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_2^2}{||X||_2^2} \ge \tfrac{1}{1+\Gamma ^2}\left( (\Gamma +b\beta )^2 + (1+a\alpha \Gamma +a\alpha b\beta )^2\right) ; \end{aligned}$$
    (21)
  2. (ii)

    the upper bound

    $$\begin{aligned} \frac{||K_{ab} X||_2^2}{||X||_2^2} \le (1+ab\alpha \beta )^2 + b^2\beta ^2\,. \end{aligned}$$
    (22)

Proof

As in Lemma 3, we consider eigenvectors of \(K_{ab}^TK_{ab}\). For \(\alpha <-2\), \(\beta >2\), the expanding eigenvector \(v_+ = (r,s)\) still lies in the northeast–southwest quadrant, outside \(C^-\). But since \(v_- = (-s,r)\), we have

$$\begin{aligned} s= & {} C_{a\alpha b\beta } -2a^2\alpha ^2+\sqrt{C_{a\alpha b\beta }(C_{a\alpha b\beta }+4)} \\> & {} 2C_{a\alpha b\beta } - 2a^2\alpha ^2 \\> & {} 4a\alpha b\beta +2 b^2\beta ^2 +2^2a\alpha ^2 b^2\beta ^2 \\> & {} 2a\alpha + 2b\beta +2a^2\alpha ^2b\beta \\= & {} r, \end{aligned}$$

and so \(-s/r <-1\), and hence \(v_-\) also lies outside \(C^-\). Since the norm in question increases monotonically between the two extremes, neither of which lie in the cone, the lower and upper bounds are achieved at the minimum and maximum values (respectively) at the boundaries of \(C^-\). At the boundary given by \((u,v)=(0,1)\), we have \(\frac{||K_{ab} X||_2^2}{||X||_2^2} = b^2\beta ^2 +(1+a\alpha b\beta )^2\), while at the other boundary, given by \((u,v) = (\Gamma /\sqrt{1+\Gamma ^2},1/\sqrt{1+\Gamma ^2})\), we have

$$\begin{aligned} \frac{||K_{ab} X||_2^2}{||X||_2^2}= & {} \tfrac{1}{1+\Gamma ^2} \left( (\Gamma +b\beta )^2 + (1+a\alpha \Gamma +a\alpha b\beta )^2 \right) \\< & {} (\Gamma +b\beta )^2 + (1+a\alpha \Gamma +a\alpha b\beta )^2 \\< & {} b^2\beta ^2 +(1+a\alpha b\beta )^2, \end{aligned}$$

since \(-1<\Gamma <0\). \(\square \)

Lemma 10

The norm \(||K_{ab}X||_1\) for a vector \(X \in C^-\) satisfies

  1. (i)

    the lower bound

    $$\begin{aligned} \frac{||K_{ab} X||_1}{||X||_1} \ge \tfrac{1}{1-\Gamma } (b\beta - a\alpha b \beta - 1 -\Gamma (a\alpha + 1)); \end{aligned}$$
    (23)
  2. (ii)

    the upper bound

    $$\begin{aligned} \frac{||K_{ab} X||_1}{||X||_1} \le b\beta - a\alpha b\beta -1\,. \end{aligned}$$
    (24)

Proof

With the \(L_1\)-norm we have \(||K_{ab} X ||_1 = |u+b\beta v| + |-a\alpha u +(1+a\alpha b\beta )v|\), which takes the given values at the boundaries \((u,v) = (0,1)\) and \((u,v) = (\Gamma /(1-\Gamma ),1/(1-\Gamma )\) of \(C^-\). \(\square \)

Proof of Theorem 4

This follows exactly the argument of Theorem 1, using Lemma 7 to guarantee an invariant cone, and using Lemmas 8, 9 and 10 to bound each term in the matrix product. \(\square \)

4.4 Cone Improvement

In the \(\alpha <0\) case we can make a significant improvement on the bounds given by Theorem 4 by recognising that the boundary \(u/v = \Gamma \) of the cone \(C^-\) can only be achieved when \(a=b=1\), which occurs on average \(P(a=b=1) = 1/4\) of the time. Whenever a or b (or both) is greater than 1, we can assume a smaller cone for the following iterate. More precisely, since \(P(a=1, b\ge 2) = P(a\ge 2, b=1)= P(a\ge 2, b\ge 2) = 1/4\), we have

Lemma 11

The cone \(C^- = \{ \Gamma \le \frac{u}{v} \le 0\}\) is mapped into the following cones with equal probability:

  1. 1.

    When \(a=b =1\), \(K_{ab} (C^-) =\left\{ \Gamma \le \frac{u}{v} \le \frac{\beta }{1+\alpha \beta } \right\} \);

  2. 2.

    when \(a\ge 2, b = 1\), \(K_{ab} (C^-) = \left\{ \Gamma _{2,1} \le \frac{u}{v} \le 0 \right\} \);

  3. 3.

    when \(a = 1, b\ge 2\), \(K_{ab} (C^-) =\left\{ \Gamma _{1,2} \le \frac{u}{v} \le \frac{1}{\alpha } \right\} \);

  4. 4.

    when \(a \ge 2, b\ge 2\), \(K_{ab} (C^-) = \left\{ \Gamma _{2,2} \le \frac{u}{v} \le 0 \right\} \).

These cones then produce the functions \(\hat{\tilde{\phi }}_k^{(m_a,m_b)}(a,b,\alpha ,\beta )\) and \(\hat{\tilde{\psi }}_{\infty }^{(m)} (a,b,\alpha ,\beta )\), for \(m_a, m_b = 1,2\) and \(m = 1,2,3\) as detailed in Theorem 5.

Proof

Any vector (uv) is mapped by \(K_{ab}\) into \((u',v')\) such that

$$\begin{aligned} \frac{u'}{v'} = \frac{\frac{u}{v}+b\beta }{a\alpha \frac{u}{v}+1+a\alpha b\beta }\,. \end{aligned}$$

Inserting the boundaries of \(C^-\), given by \(\frac{u}{v} = 0\) and \(\frac{u}{v} = \Gamma \) into this expression produces the required inequalities. The bounding functions are then obtained using analogous arguments to Lemmas 8, 9 and 10, with the new cone boundaries. \(\square \)

Proof of Theorem 5

Again this follows the same argument as the Proof of Theorem 1, using improved bounds given by Lemma 11, each of which applies 1 / 4 of the time, on average.

4.5 Generalised Lyapunov Exponents

The expressions for \(\ell (q)\) can be obtained in largely the same way, bounding the expansion of vectors at each application of matrix A or B.

Proof of Theorems 3 and 6

Using properties of expectation, and the independence of \(||X_i ||\), we have

$$\begin{aligned} \mathbb {E}||X_{N_J} ||^q= & {} \mathbb {E}\left( \frac{||K_{a_Jb_J} X_{N_{J-1}} ||}{||X_{N_{J-1}} ||}\, \frac{||K_{a_{J-1}b_{J-1}} X_{N_{J-2}} ||}{||X_{N_{J-2}} ||} \cdots \frac{||K_{a_1b_1} X_{0} ||}{||X_{0} ||} \right) ^q \\= & {} \mathbb {E}\left( \frac{||K_{a_Jb_J} X_{N_{J-1}} ||}{||X_{N_{J-1}} ||} \right) ^q\, \mathbb {E}\left( \frac{||K_{a_{J-1}b_{J-1}} X_{N_{J-2}} ||}{||X_{N_{J-2}} ||} \right) ^q \cdots \mathbb {E}\left( \frac{||K_{a_1b_1} X_{0} ||}{||X_{0} ||} \right) ^q \end{aligned}$$

and so since the \(a_i, b_i\) are i.i.d., we have

$$\begin{aligned} \sum _{a,b=1}^{\infty } 2^{-a-b} \phi ^q \le \mathbb {E}||X_{N_J} ||^q \le \sum _{a,b=1}^{\infty } 2^{-a-b} \psi ^q\,. \end{aligned}$$

Then, from the definition of \(\ell (q)\) given in (6) the results follow immediately. \(\square \)

5 Conclusions and Discussion

In this paper we addressed the question of obtaining rigorous bounds for Lyapunov exponents, generalised Lyapunov exponents and topological entropy for randomised mixing devices. The matrices under discussion are \(2 \times 2\) shear matrices, but a related technique will work for any set of matrices that share an invariant cone. This notion is proved formally in Protasov and Jungers (2013), who give a rapid algorithm involving unconstrained minimisation problems. Here, the optimisation is achieved analytically, giving explicit upper and lower bounds. We also obtain bounds in the novel case of shear matrices with negative entries. A pair of hyperbolic matrices sharing an invariant cone was shown to enjoy exponential decay of correlations in Ayyer and Stenlund (2007), where the rate of decay depends on the Lyapunov exponent, but here the Lyapunov exponent is simply bounded from below by global expansion and contraction rates in the invariant cone. The method in this paper could be adapted to tighten their lower bound, and provide an upper bound.

Bounds on generalised Lyapunov exponents are of particular interest since they are known to be extremely difficult to compute numerically. Generalised Lyapunov exponents are related to the \(L_p\)-norm joint spectral radius, or p-radius, given, for a set of m matrices \(\{ A_1, A_2, \ldots , A_m\}\), by

$$\begin{aligned} \rho _p = \lim _{N \rightarrow \infty } \left[ m^{-N} \sum _{B \in \mathcal {A}^N} \Vert B \Vert ^p \right] ^{1/pN} = \lim _{N \rightarrow \infty } \left[ \mathbb {E} \Vert M_N \Vert ^p \right] ^{1/pN}, \end{aligned}$$
(25)

where as above, \(\mathcal {A}^N\) is the set of all \(m^N\) products of matrices of length N. The parameter q in (5) plays the same role as p in the \(L_p\)-norm in (25), but the quantities differ, since the generalised Lyapunov exponent uses a logarithm, while the p-radius employs an Nth root, to prevent the value from growing without bound. Nevertheless, our method could be used to effectively bound the p-radius from above and below quite straightforwardly.

The assumption that the matrices A and B should be chosen with equal probability at each iterate can be relaxed. Altering these probabilities does not change the invariant cone, or the resulting bounds on vector norms; only the probability distribution \(P(a,b) = 2^{-a-b}\) is changed. For example, replacing the geometric probability distribution with a Bernoulli distribution gives \(P(a,b) = p^aq^b\), and then \(\mathbb {E}a=q^{-1}\), \(\mathbb {E}b=p^{-1}\), and \(\mathbb {E}n = (pq)^{-1}\). Similarly, one may choose from k matrices \(A_i\) with probability \(p_i\) at each iterate. The crucial element is that the expected length of a block should be obtainable, as in (18) N iterates need to be equated with J blocks.

Theorem 2 improves on (1) by involving the relative values of a and b in one block to shrink the cone for the next, in the three cases \(a=b\), \(a<b\) and \(a>b\). Similarly, the nine cases comprising the relative values of a and b in two consecutive blocks can increase the tightness of bounds in the following block. This procedure could be extended to further improve bounds, but the number of cases increases exponentially—in k blocks there are \(3^k\) combinations of relative values of a and b. Our original explicit bounds are appealing in their simplicity and accuracy.