1 Introduction

1.1 The Context

The hyperbolic spaces (that is rank 1 symmetric spaces of non-compact type) are \(\mathbf{H}^n_{\mathbb {F}}\), where \({\mathbb {F}}\) is one of the real numbers, the complex numbers, the quaternions or the octonions (and in the last case \(n=2\)); see Chen and Greenberg [5]. A map in \(\mathrm{Isom}(\mathbf{H}^n_{\mathbb {F}})\) is parabolic if it has a unique fixed point and this point lies on \(\partial \mathbf{H}^n_{\mathbb {F}}\). Parabolic isometries of \(\mathbf{H}^2_{\mathbb {R}}\) and \(\mathbf{H}^3_{\mathbb {R}}\), that is parabolic elements of \(\mathrm{PSL}(2,{\mathbb {R}})\) and \(\mathrm{PSL}(2,{\mathbb {C}})\), are particularly simple: they are (conjugate to) Euclidean translations. In all the other cases, there are more complicated parabolic maps, which are conjugate to Euclidean screw motions.

Shimizu’s lemma [23] gives a necessary condition for a subgroup of \(\mathrm{PSL}(2,{\mathbb {R}})\) containing a parabolic element to be discrete. If one normalises so that the parabolic fixed point is \(\infty \), then Shimizu’s lemma says that the isometric sphere of any group element not fixing infinity has bounded radius, the bound being the Euclidean translation length. Equivalently, it says that the horoball with height the Euclidean translation length is precisely invariant (that is elements of the group either map the horoball to itself or to a disjoint horoball). Therefore, Shimizu’s lemma may be thought of as an effective version of the Margulis lemma in the case of cusps. Shimizu’s lemma was generalised to \(\mathrm{PSL}(2,{\mathbb {C}})\) by Leutbecher [17] and to subgroups of \(\mathrm{Isom}(\mathbf{H}^n_{\mathbb {R}})\) containing a translation by Wielenberg [25]. Ohtake gave examples showing that, for \(n\ge 4\), subgroups of \(\mathrm{Isom}(\mathbf{H}^n_{\mathbb {R}})\) containing a more general parabolic map can have isometric spheres of arbitrarily large radius, or equivalently there can be no precisely invariant horoball [19]. Finally, Waterman [24] gave a version of Shimizu’s lemma for more general parabolic maps, by showing that each isometric sphere is bounded by a function of the parabolic translation length at its centre. Recently, Erlandsson and Zakeri [6, 7] have constructed precisely invariant regions contained in a horoball with better asymptotics than those of Waterman; see also [22].

It is then natural to ask for versions of Shimizu’s lemma associated to other rank 1 symmetric spaces. The holomorphic isometry groups of \(\mathbf{H}^n_{\mathbb {C}}\) and \(\mathbf{H}^n_{\mathbb H}\) are \(\mathrm{PU}(n,1)\) and \(\mathrm{PSp}(n,1)\), respectively. Kamiya generalised Shimizu’s lemma to subgroups of \(\mathrm{PU}(n,1)\) or \(\mathrm{PSp}(n,1)\) containing a vertical Heisenberg translation [13]. For subgroups of \(\mathrm{PU}(n,1)\) containing a general Heisenberg translation, Parker [20, 21] gave versions of Shimizu’s lemma both in terms of a bound on the radius of isometric spheres and a precisely invariant horoball or sub-horospherical region.This was generalised to \(\mathrm{PSp}(n,1)\) by Kim and Parker [16]. Versions of Shimizu’s lemma for subgroups of \(\mathrm{PU}(2,1)\) containing a screw parabolic map were given by Jiang et al. [10, 14]. Kim claimed the main result of [10] holds for \(\mathrm{PSp}(2,1)\) [15]. But in fact, he failed to consider all possible types of screw parabolic map (in the language below, he assumed \(\mu =1\)). Our result completes the project begun by Kamiya [13] by giving a full version of Shimizu’s lemma for any parabolic isometry of \(\mathbf{H}^n_{\mathbb {C}}\) or \(\mathbf{H}^n_{\mathbb H}\) for all \(n\ge 2\).

Shimizu’s lemma is a special case of Jørgensen’s inequality [12], which is among the most important results about real hyperbolic 3-manifolds. Jørgensen’s inequality has also been generalised to other hyperbolic spaces. Versions for isometry groups of \(\mathbf{H}^2_{\mathbb {C}}\) containing a loxodromic or elliptic map were given by Basmajian and Miner [1] and Jiang et al. [9]. These results were extended to \(\mathbf{H}^2_{\mathbb H}\) by Kim and Parker [16] and Kim [15]. Cao and Parker [3] and Cao and Tan [4] obtained generalised Jørgensen’s inequalities in \(\mathbf{H}^n_{\mathbb H}\) for groups containing a loxodromic or elliptic map. Finally, Markham and Parker [18] obtained a version of Jørgensen’s inequality for the isometry groups of \(\mathbf{H}^2_{\mathbb O}\) with certain types of loxodromic map.

1.2 Statements of the Main Results

The purpose of this paper is to obtain a generalised version of Shimizu’s lemma for parabolic isometries of quaternionic hyperbolic n-space, and in particular for screw parabolic isometries. In order to state our main results, we need to use some notation and facts about quaternions and quaternionic hyperbolic n-space.

We will show in Sect. 2.3 that a general parabolic isometry of quaternionic hyperbolic space \(\mathbf{H}_{{\mathbb {H}}}^n\) can be normalised to the form

$$\begin{aligned} T=\left( \begin{array}{c@{\quad }c@{\quad }c} \mu &{} -\sqrt{2}\tau ^*\mu &{} (-\Vert \tau \Vert ^2+t)\mu \\ 0 &{} U &{} \sqrt{2}\tau \mu \\ 0 &{} 0 &{} \mu \end{array}\right) , \end{aligned}$$
(1)

where \(\tau \in {\mathbb H}^{n-1}\), t is a purely imaginary quaternion, \(U\in \mathrm{Sp}(n-1)\) and \(\mu \) is a unit quaternion satisfying

$$\begin{aligned} {\left\{ \begin{array}{ll} U\tau =\mu \tau ,\ U^*\tau =\overline{\mu }\tau ,\ \mu \tau \ne \tau \overline{\mu } &{} \hbox {if }\tau \ne 0 \hbox { and }\mu \ne \pm 1, \\ U\tau =\mu \tau ,\ U^*\tau =\overline{\mu }\tau &{} \hbox {if }\tau \ne 0 \hbox { and }\mu =\pm 1, \\ \mu t\ne t\overline{\mu } &{} \hbox {if }\tau =0\hbox { and }\mu \ne \pm 1, \\ t\ne 0 &{} \hbox {if }\tau =0\hbox { and }\mu =\pm 1. \end{array}\right. } \end{aligned}$$
(2)

We call a parabolic element of form (1) a Heisenberg translation if \(\mu =\pm 1\) and \(U=\mu I_{n-1}\), and we say that it is screw parabolic otherwise. We remark that even for \(n=2\) it is possible to find screw parabolic maps with \(\mu \ne \pm 1\) and \(\tau \ne 0\). This is the point overlooked by Kim [15].

If \(\mu \) is a unit quaternion and \(\zeta \in {\mathbb H}^{n-1}\), the map \(\zeta \longmapsto \mu \zeta \overline{\mu }\) is linear. For U and \(\mu \) as above, consider the following linear maps:

$$\begin{aligned} B_{U,\mu }:\zeta \longmapsto U\zeta -\zeta \mu , \quad B_{\mu }:\zeta \longmapsto \mu \zeta -\zeta \mu . \end{aligned}$$

Define \(N_{U,\mu }\) and \(N_{\mu }\) to be their spectral norms, that is

$$\begin{aligned} N_{U,\mu }= & {} \max \{\Vert B_{U,\mu }\zeta \Vert :\ \zeta \in {\mathbb H}^{n-1}\ \text{ and }\ \Vert \zeta \Vert =1\}, \end{aligned}$$
(3)
$$\begin{aligned} N_{\mu }= & {} \max \{\Vert B_{\mu }\zeta \Vert :\ \zeta \in {\mathbb H}^{n-1}\ \text{ and }\ \Vert \zeta \Vert =1\}=2|\mathrm{Im}(\mu )|. \end{aligned}$$
(4)

Note that \(U^*\zeta -\zeta \overline{\mu } =U^*\zeta \mu \overline{\mu }-U^*U\zeta \overline{\mu } =-U^*(U\zeta -\zeta \mu )\overline{\mu }\). Therefore, \(N_{U^*,\overline{\mu }}=N_{U,\mu }\). We remark that \(N_{\mu }=0\) if and only if \(\mu =\pm 1\), and \(N_{U,\mu }= 0\) if and only if both \(\mu =\pm 1\) and \(U=\mu I_{n-1}\), that is \(N_{U,\mu }=0\) if and only if T is a Heisenberg translation.

We may identify the boundary of \(\mathbf{H}^n_{\mathbb H}\) with the \(4n-1\)-dimensional generalised Heisenberg group with 3-dimensional centre, which is \({\mathfrak N}_{4n-1}={\mathbb H}^{n-1}\times \mathrm{Im}({\mathbb H})\) with the group law

$$\begin{aligned} (\zeta _1,v_1)\cdot (\zeta _2,v_2)= (\zeta _1+\zeta _2,v_1+v_2+2\mathrm{Im}(\zeta _2^*\zeta _1)). \end{aligned}$$

There is a natural metric called, the Cygan metric, on \({\mathfrak N}_{4n-1}\). Any parabolic map T fixing \(\infty \) is a Cygan isometry of \({\mathfrak N}_{4n-1}\). The natural projection from \({\mathfrak N}_{4n-1}\) to \({\mathbb H}^{n-1}\) given by \(\Pi :(\zeta ,v)\longmapsto \zeta \) is called vertical projection. The vertical projection of T is a Euclidean isometry of \({\mathbb H}^{n-1}\).

An element S of \(\mathrm{Sp}(n,1)\) not fixing \(\infty \) is clearly not a Cygan isometry. However, there is a Cygan sphere with centre \(S^{-1}(\infty )\), called the isometric sphere of S, that is sent by S to the Cygan sphere of the same radius, centred at \(S(\infty )\). We call this radius \(r_S=r_{S^{-1}}\). Our first main result is the following theorem relating the radius of the isometric spheres of S and \(S^{-1}\), the Cygan translation length of T at their centres and the Euclidean translation length of the vertical projection of T at the vertical projections of the centres.

Theorem 1.1

Let \(\Gamma \) be a discrete subgroup of \(\mathrm{PSp}(n,1)\) containing the parabolic map T given by (1). Let \(\Pi :{\mathfrak N}_{4n-1}\longmapsto {\mathbb H}^{n-1}\) be vertical projection given by \(\Pi :(\zeta ,v)\longmapsto \zeta \). Suppose that the quantities \(N_{U,\mu }\) and \(N_\mu \) defined by (3) and (4) satisfy \(N_\mu < 1/4\) and \(N_{U,\mu } < (3-2\sqrt{2+N_\mu })/2\). Define

$$\begin{aligned} K=\frac{1}{2}\Bigl (1+2N_{U,\mu }+\sqrt{1-12N_{U,\mu }+4N_{U,\mu }^2-4N_\mu }\Bigr ). \end{aligned}$$
(5)

If S is any other element of \(\Gamma \) not fixing \(\infty \) and with isometric sphere of radius \(r_S\), then

$$\begin{aligned} r_S^2\le & {} \frac{\ell _T(S^{-1}(\infty ))\ell _T(S(\infty ))}{K}\nonumber \\&+\frac{4\Vert \Pi TS^{-1}(\infty )-\Pi S^{-1}(\infty )\Vert \, \Vert \Pi TS(\infty )-\Pi S(\infty )\Vert }{K(K-2N_{U,\mu })}. \end{aligned}$$
(6)

If \(\mu =1\) then Theorem 1.1 becomes simpler and it also applies to subgroups of \(\mathrm{PU}(n,1)\).

Corollary 1.2

Let \(\Gamma \) be a discrete subgroup of \(\mathrm{PU}(n,1)\) or \(\mathrm{PSp}(n,1)\) containing the parabolic map T given by (1) with \(\mu =1\). Suppose \(N_U=N_{U,1}\) defined by (3) satisfies \(N_U < (\sqrt{2}-1)^2/2\). Define

$$\begin{aligned} K=\frac{1}{2}\Bigl (1+2N_{U}+\sqrt{1-12N_{U}+4N_{U}^2}\Bigr ). \end{aligned}$$

If S is any other element of \(\Gamma \) not fixing \(\infty \) and with isometric sphere of radius \(r_S\) then

$$\begin{aligned} r_S^2\le & {} \frac{\ell _T(S^{-1}(\infty ))\ell _T(S(\infty ))}{K}\nonumber \\&+\frac{4\Vert \Pi TS^{-1}(\infty )-\Pi S^{-1}(\infty )\Vert \, \Vert \Pi TS(\infty )-\Pi S(\infty )\Vert }{K(K-2N_{U})}. \end{aligned}$$

As we remarked above, T is a Heisenberg translation if and only if \(N_{U,\mu }=0\), which implies \(N_\mu =0\) and \(K=1\). In this case

$$\begin{aligned} \Vert \Pi TS^{-1}(\infty )-\Pi S^{-1}(\infty )\Vert = \Vert \Pi TS(\infty )-\Pi S(\infty )\Vert =\Vert \tau \Vert \end{aligned}$$

and so Theorem 1.1, or Corollary 1.2, is just Theorem 4.8 of Kim–Parker [16]. If in addition \(\tau =0\) then \(\ell _T(S^{-1}(\infty ))=\ell _T(S(\infty ))=|t|^{1/2}\), and we recover Kamiya [13, Thm. 3.2].

For a parabolic map T of the form (1), consider the following sub-horospherical region:

$$\begin{aligned} {\mathcal {U}}_T= & {} \left\{ (\zeta ,v,u)\in \mathbf{H}^n_{\mathbb H}:u> \frac{\ell _T(z)^2}{K-N_\mu } \right. \nonumber \\&\quad +\left. \frac{4(2K-N_\mu )\Vert \Pi T(z)-\Pi (z)\Vert ^2}{(K-N_\mu )((K-N_\mu )(K-2N_{U,\mu })-2N_{U,\mu }K)} \right\} . \end{aligned}$$
(7)

Also, using the definitions of \(N_{U,\mu }\), \(N_\mu \) and K one may check

$$\begin{aligned} (K-N_\mu )(K-2N_{U,\mu })-2N_{U,\mu }K = (K-4N_{U,\mu })2N_{U,\mu }+K(K-2N_{U,\mu })^2, \end{aligned}$$

which is positive since \(K-4N_{U,\mu }> (1-6N_{U,\mu })/2>0\). Note that when \(\mu =\pm 1\), including the case of \(\mathrm{PU}(n,1)\), then we have the much simpler formula, generalising [21, eq. (3.1)]:

$$\begin{aligned} {\mathcal {U}}_T=\left\{ (\zeta ,v,u)\in \mathbf{H}^n_{\mathbb H}:u> \frac{\ell _T(z)^2}{K} +\frac{8\Vert \Pi T(z)-\Pi (z)\Vert ^2}{K(K-4N_{U,\mu })}\right\} . \end{aligned}$$

If H is a subgroup of G, then we say a set \({\mathcal {U}}\) is precisely invariant under H in G if \(T({\mathcal {U}})={\mathcal {U}}\) for all \(T\in H\) and \(S({\mathcal {U}})\cap {\mathcal {U}}=\emptyset \) for all \(S\in G- H\). Our second main result is a restatement of Theorem 1.1 in terms of a precisely invariant sub-horospherical region.

Theorem 1.3

Let G be a discrete subgroup of \(\mathrm{PSp}(n,1)\). Suppose that \(G_\infty \) the stabiliser of \(\infty \) in G is a cyclic group generated by a parabolic map of the form (1). Suppose that \(N_{U,\mu }\) and \(N_\mu \) defined by (3) and (4) satisfy \(N_\mu < 1/4\) and \(N_{U,\mu } < (3-2\sqrt{2+N_\mu })/2\) and let K be given by (5). Then the sub-horospherical region \({\mathcal {U}}_T\) given by (7) is precisely invariant under \(G_\infty \) in G.

1.3 Outline of the Proofs

All proofs of Shimizu’s lemma, and indeed of Jørgensen’s inequality, follow the same general pattern; see [10, 13, 16]. One considers the sequence \(S_{j+1}= S_jTS_j^{-1}\). From this sequence one constructs a dynamical system involving algebraic or geometrical quantities involving \(S_j\). The aim is to give conditions under which \(S_0\) is in a basin of attraction guaranteeing \(S_j\) tends to T as j tends to infinity.

The structure of the remaining sections of this paper is as follows. In Sect. 2, we give the necessary background material for quaternionic hyperbolic space. In Sect. 3, we prove that Theorem 1.3 follows from Theorem 1.1. In Sect. 4, we construct our dynamical system. This involves the radius of the isometric spheres of \(S_j\) and \(S_j^{-1}\) and the translations lengths of T and its vertical projection at their centres. We establish recurrence relations involving these quantities for \(S_{j+1}\) and the same quantities for \(S_j\). This lays a foundation for our proof of Theorem 1.1 in Sects. 5 and 6. In Sect. 5, we rewrite the condition (6) in terms of this dynamical system (Theorem 5.1), and show that it means we are in a basin of attraction. Finally, in Sect. 6, we show this implies \(S_j\) converges to T as j tends to infinity. Thus, our proof follows the existing structure; but it is far from easy to construct a suitable dynamical system and to find a basin of attraction.

2 Background

2.1 Quaternionic Hyperbolic Space

We give the necessary background material on quaternionic hyperbolic geometry in this section. Much of the background material can be found in [5, 8, 16].

We begin by recalling some basic facts about the quaternions \({\mathbb {H}}\). Elements of \({\mathbb {H}}\) have the form \(z=z_1+z_2\mathbf{i}+z_3\mathbf{j}+z_4\mathbf{k}\in {\mathbb {H}}\) where \(z_i\in {\mathbb {R}}\) and \(\mathbf{i}^2 = \mathbf{j}^2 = \mathbf{k}^2 = \mathbf{i}{} \mathbf{j}{} \mathbf{k} = -1\). Let \(\overline{z}=z_1-z_2\mathbf{i}-z_3\mathbf{j}-z_4\mathbf{k}\) be the conjugate of z, and \(|z|=\sqrt{\overline{z}z}=\sqrt{z_1^2+z_2^2+z_3^2+z_4^2}\) be the modulus of z. We define \(\mathrm{Re}(z)=(z+\overline{z})/2\) to be the real part of z, and \(\mathrm{Im}(z)=(z-\overline{z})/2\) to be the imaginary part of z. Two quaternions z and w are similar if there is a non-zero quaternion q so that \(w=qzq^{-1}\). Equivalently, z and w have the same modulus and the same real part. Let \(X=(x_{ij})\in M_{p\times q}\) be a \(p\times q\) matrix over \({\mathbb {H}}\). Define the Hilbert–Schmidt norm of X to be \(\Vert X\Vert =\sqrt{\sum _{i,j}|x_{ij}|^2}\). Also the Hermitian transpose of X, denoted \(X^*\), is the conjugate transpose of X in \(M_{q\times p}\).

Let \({\mathbb {H}}^{n,1}\) be the quaternionic vector space of quaternionic dimension \(n+1\) with the quaternionic Hermitian form

$$\begin{aligned} \langle \mathbf{z},\mathbf{w}\rangle =\mathbf{w}^*H\mathbf{z}= \overline{w}_1z_{n+1}+\overline{w}_2z_{2}+\cdots +\overline{w}_nz_{n} +\overline{w}_{n+1}z_{1}, \end{aligned}$$
(8)

where \(\mathbf{z}\) and \(\mathbf{w}\) are the column vectors in \({\mathbb {H}}^{n,1}\) with entries \(z_1,\ldots ,z_{n+1}\) and \(w_1,\ldots ,w_{n+1}\), respectively, and H is the Hermitian matrix

$$\begin{aligned} H=\left( \begin{array}{c@{\quad }c@{\quad }c} 0 &{} 0 &{} 1 \\ 0 &{} I_{n-1} &{} 0 \\ 1 &{} 0 &{} 0\\ \end{array} \right) . \end{aligned}$$

Following [5, Sec. 2], let

$$\begin{aligned} V_0 = \{\mathbf{z} \in {\mathbb {H}}^{n,1}-\{0\}:\langle \mathbf{z},\mathbf{z}\rangle =0\},\quad V_{-} = \{\mathbf{z} \in {\mathbb {H}}^{n,1}:\langle \mathbf{z},\mathbf{z}\rangle <0\}. \end{aligned}$$

We define an equivalence relation \(\sim \) on \({\mathbb {H}}^{n,1}\) by \(\mathbf{z}\sim \mathbf{w}\) if and only if there exists a non-zero quaternion \(\lambda \) so that \(\mathbf{w}=\mathbf{z}\lambda \). Let \([\mathbf{z}]\) denote the equivalence class of \(\mathbf{z}\). Let \({\mathbb {P}}:{\mathbb {H}}^{n,1}-\{0\}\longrightarrow {\mathbb {H}}{\mathbb {P}}^n\) be the right projection map given by \({\mathbb {P}}:\mathbf{z}\longmapsto [\mathbf{z}]\). If \(z_{n+1}\ne 0\) then \({\mathbb {P}}\) is given by

$$\begin{aligned} {\mathbb {P}}(z_1,\ldots ,z_n, z_{n+1})^\mathrm{T}=(z_1z_{n+1}^{-1},\ldots ,z_n z_{n+1}^{-1})^\mathrm{T}\in {{\mathbb {H}}}^n. \end{aligned}$$

We also define

$$\begin{aligned} {\mathbb {P}}(z_1, 0, \ldots ,0,0)^\mathrm{T}=\infty . \end{aligned}$$

The Siegel domain model of quaternionic hyperbolic n-space is defined to be \(\mathbf{H}_{{\mathbb {H}}}^n={\mathbb {P}}(V_-)\) with boundary \(\partial \mathbf{H}_{{\mathbb {H}}}^n={\mathbb {P}}(V_0)\). It is clear that \(\infty \in \partial \mathbf{H}_{{\mathbb {H}}}^n\). The Bergman metric on \(\mathbf{H}_{{\mathbb {H}}}^n\) is given by the distance formula

$$\begin{aligned} \cosh ^2\frac{\rho (z,w)}{2}= \frac{\langle \mathbf{z},\mathbf{w}\rangle \langle \mathbf{w},\mathbf{z}\rangle }{\langle \mathbf{z},\mathbf{z}\rangle \langle \mathbf{w},\mathbf{w}\rangle }, \quad \hbox {where}\ z,w \in \mathbf{H}_{{\mathbb {H}}}^n, \ \mathbf{z}\in {\mathbb {P}}^{-1}(z),\mathbf{w}\in {\mathbb {P}}^{-1}(w). \end{aligned}$$

This expression is independent of the choice of lifts \(\mathbf{z}\) and \(\mathbf{w}\).

Quaternionic hyperbolic space is foliated by horospheres based at a boundary point, which we take to be \(\infty \). Each horosphere has the structure of the \(4n-1\)-dimensional Heisenberg group with three-dimensional centre \({\mathfrak N}_{4n-1}\). We define horospherical coordinates on \(\overline{\mathbf{H}^n_{\mathbb H}}-\{\infty \}\) as \(z=(\zeta ,v,u)\), where \(u\in [0,\infty )\) is the height of the horosphere containing z and \((\zeta ,v)\in {\mathfrak N}_{4n-1}\) is a point of this horosphere. If \(u=0\) then z is in \(\partial \mathbf{H}^n_{\mathbb H}-\{\infty \}\) which we identify with \({\mathfrak N}_{4n-1}\) by writing \((\zeta ,v,0)=(\zeta ,v)\). Where necessary, we lift points of \(\overline{\mathbf{H}^n_{\mathbb H}}\) written in horospherical coordinates to \(V_0\cup V_-\) via the map \(\psi :(\mathfrak {N}_{4n-1}\times [0,\infty ))\cup \{\infty \} \longrightarrow V_0\cup V_-\) given by

$$\begin{aligned} \psi (\zeta ,v,u)=\left( \begin{matrix} -\Vert \zeta \Vert ^2-u+v \\ \sqrt{2}\zeta \\ 1 \\ \end{matrix}\right) ,\quad \psi (\infty )=\left( \begin{matrix} 1 \\ 0 \\ \vdots \\ 0 \end{matrix}\right) . \end{aligned}$$

The Cygan metric on the Heisenberg group is the metric corresponding to the norm

$$\begin{aligned} |(\zeta ,v)|_H =|\Vert \zeta \Vert ^2+v|^{1/2} =(\Vert \zeta \Vert ^4+|v|^2)^{1/4}. \end{aligned}$$

It is given by

$$\begin{aligned} d_H((\zeta _1, v_1),(\zeta _2, v_2))= & {} |(\zeta _1,v_1)^{-1}(\zeta _2, v_2)|_H \\= & {} |\Vert \zeta _1-\zeta _2\Vert ^2 -v_1+v_2-2\mathrm{Im}(\zeta _2^*\zeta _1)|^{1/2}. \end{aligned}$$

As in [16, p. 303], we extend the Cygan metric to \(\overline{\mathbf{H}^n_{\mathbb H}}-\{\infty \}\) by

$$\begin{aligned} d_H((\zeta _1,v_1,u_1),(\zeta _2,v_2,u_2)) =|\Vert \zeta _1-\zeta _2\Vert ^2+|u_1-u_2| -v_1+v_2-2\mathrm{Im}(\zeta _2^*\zeta _1)|^{1/2}. \end{aligned}$$

2.2 The Group \(\mathrm{Sp}(n,1)\)

The group \(\mathrm{Sp}(n,1)\) is the subgroup of \(\mathrm{GL}(n+1,{\mathbb H})\) preserving the Hermitian form given by (8). That is, \(S\in \mathrm{Sp}(n,1)\) if and only if \(\langle S(\mathbf{z}),S(\mathbf{w})\rangle =\langle \mathbf{z},\mathbf{w}\rangle \) for all \(\mathbf{z}\) and \(\mathbf{w}\) in \({\mathbb {H}}^{n,1}\). From this we find \(S^{-1}=H^{-1} S^*H\). That is S and \(S^{-1}\) have the form:

$$\begin{aligned} S=\left( \begin{array}{c@{\quad }c@{\quad }c} a&{} \gamma ^*&{} b \\ \alpha &{} A&{} \beta \\ c &{} \delta ^*&{} d\\ \end{array} \right) ,\quad S^{-1}=\left( \begin{array}{c@{\quad }c@{\quad }c} \overline{d}&{} \beta ^*&{} \overline{b} \\ \delta &{} A^*&{} \gamma \\ \overline{c} &{} \alpha ^*&{} \overline{a}\\ \end{array} \right) , \end{aligned}$$
(9)

where \(a, b, c, d\in {\mathbb {H}}\), A is an \((n-1)\times (n-1)\) matrix over \({\mathbb {H}}\), and \(\alpha , \beta , \gamma , \delta \) are column vectors in \({\mathbb {H}}^{n-1}\).

Using the identities \(I_{n+1}=SS^{-1}\) we see that the entries of S must satisfy:

$$\begin{aligned} 1= & {} a\overline{d}+\gamma ^*\delta +b\overline{c}, \end{aligned}$$
(10)
$$\begin{aligned} 0= & {} a\overline{b}+\Vert \gamma \Vert ^2+b\overline{a}, \end{aligned}$$
(11)
$$\begin{aligned} 0= & {} \alpha \overline{d}+A\delta +\beta \overline{c}, \end{aligned}$$
(12)
$$\begin{aligned} I_{n-1}= & {} \alpha \beta ^*+AA^*+\beta \alpha ^*, \end{aligned}$$
(13)
$$\begin{aligned} 0= & {} \alpha \overline{b}+A\gamma +\beta \overline{a}, \end{aligned}$$
(14)
$$\begin{aligned} 0= & {} c\overline{d}+\Vert \delta \Vert ^2+d\overline{c}. \end{aligned}$$
(15)

Similarly, equating the entries of \(I_{n+1}=S^{-1}S\) yields:

$$\begin{aligned} 1= & {} \overline{d}a+\beta ^*\alpha +\overline{b}c,\\ 0= & {} \overline{d}\gamma ^*+\beta ^*A+\overline{b}\delta ^*,\\ 0= & {} \overline{d}b+\Vert \beta \Vert ^2+\overline{b}d,\\ 0= & {} \delta a+A^*\alpha +\gamma c,\\ I_{n-1}= & {} \delta \gamma ^*+A^*A+\gamma \delta ^*,\\ 0= & {} \overline{c}a+\Vert \alpha \Vert ^2+\overline{a}c. \end{aligned}$$

An \((n-1)\times (n-1)\) quaternionic matrix U is in \(\mathrm{Sp}(n-1)\) if and only if \(UU^*=U^*U=I_{n-1}\). Using the above equations, we can verify the following lemma.

Lemma 2.1

(cf. [16, Lem. 1.1]) If S is as above then \(A-\alpha c^{-1}\delta ^*\) and \(A-\beta b^{-1}\gamma ^*\) are in \(\mathrm{Sp}(n-1).\) Also we have

$$\begin{aligned} \beta -\alpha c^{-1}d= & {} -(A-\alpha c^{-1}\delta ^*)\delta \overline{c}^{-1}, \\ \gamma -\delta \overline{c}^{-1}\overline{a}= & {} -(A-\alpha c^{-1}\delta ^*)^*\alpha c^{-1},\\ \alpha -\beta b^{-1}a= & {} -(A-\beta b^{-1}\gamma ^*)\gamma \overline{b}^{-1}, \\ \delta -\gamma \overline{b}^{-1}\overline{d}= & {} -(A-\beta b^{-1}\gamma ^* )^*\beta b^{-1}. \end{aligned}$$

It is obvious that \(V_0\) and \(V_{-}\) are invariant under \(\mathrm{Sp}(n,1)\). This means that if we can show that the action of \(\mathrm{Sp}(n,1)\) is compatible with the projection \({\mathbb {P}}\), then we can make \(\mathrm{Sp}(n,1)\) act on quaternionic hyperbolic space and its boundary. The action of \(S\in \mathrm{Sp}(n,1)\) on \(\mathbf{H}_{\mathbb {H}}^n\cup \partial \mathbf{H}_{\mathbb {H}}^n\) is given as follows. Let \(\mathbf{z}\in V_-\cup V_0\) be a vector that projects to z. Then

$$\begin{aligned} S(z)= {\mathbb {P}}S\mathbf{z}. \end{aligned}$$

Note that if \(\widetilde{\mathbf{z}}\) is any other lift of z, then \(\widetilde{\mathbf{z}}=\mathbf{z}\lambda \) for some non-zero quaternion \(\lambda \). We have

$$\begin{aligned} {\mathbb {P}}S\widetilde{\mathbf{z}}={\mathbb {P}}S\mathbf{z}\lambda ={\mathbb {P}}S\mathbf{z}=S(z), \end{aligned}$$

and so this action is independent of the choice of lift. The key point here is that the group acts on the left and projection acts on the right, hence they commute.

Let S have the form (9). If \(c=0\) then from (15) we have \(\Vert \delta \Vert =0\) and so \(\delta \) is the zero vector in \({\mathbb H}^{n-1}\). Similarly, \(\alpha \) is also the zero vector. This means that S (projectively) fixes \(\infty \). On the other hand, if \(c\ne 0\) then S does not fix \(\infty \). Moreover, \(S^{-1}(\infty )\) and \(S(\infty )\) in \({\mathfrak N}_{4n-1}=\partial \mathbf{H}^n_{\mathbb H}-\{\infty \}\) have Heisenberg coordinates

$$\begin{aligned} S^{-1}(\infty )=(\delta \overline{c}^{-1}/\sqrt{2},\, \mathrm{Im}(\overline{d}\overline{c}^{-1})),\quad S(\infty )=(\alpha c^{-1}/\sqrt{2},\,\mathrm{Im}(ac^{-1})). \end{aligned}$$

For any \(r>0\), it is not hard to check (compare [21, Lem. 3.4]) that S sends the Cygan sphere with centre \(S^{-1}(\infty )\) and radius r to the Cygan sphere with centre \(S(\infty )\) and radius \(\widetilde{r}=1/|c|r\). The isometric sphere of S is the Cygan sphere with radius \(r_S=1/|c|^{1/2}\) centred at \(S^{-1}(\infty )\). It is sent by S to the isometric sphere of \(S^{-1}\), which is the sphere with centre \(S(\infty )\) and radius \(r_S\). In particular, if r and \(\widetilde{r}\) are as above, then \(\widetilde{r}=r_S^2/r\).

We define \(\mathrm{PSp}(n,1)=\mathrm{Sp}(n,1)/\{\pm I_{n+1}\}\), which is the group of holomorphic isometries of \(\mathbf{H}_{\mathbb {H}}^n\). Following Chen and Greenberg [5], we say that a non-trivial element g of \(\mathrm{Sp}(n,1)\) is:

  1. (i)

    elliptic if it has a fixed point in \(\mathbf{H}_{{\mathbb {H}}}^n\);

  2. (ii)

    parabolic if it has exactly one fixed point, and this point lies in \(\partial \mathbf{H}_{{\mathbb {H}}}^n\);

  3. (iii)

    loxodromic if it has exactly two fixed points, both lying in \(\partial \mathbf{H}_{{\mathbb {H}}}^n\).

2.3 Parabolic Elements of \(\mathrm{Sp}(n,1)\)

The main aim of this section is to show that any parabolic motion T can be normalised to the form given by (1). We use the following result, which we refer to as Johnson’s theorem.

Lemma 2.2

(Johnson [11]) Consider the affine map on \({\mathbb H}\) given by \(T_0:z\longmapsto \nu z\overline{\mu }+\tau \) where \(\tau \in {\mathbb H}-\{0\}\) and \(\mu ,\,\nu \in {\mathbb H}\) with \(|\mu |=|\nu |=1.\)

  1. (i)

    If \(\nu \) is not similar to \(\mu \) then \(T_0\) has a fixed point in \({\mathbb H}\).

  2. (ii)

    If \(\nu =\mu \) and \(\mu \ne \pm 1\) then \(T_0\) has a fixed point in \({\mathbb H}\) if and only if \(\mu \tau =\tau \overline{\mu }\).

We now characterise parabolic elements of \(\mathrm{Sp}(n,1)\) (compare [2, Thm. 3.1 (iii)]).

Proposition 2.3

Let \(T\in \mathrm{Sp}(n,1)\) be a parabolic map that fixes \(\infty .\) Then T may be conjugated into the standard form (1). That is

$$\begin{aligned} T=\left( \begin{array}{c@{\quad }c@{\quad }c} \mu &{} -\sqrt{2}\tau ^*\mu &{} (-\Vert \tau \Vert ^2+t)\mu \\ 0 &{} U &{} \sqrt{2}\tau \mu \\ 0 &{} 0 &{} \mu \end{array}\right) , \end{aligned}$$

where \((\tau ,t)\in {\mathfrak N}_{4n-1}\), \(U\in \mathrm{Sp}(n-1)\) and \(\mu \in {\mathbb H}\) with \(|\mu |=1\) satisfying (2). That is

$$\begin{aligned} {\left\{ \begin{array}{ll} U\tau =\mu \tau ,\ U^*\tau =\overline{\mu }\tau ,\ \mu \tau \ne \tau \overline{\mu } &{} \hbox {if }\tau \ne 0 \hbox { and }\mu \ne \pm 1, \\ U\tau =\mu \tau ,\ U^*\tau =\overline{\mu }\tau &{} \hbox {if }\tau \ne 0 \hbox { and }\mu =\pm 1, \\ \mu t\ne t\overline{\mu } &{} \hbox {if }\tau =0\hbox { and }\mu \ne \pm 1, \\ t\ne 0 &{} \hbox {if }\tau =0\hbox { and }\mu =\pm 1. \end{array}\right. } \end{aligned}$$

Recall that if \(U=I_{n-1}\) and \(\mu =1\) (or \(U=-I_{n-1}\) and \(\mu =-1\)), then T is a Heisenberg translation. Otherwise, we say that U is screw parabolic.

Note that if \(U\tau =\mu \tau =\tau \overline{\mu }\) and \(\mu \ne \pm 1\), then \(\zeta =\tau (1-\overline{\mu }^2)^{-1}\) is a fixed point of \(\zeta \longmapsto U\zeta \overline{\mu }+\tau \). Furthermore, if \(\tau =0\), \(\mu t=t\overline{\mu }\) and \(\mu \ne \pm 1\), then \((\zeta ,v)=(0,t(1-\overline{\mu }^2)^{-1})\) is a fixed point of T (note that, when \(\mu t=t\overline{\mu }\), if t is pure imaginary then so is \(t(1-\overline{\mu }^2)^{-1}\)).

Proof

Suppose that T, written in the general form (9), fixes \(\infty \). Then it must be block upper triangular, that is \(c=0\) and \(\alpha =\delta =0\), the zero vector in \({\mathbb H}^{n-1}\). This means that \(\psi (\infty )\) is an eigenvector of T with (left) eigenvalue a. Thus, if T is non-loxodromic, we must have \(|a|=1\). From (10) we also have \(a\overline{d}=1\). Using \(|a|=1\), we see that \(a=d\). We define \(\mu :=a=d\in {\mathbb H}\) with \(|\mu |=1\).

If \(o=(0,0)\) is the origin in \({\mathfrak N}_{4n-1}\), then suppose T maps o to \((\tau ,t)\in {\mathfrak N}_{4n-1}\). This means that

$$\begin{aligned} bd^{-1}=-\Vert \tau \Vert ^2+t,\quad \beta d^{-1}=\sqrt{2}\tau . \end{aligned}$$

Hence \(b=(-\Vert \tau \Vert ^2+t)\mu \) and \(\beta =\sqrt{2}\tau \mu \). Also, \(A\in \mathrm{Sp}(n-1)\) and so we write \(A=U\). It is easy to see from (14) that \(U\gamma +\sqrt{2}\tau =0\). Hence, T has the form

$$\begin{aligned} T=\left( \begin{array}{c@{\quad }c@{\quad }c} \mu &{} -\sqrt{2}\tau ^*U &{} (-\Vert \tau \Vert ^2+t)\mu \\ 0 &{} U &{} \sqrt{2}\tau \mu \\ 0 &{} 0 &{} \mu \end{array}\right) . \end{aligned}$$

Since T fixes \(\infty \) and is assumed to be parabolic, we need to find conditions on U, \(\mu \) and \(\tau \) that imply T does not fix any finite point of \({\mathfrak N}_{4n-1}=\partial \mathbf{H}^n_{\mathbb H}-\{\infty \}\).

Without loss of generality, we may suppose that U is a diagonal map whose entries \(u_i\) all satisfy \(|u_i|=1\). Writing the entries of \(\zeta \) and \(\tau \in {\mathbb H}^{n-1}\) as \(\zeta _i\) and \(\tau _i\) for \(i=1,\ldots ,n-1\), we see that a fixed point \((\zeta ,v)\) of T is a simultaneous solution to the equations

$$\begin{aligned} -\Vert \zeta \Vert ^2+v= & {} \mu (-\Vert \zeta \Vert ^2+v)\overline{\mu } -2\tau ^*U\zeta \overline{\mu }-\Vert \tau \Vert ^2+t, \\ \zeta _i= & {} u_i\zeta _i\overline{\mu }+\tau _i, \end{aligned}$$

for \(i=1,\ldots ,n-1\). If any of the equations \(\zeta _i = u_i\zeta _i\overline{\mu }+\tau _i\) has a solution, then conjugating by a translation if necessary, we assume this solution is 0.

If all the equations \(\zeta _i = u_i\zeta _i\overline{\mu }+\tau _i\) have a solution, then, as above, \(\zeta =0\) and so \(\tau =0\). The first equation becomes

$$\begin{aligned} v=\mu v\overline{\mu }+t. \end{aligned}$$

By Johnson’s theorem, Lemma 2.2, if \(\mu \ne \pm 1\) this has no solution provided \(\mu t\ne t\overline{\mu }\). Clearly, if \(\mu =\pm 1\) then it has no solution if and only if \(t\ne 0\).

On the other hand, if there are some values of i for which \(\zeta _i=u_i\zeta _i\overline{\mu }+\tau _i\) has no solution, then by Johnson’s theorem, Lemma 2.2, for each such value of i, the corresponding \(u_i\) must be similar to \(\mu \) (and \(\tau _i\ne 0\) else 0 is a solution). Hence, without loss of generality, we may choose coordinates so that whenever \(\tau _i\ne 0\) we have \(u_i=\mu \). In particular, \(u_i\tau _i=\mu \tau _i\) and so \(U\tau =\mu \tau \). Furthermore, again using Johnson’s theorem, Lemma 2.2, if \(\mu \ne \pm 1\) then \(\mu \tau \ne \tau \overline{\mu }\).

Observe that \(u_i\tau _i=\mu \tau _i\) and \(\tau _i\ne 0\) imply

$$\begin{aligned} \overline{u}_i\tau _i =\overline{u}_i(\mu \tau _i)(\tau _i^{-1}\overline{\mu }\tau _i) =\overline{u}_i(u_i\tau _i)(\tau _i^{-1}\overline{\mu }\tau _i) =\overline{\mu }\tau _i. \end{aligned}$$

Hence \(U^*\tau =\overline{\mu }\tau \), or equivalently \(\tau ^*U=\tau ^*\mu \) and so T has the required form. \(\square \)

The action of T on \(\overline{\mathbf{H}^n_{\mathbb H}}-\{\infty \}\) is given by

$$\begin{aligned} T(\zeta ,v,u) =(U\zeta \overline{\mu }+\tau ,t+\mu v\overline{\mu }-2\mathrm{Im}(\tau ^*\mu \zeta \overline{\mu }),u). \end{aligned}$$

Observe that T maps the horosphere of height \(u\in [0,\infty )\) to itself. The Cygan translation length of T at \((\zeta ,v)\), denoted \(\ell _T(\zeta ,v)=d_H(T(\zeta ,v),(\zeta ,v)) =d_H(T(\zeta ,v,u),(\zeta ,v,u))\), is:

$$\begin{aligned} \ell _T(\zeta ,v)= & {} |(U\zeta \overline{\mu }+\tau -\zeta ,\, t+\mu v\overline{\mu }-v +2\mathrm{Im}((\zeta ^*-\tau ^*)(U\zeta \overline{\mu }+\tau )) )|_H \nonumber \nonumber \\= & {} |\Vert U\zeta \overline{\mu }+\tau -\zeta \Vert ^2 +t+\mu v\overline{\mu }-v +2\mathrm{Im}((\zeta ^*-\tau ^*)(U\zeta \overline{\mu }+\tau )) |^{1/2} \nonumber \nonumber \\= & {} |2\zeta ^*U\zeta \overline{\mu }-2\tau ^*\mu \zeta \overline{\mu } +2\zeta ^*\tau -\Vert \tau \Vert ^2+t-2\Vert \zeta \Vert ^2 +\mu v\overline{\mu }-v|^{1/2}.\nonumber \\ \end{aligned}$$
(16)

The vertical projection of T acting on \({\mathbb H}^{n-1}\) is \(\zeta \longmapsto U\zeta \overline{\mu }+\tau \). Its Euclidean translation length is \(\Vert \Pi T(\zeta ,v)-\Pi (\zeta ,v)\Vert =\Vert U\zeta \overline{\mu }+\tau -\zeta \Vert \). The following corollary is easy to show.

Corollary 2.4

Let \((\zeta ,v)\in {\mathfrak N}_{4n-1}\) and let \(\Pi :{\mathfrak N}_{4n-1}\longrightarrow {\mathbb H}^{n-1}\) be the vertical projection given by \(\Pi :(\zeta ,v)\longmapsto \zeta .\) If T is given by (1) then

$$\begin{aligned} \Vert \Pi T(\zeta ,v)-\Pi (\zeta ,v)\Vert \le \ell _T(\zeta ,v). \end{aligned}$$

The following proposition relates the Cygan translation lengths of T at two points of \({\mathfrak N}_{4n-1}\). It is a generalisation of [21, Lem. 1.5].

Proposition 2.5

Let T be given by (1). Let \((\zeta ,v)\) and \((\xi ,r)\) be two points in \({\mathfrak {N}}_{4n-1}.\) Write \((\zeta ,v)^{-1}(\xi ,r)=(\eta ,s).\) Then

$$\begin{aligned} \ell _T(\xi ,r)^2\le \ell _T(\zeta ,v)^2 +4\Vert \Pi T(\zeta ,v)-\Pi (\zeta ,v)\Vert \,\Vert \eta \Vert +2N_{U,\mu }\Vert \eta \Vert ^2+N_\mu |s|. \end{aligned}$$

Proof

We write \((\xi ,r)=(\zeta ,v)(\eta ,s)=(\zeta +\eta ,\,v+s+\eta ^*\zeta -\zeta ^*\eta )\). Then

$$\begin{aligned}&2\xi ^*U\xi \overline{\mu }-2\tau ^*\mu \xi \overline{\mu }+2\xi ^*\tau - \Vert \tau \Vert ^2+t-2\Vert \xi \Vert ^2+\mu r\overline{\mu }-r \\&\quad =2(\zeta +\eta )^*U(\zeta +\eta )\overline{\mu }-2\tau ^* \mu (\zeta +\eta )\overline{\mu }+2(\zeta +\eta )^*\tau -\Vert \tau \Vert ^2+t \\&\qquad -\,2\Vert \zeta +\eta \Vert ^2+\mu (v+s+\eta ^*\zeta -\zeta ^*\eta ) \overline{\mu }-v-s-\eta ^*\zeta +\zeta ^*\eta \\&\quad = 2\zeta ^*U\zeta \overline{\mu }-2\tau ^*\mu \zeta \overline{\mu }+2\zeta ^*\tau -\Vert \tau \Vert ^2+t -2\Vert \zeta \Vert ^2+\mu v\overline{\mu }-v \\&\qquad +\,2\eta ^*(U\zeta \overline{\mu }+\tau -\zeta ) -2(\mu \zeta ^*U^*+\tau ^*-\zeta ^*)U\eta \overline{\mu } +2\eta ^*(U\eta -\eta \mu )\overline{\mu }\\&\qquad +\,(\mu s-s\mu )\overline{\mu }. \end{aligned}$$

Therefore, using (16),

$$\begin{aligned} \ell _T(\xi ,r)^2= & {} |2\xi ^*U\xi \overline{\mu }-2\tau ^*\mu \xi \overline{\mu }+2\xi ^*\tau -\Vert \tau \Vert ^2+t -2\Vert \xi \Vert ^2+\mu r\overline{\mu }-r| \\\le & {} |2\zeta ^*U\zeta \overline{\mu }-2\tau ^*\mu \zeta \overline{\mu } +2\zeta ^*\tau -\Vert \tau \Vert ^2+t -2\Vert \zeta \Vert ^2 +\mu v\overline{\mu }-v| \\&+\,2|\eta ^*(U\zeta \overline{\mu }+\tau -\zeta )| +2|(\mu \zeta ^*U^*+\tau ^*-\zeta ^*)U\eta \overline{\mu }|\\&+\,2\Vert \eta \Vert \,\Vert U\eta \overline{\mu }-\eta \Vert + |\mu s-s\mu | \\\le & {} \ell _T(\zeta ,v)^2+4\Vert \eta \Vert \, \Vert U\zeta \overline{\mu }+\tau -\zeta \Vert +2N_{U,\mu }\Vert \eta \Vert ^2 +N_\mu |s|. \end{aligned}$$

The result follows since \(U\zeta \overline{\mu }+\tau -\zeta =\Pi T(\zeta ,v)-\Pi (\zeta ,v)\). \(\square \)

3 A Precisely Invariant Sub-horospherical Region

In this section, we show how Theorem 1.3 follows from Theorem 1.1. This argument follows [21, Lem. 3.3, Lem. 3.4].

Proof of Theorem 1.3

Let \(z=(\zeta ,v,u)\) be any point on the Cygan sphere with radius r and centre \((\zeta _0,v_0,0)=(\zeta _0,v_0)\in {\mathfrak N}_{4n-1}\subset \partial \mathbf{H}^n_{\mathbb H}\) and write \((\eta ,s)=(\zeta ,v)^{-1}(\zeta _0,v_0)\). Then we have

$$\begin{aligned} r^2=d_H((\zeta ,v,u),(\zeta _0,v_0,0))^2 =|\Vert \eta \Vert ^2+u+s| =((\Vert \eta \Vert ^2+u)^2+|s|^2)^{1/2}. \end{aligned}$$

In particular, \(r^2\ge \Vert \eta \Vert ^2+u\) and \(r^2\ge |s|\). We claim that the Cygan sphere with centre \((\zeta _0,v_0)\) and radius r does not intersect \({\mathcal {U}}_T\) when r satisfies:

$$\begin{aligned} r^2\le \frac{\ell _T(\zeta _0,v_0)^2}{K} +\frac{4\Vert \Pi T(\zeta _0,v_0)-\Pi (\zeta _0,v_0)\Vert ^2}{K(K-2N_{U,\mu })}. \end{aligned}$$
(17)

To see this, using Proposition 2.5 to compare \(\ell _T(\zeta _0,v_0)\) with \(\ell _T(\zeta ,v)=\ell _T(z)\), we have

$$\begin{aligned} u\le & {} r^2-\Vert \eta \Vert ^2 \\= & {} \frac{K}{K-N_\mu }\,r^2-\frac{N_\mu }{K-N_\mu }\,r^2 -\Vert \eta \Vert ^2 \\\le & {} \frac{K}{K-N_\mu }\left( \frac{\ell _T(\zeta _0,v_0)^2}{K} \!+\!\frac{4\Vert \Pi T(\zeta _0,v_0)-\Pi (\zeta _0,v_0)\Vert ^2}{K(K-2N_{U,\mu })} \right) -\frac{N_\mu }{K-N_\mu }\,|s|-\Vert \eta \Vert ^2 \\\le & {} \frac{1}{K-N_\mu }(\ell _T(z)^2+4\Vert \Pi T(z)-\Pi (z) \Vert \,\Vert \eta \Vert +2N_{U,\mu }\Vert \eta \Vert ^2+N_\mu |s|) \\&+\frac{4}{(K-N_{\mu })(K-2N_{U,\mu })}(\Vert \Pi T(z)-\Pi (z) \Vert +N_{U,\mu }\Vert \eta \Vert )^2\\&-\,\frac{N_\mu }{K-N_\mu }\,|s|-\Vert \eta \Vert ^2 \\= & {} \frac{\ell _T(z)^2}{K-N_\mu }+\frac{4\Vert \Pi T(z)-\Pi (z)\Vert ^2}{(K-N_\mu )(K-2N_{U,\mu })} +\frac{4K\Vert \Pi T(z)-\Pi (z)\Vert }{(K-N_\mu )(K-2N_{U,\mu })}\,\Vert \eta \Vert \\&-\,\frac{(K-N_\mu )(K-2N_{U,\mu })-2N_{U,\mu }K}{(K-N_\mu )(K-2N_{U,\mu })} \,\Vert \eta \Vert ^2 \\\le & {} \frac{\ell _T(z)^2}{K-N_\mu } +\frac{4(2K-N_\mu )\Vert \Pi T(z)-\Pi (z)\Vert ^2}{(K-N_\mu )((K-N_\mu )(K-2N_{U,\mu })-2N_{U,\mu }K)}, \end{aligned}$$

where the last inequality follows by finding the value of \(\Vert \eta \Vert \) maximising the previous line. Hence, when r satisfies (17) the Cygan sphere with centre \((\zeta _0,v_0)\) and radius r lies outside \({\mathcal {U}}_T\).

Now suppose that the radius \(r_S\) of the isometric sphere of S satisfies the bound (6). Consider the Cygan sphere with centre \(S^{-1}(\infty )=(\zeta _0,v_0)\) and radius r with equality in (17). That is

$$\begin{aligned} r^2=\frac{\ell _T(\zeta _0,v_0)^2}{K} +\frac{4\Vert \Pi T(\zeta _0,v_0)-\Pi (\zeta _0,v_0)\Vert ^2}{K(K-2N_{U,\mu })}. \end{aligned}$$
(18)

We know that S sends this sphere to the Cygan sphere with centre \(S(\infty )=(\widetilde{\zeta }_0,\widetilde{v}_0)\) and radius \(\widetilde{r}=r_S^2/r\). We claim that \(\widetilde{r}\) satisfies (17). It will follow from this claim that both spheres are disjoint from \({\mathcal {U}}_T\). Since S sends the exterior of the first sphere to the interior of the second, it will follow that \(S({\mathcal {U}}_T)\cap {\mathcal {U}}_T=\emptyset \).

In order to verify the claim, use (18) and (6) to check that:

$$\begin{aligned} \widetilde{r}^2= & {} r_S^4/r^2 \\\le & {} \frac{1}{r^2}\left( \frac{\ell _T(\zeta _0,v_0)\ell _T(\widetilde{\zeta _0},\widetilde{v}_0)}{K} +\frac{4\Vert \Pi T(\zeta _0,v_0)-\Pi (\zeta _0,v_0)\Vert \, \Vert \Pi T(\widetilde{\zeta }_0,\widetilde{v}_0) -\Pi (\widetilde{\zeta }_0,\widetilde{v}_0)\Vert }{K(K-2N_{U,\mu })}\right) ^2 \\\le & {} \left( \frac{\ell _T(\widetilde{\zeta }_0,\widetilde{v}_0)^2}{K} +\frac{4\Vert \Pi T(\widetilde{\zeta }_0,\widetilde{v}_0) -\Pi (\widetilde{\zeta }_0,\widetilde{v}_0)\Vert ^2}{K(K-2N_{U,\mu })}\right) . \end{aligned}$$

Thus \(\widetilde{r}\) satisfies (17) as claimed.

Therefore, if \(S\in G-G_\infty \) then the image of \({\mathcal {U}}_T\) does not intersect its image under S. On the other hand, clearly T maps \({\mathcal {U}}_T\) to itself. Thus every element of \(G_\infty =\langle T\rangle \) maps \({\mathcal {U}}_T\) to itself. Hence \({\mathcal {U}}_T\) is precisely invariant under \(G_\infty \) in G. This proves Theorem 1.3. \(\square \)

4 The Dynamical System Involving S and T

4.1 The Sequence \(S_{j+1}=S_jTS_j^{-1}\)

Let T be a parabolic map fixing \(\infty \) written in the normal form (1) and let S be a general element of \(\mathrm{Sp}(n,1)\) written in the standard form (9). We are particularly interested in the case where S does not fix \(\infty \). We define a sequence of elements \(\{S_j\}\) in the group \(\langle S,\,T\rangle \) by \(S_0=S\) and \(S_{j+1}=S_jTS_j^{-1}\) for \(j\ge 0\). We write \(S_j\) in the standard form (9) with each entry having the subscript j. Then \(S_{j+1}\) is given by:

$$\begin{aligned} \left( \begin{array}{c@{\quad }c@{\quad }c} a_{j+1}&{} \gamma ^*_{j+1}&{} b_{j+1} \\ \alpha _{j+1} &{} A_{j+1}&{} \beta _{j+1}\\ c_{j+1} &{} \delta ^*_{j+1}&{} d_{j+1}\\ \end{array} \right)= & {} \left( \begin{array}{c@{\quad }c@{\quad }c} a_{j} &{} \gamma ^*_{j} &{} b_{j} \\ \alpha _{j} &{} A_{j}&{} \beta _{j}\\ c_{j} &{} \delta ^*_{j}&{} d_{j}\\ \end{array} \right) \left( \begin{array}{c@{\quad }c@{\quad }c} \mu &{} -\sqrt{2}\tau ^*\mu &{} (-\Vert \tau \Vert ^2+t)\mu \\ 0 &{} U &{} \sqrt{2}\tau \mu \\ 0 &{} 0 &{} \mu \end{array}\right) \nonumber \\&\times \left( \begin{array}{c@{\quad }c@{\quad }c} \overline{d}_{j}&{} \beta ^*_{j}&{} \overline{b}_{j} \\ \delta _{j} &{} A^*_{j}&{} \gamma _{j}\\ \overline{c}_{j} &{} \alpha ^*_{j}&{} \overline{a}_{j}\\ \end{array} \right) . \end{aligned}$$
(19)

Performing the matrix multiplication of (19), we obtain recurrence relations relating the entries of \(S_{j+1}\) with the entries of \(S_j\):

$$\begin{aligned} a_{j+1}= & {} \gamma _j^*U\delta _j-\sqrt{2}a_j\tau ^*\mu \delta _j +\sqrt{2}\gamma _j^*\tau \mu \overline{c}_j -a_j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j\nonumber \\&+\,a_j\mu \overline{d}_j+b_j\mu \overline{c}_j, \end{aligned}$$
(20)
$$\begin{aligned} \gamma _{j+1}= & {} A_jU^*\gamma _j-\sqrt{2}A_j\overline{\mu }\tau \overline{a}_j +\sqrt{2}\alpha _j\overline{\mu }\tau ^*\gamma _j -\alpha _j\overline{\mu }(\Vert \tau \Vert ^2+t)\overline{a}_j\nonumber \\&+\,\alpha _j\overline{\mu }\,\overline{b}_j+\beta _j\overline{\mu }\,\overline{a}_j, \end{aligned}$$
(21)
$$\begin{aligned} b_{j+1}= & {} \gamma _j^*U\gamma _j-\sqrt{2}a_j\tau ^*\mu \gamma _j+ \sqrt{2}\gamma _j^*\tau \mu \overline{a}_j-a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j\nonumber \\&+\,a_j\mu \overline{b}_j+b_j\mu \overline{a}_j, \end{aligned}$$
(22)
$$\begin{aligned} \alpha _{j+1}= & {} A_jU\delta _j-\sqrt{2}\alpha _j\tau ^*\mu \delta _j+\sqrt{2}A_j\tau \mu \overline{c}_j -\alpha _j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j\nonumber \\&+\,\alpha _j\mu \overline{d}_j+\beta _j\mu \overline{c}_j, \end{aligned}$$
(23)
$$\begin{aligned} A_{j+1}= & {} A_jUA_j^*-\sqrt{2}\alpha _j\tau ^*\mu A_j^*+\sqrt{2}A_j\tau \mu \alpha _j^*-\alpha _j(\Vert \tau \Vert ^2-t) \mu \alpha _j^*\nonumber \\&+\,\alpha _j\mu \beta _j^*+\beta _j\mu \alpha _j^*, \end{aligned}$$
(24)
$$\begin{aligned} \beta _{j+1}= & {} A_jU\gamma _j-\sqrt{2}\alpha _j\tau ^*\mu \gamma _j+ \sqrt{2}A_j\tau \mu \overline{a}_j-\alpha _j(\Vert \tau \Vert ^2-t) \mu \overline{a}_j\nonumber \\&+\,\alpha _j\mu \overline{b}_j+\beta _j\mu \overline{a}_j, \end{aligned}$$
(25)
$$\begin{aligned} c_{j+1}= & {} \delta _j^*U\delta _j- \sqrt{2}c_j\tau ^*\mu \delta _j+\sqrt{2}\delta _j^*\tau \mu \overline{c}_j-c_j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j\nonumber \\&+\,c_j\mu \overline{d}_j+d_j\mu \overline{c}_j, \end{aligned}$$
(26)
$$\begin{aligned} \delta _{j+1}= & {} A_jU^*\delta _j-\sqrt{2}A_j\overline{\mu }\tau \overline{c}_j+ \sqrt{2}\alpha _j\overline{\mu }\tau ^*\delta _j-\alpha _j \overline{\mu }(\Vert \tau \Vert ^2+t)\overline{c}_j\nonumber \\&+\,\beta _j\overline{\mu }\,\overline{c}_j+\alpha _j\overline{\mu }\,\overline{d}_j, \end{aligned}$$
(27)
$$\begin{aligned} d_{j+1}= & {} \delta _j^*U\gamma _j-\sqrt{2}c_j\tau ^*\mu \gamma _j+\sqrt{2} \delta _j^*\tau \mu \overline{a}_j-c_j(\Vert \tau \Vert ^2-t)\mu \overline{a}\nonumber \\&+\,c_j\mu \overline{b}_j+d_j\mu \overline{a}_j. \end{aligned}$$
(28)

We also define \(\widetilde{S}_{j+1}=S_j^{-1}TS_j\) and we denote its entries \(\widetilde{a}_{j+1}\) and so on. We will only need

$$\begin{aligned} \widetilde{c}_{j+1}= & {} \alpha _j^*U\alpha _j -\sqrt{2}\overline{c}_j\tau ^*\mu \alpha _j+\sqrt{2}\alpha _j^*\tau \mu c_j -\overline{c}_j(\Vert \tau \Vert ^2-t)\mu c_j \nonumber \\&+\,\overline{c}_j\mu a_j+\overline{a}_j\mu c_j. \end{aligned}$$
(29)

These recurrence relations are rather complicated. We want to simplify them by extracting geometrical information. Specifically, we want to find relations between the radii of the isometric spheres of \(S_j^{\pm 1}\) and \(S_{j+1}^{\pm 1}\), the Cygan translation lengths of T at the centres of these isometric spheres and the Euclidean translation lengths of T at the vertical projections of these centres.

Suppose \(S_j^{-1}(\infty )\) and \(S_j(\infty )\) have Heisenberg coordinates \((\zeta _j,r_j)\) and \((\omega _j,s_j)\), respectively. So:

$$\begin{aligned} S_j^{-1}(\infty )= & {} \left( \begin{array}{c} -\Vert \zeta _j\Vert ^2+r_j \\ \sqrt{2}\zeta _j \\ 1 \end{array}\right) =\left( \begin{array}{c} \overline{d}_j\overline{c}_j^{-1} \\ \delta _j\overline{c}_j^{-1} \\ 1 \end{array}\right) ,\nonumber \\ S_j(\infty )= & {} \left( \begin{array}{c} -\Vert \omega _j\Vert ^2+s_j \\ \sqrt{2}\omega _j \\ 1 \end{array}\right) =\left( \begin{array}{c} a_jc_j^{-1} \\ \alpha _jc_j^{-1} \\ 1 \end{array}\right) . \end{aligned}$$
(30)

We now show how to relate \(c_{j+1}\) to \(c_j\) and \((\zeta _j,r_j)=S_j^{-1}(\infty )\) and how to relate \(\widetilde{c}_{j+1}\) to \(c_j\) and \((\omega _j,s_j)=S_j(\infty )\). Geometrically, this enables us to relate the radius of the isometric spheres of \(S_j^{\pm 1}TS_j^{\pm 1}\) to the radius and centres of the isometric spheres of \(S_j\) and \(S_j^{-1}\). Specifically, using (26) and (29) we have:

$$\begin{aligned} c_j^{-1}c_{j+1}\overline{c}_j^{-1}= & {} 2\zeta _j^*U\zeta _j-2\tau ^*\mu \zeta _j+2\zeta _j^*\tau \mu -\Vert \tau \Vert ^2\mu +t\mu \nonumber \\&-\,2\Vert \zeta _j\Vert ^2\mu +\mu r_j-r_j\mu , \end{aligned}$$
(31)
$$\begin{aligned} \overline{c}_j^{-1}\widetilde{c}_{j+1}c_j^{-1}= & {} 2\omega _j^*U\omega _j-2\tau ^*\mu \omega _j+2\omega _j^*\tau \mu -\Vert \tau \Vert ^2\mu +t\mu \nonumber \\&-\,2\Vert \omega _j\Vert ^2\mu +\mu s_j-s_j\mu . \end{aligned}$$
(32)

Furthermore, the vertical projections of the centres of the isometric spheres of \(S_j\) and \(S_j^{-1}\) are \(\Pi (S_j^{-1}(\infty ))=\zeta _j\) and \(\Pi (S_j\infty ))=\omega _j\). Their images under the vertical projection of T are \(\Pi (TS_j^{-1}(\infty ))=U\zeta _j\overline{\mu }+\tau \) and \(\Pi (TS_j(\infty ))=U\omega _j\overline{\mu }+\tau \). We define

$$\begin{aligned} \xi _j:= & {} \Pi (TS_j^{-1}(\infty ))-\Pi (S_j^{-1}(\infty )) =U\zeta _j\overline{\mu }+\tau -\zeta _j\nonumber \\= & {} \frac{1}{\sqrt{2}}(U\delta _j\overline{c}_j^{-1} \overline{\mu }-\delta _j\overline{c}_j^{-1})+\tau , \end{aligned}$$
(33)
$$\begin{aligned} \eta _j:= & {} \Pi (TS_j(\infty ))-\Pi (S_j(\infty ))=U\omega _j\overline{\mu }+\tau -\omega _j\nonumber \\= & {} \frac{1}{\sqrt{2}}(U\alpha _jc_j^{-1}\overline{\mu }-\alpha _jc_j^{-1})+\tau , \end{aligned}$$
(34)
$$\begin{aligned} B_j:= & {} A_j-\alpha _jc_j^{-1}\delta ^*_j. \end{aligned}$$
(35)

Note that Lemma 2.1 implies \(B_j\in \mathrm{Sp}(n-1)\). Also, \(\Vert \xi _j\Vert \) and \(\Vert \eta _j\Vert \) are the Euclidean translation lengths of the vertical projection of T at the vertical projections of the centres of the isometric spheres of \(S_j\) and \(S_j^{-1}\), respectively. The next lemma enables us to get information about the these translation lengths in terms of the radii of the isometric spheres of \(S_j\) and \(S_j^{\pm 1}TS_j^{\pm 1}\).

Lemma 4.1

If \(c_j,\) \(\widetilde{c}_j,\) \(\xi _j\) and \(\eta _j\) are given by (26),  (29),  (33) and (34),  then

$$\begin{aligned} 0 = 2\Vert \xi _j\Vert ^2 +2\mathrm{Re}(c_j^{-1}c_{j+1}\overline{c}_j^{-1}\overline{\mu }), \quad 0 = 2\Vert \eta _j\Vert ^2 +2\mathrm{Re}(\overline{c}_j^{-1}\widetilde{c}_{j+1}c_j^{-1}\overline{\mu }). \end{aligned}$$

Proof

We only prove the first identity. Writing out \(2\mathrm{Re}(c_j^{-1}c_{j+1}\overline{c}_j^{-1}\overline{\mu })\) from (31), we obtain

$$\begin{aligned} 2\mathrm{Re}(c_j^{-1}c_{j+1}\overline{c}_j^{-1}\overline{\mu })= & {} 2\zeta _j^*U\zeta _j\overline{\mu }-2\tau ^*\mu \zeta _j\overline{\mu } +2\zeta _j^*\tau -\Vert \tau \Vert ^2+t\\&-\,2\Vert \zeta _j\Vert ^2+\mu r_j\overline{\mu }-r_j +2\mu \zeta _j^*U^*\zeta _j-2\mu \zeta _j^*\overline{\mu }\tau +2\tau ^*\zeta _j\\&-\,\Vert \tau \Vert ^2-t-2\Vert \zeta _j\Vert ^2 -\mu r_j\overline{\mu }+r_j \\= & {} -2(\mu \zeta _j^*U^*+\tau ^*-\zeta _j^*) (U\zeta _j\overline{\mu }+\tau -\zeta _j), \end{aligned}$$

where we have used \(\tau ^*\mu =\tau ^*U\). The result follows since \(\xi _j=U\zeta _j\overline{\mu }+\tau -\zeta _j\). \(\square \)

We now find the centres of the isometric spheres of \(S_{j+1}\) and \(S_{j+1}^{-1}\) in terms of the other geometric quantities we have discussed above.

Lemma 4.2

Let \(S_j^{-1}(\infty )=(\zeta _j,r_j)\) and \(S_j(\infty )=(\omega _j,s_j)\). Let \(\xi _j\) and \(\eta _j\) be given by (33) and (34). Then

$$\begin{aligned} \zeta _{j+1}= & {} \frac{1}{\sqrt{2}}\delta _{j+1}\overline{c}_{j+1}^{-1} =\omega _j-B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1}, \end{aligned}$$
(36)
$$\begin{aligned} -\Vert \zeta _{j+1}\Vert ^2+r_{j+1}= & {} \overline{d}_{j+1}\overline{c}_{j+1}^{-1}\nonumber \\= & {} -\Vert \omega _j\Vert ^2+s_j +\overline{c}_j^{-1}\overline{\mu }\,\overline{c}_j\overline{c}_{j+1}^{-1}\nonumber \\&+\,2\omega _j^*(B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1}), \end{aligned}$$
(37)
$$\begin{aligned} \omega _{j+1}= & {} \frac{1}{\sqrt{2}}\alpha _{j+1}c_{j+1}^{-1} =\omega _j+B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1}, \end{aligned}$$
(38)
$$\begin{aligned} -\Vert \omega _{j+1}\Vert ^2+s_{j+1}= & {} a_{j+1}c_{j+1}^{-1}\nonumber \\= & {} -\Vert \omega _j\Vert ^2+s_j+\overline{c}_j^{-1}\mu \overline{c}_jc_{j+1}^{-1}\nonumber \\&-\,2\omega _j^*(B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1}). \end{aligned}$$
(39)

In particular, 

$$\begin{aligned} \xi _{j+1}= & {} U\zeta _{j+1}\overline{\mu }+\tau -\zeta _{j+1} =\eta _j -U(B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1})\overline{\mu }\nonumber \\&+\,(B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1}), \end{aligned}$$
(40)
$$\begin{aligned} \eta _{j+1}= & {} \eta _j+U\omega _{j+1}\overline{\mu }+\tau -\omega _{j+1} =U(B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1})\overline{\mu }\nonumber \\&-\,(B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1}). \end{aligned}$$
(41)

Proof

We have

$$\begin{aligned} a_{j+1}= & {} \gamma _j^*U\delta _j-\sqrt{2}a_j\tau ^*\mu \delta _j +\sqrt{2}\gamma _j^*\tau \mu \overline{c}_j -a_j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j+a_j\mu \overline{d}_j +b_j\mu \overline{c}_j\\= & {} a_jc_j^{-1}c_{j+1}+(\gamma _j^*-a_jc_j^{-1}\delta _j^*) (U\delta _j\overline{c}_j^{-1}\overline{\mu } -\delta _j\overline{c}_j^{-1}+\sqrt{2}\tau )\mu \overline{c}_j \\&+\,(\gamma _j^*\delta _j\overline{c}_j^{-1} -a_jc_j^{-1}\delta _j^*\delta _j\overline{c}_j^{-1} +b_j-a_jc_j^{-1}d_j)\mu \overline{c}_j, \\= & {} a_jc_j^{-1}c_{j+1}+\overline{c}_j^{-1}\mu \overline{c}_j -\overline{c}_j^{-1}\alpha _j^*B_j(U\delta _j\overline{c}_j^{-1}\overline{\mu } -\delta _j\overline{c}_j^{-1}+\sqrt{2}\tau )\mu \overline{c}_j. \end{aligned}$$

In the last line we used (10) and (15) to substitute for \(\gamma _j^*\delta _j\) and \(\delta _j^*\delta _j\) and Lemma 2.1 to write \(\gamma ^*_j-a_jc_j^{-1}\delta _j^*=-\overline{c}_j^{-1}\alpha _j^*B_j\). Now using the definitions of \(s_j\), \(\omega _j\) and \(\xi _j\) from (30) and (33) we obtain (39).

The other identities follow similarly. When proving the identities for \(\zeta _{j+1}\) and \(-\Vert \zeta _{j+1}\Vert ^2+r_{j+1}\), we also use \(U^*\tau =\overline{\mu }\tau \). \(\square \)

The following corollary, along with Proposition 2.5, will enable us to compare the Cygan translation length of T at \(S_{j+1}^{-1}(\infty )\) and \(S_{j+1}(\infty )\) with its Cygan translation lengths at \(S_j^{-1}(\infty )\) and \(S_j(\infty )\).

Corollary 4.3

Write \(S_j^{-1}(\infty )=(\zeta _j,r_j)\) and \(S_j(\infty )=(\omega _j,s_j)\) in Heisenberg coordinates. Then

$$\begin{aligned} (\omega _j,s_j)^{-1}(\zeta _{j+1},r_{j+1})= & {} (-B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1},\ \mathrm{Im}( \overline{c}_j^{-1}\overline{\mu }\,\overline{c}_j\overline{c}_{j+1}^{-1} )), \\ (\omega _j,s_j)^{-1}(\omega _{j+1},s_{j+1})= & {} (B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1},\ \mathrm{Im}(\overline{c}_j^{-1}\mu \overline{c}_jc_{j+1}^{-1})). \end{aligned}$$

4.2 Translation Lengths of T at \(S_j^{-1}(\infty )\) and \(S_j(\infty )\)

We are now ready to define the main quantities which we use for defining the recurrence relation between \(S_{j+1}\) and \(S_j\). Recall that \(S_j\) and \(S_j^{-1}\) have isometric spheres of radius \(r_{S_j}\) with centres \(S_j^{-1}(\infty )\) and \(S_j(\infty )\), respectively. We write \(\ell _T(S_j^{\mp 1}(\infty ))\) for the Cygan translation length of T at the centres of these isometric spheres and \(\Vert \Pi TS_j^{\mp 1}(\infty )-\Pi S_j^{\mp 1}(\infty )\Vert \) for the Euclidean translation of T at the images of these centres under the vertical projection. The quantities \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\) are each the ratio of one of these translation lengths with the radius of the isometric sphere. Specifically, they are defined by:

$$\begin{aligned} X_j= & {} \frac{\ell _T(S_j^{-1}(\infty ))}{r_{S_j}}, \quad {Y_j=\frac{\Vert \Pi TS_j^{-1}(\infty )-\Pi S_j^{-1}(\infty )\Vert }{r_{S_j}},} \\ \widetilde{X}_j= & {} \frac{\ell _T(S_j(\infty ))}{r_{S_j}}, \quad ~~~\widetilde{Y}_j=\frac{\Vert \Pi TS_j(\infty )-\Pi S_j(\infty )\Vert }{r_{S_j}}. \end{aligned}$$

Observe that Corollary 2.4 immediately implies \(Y_j\le X_j\) and \(\widetilde{Y}_j\le \widetilde{X}_j\). Using (16), (31) and (32), we see that in terms of the matrix entries they are given by:

$$\begin{aligned} X_j^2= & {} |c_j^{-1}c_{j+1}\overline{c}_j^{-1}|\,|c_j|\nonumber \\= & {} |2\zeta _j^*U\zeta _j-2\tau ^*\mu \zeta _j+2\zeta _j^*\tau \mu -(\Vert \tau \Vert ^2-t)\mu -2\Vert \zeta _j\Vert ^2\mu \nonumber \\&+\,\mu r_j-r_j\mu |\,|c_j|, \end{aligned}$$
(42)
$$\begin{aligned} \widetilde{X}_j^2= & {} |\overline{c}_j^{-1}\widetilde{c}_{j+1}c_j^{-1}|\,|c_j|\nonumber \\= & {} |2\omega _j^*U\omega _j-2\tau ^*\mu \omega _j+2\omega _j^*\tau \mu -(\Vert \tau \Vert ^2-t)\mu -2\Vert \omega _j\Vert ^2\mu \nonumber \\&+\,\mu s_j-s_j\mu |\,|c_j|, \end{aligned}$$
(43)
$$\begin{aligned} Y_j^2= & {} \Vert \xi _j\Vert ^2|c_j|=\Vert U\zeta _j\overline{\mu }+\tau -\zeta _j\Vert ^2|c_j|, \end{aligned}$$
(44)
$$\begin{aligned} \widetilde{Y}_j^2= & {} \Vert \eta _j\Vert ^2|c_j|= \Vert U\omega _j\overline{\mu }+\tau -\omega _j\Vert ^2|c_j|. \end{aligned}$$
(45)

In Sect. 6, we will show that if the condition (6) of our main theorem does not hold then the sequence \(S_{j+1}=S_jTS_j^{-1}\) converges to T in the topology induced by the Hilbert–Schmidt norm on \(\mathrm{PSp}(n,1)\). To do so, we need the following two lemmas giving \(X_{j+1}\), \(\widetilde{X}_{j+1}\), \(Y_{j+1}\) and \(\widetilde{Y}_{j+1}\) in terms of \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\).

Lemma 4.4

We claim that

$$\begin{aligned} X_{j+1}^2\le & {} X_j^2\widetilde{X}_j^2+4Y_j\widetilde{Y}_j+2N_{U,\mu }+N_\mu , \end{aligned}$$
(46)
$$\begin{aligned} \widetilde{X}_{j+1}^2\le & {} X_j^2\widetilde{X}_j^2 +4Y_j\widetilde{Y}_j+2N_{U,\mu }+N_\mu . \end{aligned}$$
(47)

Proof

Writing \(S_j^{-1}(\infty )\) and \(S_j(\infty )\) in Heisenberg coordinates and using Proposition 2.5 and Corollary 4.3, we have

$$\begin{aligned}&\ell _T(S_{j+1}^{-1}(\infty ))^2 \\&\quad \le \ell _T(S_j(\infty ))^2 +4\Vert \Pi TS_j(\infty )-\Pi S_j(\infty )\Vert \, \Vert -B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1} \Vert \\&\qquad +\,2N_{U,\mu }\Vert -B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1} \Vert ^2+N_\mu | \mathrm{Im}(\overline{c}_j^{-1} \overline{\mu }\,\overline{c}_j\overline{c}_{j+1}^{-1})| \\&\quad \le \ell _T(S_j(\infty ))^2 +4\Vert \eta _j\Vert \,\Vert \xi _j\Vert \,|c_j|\,|c_{j+1}|^{-1} +2N_{U,\mu }\Vert \xi _j\Vert ^2|c_j|^2|c_{j+1}|^{-2}\\&\qquad +\,N_\mu |c_{j+1}|^{-1}. \end{aligned}$$

Now, multiply on the left and right by \(|c_{j+1}|=1/r_{S_{j+1}}^2\) and use \(\ell _T(S_j^{-1}(\infty ))=X_jr_{S_j}\) and \(\ell _T(S_j(\infty ))=\widetilde{X}_jr_{S_j}\). This gives

$$\begin{aligned} X_{j+1}^2\le \widetilde{X}_j^2|c_{j+1}|\,|c_j|^{-1} +4\Vert \eta _j\Vert \,\Vert \xi _j\Vert \,|c_j|+2N_{U,\mu }\Vert \xi _j\Vert ^2|c_j|^2 |c_{j+1}|^{-1}+N_\mu . \end{aligned}$$

Finally, we use \(|c_{j+1}|\,|c_j|^{-1}=X_j^2\), \(\Vert \xi _j\Vert \,|c_j|^{1/2}=Y_j\) and \(\Vert \eta _j\Vert \,|c_j|^{1/2}=\widetilde{Y}_j\). This gives

$$\begin{aligned} X_{j+1}^2 \le X_j^2\widetilde{X}_j^2 +4Y_j\widetilde{Y}_j+2N_{U,\mu }Y_j^2X_j^{-2}+N_\mu . \end{aligned}$$

The inequality (46) follows since \(Y_j\le X_j\). The inequality (47) follows similarly. \(\square \)

We now estimate \(Y_{j+1}\) and \(\widetilde{Y}_{j+1}\) in terms of \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\).

Lemma 4.5

We claim that

$$\begin{aligned} Y_{j+1}^2\le & {} \widetilde{Y}_j^2X_j^2+2N_{U,\mu }Y_j\widetilde{Y}_j +N_{U,\mu }^2, \end{aligned}$$
(48)
$$\begin{aligned} \widetilde{Y}_{j+1}^2\le & {} \widetilde{Y}_j^2X_j^2 +2N_{U,\mu }Y_j\widetilde{Y}_j +N_{U,\mu }^2. \end{aligned}$$
(49)

Proof

Using the definition of \(Y_j\) from (44) and the identity for \(\xi _{j+1}\) from (40), we have:

$$\begin{aligned} Y_{j+1}= & {} \Vert \xi _{j+1}\Vert \, |c_{j+1}|^{1/2} \\= & {} \Vert \eta _j -U(B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1})\overline{\mu } +(B_jU^*\xi _j\overline{c}_j\overline{c}_{j+1}^{-1})\Vert \,|c_{j+1}|^{1/2} \\\le & {} \widetilde{Y}_j|c_j|^{-1/2}|c_{j+1}|^{1/2} +N_{U,\mu }Y_j|c_j|^{1/2}|c_{j+1}|^{-1/2} \\= & {} \widetilde{Y}_jX_j+N_{U,\mu }Y_jX_j^{-1}. \end{aligned}$$

Squaring and using \(Y_j\le X_j\) gives (48). A similar argument gives the inequality (49).   \(\square \)

Therefore, we have recurrence relations bounding \(X_{j+1}\), \(\widetilde{X}_{j+1}\), \(Y_{j+1}\) and \(\widetilde{Y}_{j+1}\) (that is translation lengths and radii) in terms of the same quantities for the index j. In the next section, we find a basin of attraction for this dynamical system.

5 Convergence of the Dynamical System

In this section, we interpret the condition (6) of Theorem 1.1 in terms of our dynamical system involving translation lengths, and we show that if (6) does not hold then \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\) are all bounded. Broadly speaking the argument will be based on the argument of Parker [21] for subgroups of \(\mathrm{SU}(n,1)\) containing a Heisenberg translation. This argument was used by Kim and Parker [16] for subgroups of \(\mathrm{Sp}(n,1)\) containing a Heisenberg translation. If \(N_{U,\mu }=0\) then T is a Heisenberg translation, since \(\mu =\pm 1\) and \(U=\mu I_{n-1}\). Moreover, \(K=1\). These conditions make the inequalities from Lemmas  and much simpler (see [16, p. 307]), and so Theorem 1.1 reduces to [16, Thm. 4.8].

Recall the definition of K from (5). The only properties of K that we need are that \(2N_{U,\mu }<(1+2N_{U,\mu })/2< K<1-2N_{U,\mu }<1\) and that K satisfies the equation:

$$\begin{aligned} (K-2N_{U,\mu })(1-K)=2N_{U,\mu }+N_\mu . \end{aligned}$$
(50)

Observe that (46)–(49) together with (50) imply

$$\begin{aligned} \max \{X_{j+1}^2,\,\widetilde{X}_{j+1}^2\}\le & {} X_j^2\widetilde{X}_j^2+4Y_j\widetilde{Y}_j+(K-2N_{U,\mu })(1-K), \end{aligned}$$
(51)
$$\begin{aligned} \max \{Y_{j+1}^2,\,\widetilde{Y}_{j+1}^2\}\le & {} X_j^2\widetilde{Y}_j^2+2N_{U,\mu }Y_j\widetilde{Y}_j+N_{U,\mu }(K-2N_{U,\mu })(1-K)/2.\qquad \quad \end{aligned}$$
(52)

Our goal in this section is to prove the following theorem.

Theorem 5.1

Assume that \(N_{U,\mu }\ne 0.\) Suppose \(X_j,\) \(\widetilde{X}_j,\) \(Y_j\) and \(\widetilde{Y}_j\) satisfy (51) and (52). If

$$\begin{aligned} X_0\widetilde{X}_0+\frac{4Y_0\widetilde{Y}_0}{K-2N_{U,\mu }}< K \end{aligned}$$
(53)

then for all \(\varepsilon >0\) there exists \(J_\varepsilon \) so that for all \(j\ge J_\varepsilon {:}\)

$$\begin{aligned} \max \{X_j^2,\,\widetilde{X}_j^2\}<1-K+\varepsilon ,\quad \max \{Y_j^2,\,\widetilde{Y}_j^2\}<N_{U,\mu }(1-K)/2+\varepsilon . \end{aligned}$$
(54)

Note that (53) is simply the statement that (6) fails written in terms of \(X_0\), \(\widetilde{X}_0\), \(Y_0\) and \(\widetilde{Y}_0\). In the case where T is a Heisenberg translation, that is \(N_{U,\mu }=0\) and \(K=1\), the theorem implies that \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\) all converge to 0. In the general case we have the weaker conclusion that these sequences are uniformly bounded. In particular, we can find a compact set containing \(X_j\), \(\widetilde{X}_j\), \(Y_j\) and \(\widetilde{Y}_j\) for all \(j\ge J_\varepsilon \). Hence there is a subsequence on which we have convergence of each of these variables.

In order to simply the notation, for each \(j\ge 1\) we define

$$\begin{aligned} x_j=\max \{X_j^2,\widetilde{X}_j^2\},\quad y_j=\max \{Y_j^2,\widetilde{Y}_j^2\}. \end{aligned}$$

It is clear that (51) and (52) imply that for \(j\ge 1\) we have:

$$\begin{aligned} x_{j+1}\le & {} x_j^2+4y_j+(K-2N_{U,\mu })(1-K), \end{aligned}$$
(55)
$$\begin{aligned} y_{j+1}\le & {} x_jy_j+2N_{u,\mu }y_j+N_{U,\mu }(K-2N_{U,\mu })(1-K)/2. \end{aligned}$$
(56)

The proof of Theorem 5.1 will be by way of three lemmas. The first one converts the hypothesis (53) of Theorem 5.1 to an initial condition for this dynamical system involving \(x_1\) and \(y_1\). Assuming this initial condition, the second and third lemmas, respectively, show that for each \(\varepsilon >0\) there is \(J_\varepsilon \) so that for \(j\ge J_{\varepsilon }\)

$$\begin{aligned} x_j<1-K+\varepsilon ,\quad y_j<N_{U,\mu }(1-K)/2+\varepsilon . \end{aligned}$$

This is just a restatement of the conclusion of Theorem 5.1.

Before giving the proof, we give a geometrical interpretation of Theorem 5.1. Consider the dynamical system where we impose equality in (55) and (56) for each j. It has an attractive fixed point at \((x,y)=((1-K),\,N_{U,\mu }(1-K)/2)\) and a saddle fixed point at \((x,y)=((K-2N_{U,\mu }),\,N_{U,\mu }(K-2N_{U,\mu })/2)\). Points on the line

$$\begin{aligned} x+\frac{4y}{K-2N_{U,\mu }} =K \end{aligned}$$

are attracted to the saddle point and points below this line are attracted to the attractive fixed point. Since we only have inequalities, we cannot describe fixed points. However, our main result says that points below the line accumulate in a neighbourhood of the rectangle \(x\le (1-K)\), \(y\le N_{U,\mu }(1-K)/2\).

Lemma 5.2

Suppose that \(X_1^2,\) \(\widetilde{X}_1^2,\) \(Y_1^2\) and \(\widetilde{Y}_1^2\) satisfy the recursive inequalities (51) and (52). If (53) holds,  that is : 

$$\begin{aligned} X_0\widetilde{X}_0+\frac{4Y_0\widetilde{Y}_0}{K-2N_{U,\mu }} < K, \end{aligned}$$

then

$$\begin{aligned} x_1+\frac{4y_1}{K-2N_{U,\mu }} =\max \{X_{1}^2,\widetilde{X}_{1}^2\} +\frac{4\max \{Y_{1}^2,\widetilde{Y}_{1}^2\}}{K-2N_{U,\mu }} <K. \end{aligned}$$

Proof

Suppose that (53) holds. Interchanging \(S_0\) and \(S_0^{-1}\) if necessary, we also suppose that \(X_0\widetilde{Y}_0\le \widetilde{X}_0Y_0\). Using (51) and (52) we have:

$$\begin{aligned}&x_1+\frac{4y_1}{K-2N_{U,\mu }} \\&\quad = \max \{X_{1}^2,\widetilde{X}_{1}^2\} +\frac{4\max \{Y_{1}^2,\widetilde{Y}_{1}^2\}}{K-2N_{U,\mu }} \\&\quad \le (X_0^2\widetilde{X}_0^2+4Y_0\widetilde{Y}_0 + 2N_{U,\mu } + N_{\mu })\\&\qquad +\,(X_0^2\widetilde{Y}_0^2+2N_{U,\mu }Y_0\widetilde{Y}_0+\,N_{U,\mu }^2)\frac{4}{K-2N_{U,\mu }}\\&\quad \le (X_0^2\widetilde{X}_0^2+4Y_0\widetilde{Y}_0 + 2N_{U,\mu } + N_{\mu })\\&\qquad +\,(X_0\widetilde{X}_0Y_0\widetilde{Y}_0 +2N_{U,\mu }Y_0\widetilde{Y}_0+\,N_{U,\mu }^2)\frac{4}{K-2N_{U,\mu }} \\&\quad = \left( X_0\widetilde{X}_0+\frac{4Y_0\widetilde{Y}_0}{K-2N_{U,\mu }}\right) X_0\widetilde{X}_0+\frac{4KY_0\widetilde{Y}_0}{K-2N_{U,\mu }} + \frac{2KN_{U,\mu }}{K-2N_{U,\mu }}+N_\mu \\&\quad< K\left( X_0\widetilde{X}_0+\frac{4Y_0\widetilde{Y}_0}{K-2N_{U,\mu }} \right) +\frac{2KN_{U,\mu }}{K-2N_{U,\mu }}+\frac{KN_\mu }{K-2N_{U,\mu }} \\&\quad < K^2+K(1-K) \\&\quad = K. \end{aligned}$$

This proves the lemma. \(\square \)

We now use this lemma to give an upper bound on \(x_j\).

Lemma 5.3

Suppose that \(x_j\) and \(y_j\) satisfy the recursive inequalities (55) and (56) and also that

$$\begin{aligned} x_1+\frac{4y_1}{K-2N_{U,\mu }}<K. \end{aligned}$$

Then for any \(\varepsilon _x>0\) there exists \(J_x\in {\mathbb N}\) so that for all \(j\ge J_x\) we have

$$\begin{aligned} x_j<1-K+\varepsilon _x. \end{aligned}$$

Proof

Using (55) and (56) we have

$$\begin{aligned}&x_{j+1}+\frac{4y_{j+1}}{K-2N_{U,\mu }} \\&\quad \le x_j^2+4y_j+(K-2N_{U,\mu })(1-K)+\frac{4}{K-2N_{U,\mu }}(x_jy_j+2N_{U,\mu }y_j)\\&\qquad +\, 2N_{U,\mu }(1-K)\\&\quad = K-(x_j+K)\left( K-x_j-\frac{4y_j}{K-2N_{U,\mu }}\right) . \end{aligned}$$

Since \(x_1+4y_1/(K-2N_{U,\mu })<K\), the above inequality implies that, for each \(j\ge 2\), we have

$$\begin{aligned} \left( K-x_j-\frac{4y_j}{K-2N_{U,\mu }}\right) \ge \left( K-x_1-\frac{4y_1}{K-2N_{U,\mu }}\right) \prod _{i=1}^{j-1}(x_j+K)>0. \end{aligned}$$

If there exists \(\varepsilon >0\) so that \(x_j\ge (1-K+\varepsilon )\) for all but finitely many values of j, then the right-hand side of the above inequality tends to infinity as j tends to infinity. However, the left-hand side is at most K, which is a contradiction. \(\square \)

Finally, we use the upper bound on \(x_j\) to obtain an upper bound on \(y_j\).

Lemma 5.4

Suppose that \(y_j\) satisfies the recursive inequality (56) and also that for all \(\varepsilon _x>0\) there exists \(J_x\in {\mathbb N}\) so that for all \(j\ge J_x\), we have \(x_j<1-K+\varepsilon _x.\) Then for any \(\varepsilon _y>0\) there exists \(J_y\ge J_x\) so that for all \(j\ge J_y\), we have

$$\begin{aligned} y_j\le N_{U,\mu }(1-K)/2+\varepsilon _y. \end{aligned}$$

Proof

Given \(\varepsilon _y>0\) choose \(\varepsilon _x\) with \(0<\varepsilon _x<K-2N_{U,\mu }\) so that

$$\begin{aligned} \frac{N_{U,\mu }(K-2N_{U,\mu })(1-K)}{K-2N_{U,\mu }-\varepsilon _x} \le N_{U,\mu }(1-K)+\varepsilon _y. \end{aligned}$$

Using (56) for \(j\ge J_x\), we have

$$\begin{aligned} y_{j+1}\le & {} x_jy_j+2N_{U,\mu }y_j+N_{U,\mu }(K-2N_{U,\mu })(1-K)/2 \\\le & {} x_jy_j+2N_{U,\mu }y_j+ (K-2N_{U,\mu }-\varepsilon _x)(N_{U,\mu }(1-K)/2+\varepsilon _y/2)\\= & {} N_{U,\mu }(1-K)/2 + \varepsilon _y/2 \\&+\,(1-K+2N_{U,\mu }+\varepsilon _x) (y_j-N_{U,\mu }(1-K)/2-\varepsilon _y/2). \end{aligned}$$

If \(y_j\le N_{U,\mu }(1-K)/2+\varepsilon _y/2\) then so is \(y_{j+1}\) and the result follows. Otherwise, we have

$$\begin{aligned}&{y_{j+1}-N_{U,\mu }(1-K)/2 - \varepsilon _y/2} \\&\quad \le (1-K+2N_{U,\mu }+\varepsilon _x) (y_j-N_{U,\mu }(1-K)/2-\varepsilon _y/2) \\&\quad \le (1-K+2N_{U,\mu }+\varepsilon _x)^{j+1-J_x} (y_{J_x}-N_{U,\mu }(1-K)/2-\varepsilon _y/2). \end{aligned}$$

Since \(K-2N_{U,\mu }+\varepsilon _x>0\), we see that the right-hand side tends to \(N_{U/\mu }(1-K)/2+\varepsilon _y/2\). Therefore, we can find \(J_y\ge J_x\) so that for all \(j\ge J_y\), we have

$$\begin{aligned} (1-K+2N_{U,\mu }+\varepsilon _x)^{j+1-J_x} (y_{J_x}-N_{U,\mu }(1-K)/2-\varepsilon _y/2) <\varepsilon _y/2. \end{aligned}$$

This gives the result. \(\square \)

Finally, Theorem 5.1 follows by taking \(\varepsilon =\min \{\varepsilon _x,\varepsilon _y\}\) and \(J_\varepsilon =\max \{J_x,J_y\}=J_y\). This completes the proof.

6 Convergence of \(S_j\) to T

We are now ready to prove that the \(S_j\) converge to T as j tends to infinity under the condition (53) of Theorem 5.1. We claim that the sequence \(\{S_j\}\) is not eventually constant and so this convergence implies that the group \(\langle S,T\rangle \) is not discrete.

In order to verify the claim, suppose the sequence \(\{S_j\}\) converges to T and is eventually constant. Then \(S_j=T\) for sufficiently large j, and so \(S_{j+1}\) fixes \(\infty \) for some \(j\ge 0\). Since \(\infty \) is the only fixed point of T then \(S_j(\infty )\) is the only fixed point of \(S_{j+1}=S_jTS_j^{-1}\). Hence, if \(S_{j+1}\) fixes \(\infty \) then so does \(S_j\). Repeating this argument, we see that all the \(S_j\) must fix \(\infty \). However, we assumed \(S_0=S\) does not fix \(\infty \), which is a contradiction.

In this section, we will show that the condition (53) implies that each of the nine entries of \(S_j\) converges to the corresponding entry of T. We divide our proof into subsections, each containing convergence of certain entries. The main steps are:

  • We will first show that \(c_j\) tends to zero as j tends to infinity (Proposition 6.2).

  • After showing \(\Vert \alpha _j c_j^{-1/2}\Vert \), \(\Vert \delta _j \overline{c}_j^{-1/2}\Vert \) are bounded (Lemma 6.3), we can show that \(\alpha _j\) and \(\delta _j\) both tend to \(0\in {\mathbb H}^{n-1}\) as j tends to infinity (Proposition 6.4).

  • We then show the remaining matrix entries are bounded (Lemmas 6.6, 6.7 and Corollaries 6.8, 6.9).

  • Using the results obtained so far, we can show that \(a_j\) and \(d_j\) both tend to \(\mu \) and \(A_j\) tends to U as j tends to infinity (Propositions 6.10 and 6.11).

  • Finally, we show that \(\beta _j\), \(\gamma _j\) and \(b_j\) tend to \(\sqrt{2}\tau \mu \), \(-\sqrt{2}\overline{\mu }\tau \) and \((-\Vert \tau \Vert ^2+t)\mu \), respectively, as j tends to infinity (Propositions 6.12 and 6.13).

Throughout this proof we use Theorem 5.1 to show that the hypothesis (53) implies that (54) holds, that is for large enough j:

$$\begin{aligned} \max \{X_j^2,\widetilde{X}_j^2\}<1-K+\varepsilon ,\quad \max \{Y_j^2,\widetilde{Y}_j^2\}<N_{U,\mu }(1-K)/2+\varepsilon . \end{aligned}$$

We will repeatedly use the following elementary lemma to show certain entries are bounded and others converge.

Lemma 6.1

Let \(\lambda _1,\) \(\lambda _2,\) D be positive real constants with \(\lambda _i<1\) and \(\lambda _1\ne \lambda _2.\) Let \(C_j\in {\mathbb {R}}^+\) be defined iteratively.

  1. (i)

    If \(C_{j+1}\le \lambda _1 C_j +D\) for \(j\ge 0\) then

    $$\begin{aligned} C_j\le D/(1-\lambda _1)+\lambda _1^j(C_0-D/(1-\lambda _1)). \end{aligned}$$

    In particular,  given \(\varepsilon >0\) there exists \(J_\varepsilon \) so that for all \(j\ge J_\varepsilon \) we have

    $$\begin{aligned} C_j\le D/(1-\lambda _1)+\varepsilon . \end{aligned}$$
  2. (ii)

    If \(C_{j+1}\le \lambda _1 C_j + \lambda _2^j D\) for \(j\ge 0\) then

    $$\begin{aligned} C_j\le \lambda _1^j C_0+D(\lambda _2^j-\lambda _1^j)/(\lambda _2-\lambda _1). \end{aligned}$$

    In particular,  \(C_j\le C_0\lambda _1^j+\max \{\lambda _1^j,\lambda _2^j\}D/|\lambda _1-\lambda _2|\).

6.1 Convergence of \(c_j\)

The easiest case is to show that \(c_j\) tends to zero. Geometrically, this means that the isometric spheres of \(S_j\) have radii tending to infinity as j tends to infinity.

Proposition 6.2

Suppose that (53) holds. Then \(c_j\) tends to zero as j tends to infinity.

Proof

Using Theorem 5.1, given \(\varepsilon >0\), the hypothesis (53) implies that for large enough j we have \(X_j^2<1-K+\varepsilon \). Since \(K>1/2\) we can choose \(\varepsilon \) so that \(0<\varepsilon <K-1/2\). Then there exists \(J_\varepsilon \) so that \(X_j^2<(1-K)+\varepsilon <1/2\) for all \(j\ge J_\varepsilon \). From (42) and (54) for \(j\ge J_\varepsilon \) we have

$$\begin{aligned} |c_{j+1}|=X_j^2|c_j|< |c_j|/2< \cdots < |c_{J_\varepsilon }|/2^{j-J_\varepsilon +1}. \end{aligned}$$

Thus that \(c_j\) tends to zero as j tends to infinity. \(\square \)

6.2 Convergence of \(\alpha _j\) and \(\delta _j\)

In this section, we show that \(\alpha _j\) and \(\delta _j\) both tend to the zero vector as j tends to infinity. To do so, we first show their norms are bounded by a constant multiple of \(|c_j|^{1/2}\).

Lemma 6.3

Suppose that (53) holds. For any \(\varepsilon >0\) there exists \(J_\varepsilon >0\) so that

$$\begin{aligned} \Vert \alpha _jc_j^{-1/2}\Vert< \frac{\sqrt{2}}{1-\sqrt{1-K}}+\varepsilon , \quad \Vert \delta _j\overline{c}_j^{-1/2}\Vert < \frac{\sqrt{2}}{1-\sqrt{1-K}}+\varepsilon . \end{aligned}$$

Proof

Again, using Theorem 5.1, given \(\varepsilon _1>0\) there exists \(J_1\) so that for \(j\ge J_1\)

$$\begin{aligned} X_j^2\le (1-K)+\varepsilon _1. \end{aligned}$$

Observe that \(\alpha _jc_j^{-1/2}=\sqrt{2}\omega _jc_j^{1/2}\). Therefore, Eq. (38) implies that for \(j\ge J_1\) we have

$$\begin{aligned} \Vert \alpha _{j+1}c_{j+1}^{-1/2}\Vert= & {} \sqrt{2}\Vert \omega _{j+1}\Vert \,|c_{j+1}|^{1/2} \\= & {} \sqrt{2} \Vert \omega _j +B_j\xi _j\mu \overline{c}_jc_{j+1}^{-1}\Vert \,|c_{j+1}|^{1/2} \\\le & {} \sqrt{2}\Vert \omega _j\Vert \,|c_{j+1}|^{1/2} +\sqrt{2}\Vert \xi _j\Vert \,|c_j|\,|c_{j+1}|^{-1/2} \\= & {} \Vert \alpha _jc_j^{-1/2}\Vert \,|c_j|^{-1/2}|c_{j+1}|^{1/2} +\sqrt{2}\Vert \xi _j\Vert \,|c_j|\,|c_{j+1}|^{-1/2} \\= & {} X_j\Vert \alpha _jc_j^{-1/2}\Vert +\sqrt{2}Y_jX_j^{-1} \\\le & {} \sqrt{1-K+\varepsilon _1}\,\Vert \alpha _jc_j^{-1/2}\Vert +\sqrt{2}. \end{aligned}$$

Therefore, using Lemma 6.1, given \(\varepsilon _2>0\) we can find \(J_2\ge J_1\) so that for \(j\ge J_2\) we have

$$\begin{aligned} \Vert \alpha _jc_j^{-1/2}\Vert \le \frac{\sqrt{2}}{1-\sqrt{1-K+\varepsilon _1}} +\varepsilon _2. \end{aligned}$$

Given any \(\varepsilon >0\) it is possible to find \(\varepsilon _1>0\) and \(\varepsilon _2>0\) so that

$$\begin{aligned} \frac{\sqrt{2}}{1-\sqrt{1-K+\varepsilon _1}} +\varepsilon _2 \le \frac{\sqrt{2}}{1-\sqrt{1-K}}+\varepsilon . \end{aligned}$$

This proves the first part. A similar argument holds for \(\Vert \delta _j\overline{c}_j^{-1/2}\Vert \). \(\square \)

Proposition 6.4

Suppose that (53) holds. Then \(\alpha _j\) and \(\delta _j\) both tend to \(0\in {\mathbb H}^{n-1}\) as j tends to infinity.

Proof

Clearly \(\Vert \alpha _j\Vert =\Vert \alpha _jc_j^{-1/2}\Vert \,|c_j|^{1/2}\) and \(\Vert \delta _j\Vert =\Vert \delta _j\overline{c}_j^{-1/2}\Vert \,|c_j|^{1/2}\). Using Proposition 6.2 and Lemma 6.3 we see that \(c_j\) tends to zero and \(\Vert \alpha _jc_j^{-1/2}\Vert \) and \(\Vert \delta _j\overline{c}_j^{-1/2}\Vert \) are bounded. Thus \(\alpha _j\) and \(\delta _j\) both tend to \(0\in {\mathbb H}^{n-1}\) as j tends to infinity. \(\square \)

The following estimate will be useful later.

Corollary 6.5

Suppose that (53) holds. Given \(\varepsilon >0\) there exists \(J_0\) so that for \(j\ge J_0\) we have

$$\begin{aligned} Y_j\Vert \alpha _jc_j^{-1/2}\Vert < \frac{\sqrt{N_{U,\mu }}}{\sqrt{2}-1}+\varepsilon . \end{aligned}$$

Proof

From (54) we have

$$\begin{aligned} 2Y_j^2\le N_{U,\mu }(1-K)+\varepsilon _1, \end{aligned}$$

and from Lemma 6.3 we have

$$\begin{aligned} \Vert \alpha _j c_j^{-1/2}\Vert ^2 \le \frac{2}{(1-\sqrt{1-K})^2}+\varepsilon _2. \end{aligned}$$

Given \(\varepsilon >0\), combining these inequalities for suitable \(\varepsilon _1,\varepsilon _2>0\), we obtain

$$\begin{aligned} Y_j\Vert \alpha _jc_j^{-1/2}\Vert \le \frac{\sqrt{N_{U,\mu }(1-K)}}{1-\sqrt{1-K}}+\varepsilon . \end{aligned}$$

Since \((1-K)< 1/2\) we have

$$\begin{aligned} \frac{N_{U,\mu }(1-K)}{(1-\sqrt{1-K})^2}< \frac{N_{U,\mu }(1/2)}{(1-\sqrt{1/2})^2} =\frac{N_{U,\mu }}{(\sqrt{2}-1)^2}. \end{aligned}$$

This completes the proof. \(\square \)

6.3 The Remaining Matrix Entries are Bounded

In this section, we show that the norms of the remaining matrix entries are bounded. Later, this will enable us to show they converge. We begin by showing \(|a_j|\) and \(|b_j|\) are bounded.

Lemma 6.6

Suppose that (53) holds. There exists \(J\in {\mathbb N}\) so that for \(j\ge J\) we have

$$\begin{aligned} |a_j|<4,\quad |d_j|<4. \end{aligned}$$

Proof

We use (39) to obtain

$$\begin{aligned} |a_{j+1}|= & {} |a_jc_j^{-1}c_{j+1}+\overline{c}_j^{-1}\mu \overline{c}_j -\sqrt{2}\,\overline{c}_j^{-1}\alpha _j^*(B_j\xi _j\mu \overline{c}_j)| \\\le & {} |a_j|\,|c_{j+1}|\,|c_j|^{-1} +1+\sqrt{2}\Vert \xi _j\Vert \,|c_j|^{-1/2}\Vert \alpha _jc_j^{-1/2}\Vert \\= & {} X_j^2|a_j| +1+\sqrt{2}\,Y_j\Vert \alpha _jc_j^{-1/2}\Vert . \end{aligned}$$

Using (54) and Corollary 6.5, since \(1-K<1/2\), for any \(\varepsilon _1>0\) we can find \(J_1\) so that for \(j\ge J_1\) we have

$$\begin{aligned} X_j^2\le \frac{1}{2},\quad \sqrt{2}\,Y_j\Vert \alpha _jc_j^{-1/2}\Vert <\frac{\sqrt{2N_{U,\mu }}}{\sqrt{2}-1}+\varepsilon _1. \end{aligned}$$

Therefore, using Lemma 6.1(i) with \(\lambda _1=1/2\) and \(D=1+\frac{\sqrt{2N_{U,\mu }}}{\sqrt{2}-1}+\varepsilon _1\), for any \(\varepsilon _2>0\) there is a \(J_2\ge J_1\) so that for all \(j\ge J_2\) we have

$$\begin{aligned} |a_j|<\frac{1+\sqrt{2N_{U,\mu }}/(\sqrt{2}-1)+\varepsilon _1}{1-1/2}+\varepsilon _2 =2+\frac{2\sqrt{2N_{U,\mu }}}{\sqrt{2}-1}+2\varepsilon _1+\varepsilon _2. \end{aligned}$$

Now, using our assumptions about \(N_{U,\mu }\) and \(N_\mu \), we have:

$$\begin{aligned} N_{U,\mu }<\frac{3-2\sqrt{2+N_\mu }}{2}< \frac{(\sqrt{2}-1)^2}{2}. \end{aligned}$$

Therefore, we can choose \(\varepsilon _1\) and \(\varepsilon _2\) so that

$$\begin{aligned} \frac{\sqrt{2N_{U,\mu }}}{\sqrt{2}-1}+\varepsilon _1+\varepsilon _2/2<1. \end{aligned}$$

Hence \(|a_j|<4\) for \(j\ge J_2\). A similar argument shows that \(|d_j|<4\) for large enough j. \(\square \)

Lemma 6.7

Suppose that (53) holds. Then \(|b_j|\) is bounded above as j tends to infinity.

Proof

If \(a_j=0\) then \(\gamma _j=0\) and so \(b_{j+1}=0\). Hence we take \(a_j\ne 0\). Then (11) gives

$$\begin{aligned} 0= & {} (a_j\overline{b}_j+ \gamma _j^*\gamma _j+b_j\overline{a}_j) \overline{a}_j^{-1}\mu \overline{a}_j =a_j\overline{b}_j\overline{a}_j^{-1}\mu \overline{a}_j + \gamma _j^*\gamma _j\overline{a}_j^{-1}\mu \overline{a}_j +b_j\mu \overline{a}_j, \\ \Vert \gamma _j\Vert ^2= & {} -(a_j\overline{b}_j+b_j\overline{a}_j)\le 2|a_j|\,|b_j|. \end{aligned}$$

Hence, using (22), we have

$$\begin{aligned} b_{j+1}= & {} \gamma _j^*U\gamma _j-\sqrt{2}a_j\tau ^*\mu \gamma _j+ \sqrt{2}\gamma _j^*\tau \mu \overline{a}_j-a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j +a_j\mu \overline{b}_j+b_j\mu \overline{a}_j \\= & {} \gamma _j^*U(\gamma _j\overline{a}_j^{-1})\overline{a}_j -\sqrt{2}a_j\tau ^*\mu \gamma _j+\sqrt{2}\gamma _j^*\tau \mu \overline{a}_j-a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j\\&+\,a_j\mu (\overline{b}_j\overline{a}_j^{-1})\overline{a}_j+b_j\mu \overline{a}_j -\gamma _j^*(\gamma _j\overline{a}_j^{-1})\mu \overline{a}_j -a_j(\overline{b}_j\overline{a}_j^{-1})\mu \overline{a}_j-b_j\mu \overline{a}_j \\= & {} \gamma _j^*(U\gamma _j\overline{a}_j^{-1}-\gamma _j\overline{a}_j^{-1}\mu ) \overline{a}_j-\sqrt{2}a_j\tau ^*\mu \gamma _j +\sqrt{2}\gamma _j^*\tau \mu \overline{a}_j-a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j\\&+\,a_j(\mu \overline{b}_j\overline{a}_j^{-1}-\overline{b}_j\overline{a}_j^{-1}\mu )\overline{a}_j. \end{aligned}$$

Using Lemma 6.6 we suppose j is large enough that \(|a_j|<4\). Then we have

$$\begin{aligned} |b_{j+1}|\le & {} |\gamma _j^*(U\gamma _j\overline{a}_j^{-1}-\gamma _j\overline{a}_j^{-1}\mu ) \overline{a}_j|+\sqrt{2}|a_j\tau ^*\mu \gamma _j|+ \sqrt{2}|\gamma _j^*\tau \mu \overline{a}_j|\\&+\,|a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j| +|a_j(\mu \overline{b}_j\overline{a}_j^{-1} -\overline{b}_j\overline{a}_j^{-1}\mu )\overline{a}_j| \\\le & {} N_{U,\mu }\Vert \gamma _j\Vert ^2 +2\sqrt{2}|a_j|\,\Vert \tau \Vert \, \Vert \gamma _j\Vert +|a_j|^2|\Vert \tau \Vert ^2-t|+N_{\mu }|a_j|\,|b_j|\\\le & {} (2N_{U,\mu }+N_{\mu })|a_j|\,|b_j| +4|a_j|^{3/2}\Vert \tau \Vert \,|b_j|^{1/2}+|a_j|^2|\Vert \tau \Vert ^2-t|\\\le & {} 4(2N_{U,\mu }+N_{\mu })|b_j| +32\Vert \tau \Vert \,|b_j|^{1/2}+16|\Vert \tau \Vert ^2-t|. \end{aligned}$$

Observe that our hypotheses \(N_\mu < 1/4\) and \(N_{U,\mu }<(3-2\sqrt{2+N_\mu })/2\) imply that

$$\begin{aligned} 2N_{U,\mu }+N_\mu< N_{\mu }+3-2\sqrt{2+N_\mu } = (\sqrt{2+N_\mu }-1)^2 < (3/2-1)^2=1/4. \nonumber \\ \end{aligned}$$
(57)

Hence we can find \(\lambda >0\) with \(4(2N_{U,\mu }+N_\mu )<\lambda ^2<1\) and

$$\begin{aligned} |b_{j+1}|\le & {} \lambda ^2|b_j| +32\Vert \tau \Vert \,|b_j|^{1/2}+16|\Vert \tau \Vert ^2-t|\nonumber \\< & {} \left( \lambda |b_j|^{1/2} +16|\Vert \tau \Vert ^2-t|^{1/2}/\lambda \right) ^2. \end{aligned}$$

Then, using Lemma 6.6(i), given \(\varepsilon _1>0\) we can find \(J_1\) so that for \(j\ge J_1\) we have

$$\begin{aligned} |b_j|^{1/2}\le \frac{16|\Vert \tau \Vert ^2-t|^{1/2}/\lambda }{1-\lambda }+\varepsilon _1. \end{aligned}$$

\(\square \)

Corollary 6.8

Suppose that (53) holds. Then \(\Vert \beta _j\Vert \) and \(\Vert \gamma _j\Vert \) are bounded above as j tends to infinity.

Proof

Note that \(\Vert \gamma _j\Vert ^2=-(a_j\overline{b}_j+b_j\overline{a}_j)\le 2|a_j||b_j|\) and \(\Vert \beta _j\Vert ^2=-(\overline{b}_jd_j+\overline{d}_jb_j)\le 2|b_j||d_j|\). Thus Lemmas 6.6 and 6.7 imply that \(\Vert \beta _j\Vert \) and \(\Vert \gamma _j\Vert \) are bounded. \(\square \)

Finally, we show that \(\Vert A_j\Vert \) and \(\Vert A_j-U\Vert \) are bounded.

Corollary 6.9

Suppose that (53) holds. Then \(\Vert A_j\Vert \) and \(\Vert A_j-U\Vert \) are bounded as j tends to \(\infty .\)

Proof

Using (13) we have

$$\begin{aligned} I_{n-1}= & {} A_jA_j^*+\alpha _j\beta _j^*+\beta _j\alpha _j^* \\= & {} (A_j-U)(A_j^*-U^*)+U(A_j^*-U^*)+(A_j-U)U^*+I_{n-1}\\&+\,\alpha _j\beta _j^*+\beta _j\alpha _j^*. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert A_j\Vert ^2\le \Vert I_{n-1}\Vert +2\Vert \alpha _j\Vert \,\Vert \beta _j\Vert , \quad \Vert A_j-U\Vert ^2\le 2\Vert A_j-U\Vert +2\Vert \alpha _j\Vert \,\Vert \beta _j\Vert . \end{aligned}$$

The latter implies that

$$\begin{aligned} \Vert A_j-U\Vert \le 1+\sqrt{1+2\Vert \alpha _j\Vert \,\Vert \beta _j\Vert }. \end{aligned}$$
(58)

Hence \(\Vert A_j-U\Vert \) and \(\Vert A_j\Vert \) are bounded. \(\square \)

6.4 Convergence of \(a_j\) and \(d_j\)

Having now shown that all the entries of \(S_j\) are bounded as j tends to infinity, we can now show that the matrix entries of \(S_j\) tend to the corresponding entries of T. Recall that we have already shown, Proposition 6.2, that \(c_j\) tends to \(0\in {\mathbb H}\) and in Proposition 6.4 that \(\alpha _j\) and \(\delta _j\) tend to the zero vector in \({\mathbb H}^{n-1}\).

We now show \(a_j\) and \(d_j\) both tend to \(\mu \).

Proposition 6.10

Suppose that (53) holds. Then both \(a_j\) and \(d_j\) tend to \(\mu \) as j tends to infinity.

Proof

Recall from (10) that \(1=a_j\overline{d}_j+\gamma _j^*\delta _j+b_j\overline{c}_j\). Using (20), we have

$$\begin{aligned} a_{j+1} -\mu= & {} \gamma _j^*U\delta _j-\sqrt{2}a_j\tau ^*\mu \delta _j +\sqrt{2}\gamma _j^*\tau \mu \overline{c}_j -a_j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j +a_j\mu \overline{d}_j\\&+\,b_j\mu \overline{c}_j -\mu \gamma _j^*\delta _j-\mu a_j\overline{d}_j-\mu b_j\overline{c}_j \\= & {} (\gamma _j^*U-\mu \gamma _j^*)\delta _j -\sqrt{2}a_j\tau ^*\mu \delta _j +\sqrt{2}\gamma _j^*\tau \mu \overline{c}_j -a_j(\Vert \tau \Vert ^2-t)\mu \overline{c}_j \\&+\,((a_j-\mu )\mu -\mu (a_j-\mu )) \overline{d}_j +(b_j\mu -\mu b_j) \overline{c}_j. \end{aligned}$$

Using Lemma 6.6, we suppose that j is large enough that \(|d_j|<4\). Then:

$$\begin{aligned} |a_{j+1}-\mu |\le & {} N_{U,\mu }\Vert \gamma _j\Vert \,\Vert \delta _j\Vert +\sqrt{2}\Vert \tau \Vert \,|a_j|\,\Vert \delta _j\Vert +\sqrt{2}\Vert \tau \Vert \,|c_j|\,\Vert \gamma _j\Vert +|\Vert \tau \Vert ^2\\&-\,t|\,|a_j|\,|c_j| +N_\mu |d_j|\,|a_j-\mu |+N_\mu |b_j|\,|c_j| \\\le & {} N_{\mu }|d_j|\,|a_j-\mu | +(N_{U,\mu }\Vert \gamma _j\Vert +\sqrt{2}\Vert \tau \Vert \,|a_j|) \Vert \delta _j \overline{c}_j^{-1/2}\Vert \,|c_j|^{1/2} \\&+\,(\sqrt{2}\Vert \tau \Vert \,\Vert \gamma _j\Vert + |\Vert \tau \Vert ^2-t|\,|a_j|+N_{\mu }|b_j|)|c_j| \\\le & {} 4N_{\mu }|a_j-\mu | +(N_{U,\mu }\Vert \gamma _j\Vert +\sqrt{2}\Vert \tau \Vert \,|a_j|) \Vert \delta _j \overline{c}_j^{-1/2}\Vert \,|c_j|^{1/2} \\&+\,(\sqrt{2}\Vert \tau \Vert \,\Vert \gamma _j\Vert + |\Vert \tau \Vert ^2-t|\,|a_j|+N_{\mu }|b_j|)|c_j|. \end{aligned}$$

Note that \(4N_\mu <1\). Moreover, for \(j\ge J_1\) we have \(X_j^2\le 1/2\). Therefore \(|c_j|\le |c_{J_1}|/2^{j-J_1}\). Also, \(\Vert \gamma _j\Vert \), \(\Vert \delta _j\overline{c}_j^{-1/2}\Vert \), \(|a_j|\) and \(|b_j|\) are all bounded. Then using Lemma 6.1 with \(\lambda _1=4N_\mu <1\) and \(\lambda _2=|c_j|^{1/2}\le 1/\sqrt{2}\), we see that \(|a_j-\mu |\) tends to 0 as j tends to infinity.

Similarly \(|d_j-\mu |\) tends to zero as j tends to infinity. \(\square \)

6.5 Convergence of \(A_j\)

We now show that \(A_j\) tends to U.

Proposition 6.11

Suppose that (53) holds. Then \(A_j\) tends to U as j tends to infinity.

Proof

Recall from Corollary 6.9 that \(\Vert A_j\Vert \) and \(\Vert A_j-U\Vert \) are bounded. Note that

$$\begin{aligned} A_jU-UA_j= & {} ((A_j-U)U-\mu (A_j-U)) +(\mu (A_j-U)-(A_j-U)\mu )\\&-\,(U(A_j-U)-(A_j-U)\mu ). \end{aligned}$$

Therefore

$$\begin{aligned} \Vert A_jU-UA_j \Vert \le (2N_{U,\mu }+N_\mu )\Vert A_j-U\Vert . \end{aligned}$$

Hence

$$\begin{aligned} \Vert A_jUA_j^*-UA_jA_j^* \Vert= & {} \Vert (A_jU-UA_j)(A^*-U^*)+ (A_jU-UA_j)U^*\Vert \\\le & {} \Vert A_jU-UA_j\Vert (\Vert A_j-U\Vert +1) \\\le & {} (2N_{U,\mu }+N_\mu )\Vert A_j-U\Vert (\Vert A_j-U\Vert +1). \end{aligned}$$

From (58) we have

$$\begin{aligned} (2N_{U,\mu }+N_\mu )(\Vert A_j-U\Vert +1) \le (2N_{U,\mu }+N_\mu )\left( 2+\sqrt{1+2\Vert \alpha _j\Vert \,\Vert \beta _j\Vert } \right) . \end{aligned}$$

Since \(2N_{U,\mu }+N_\mu <1/4\) by (57), \(\Vert \beta _j\Vert \) is bounded and \(\Vert \alpha _j\Vert \) tends to zero, we can find J so that for all \(j\ge J\) we have

$$\begin{aligned} \Vert A_jUA_j^*-UA_jA_j^* \Vert < \frac{2+\sqrt{2}}{4} \Vert A_j-U\Vert . \end{aligned}$$

Noting that \(U=U\alpha _j \beta _j^*+UA_jA_j^*+U\beta _j\alpha _j^* \), we use (24) to find that

$$\begin{aligned} A_{j+1}-U= & {} A_jUA_j^*-\sqrt{2}\alpha _j\tau ^*\mu A_j^* +\sqrt{2}A_j\tau \mu \alpha _j^*-\alpha _j(\Vert \tau \Vert ^2-t)\mu \alpha _j^*\\&+\,\alpha _j\mu \beta _j^*+\beta _j\mu \alpha _j^* -UA_jA_j^*-U\alpha _j\beta _j^*-U\beta _j\alpha _j^* \\= & {} A_jUA_j^*-UA_jA_j^* -\sqrt{2}\alpha _j\tau ^*\mu (A_j^*-U^*) +\sqrt{2}(A_j-U)\tau \mu \alpha _j^*\\&-\,\alpha _j(\Vert \tau \Vert ^2-t)\mu \alpha _j^* -\sqrt{2}\alpha _j\tau ^*+\sqrt{2}U\tau \mu \alpha _j^*\\&-\,(U\alpha _j-\alpha _j\mu )\beta _j^*-(U\beta _j-\beta _j\mu )\alpha _j^*. \end{aligned}$$

Note, we have used \(\tau ^*U=\tau ^*\mu \). Thus for \(j\ge J\),

$$\begin{aligned} \Vert A_{j+1}-U\Vert\le & {} \Vert A_jUA_j^*-UA_jA_j^*\Vert +2\sqrt{2}\Vert A_j-U\Vert \, \Vert \alpha _j\Vert \, \Vert \tau \Vert +|\Vert \tau \Vert ^2\\&-\,t\bigr |\,\Vert \alpha _j\Vert ^2 +2\sqrt{2}\Vert \tau \Vert \,\Vert \alpha _j\Vert +2N_{U,\mu }\Vert \alpha _j\Vert \,\Vert \beta _j\Vert \\< & {} \frac{2+\sqrt{2}}{4}\Vert A_j-U\Vert +|\Vert \tau \Vert ^2-t|\,\Vert \alpha _jc_j^{-1/2}\Vert ^2|c_j| \\&+\, (2\sqrt{2}\Vert A_j-U\Vert \,\Vert \tau \Vert +2\sqrt{2}\Vert \tau \Vert +2N_{U,\mu }\Vert \beta _j\Vert )\Vert \alpha _jc_j^{-1/2}\Vert \,|c_j|^{1/2}. \end{aligned}$$

Suppose that J is large enough that for \(j\ge J\) we have \(|c_j|\le |c_J|/2^{j-J}\). Now apply Lemma 6.1 with \(\lambda _1=(2+\sqrt{2})/4\) and \(\lambda _2=1/\sqrt{2}\), and so \(\Vert A_j-U\Vert \) tends to zero as j tends to infinity. \(\square \)

6.6 Convergence of \(\beta _j\) and \(\gamma _j\)

We are now ready to show convergence of \(\beta _j\) and \(\gamma _j\).

Proposition 6.12

Suppose that (53) holds. Then \(\beta _j,\) and \(\gamma _j\) tend to \(\sqrt{2}\tau \mu \) and \(-\sqrt{2}\overline{\mu }\tau \), respectively, as j tends to infinity.

Proof

Using \(U\beta _j\overline{a}_j+UA_j\gamma _j+U\alpha _j\overline{b}_j=0\), which follows from (14), we have

$$\begin{aligned}&\beta _{j+1} -\sqrt{2}\tau \mu \\&\quad = A_jU\gamma _j-\sqrt{2}\alpha _j\tau ^*\mu \gamma _j +\sqrt{2}A_j\tau \mu \overline{a}_j -\alpha _j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j\\&\qquad +\,\alpha _j\mu \overline{b}_j+\beta _j\mu \overline{a}_j-\sqrt{2}\tau \mu \\&\quad = A_jU\gamma _j-\sqrt{2}\alpha _j\tau ^*\mu \gamma _j +\sqrt{2}A_j\tau \mu \overline{a}_j -\alpha _j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j\\&\qquad +\,\alpha _j\mu \overline{b}_j+\beta _j\mu \overline{a}_j-\sqrt{2}\tau \mu \\&\qquad -\,UA_j\gamma _j-U\alpha _j\overline{b}_j-U\beta _j\overline{a}_j \\&\quad = (A_jU-UA_j)\gamma _j-\sqrt{2}\alpha _j\tau ^*\mu \gamma _j -\alpha _j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j -(U\alpha _j-\alpha _j\mu )\overline{b}_j \\&\qquad +\,\sqrt{2}(A_j-U)\tau \mu \overline{a}_j -(U(\beta _j-\sqrt{2}\tau \mu )\\&\qquad -\,(\beta _j-\sqrt{2}\tau \mu )\mu ) \overline{a}_j+\sqrt{2}\tau \mu ^2(\overline{a}_j-\overline{\mu }). \end{aligned}$$

Therefore

$$\begin{aligned}&\Vert \beta _{j+1}-\sqrt{2}\tau \mu \Vert \\&\quad \le N_{U,\mu }|a_j| \, \Vert \beta _j-\sqrt{2}\tau \mu \Vert +(2\Vert \gamma _j\Vert +\sqrt{2}\Vert \tau \Vert \,|a_j|) \Vert A_j-U\Vert \\&\qquad +\,(\sqrt{2}\Vert \tau \Vert \,\Vert \gamma _j\Vert +|\Vert \tau \Vert ^2-t|\,|a_j| +N_{U,\mu }|b_j| ) \Vert \alpha _jc_j^{-1/2}\Vert \,|c_j|^{1/2}. \end{aligned}$$

Using Lemma 6.6, suppose j is large enough that \(|a_j|<4\) and so \(N_{U,\mu }|a_j|<4N_{U,\mu }\). Note that

$$\begin{aligned} 4N_{U,\mu }<2(3-2\sqrt{2+N_\mu })<2(\sqrt{2}-1)^2<1. \end{aligned}$$

Since \(|c_j|^{1/2}\) and \(\Vert A_j-U\Vert \) are bounded by a constant multiple of \(2^{j/2}\), we can apply Lemma 6.1(ii) to show that \(\Vert \beta _j-\sqrt{2}\tau \mu \Vert \) tends to zero as j tends to infinity. A similar argument shows that \(\Vert \gamma _j+\sqrt{2}\overline{\mu }\tau \Vert \) tends to zero as j tends to infinity. This argument uses \(U^*\tau =\overline{\mu }\tau \). \(\square \)

6.7 Convergence of \(b_j\)

Finally, we show that \(b_j\) converges as j tends to infinity.

Proposition 6.13

Suppose that (53) holds. Then \(b_j\) tends to \(-(\Vert \tau \Vert ^2-t)\mu \) as j tends to infinity.

Proof

Note that if \(b_j\) tends to \(-(\Vert \tau \Vert ^2-t)\mu \) then \(\overline{b}_j\) tends to \(-\overline{\mu }(\Vert \tau \Vert ^2+t)\).

Using \(0=\gamma _j^*\gamma _j\mu +a_j\overline{b}_j\mu +b_j\overline{a}_j\mu \), we have

$$\begin{aligned}&b_{j+1}+(\Vert \tau \Vert ^2-t)\mu \\&\quad =\gamma _j^*U\gamma _j-\sqrt{2}a_j\tau ^*\mu \gamma _j+\sqrt{2} \gamma _j^*\tau \mu \overline{a}_j -a_j(\Vert \tau \Vert ^2-t)\mu \overline{a}_j +a_j\mu \overline{b}_j+b_j\mu \overline{a}_j \\&\qquad -\,\gamma _j^*\gamma _j\mu -a_j\overline{b}_j\mu -b_j\overline{a}_j\mu +(\Vert \tau \Vert ^2-t)\mu \\&\quad = \gamma _j^*U(\gamma _j+\sqrt{2}\overline{\mu }\tau ) -\gamma _j^*(\gamma _j+\sqrt{2}\overline{\mu }\tau )\mu +\sqrt{2}(\gamma _j^*+\sqrt{2}\tau ^*\mu )\overline{\mu }\tau \mu -2\Vert \tau \Vert ^2\mu \\&\qquad -\,\sqrt{2}a_j\tau ^*\mu (\gamma _j+\sqrt{2} \overline{\mu }\tau )+2a_j\Vert \tau \Vert ^2 +\sqrt{2}\gamma _j^*\tau \mu (\overline{a}_j-\overline{\mu })\\&\qquad -\,a_j(\Vert \tau \Vert ^2-t)\mu (\overline{a}_j-\overline{\mu }) -a_j(\Vert \tau \Vert ^2-t)\\&\qquad +\,a_j\mu (\overline{b}_j+\overline{\mu }(\Vert \tau \Vert ^2+t)) -a_j(\Vert \tau \Vert ^2+t) +b_j\mu (\overline{a}_j-\overline{\mu })\\&\qquad -\,a_j(\overline{b}_j+\overline{\mu }(\Vert \tau \Vert ^2+t))\mu +a_j\overline{\mu }(\Vert \tau \Vert ^2+t)\mu -b_j(\overline{a}_j-\overline{\mu })\mu +(\Vert \tau \Vert ^2-t)\mu \\&\quad = \gamma _j^*(U(\gamma _j+\sqrt{2}\overline{\mu }\tau ) -(\gamma _j+\sqrt{2}\overline{\mu }\tau )\mu )\\&\qquad +\,\sqrt{2}(\gamma _j^*+\sqrt{2}\tau ^*\mu )\overline{\mu }\tau \mu -\sqrt{2}a_j\tau ^*\mu (\gamma _j+\sqrt{2}\overline{\mu }\tau ) \\&\qquad +\,\sqrt{2}\gamma _j^*\tau \mu (\overline{a}_j-\overline{\mu }) -a_j(\Vert \tau \Vert ^2-t)\mu (\overline{a}_j-\overline{\mu }) +b_j(\mu (\overline{a}_j-\overline{\mu }) -(\overline{a}_j-\overline{\mu })\mu ) \\&\qquad +\,(a_j-\mu )\overline{\mu }(\Vert \tau \Vert ^2+t)\mu +a_j( \mu (\overline{b}_j+\overline{\mu }(\Vert \tau \Vert ^2+t)) -(\overline{b}_j+\overline{\mu }(\Vert \tau \Vert ^2+t))\mu ). \end{aligned}$$

Therefore

$$\begin{aligned} |b_{j+1}+(\Vert \tau \Vert ^2-t)\mu |\le & {} (N_{U,\mu }\Vert \gamma _j\Vert +\sqrt{2}\Vert \tau \Vert (|a_j|+1)) \Vert \gamma _j+\sqrt{2}\overline{\mu }\tau \Vert \\&+\,(\sqrt{2}\Vert \gamma _j\Vert \,\Vert \tau \Vert +|\Vert \tau \Vert ^2-t|(|a_j|+1)+N_\mu |b_j|)|a_j-\mu | \\&+\, N_\mu |a_j|\,|b_j+(\Vert \tau \Vert ^2-t)\mu |. \end{aligned}$$

We can take j large enough that \(N_\mu |a_j|<4N_\mu <1\). Also, we know that \(\Vert \gamma _j+\sqrt{2}\overline{\mu }\tau \Vert \) and \(|a_j-\mu |\) are bounded by constant multiples of \(2^{(j-J)/2}\). Therefore, we can apply Lemma 6.1 to conclude that \(|b_j+(\Vert \tau \Vert ^2-t)\mu |\) tends to zero. \(\square \)

Propositions 6.26.13 imply that \(S_j\) tends to T as j tends to infinity, which completes the proof of Theorem 1.1.