1 Introduction

A Jacobi operator is a selfadjoint operator on \(\ell ^2 = \ell ^2(\{0,1,2,\ldots \})\), which with respect to the standard orthonormal basis \(\{e_0, e_1,e_2, \ldots \}\) has a tridiagonal matrix representation,

(1.1)

where \(\alpha _k\) and \(\beta _k\) are real numbers with \(\beta _k > 0\). In the special case where \(\{\beta _k\}_{k=0}^\infty \) is a constant sequence, this operator can be interpreted as a discrete Schrödinger operator on the half line, and its spectral theory is arguably its most important aspect.

The spectral theorem for Jacobi operators guarantees the existence of a probability measure \(\mu \) supported on the spectrum \(\sigma (J) \subset {\mathbb {R}}\), called the spectral measure, and a unitary operator \(U:\ell ^2 \rightarrow L^2_\mu ({\mathbb {R}})\) such that

$$\begin{aligned} U J U^* [f](s) = sf(s), \end{aligned}$$
(1.2)

for all \(f \in L^2_\mu ({\mathbb {R}})\) [16]. The coefficients \(\{\alpha _k\}_{k\ge 0}\) and \(\{\beta _k\}_{k\ge 0}\) are the three–term recurrence coefficients of the orthonormal polynomials \(\{P_k\}_{k\ge 0}\) with respect to \(\mu \), which are given by \(P_k = Ue_k\).

Suppose we have a second Jacobi operator, D, for which the spectral theory is known analytically. The point of this paper is to show that for certain classes of Jacobi operators J and D, the computation and theoretical study of the spectrum and spectral measure of J can be conducted effectively using the connection coefficient matrix between J and D, combined with known properties of D.

Definition 1.1

The connection coefficient matrix \(C=C_{J\rightarrow D} = (c_{ij})_{i,j = 0}^\infty \) is the upper triangular matrix representing the change of basis between the orthonormal polynomials \((P_k)_{k=0}^\infty \) of J, and the orthonormal polynomials \((Q_k)_{k=0}^\infty \) of D, whose entries satisfy,

$$\begin{aligned} P_k(s) = c_{0k}Q_0(s) + c_{1k}Q_1(s) + \cdots + c_{kk} Q_k(s). \end{aligned}$$
(1.3)

We pay particular attention to the case where D is the so-called free Jacobi operator,

(1.4)

and J is a Jacobi operator of the form \(J = \Delta + K\), where K is compact. In the discrete Schrödinger operator setting, K is a diagonal potential function which decays to zero at infinity [41]. Another reason this class of operators is well studied is because the Jacobi operators for the classical Jacobi polynomials are of this form [36]. Since scaling and shifting by the identity operator affects the spectrum in a trivial way, results about these Jacobi operators J apply to all Jacobi operators which are compact perturbations of a Toeplitz operator (Toeplitz-plus-compact).

The spectral theory of Toeplitz operators such as \(\Delta \) is known explicitly [7]. The spectral measure of \(\Delta \) is the semi-circle \(\mathrm {d}\mu _\Delta (s) = \frac{2}{\pi }(1-s^2)^{\frac{1}{2}}\) (restricted to \([-1,1]\)), and the orthonormal polynomials are the Chebyshev polynomials of the second kind, \(U_k(s)\). We prove the following new results.

If J is a Jacobi operator which is a finite rank perturbation of \(\Delta \) (Toeplitz-plus-finite-rank), i.e. there exists n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$
(1.5)
  • Theorem 4.8: The connection coefficient matrix \(C_{J\rightarrow \Delta }\) can be decomposed into \(C_{\mathrm{Toe}} + C_{\mathrm{fin}}\) where \(C_{\mathrm{Toe}}\) is Toeplitz, upper triangular and has bandwidth \(2n-1\), and the entries of \(C_{\mathrm{fin}}\) are zero outside the \(n-1 \times 2n-1\) principal submatrix.

  • Theorem 4.22: Let c be the Toeplitz symbol of \(C_{\mathrm{Toe}}\). It is a degree \(2n-1\) polynomial with \(r \le n\) roots inside the complex open unit disc \({\mathbb {D}}\), all of which are real and simple. The spectrum of J is

    $$\begin{aligned} \sigma (J) = [-1,1] \cup \left\{ \lambda (z_k) : c(z_k) = 0, z_k \in {\mathbb {D}}\right\} , \end{aligned}$$
    (1.6)

    where \(\lambda (z) := \frac{1}{2}(z+z^{-1}) : {\mathbb {D}} \rightarrow {\mathbb {C}}\setminus [-1,1]\) is the Joukowski transformation. The spectral measure of J is given by the formula

    $$\begin{aligned} \mathrm {d}\mu (s) = \frac{\mathrm {d}\mu _\Delta (s)}{|c(e^{i\theta })|^2} + \sum _{k=1}^r \frac{(z_k-z_k^{-1})^2}{z_kc'(z_k)c(z_k^{-1})}\mathrm {d}\delta _{\lambda (z_k)}(s), \end{aligned}$$
    (1.7)

    where \(\cos (\theta ) = s\). The denominator in the first term can be expressed using the polynomial, \(|c(e^{i\theta })|^2= \sum _{k=0}^{2n-1} \langle C^Te_k ,C^T e_0 \rangle U_k(s)\).

Fig. 1
figure 1

These are the spectral measures of three different Jacobi operators; each differs from \(\Delta \) in only one entry. The left plot is of the spectral measure of the Jacobi operator which is \(\Delta \) except the (0, 0) entry is 1, the middle plot is that except the (1, 1) entry is 1, and the right plot is that except the (4, 4) entry is 1. This can be interpreted as a discrete Schrödinger operator with a single Dirac potential at different points along \([0,\infty )\). The continuous parts of the measures are given exactly by the computable formula in equation (1.7), and each has a single Dirac delta corresponding to discrete spectrum (the weight of the delta gets progressively smaller in each plot), the location of which can be computed with guaranteed error using interval arithmetic (see Appendix A)

For \(R > 0 \), define the Banach space \(\ell ^1_R\) to be scalar sequences such that \(\sum _{k=0}^\infty |v_k| R^k < \infty \). If J is a trace class perturbation of \(\Delta \) (Toeplitz-plus-trace-class), i.e.,

$$\begin{aligned} \sum _{k=0}^\infty \left| \alpha _k\right| + \left| \beta _k-\frac{1}{2}\right| < \infty , \end{aligned}$$
(1.8)
  • Theorem 5.11: \(C = C_{J\rightarrow \Delta }\) is bounded as an operator from \(\ell ^1_{R}\) into itself, for all \(R > 1\). Further, we have the decomposition \(C = C_{\mathrm{Toe}} + C_K\) where \(C_{\mathrm{Toe}}\) is upper triangular Toeplitz and \(C_K\) is compact as an operator from \(\ell ^1_{R}\) into itself for all \(R > 1\).

  • Theorem 5.13 and Theorem 5.15 : The Toeplitz symbol of \(C_{\mathrm{Toe}}\), c, is analytic in the complex unit disc. The discrete eigenvalues, as in the Toeplitz-plus-finite-rank case are of the form \(\frac{1}{2}(z_k+z_k^{-1})\) where \(z_k\) are the roots of c in the open unit disc.

The relevance of the space \(\ell ^1_R\) here is that for an upper triangular Toeplitz matrix which is bounded as an operator from \(\ell ^1_R\) to itself for all \(R>1\), the symbol of that operator is analytic in the open unit disc.

Following the pioneering work of Ben-Artzi–Colbrook–Hansen–Nevanlinna–Seidel on the Solvability Complexity Index [4, 5, 8, 9, 28], we prove two theorems about computability. We assume real number arithmetic, and the results do not necessarily apply to algorithms using floating point arithmetic.

  • Theorem 6.8: If J is a Toeplitz-plus-finite-rank Jacobi operator, then in a finite number of operations, the absolutely continuous part of the spectral measure is computable exactly, and the locations and weights of the discrete part of the spectral measure are computable to any desired accuracy. If the rank of the perturbation is known a priori then the algorithm can be designed to terminate with guaranteed error control.

  • Theorem 6.10: If \(J=\Delta +K\) is a Toeplitz-plus-compact Jacobi operator, then in a finite number of operations, the spectrum of J is computable to any desired accuracy in the Hausdorff metric on subsets of \({\mathbb {R}}\). If the quantity \(\sup _{k \ge m} |\alpha _k| + \sup _{k\ge m} |\beta _k-\frac{1}{2}|\) can be estimated for all m, then the algorithm can be designed to terminate with guaranteed error control.

The present authors consider these results to be the beginning of a long term project on the computation of spectra of structured operators. Directions for future research are outlined in Sect. 7.

1.1 Relation to existing work on connection coefficients

Recall that J and D are Jacobi operators with orthonormal polynomials \(\{P_k\}_{k=0}^\infty \) and \(\{Q_k\}_{k=0}^\infty \) respectively, and spectral measures \(\mu \) and \(\nu \) respectively. Uvarov gave expressions for the relation between \(\{P_k\}_{k=0}^\infty \) and \(\{Q_k\}_{k=0}^\infty \) in the case where the Radon–Nikodym derivative \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\) is a rational function, utilising connection coefficients [47, 48]. Decades later, Kautsky and Golub related properties of J, D and the connection coefficients matrix \(C = C_{J\rightarrow D}\) with the Radon-Nikodym derivative \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\), with applications to practical computation of Gauss quadrature rules in mind [32]. In the setting of Gauss quadrature rules, the connection coefficients are usually known as modified or mixed moments, (see [23, 40, 55]). The following results, which we prove in Sect. 3 for completeness, are straightforward generalisations of what can be found in the papers cited in this paragraph from the 1960’s, 1970’s and 1980’s.

The Jacobi operators and the connection coefficients satisfy

$$\begin{aligned} CJ = DC, \end{aligned}$$
(1.9)

which makes sense as operators acting on finitely supported sequences (this is made clear and proved in Theorem 3.4). A finite-dimensional version of this result with a rank-one remainder term first appears in [32, Lem. 1]. The connection coefficients matrix also determines the existence and certain properties of the Radon–Nikodym derivative \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\):

  • Proposition 3.6: \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^2_\mu ({\mathbb {R}})\) if and only if the first row of C is an \(\ell ^2\) sequence, in which case

    $$\begin{aligned} \frac{\mathrm {d}\nu }{\mathrm {d}\mu } = \sum _{k=0}^\infty c_{0,k} P_k. \end{aligned}$$
    (1.10)
  • Proposition 3.9: If \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^\infty _\mu ({\mathbb {R}})\) then C is a bounded operator on \(\ell ^2\) and

    $$\begin{aligned} \Vert C\Vert _2^2 = \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| . \end{aligned}$$
    (1.11)
  • Corollary 3.10: If both \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^\infty _\mu ({\mathbb {R}})\) and \(\frac{\mathrm {d}\mu }{\mathrm {d}\nu } \in L^\infty _\nu ({\mathbb {R}})\) then C is bounded and invertible on \(\ell ^2\).

1.2 Relation to existing work on spectra of Jacobi operators

There has been extensive work in the last 20 years or so on the spectral theory of Jacobi operators which are compact perturbations of the free Jacobi operator, particularly with applications to quantum theory and random matrix theory in mind [14, 15, 20, 22, 33, 42]. The literature focuses on so-called sum rules, which are certain remarkable relationships between the spectral measures and the entries of Jacobi operators, and builds upon some 20th century developments in the Szegő theory of orthogonal polynomials [18, 25, 30, 35, 44, 49,50,51,52].

For a Jacobi operator J, which is a compact perturbation of the free Jacobi operator \(\Delta \), with orthogonal polynomials \(\{P_k\}_{k=0}^\infty \) and spectral measure \(\mathrm {d}\mu \), the central analytical object of interest is the function

$$\begin{aligned} u_0(z) = \lim _{k\rightarrow \infty } (1-z^2) z^k P_k\left( \frac{1}{2}\left( z+z^{-1}\right) \right) , \qquad z \in {\mathbb {D}}. \end{aligned}$$
(1.12)

Conditions on the perturbation \(J-\Delta \) yield a nicer \(u_0\). For instance, when \(J-\Delta \) is trace class, \(u_0\) is analytic and its roots in the unit disc are in one-to-one correspondence with the eigenvalues of J, and when \(J - \Delta \) is of finite rank, \(u_0\) is a polynomial [33].

The function \(u_0\) defined as in (1.12) can be arrived at through several different definitions: the Jost functions, the perturbation determinant, the Szegő function and the Geronimo–Case functions (see the introduction of [33]). One contribution of the present paper is that it provides a new interpretation of \(u_0\): the Toeplitz symbol c(z). Furthermore, this new interpretation also has a matrix associated to it, the connection coefficients matrix C, which is defined for any two Jacobi operators, not only perturbations of the free Jacobi operator. This opens the door to generalisation, something the present authors intend to pursue in the future.

Very recently, Colbrook, Bogdan, and Hansen have introduced techniques for computing spectra with error control [10] which work on quite broad classes of operators. Colbrook [11] extended these techniques to computing spectral measures of operators, including Jacobi operators as a special case. While this work applies to broader classes of operators than ours, it gives less accurate results for the class of Jacobi operators we consider, and in particular does not produce explicit formulae such as Theorem 4.22 or as precise results as Theorem 6.10. In particular, our assumptions on the structure of the operator are sufficient to produce better results in terms the SCI hierarchy: we can compute the discrete spectra and spectral measures of trace-class perturbations or compact perturbations with known decay with error control (\(\Delta _1\) in the notation of [11]), the spectral measure of Jacobi operators that are finite rank perturbations of \(\Delta \) in finite operations (\(\Delta _0\)), and the absolutely continuous spectrum is always \([-1,1]\). This compares favourably to [11] which proves \(\Delta _2\) classification results (one limit with no error control) for the spectral measures, projections, functional calculus and Radon-Nikodym derivatives of a larger class of operators. Furthermore, the generality of [11] means that the location of the absolutely continuous and pure point spectra are no longer known, as is the case for our class of operators. This causes the computation of spectral decompositions to become very difficult and higher up in the SCI hierarchy.

1.3 Outline of the paper

  • In Sect. 2 we outline basic, established results about spectral theory of Jacobi operators.

  • In Sect. 3 we discuss the basic properties of the connection coefficients matrix \(C_{J \rightarrow D}\) for general Jacobi operators J and D, and how they relate to the spectra of J and D.

  • In Sect. 4 we show how connection coefficient matrices apply to Toeplitz-plus-finite-rank Jacobi operators, and in Sect. 5 we extend these results to the Toeplitz-plus-trace-class case.

  • Section 6 is devoted to issues of computability.

  • Appendix A gives an array of numerical examples produced using an open source Julia package SpectralMeasures.jl [53] that implements the ideas of this paper. It makes extensive use of the open source Julia package ApproxFun.jl [37, 38], in particular the features for defining and manipulating functions and infinite-dimensional operators.

2 Spectral Theory of Jacobi Operators

In this section we present well known results about the spectra of Jacobi operators. This gives a self-contained account of what is required to prove the results later in the paper, and sets the notation.

Definition 2.1

Define the principal resolvent function for \(\lambda \in {\mathbb {C}}\setminus \sigma (J)\),

$$\begin{aligned} G(\lambda ) = \langle e_0, (J-\lambda )^{-1} e_0\rangle \end{aligned}$$
(2.1)

where

$$ \langle x,y \rangle := \sum _{k=0}^\infty {{\bar{x}}}_k y_k. $$

Theorem 2.2

([16, 41, 45]). Let J be a bounded Jacobi operator.

  1. (i)

    There exists a unique compactly supported probability measure \(\mu \) on \({\mathbb {R}}\), called the spectral measure of J, such that

    $$\begin{aligned} G(\lambda ) = \int (s-\lambda )^{-1} \,\mathrm {d}\mu (s). \end{aligned}$$
    (2.2)
  2. (ii)

    For any \(s_1 < s_2\) in \({\mathbb {R}}\),

    $$\begin{aligned} \frac{1}{2} \mu (\{s_1\}) + \mu ((s_1,s_2))+\frac{1}{2}\mu (\{s_2\}) = \lim _{\epsilon \searrow 0} \frac{1}{\pi } \int _{s_1}^{s_2} \mathrm {Im}\,G(s+i\epsilon ) \, \mathrm {d}s. \end{aligned}$$
    (2.3)
  3. (iii)

    The spectrum of J is

    $$\begin{aligned} \sigma (J) = \mathrm {supp}(\mu ) = \overline{\{ s \in {\mathbb {R}}: \liminf _{\epsilon \searrow 0} \mathrm {Im}\,G(s+i\epsilon ) > 0 \}}. \end{aligned}$$
    (2.4)

    The point spectrum \(\sigma _p(J)\) of J is the set of points \(s \in {\mathbb {R}}\) such that the limit

    $$\begin{aligned} \mu (\{s\}) = \lim _{\epsilon \searrow 0} \frac{\epsilon }{i} G(s+i\epsilon ) \end{aligned}$$
    (2.5)

    exists and is positive. The continuous spectrum of J is the set of points \(s \in {\mathbb {R}}\) such that \(\mu (\{s\}) = 0\) but

    $$\begin{aligned} \liminf _{\epsilon \searrow 0} \mathrm {Im}\,G(s + i\epsilon ) > 0. \end{aligned}$$
    (2.6)

The measure \(\mu \) is the spectral measure that appears in the spectral theorem for self-adjoint operators on Hilbert space [21], as demonstrated by the following theorem [16, 41, 45].

Definition 2.3

The orthonormal polynomials for J are \(P_0, P_1, P_2, \ldots \) defined by the three term recurrence

$$\begin{aligned} sP_k(s)&= \beta _{k-1} P_{k-1}(s) + \alpha _k P_k(s) + \beta _k P_{k+1}(s), \end{aligned}$$
(2.7)
$$\begin{aligned} P_{-1}(s)&= 0, \quad P_0(s) = 1. \end{aligned}$$
(2.8)

Theorem 2.4

([16]). Let J be a bounded Jacobi operator and let \(P_0, P_1, P_2, \ldots \) be as defined in Definition 2.3. Then we have the following.

  1. (i)

    The polynomials are such that \(P_k(J)e_0 = e_k\).

  2. (ii)

    The polynomials are orthonormal with respect to the spectral measure of J,

    $$\begin{aligned} \int P_j(s)P_k(s)\,\mathrm {d}\mu (s) = \delta _{jk}. \end{aligned}$$
    (2.9)
  3. (iii)

    Define the unitary operator \(U : \ell ^2 \rightarrow L^2_\mu ({\mathbb {R}})\) such that \(Ue_k = P_k\). Then for all \(f \in L^2_\mu ({\mathbb {R}})\),

    $$\begin{aligned} UJU^*[f](s) = sf(s). \end{aligned}$$
    (2.10)
  4. (iv)

    For all polynomials f, the entries of f(J) are equal to,

    $$\begin{aligned} \langle e_i, f(J) e_j\rangle = \int f(s) P_i(s)P_j(s) \,\mathrm {d}\mu (s). \end{aligned}$$
    (2.11)

    For \(f \in L^1_\mu ({\mathbb {R}})\), this formula defines the matrix f(J).

The following definition is standard in orthogonal polynomial theory.

Definition 2.5

([24, 50]). The first associated polynomials for J are \(P^{\mu }_0,P^{\mu }_1,P^{\mu }_2,\ldots \) defined by the three term recurrence

$$\begin{aligned} \lambda P^{\mu }_k(\lambda )&= \beta _{k-1} P^{\mu }_{k-1}(\lambda ) + \alpha _k P^{\mu }_k(\lambda ) + \beta _k P^{\mu }_{k+1}(\lambda ), \end{aligned}$$
(2.12)
$$\begin{aligned} P^{\mu }_0(\lambda )&= 0, \quad P^{\mu }_1(\lambda ) = \beta _0^{-1}. \end{aligned}$$
(2.13)

The relevance of the first associated polynomials for this work is the following integral formula.

Lemma 2.6

([24, pp. 17,18]) The first associated polynomials are given by the integral formula

$$\begin{aligned} P^{\mu }_k(\lambda ) = \int \frac{P_k(s) - P_k(\lambda )}{s-\lambda } \,\mathrm {d}\mu (s), \quad \lambda \in {\mathbb {C}}\setminus \sigma (J). \end{aligned}$$
(2.14)

For notational convenience we also define the \(\mu \)-derivative of a general polynomial.

Definition 2.7

Let \(\mu \) be a probability measure compactly supported on the real line and let f be a polynomial. The \(\mu \)-derivative of f is the polynomial defined by

$$\begin{aligned} f^{\mu }(\lambda ) = \int \frac{f(s)-f(\lambda )}{s-\lambda } \,\mathrm {d}\mu (s). \end{aligned}$$
(2.15)

3 Connection Coefficient Matrices

In this section we give preliminary results to indicate the relevance of connection coefficient matrices to spectral theory of Jacobi operators.

3.1 Basic properties

As in the introduction, consider a second bounded Jacobi operator,

with principal resolvent function H(z), spectral measure \(\nu \) and orthogonal polynomials denoted \(Q_0, Q_1, Q_2, \ldots \). In the introduction (Definition 1.1) we defined the connection coefficient matrix between J and D, \(C = C_{J\rightarrow D}\) to have entries satisfying

$$\begin{aligned} P_k(s) = c_{0k}Q_0(s) + c_{1k}Q_1(s) + \cdots + c_{kk} Q_k(s). \end{aligned}$$
(3.1)

Definition 3.1

Denote the space of complex-valued sequences with finitely many nonzero elements by \(\ell _{{\mathcal {F}}}\), and its algebraic dual, the space of all complex-valued sequences, by \(\ell _{{\mathcal {F}}}^\star \).

Note that \(C : \ell _{{\mathcal {F}}} \rightarrow \ell _{{\mathcal {F}}}\), because it is upper triangular, and \(C^T : \ell _{{\mathcal {F}}}^\star \rightarrow \ell _{{\mathcal {F}}}^\star \), because it is lower triangular, and thus we may write

$$\begin{aligned} \left( \begin{array}{c} P_0(s) \\ P_1(s) \\ P_2(s) \\ \vdots \end{array} \right) = C^T \left( \begin{array}{c} Q_0(s) \\ Q_1(s) \\ Q_2(s) \\ \vdots \end{array} \right) \text { for all } s \in {\mathbb {C}}. \end{aligned}$$

By orthonormality of the polynomial sequences the entries can also be interpreted as

(3.2)

where \(\langle \cdot , \cdot \rangle _\nu \) is the standard inner product on \(L^2_\nu ({\mathbb {R}})\).

A recurrence relationship for the connection coefficients matrix was discovered by Sack and Donovan [40] and independently by Wheeler [55], in the context of Gauss quadrature formulae.

Lemma 3.2

[40, 55] The entries of the connection coefficients matrix \(C_{J \rightarrow D}\) satisfy the following 5-point discrete system:

$$\begin{aligned} -\delta _{i-1}c_{i-1,j} + \beta _{j-1}c_{i,j-1} + (\alpha _j-\gamma _i)c_{ij} + \beta _{j}c_{i,j+1} -\delta _{i}c_{i+1,j} = 0, \text { for all } 0 \le i < j, \end{aligned}$$

with boundary conditions

$$\begin{aligned} c_{ij} = {\left\{ \begin{array}{ll} 1 &{}\text { if } i=j=0, \\ 0 &{}\text { if } j = 0 \text { and } i\ne 0, \\ 0 &{}\text { if } j = -1 \text { or } i = -1. \end{array}\right. } \end{aligned}$$

Proof

Assume by convention that \(c_{ij} = 0\) if \(i = -1\) or \(j=-1\). Now using this boundary condition and the three term recurrences for the polynomial sequences, we see that

$$\begin{aligned} \langle Q_i(s),sP_j(s) \rangle _\nu&= \beta _{j-1} \langle Q_i,P_{j-1}\rangle _\nu + \alpha _j \langle Q_i,P_j \rangle _\nu + \beta _j \langle Q_i,P_{j+1} \rangle _\nu \\&= \beta _{j-1} c_{i,j-1} + \alpha _j c_{ij} + \beta _j c_{i,j+1}, \end{aligned}$$

and

$$\begin{aligned} \langle sQ_i(s),P_j(s) \rangle _\nu&= \delta _{i-1} \langle Q_{i-1},P_j\rangle _\nu + \gamma _i \langle Q_i,P_j \rangle _\nu + \delta _i \langle Q_{i+1},P_j \rangle _\nu \\&= \delta _{i-1} c_{i-1,j} + \gamma _i c_{ij} + \delta _i c_{i+1,j}. \end{aligned}$$

Since \(\langle sQ_i(s),P_j(s) \rangle _\nu = \langle Q_i(s),sP_j(s) \rangle _\nu \), we have the result for the interior points \(0 \le i < j\).

The remaining boundary conditions come from \(c_{i0} = \langle Q_i, P_0 \rangle _\nu \) which equals 1 if \(i=0\) and 0 otherwise. \(\square \)

The 5-point recurrence formula can be restated as infinite-vector-valued three-term recurrence relations for rows and columns of C.

Corollary 3.3

The columns of C satisfy

$$\begin{aligned} c_{*,0}&= e_0 \\ Dc_{*,0}&= \alpha _0 c_{*,0} + \beta _0 c_{*,1} \\ Dc_{*,j}&= \beta _{j-1}c_{*,j-1} + \alpha _j c_{*,j} + \beta _j c_{*,j+1}. \end{aligned}$$

Consequently the jth column can be written \(c_{*,j} = P_j(D)e_0\).

The rows of C satisfy

$$\begin{aligned} c_{0,*}J&= \gamma _0 c_{0,*} + \delta _0 c_{1,*}, \\ c_{i,*}J&= \delta _{i-1} c_{i-1,*} + \gamma _i c_{i,*} + \delta _i c_{i+1,*}. \end{aligned}$$

Consequently, the ith row can be written \(c_{i,*} = c_{0,*}Q_i(J)\).

Proof

The 5-point discrete system described in Lemma 3.2 can be used to find an explicit linear recurrence to compute the entries of C,

$$\begin{aligned} c_{0,0}&= 1 \end{aligned}$$
(3.3)
$$\begin{aligned} c_{0,1}&= (\gamma _0-\alpha _0)/\beta _0 \end{aligned}$$
(3.4)
$$\begin{aligned} c_{1,1}&= \delta _0/\beta _0 \end{aligned}$$
(3.5)
$$\begin{aligned} c_{0,j}&= \left( (\gamma _0-\alpha _{j-1})c_{0,j-1} + \delta _0c_{1,j-1} - \beta _{j-2} c_{0,j-2}\right) /\beta _{j-1} \end{aligned}$$
(3.6)
$$\begin{aligned} c_{i,j}&= \left( \delta _{i-1}c_{i-1,j-1} + (\gamma _i-\alpha _{j-1})c_{i,j-1} + \delta _ic_{i+1,j-1} - \beta _{j-2}c_{i,j-2}\right) /\beta _{j-1}. \end{aligned}$$
(3.7)

The recurrences of the rows and columns of C are these written in vectorial form.

The consequences follow from the uniqueness of solution to second order difference equations with two initial data (adding \(c_{-1,*} = 0\) and \(c_{*,-1} = 0\)). \(\square \)

3.2 Connection coefficients and spectral theory

The following theorems give precise results about how the connection coefficients matrix C can be useful for studying and computing the spectra of Jacobi operators.

Theorem 3.4

([32]). Let J and D be bounded Jacobi operators and \(C = C_{J \rightarrow D}\) the connection coefficients matrix. For all polynomials p, we have the following as operators from \(\ell _{{\mathcal {F}}}\) to \(\ell _{{\mathcal {F}}}\),

$$ Cp(J) = p(D)C. $$

Remark 3.5

This is a generalisation of the result of Kausky and Golub [32, Lem 1], that

$$\begin{aligned} C_{N\times N}J_{N\times N} = D_{N\times N}C_{N\times N} + e_Nc_N^T, \end{aligned}$$

where \(C_{N\times N},J_{N\times N},D_{N\times N}\) are the principal \(N\times N\) submatrices of CJD, and \(c_N\) is a certain vector in \({\mathbb {R}}^N\).

Proof

First we begin with the case \(p(z) = z\). By definition,

$$\begin{aligned} CJe_0&= C(\alpha _0 e_0 + \beta _0 e_1) \\&= \alpha _0 Ce_0 + \beta _0 C e_1 \\&= \alpha _0 c_{*,0}+ \beta _0 c_{*,1}. \end{aligned}$$

Then by Corollary 3.3, this is equal to \(Dc_{*,0}\), which is equal to \(DCe_0\). Now, for any \(j > 0\),

$$\begin{aligned} CJe_j&= C(\beta _{j-1} e_{j-1} + \alpha _j e_j + \beta _j e_{j+1}) \\&= \beta _{j-1} c_{*,j-1} + \alpha _k c_{*,j} + \beta _j c_{*,j+1}. \end{aligned}$$

Then by Corollary 3.3, this is equal to \(Dc_{*,j}\), which is equal to \(DCe_j\). Hence \(CJ = DC\).

Now, when \(f(z) = z^k\) for any \(k >0\), \(D^kC = D^{k-1}CJ = \cdots = CJ^k\). By linearity \(Cf(J) = f(D)C\) for all polynomials f. \(\square \)

We believe that the basic properties relating the Radon-Nikodym derivative \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\) and the connection coefficients matrix C have not been given in the literature before, but follow naturally from discussions in, for example, [27, Ch. 5], on mixed moments and modifications of weight functions for orthogonal polynomials.

Proposition 3.6

Let J and D be bounded Jacobi operators with spectral measures \(\mu \) and \(\nu \) respectively, and connection coefficient matrix \(C = C_{J\rightarrow D}\). Then

$$\begin{aligned} \frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^2_\mu ({\mathbb {R}}) \text { if and only if } c_{0,*} \in \ell ^2, \end{aligned}$$

in which case

$$\begin{aligned} \frac{\mathrm {d}\nu }{\mathrm {d}\mu } = \sum _{k=0}^\infty c_{0,k} P_k. \end{aligned}$$
(3.8)

Proof

Suppose first that \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^2_\mu ({\mathbb {R}})\). Then \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } = \sum _{k=0}^\infty a_k P_k\), for some \(a\in \ell ^2\), because \(P_0,P_1,P_2,\ldots \) is an orthonormal basis of \(L^2_\mu ({\mathbb {R}})\). The coefficients are given by,

$$\begin{aligned} a_k&= \int P_k(s) \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \,\mathrm {d}\mu (s) \\&= \int P_k(s) \,\mathrm {d}\nu (s) \qquad \text {(definition of R--N derivative)}\\&= c_{0,k} \qquad \text {(equation} \, (3.2)). \end{aligned}$$

Hence \(c_{0,*} \in \ell ^2\) and gives the \(P_k\) coefficients of \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\).

Conversely, suppose that \(c_{0,*} \in \ell ^2\). Then the function \(\sum _{k=0}^\infty c_{0,k}P_k\) is in \(L^2_\mu ({\mathbb {R}})\), and by the same manipulations as above its projections onto polynomial subspaces are equal to that of \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\). \(\square \)

Remark 3.7

If we have a situation in which \(c_{0,*} \in \ell ^2\), we can by Proposition 3.6 and the existence of the Radon–Nikodym derivative on \(\mathrm {supp}(\mu )\) deduce that \(\sigma (D) \subset \sigma (J)\) and the function defined by \(\sum _{k=0}^\infty c_{0,k}P_k\) is zero on \(\sigma (J) \setminus \sigma (D)\). This observation translates into a rootfinding problem in Sect. 4.

Lemma 3.8

Let J and D be bounded Jacobi operators with spectral measures \(\mu \) and \(\nu \) respectively, and connection coefficient matrix \(C = C_{J\rightarrow D}\). If \(\nu \) is absolutely continuous with respect to \(\mu \), then as operators mapping \(\ell _{{\mathcal {F}}} \rightarrow \ell _{{\mathcal {F}}}^\star \),

$$\begin{aligned} C^TC = \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J). \end{aligned}$$

Here the matrix \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J)\) is interpreted as in Theorem 2.4 part (iii).

Proof

Note first that since \(C:\ell _{{\mathcal {F}}} \rightarrow \ell _{{\mathcal {F}}}\) and \(C^T : \ell _{{\mathcal {F}}}^\star \rightarrow \ell _{{\mathcal {F}}}^\star \), \(C^TC\) is well-defined \(\ell _{{\mathcal {F}}} \rightarrow \ell _{{\mathcal {F}}}^\star \). Then we have,

$$\begin{aligned} \langle e_i, C^T C e_j \rangle&= \langle e_0, P_i(D)P_j(D) e_0\rangle \qquad \text {(Corollary}~3.3\text {)}\\&= \int P_i(s) P_j(s) \,\mathrm {d}\nu (s) \qquad \text {(Theorem}~2.4 \, \text {part (iii))}\\&= \int P_i(s) P_j(s) \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \,\mathrm {d}\mu (s) \qquad \text {(definition of R--N derivative)} \\&= \left\langle e_i, \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J) e_j \right\rangle \qquad \text {(Theorem}~2.4 \, \text {part (iii))}. \end{aligned}$$

This completes the proof. \(\square \)

Proposition 3.9

Let J and D be bounded Jacobi operators with spectral measures \(\mu \) and \(\nu \) respectively, and connection coefficient matrix \(C = C_{J\rightarrow D}\). If \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^\infty _\mu ({\mathbb {R}})\) then C is a bounded operator on \(\ell ^2\) and

$$\begin{aligned} \Vert C\Vert _2^2 = \mathop {\mu {-ess }\,{sup }}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| . \end{aligned}$$

Here \(\Vert \cdot \Vert _2\) is the operator norm from \(\ell ^2 \rightarrow \ell ^2\) and \({\mu {-ess }\,{sup }}\) is the supremum up to \(\mu \)-almost everywhere equivalence of functions.

Proof

Since \(\mu \) is a probability measure (and hence \(\sigma \)-finite), we have the standard characterisation,

$$ \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| = \sup _{\begin{array}{c} g \in L_\mu ^1({\mathbb {R}}) \\ \Vert g\Vert _1 \le 1 \end{array}} \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) |g(s)| \, \mathrm {d}\mu (s), $$

which can be modified to

$$ \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| = \sup _{\begin{array}{c} f \in L_\mu ^2({\mathbb {R}}) \\ \Vert f\Vert _2 \le 1 \end{array}} \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) (f(s))^2 \, \mathrm {d}\mu (s), $$

by associating positive functions \(g \in L^1_\mu ({\mathbb {R}})\) with their square-roots \(f \in L_\mu ^2({\mathbb {R}})\). Now, since \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L_\mu ^\infty ({\mathbb {R}})\), this supremum can actually be taken over all polynomials by the following argument. If \(f \in L_\mu ^2({\mathbb {R}})\) with \(\Vert f\Vert _{2,\mu } \le 1\) then for any \(\epsilon > 0\) there exists polynomial p such that \(\Vert p\Vert _{2,\mu } \le 1\) and \(\Vert f-p\Vert _2 \le \epsilon \), since polynomials are dense in \(L_\mu ^2({\mathbb {R}})\) (this follows, for example, from the compact support of \(\mu \) [1]). It is readily shown using Hölder and triangle inequalities that

$$\begin{aligned} \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \left( (f(s))^2 - (p(s))^2 \right) \, \mathrm {d}\mu (s) \le \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| 2\epsilon , \end{aligned}$$

so that any supremum can be arbitrarily approximated using a polynomial.

Since \(\{P_k \}_{k=0}^\infty \) is a complete orthonormal basis of \(L_\mu ^2({\mathbb {R}})\) (completeness holds as a result of \(\mu \) having compact support), then \(f \in L_\mu ^2({\mathbb {R}})\) if and only if there is a unique sequence \(v \in \ell ^2\) such that \(f = \sum _{k=0}^\infty v_k P_k\) with the series converging in the \(L_\mu ^2({\mathbb {R}})\) norm. Furthermore, \(\Vert f\Vert _2 = \Vert v\Vert _2\). Hence,

$$\begin{aligned} \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right|= & {} \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \left( \sum _{k} v_k P_k(s) \right) ^2 \, \mathrm {d}\mu (s) \\= & {} \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \int \sum _{j,k} v_j v_k \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) P_j(s) P_k(s) \, \mathrm {d}\mu (s). \\= & {} \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \sum _{j,k} v_j v_k \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) P_j(s) P_k(s) \, \mathrm {d}\mu (s). \end{aligned}$$

Since \(\ell _{{\mathcal {F}}}\) is a dense subspace of \(\ell ^2\), we have \(\Vert C\Vert _2 = \sup _{v \in \ell _{{\mathcal {F}}}, \Vert v\Vert _2 = 1} \Vert Cv\Vert _2\). Now, \(\Vert Cv\Vert _2^2 = \langle v, C^T C v\rangle \), and by Lemma 3.8, \(C^TC = \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J)\). Therefore,

$$\begin{aligned} \Vert C\Vert _2^2= & {} \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \left\langle v, \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J) v \right\rangle = \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \sum _{j,k} v_j v_k \left[ \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(J)\right] _{j,k}. \end{aligned}$$

By Theorem 2.4 part (iv), we conclude that

$$\begin{aligned} \Vert C\Vert _2^2 = \sup _{\begin{array}{c} v \in \ell _{{\mathcal {F}}} \\ \Vert v\Vert _2 = 1 \end{array}} \sum _{j,k=0}^\infty v_j v_k \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) P_j(s) P_k(s) \, \mathrm {d}\mu (s). \end{aligned}$$
(3.9)

Therefore, \(\Vert C\Vert _2^2 = \mathop {\mu {\text {-ess}}\,{\text {sup}}}\limits _{s \in \sigma (J)} \left| \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s)\right| \) and C is bounded. \(\square \)

Corollary 3.10

Let J and D be bounded Jacobi operators with spectral measures \(\mu \) and \(\nu \) respectively, and connection coefficient matrix \(C = C_{J\rightarrow D}\). If \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^\infty _\mu ({\mathbb {R}})\) and \(\frac{\mathrm {d}\mu }{\mathrm {d}\nu } \in L^\infty _\nu ({\mathbb {R}})\) then C is bounded and invertible on \(\ell ^2\).

Proof

By Proposition 3.9, \(C_{J\rightarrow D}\) is bounded if \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu } \in L^\infty _\mu ({\mathbb {R}})\), and \(C_{D\rightarrow J}\) is bounded if \(\frac{\mathrm {d}\mu }{\mathrm {d}\nu } \in L^\infty _\nu ({\mathbb {R}})\). Combining this the fact that \(C_{J\rightarrow D}^{-1} = C_{D \rightarrow J}\), as operators from \(\ell _{{\mathcal {F}}}\) to itself, we complete the proof. \(\square \)

The following definition and lemma are useful later.

Definition 3.11

Given polynomial sequences \(P_0,P_1,P_2,\ldots \) and \(Q_0,Q_1,Q_2,\ldots \) for Jacobi operators J and D respectively, we define the matrix \(C^\mu \) to be the connection coefficients matrix between \(P_0^\mu ,P_1^\mu ,P_2^\mu ,\ldots \) and \(Q_0,Q_1,Q_2,\ldots \) as in Definition 1.1, where \(P_0^\mu ,P_1^\mu ,P_2^\mu ,\ldots \) are the first associated polynomials for J as in Definition 2.5. Noting that the lower triangular matrix \((C^\mu )^T\) is a well defined operator from \(\ell _{{\mathcal {F}}}^\star \) into itself, we have

$$\begin{aligned} \left( \begin{array}{c} P^\mu _0(s) \\ P^\mu _1(s) \\ P^\mu _2(s) \\ \vdots \end{array} \right) = (C^\mu )^T \left( \begin{array}{c} Q_0(s) \\ Q_1(s) \\ Q_2(s) \\ \vdots \end{array} \right) \text { for all } s. \end{aligned}$$

Remark 3.12

Note that \(C^\mu \) is strictly upper triangular, because the first associated polynomials have their degrees one less than their indices.

Lemma 3.13

The operator \(C^\mu \) as defined above for \(C_{J\rightarrow D}\) is in fact \(\beta _0^{-1}\left( 0 , C_{J^\mu \rightarrow D} \right) \), where

Proof

The (unique) orthonormal polynomials for \(J^\mu \) are \(\beta _0 P_1^\mu ,\beta _0P_2^\mu ,\beta _0P_3^\mu ,\ldots \), and \(P^\mu _0 = 0\). \(\square \)

4 Toeplitz-Plus-Finite-Rank Jacobi Operators

In this section we present several novel results which show how the connection coefficient matrices can be used for computing the spectral measure of a Toeplitz-plus-finite-rank Jacobi operator.

4.1 Jacobi operators for Chebyshev polynomials

There are two particular Jacobi operators with Toeplitz-plus-finite-rank structure that are of great interest,

(4.1)

The spectral measures of \(\Delta \) and \(\Gamma \) are

$$\begin{aligned} \mathrm {d}\mu _\Delta (s) = \frac{2}{\pi } \sqrt{1-s^2}\mathrm {d}s, \quad \mathrm {d}\mu _\Gamma (s) = \frac{1}{\pi }\frac{1}{\sqrt{1-s^2}} \mathrm {d}s, \end{aligned}$$

supported on \([-1,1]\).

Using results of Stieltjes in his seminal paper [43, 1, App.], the principal resolvent can be written elegantly as a continued fraction,

$$\begin{aligned} G(\lambda ) = \frac{-1}{\lambda - \alpha _0 - \frac{\beta _0^2}{\lambda - \alpha _1 - \frac{\beta _1^2}{\lambda - \alpha _2 -...}}}. \end{aligned}$$
(4.2)

Using this gives explicit expressions for the principal resolvents,

$$\begin{aligned} G_\Delta (\lambda ) = 2\sqrt{\lambda +1}\sqrt{\lambda -1}-2\lambda , \quad G_\Gamma (\lambda ) = \frac{-1}{\sqrt{\lambda +1}\sqrt{\lambda -1}}. \end{aligned}$$

Remark 4.1

We must be careful about which branch we refer to when we write the resolvents in this explicit form. Wherever is written above we mean the standard branch that is positive on \((0,\infty )\) with branch cut \((-\infty ,0]\). This gives a branch cut along \([-1,1]\) in both cases, the discontinuity of G across which makes the Perron–Stieltjes inversion formula in Theorem 2.2 work. It also ensures the \({\mathcal {O}}(\lambda ^{-1})\) decay resolvents enjoy as \(\lambda \rightarrow \infty \).

The orthonormal polynomials for \(\Delta \) are the Chebyshev polynomials of the second kind, which we denote \(U_k(s)\),

$$\begin{aligned} U_k(s) = \frac{\sin ((k+1)\cos ^{-1}(s))}{\sin (\cos ^{-1}(s))}. \end{aligned}$$

The orthonormal polynomials for \(\Gamma \) are the normalised Chebyshev polynomials of the first kind, which we denote \({{\tilde{T}}}_k(s)\). Note that these are not the usual Chebyshev polynomials of the first kind (denoted \(T_k(s)\)) [16, 24]. We in fact have,

$$\begin{aligned} {\tilde{T}}_0(s) = 1, \quad {\tilde{T}}_k(s) = \sqrt{2}\cos (k\cos ^{-1}(s)). \end{aligned}$$

The first associated polynomials have simple relationships with the orthonormal polynomials,

$$\begin{aligned} U^{\mu _\Delta }_k = 2U_{k-1}, \quad {\tilde{T}}^{\mu _\Gamma }_k = \sqrt{2}U_{k-1}. \end{aligned}$$
(4.3)

4.2 Basic perturbations

In this section we demonstrate for two simple, rank-one perturbations of \(\Delta \) how the connection coefficient matrix relates properties of the spectrum of the operators. This will give some intuition as to what to expect in more general cases.

Example 4.2

(Basic perturbation 1). Let \(\alpha \in {\mathbb {R}}\), and define

Then the connection coefficient matrix \(C_{J_\alpha \rightarrow \Delta }\) is the bidiagonal Toeplitz matrix

(4.4)

This can be computed using the explicit recurrences (3.3)–(3.7). The connection coefficient matrix \(C_{\Delta \rightarrow J_\alpha }\) (which is the inverse of \(C_{J_\alpha \rightarrow \Delta }\) on \(\ell _{{\mathcal {F}}}\)) is the full Toeplitz matrix

From this we see that \(C=C_{J_\alpha \rightarrow \Delta }\) has a bounded inverse in \(\ell ^2\) if and only if \(|\alpha |<1\). Hence by Theorem 3.4, if \(|\alpha |<1\) then \(CJ_\alpha C^{-1} = \Delta \) with each operator bounded on \(\ell ^2\), so that \(\sigma (J_\alpha ) = \sigma (\Delta ) = [-1,1]\). We will discuss what happens when \(|\alpha | \ge 1\) later in the section.

Example 4.3

(Basic perturbation 2). Let \(\beta > 0\), and define

Then the connection coefficient matrix \(C_{J_\beta \rightarrow \Delta }\) is the banded Toeplitz-plus-rank-1 matrix

(4.5)

Just as in Example 4.2, this can be computed using the explicit recurrences (3.3)–(3.7). The connection coefficient matrix \(C_{\Delta \rightarrow J_\beta }\) (which is the inverse of \(C_{J_\beta \rightarrow \Delta }\) on \(\ell _{{\mathcal {F}}}\)) is the Toeplitz-plus-rank-1 matrix

From this we see that \(C = C_{J_\beta \rightarrow \Delta }\) has a bounded inverse on \(\ell ^2\) if and only if \(\beta <\sqrt{2}\). Hence by Theorem 3.4, if \(\beta <\sqrt{2}\) then \(CJ_\beta C^{-1} = \Delta \) with each operator bounded on \(\ell ^2\), so that \(\sigma (J_\beta ) = \sigma (\Delta ) = [-1,1]\). We will discuss what happens when \(\beta \ge \sqrt{2}\) later in the section. Note that the case \(\beta = \sqrt{2}\) gives the Jacobi operator \(\Gamma \) in equation (4.1).

4.3 Fine properties of the connection coefficients

The two basic perturbations of \(\Delta \) discussed above give connection coefficient matrices that are highly structured. The following lemmata and theorems prove that this is no coincidence; in fact, if Jacobi operator J is a finite-rank perturbation of \(\Delta \) then \(C_{J \rightarrow \Delta }\) is also a finite-rank perturbation of Toeplitz.

Remark 4.4

Note for the following results that all vectors and matrices are indexed starting from 0.

Lemma 4.5

If \(\delta _j = \beta _j\) for \(j \ge n\) then \(c_{jj} = c_{nn}\) for all \(j \ge n\).

Proof

By the recurrence in Lemma 3.2, \(c_{jj} = (\delta _{j-1}/\beta _{j-1})c_{j-1,j-1}\). The result follows by induction. \(\square \)

Lemma 4.6

Let J and D be Jacobi operators with coefficients \(\{\alpha _k,\beta _k\}\) and \(\{\gamma _k,\delta _k\}\) respectively, such that there exists an n such thatFootnote 1

$$\begin{aligned} \alpha _k = \gamma _k = \alpha _n, \quad \beta _{k-1} = \delta _{k-1} = \beta _{n-1} \text { for all } k\ge n. \end{aligned}$$

Then the entries of the connection coefficient matrix \(C = C_{J \rightarrow D}\) satisfy

$$\begin{aligned} c_{i,j} = c_{i-1,j-1} \text { for all } i,j > 0 \text { such that } i \ge n. \end{aligned}$$

Remark 4.7

This means that C is of the form \(C = C_{\mathrm{Toe}} + C_{\mathrm{fin}}\) where \(C_{\mathrm{Toe}}\) is Toeplitz and \(C_{\mathrm{fin}}\) is zero except in the first \(n-1\) rows. For example, when \(n = 4\), we have the following structure

Proof

We prove by induction on \(k = 0,1,2,\ldots \) that

$$\begin{aligned} c_{i,i+k} = c_{i-1,i+k-1} \text { for all } i \ge n. \end{aligned}$$
(4.6)

We use the recurrences in Lemma 3.2 and equations (3.3)–(3.7). The base case \(k=0\) is proved in Lemma 4.5. Now we deal with the second base case, \(k=1\). For any \(i \ge n\), we have \(\beta _i = \delta _i = \beta _{i-1} = \delta _{i-1}\), and \(\alpha _i = \gamma _i\), so

$$\begin{aligned} c_{i,i+1}&=\left( \delta _{i-1}c_{i-1,i} + (\gamma _i-\alpha _{i})c_{i,i} + \delta _ic_{i+1,i} - \beta _{i-1}c_{i,i-1}\right) /\beta _{i} \\&= 1 \cdot c_{i-1,i} + 0 \cdot c_{i,i} + 1 \cdot 0 - 1 \cdot 0 \\&= c_{i-1,i}. \end{aligned}$$

Now we deal with the case \(k > 1\). For any \(i \ge n\), we have \(\delta _i = \delta _{i-1} = \beta _{i+k-2} = \beta _{i+k-1}\), and \(\alpha _{i+k-1} = \gamma _i\), so

$$\begin{aligned} c_{i,i+k}&= \left( \delta _{i-1}c_{i-1,i+k-1} + (\gamma _i-\alpha _{i+k-1})c_{i,i+k-1} + \delta _ic_{i+1,i+k-1} - \beta _{i+k-2}c_{i,i+k-2}\right) /\beta _{i+k-1} \\&= 1 \cdot c_{i-1,i+k-1} + 0 \cdot c_{i,i+k-1} + 1 \cdot c_{i+1,i+k-1} - 1\cdot c_{i,i+k-2} \\&= c_{i-1,i+k-1} + c_{i+1,i+k-1} - c_{i,i+k-2} \\&= c_{i-1,i+k-1}. \end{aligned}$$

The last line follows from the induction hypothesis for the case \(k-2\) (hence why we needed two base cases). \(\square \)

The special case in which D is Toeplitz gives even more structure to C, as demonstrated by the following theorem. We state the results for a finite-rank perturbation of the free Jacobi operator \(\Delta \), but they apply to general Toeplitz-plus-finite rank Jacobi operators because the connection coefficients matrix C is unaffected by a scaling and shift by the identity applied to both J and D.

Theorem 4.8

Let J be a Jacobi operator such that there exists an n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$

i.e. it is equal to the free Jacobi operator \(\Delta \) outside the \(n\times n\) principal submatrix. Then the entries of the connection coefficient matrix \(C = C_{J \rightarrow \Delta }\) satisfy

$$\begin{aligned} c_{i,j}&= c_{i-1,j-1} \text { for all } i,j > 0 \text { such that } i+j \ge 2n \end{aligned}$$
(4.7)
$$\begin{aligned} c_{0,j}&= 0 \text { for all } j \ge 2n. \end{aligned}$$
(4.8)

Remark 4.9

This means that C is of the form \(C = C_{\mathrm{Toe}} + C_{\mathrm{fin}}\) where \(C_{\mathrm{Toe}}\) is Toeplitz with bandwidth \(2n-1\) and \(C_{\mathrm{fin}}\) zero except for entries in the \((n-1) \times (2n-2)\) principal submatrix. For example when \(n = 4\), we have the following structure,

Proof

First we prove (4.7). Fix ij such that \(i+j \ge 2n\). Note that the case \(i \ge n\) is proven in Lemma 4.6. Hence we assume \(i < n\), and therefore \(j > n\). Using Lemma 3.2 and equations (3.3)–(3.7) we find the following recurrence. Substituting \(\delta _i = \frac{1}{2}\), \(\gamma _i = 0\) for all i, and \(\alpha _{k} = 0\), \(\beta _{k-1} =\frac{1}{2}\) for \(k \ge n\) into the recurrence, we have

$$\begin{aligned} c_{i,j}&= \left( \delta _{i-1}c_{i-1,j-1} + (\gamma _i-\alpha _{j-1})c_{i,j-1} + \delta _ic_{i+1,j-1} - \beta _{j-2}c_{i,j-2}\right) /\beta _{j-1}. \\&= \left( \frac{1}{2} c_{i-1,j-1} - \alpha _{j-1}c_{i,j-1} + \frac{1}{2}c_{i+1,j-1} - \beta _{j-2}c_{i,j-2} \right) /\beta _{j-1} \\&= c_{i-1,j-1} + c_{i+1,j-1} - c_{i,j-2}. \end{aligned}$$

Repeating this process on \(c_{i+1,j-1}\) in the above expression gives

$$\begin{aligned} c_{i,j} = c_{i-1,j-1} + c_{i+2,j-2} - c_{i+1,j-3}. \end{aligned}$$

Repeating the process on \(c_{i+2,j-2}\) and so on eventually gives

$$\begin{aligned} c_{i,j} = c_{i-1,j-1} + c_{n,i+j-n} - c_{n-1,i+j-n-1}. \end{aligned}$$

By Lemma 4.6, \(c_{n,i+j-n} = c_{n-1,i+j-n-1}\), so we are left with \(c_{i,j} = c_{i-1,j-1}\). This completes the proof of (4.7).

Now we prove (4.8). Let \(j \ge 2n\). Then

$$\begin{aligned} c_{0,j}&= \left( (\gamma _0-\alpha _{j-1})c_{0,j-1} + \delta _0c_{1,j-1} - \beta _{j-2} c_{0,j-2}\right) /\beta _{j-1} \\&= \left( -\alpha _{j-1}c_{0,j-1} + \frac{1}{2}c_{1,j-1} - \beta _{j-2} c_{0,j-2}\right) /\beta _{j-1} \\&= c_{1,j-1} - c_{0,j-2}. \end{aligned}$$

This is equal to zero by (4.7), because \(1 + (j-1) \ge 2n\). \(\square \)

Corollary 4.10

Let \(C^\mu \) be as defined in Definition 3.11 for C as in Theorem 4.8. Then \(C^\mu = C^\mu _{\mathrm{Toe}} + C^\mu _{\mathrm{fin}}\), where \(C^\mu _{\mathrm{Toe}}\) is Toeplitz with bandwidth \(2n-2\) and \(C^\mu _{\mathrm{fin}}\) is zero outside the \((n-2) \times (2n-1)\) principal submatrix.

Proof

This follows from Theorem 4.8 applied to \(J^\mu \) as defined in Lemma 3.13. \(\square \)

Remark 4.11

A technical point worth noting for use in proofs later is that for Toeplitz-plus-finite-rank Jacobi operators like J and D occurring in Theorem 4.8 and Corollary 4.10, the operators C, \(C^T\), \(C^\mu \) and \((C^\mu )^T\) all map \(\ell _{{\mathcal {F}}}\) to \(\ell _{{\mathcal {F}}}\). Consequently, combinations such as \(CC^T\), \(C^\mu C^T\) are all well defined operators from \(\ell _{{\mathcal {F}}}\) to \(\ell _{{\mathcal {F}}}\).

4.4 Properties of the resolvent

When the Jacobi operator J is Toeplitz-plus-finite rank, as a consequence of the structure of the connection coefficients matrix proved in Sect. 4.3, the principal resolvent G (see Definition 2.1) and spectral measure (see Theorem 2.2) are also highly structured. As usual these proofs are stated for a finite-rank perturbation of the free Jacobi operator \(\Delta \), but apply to general Toeplitz-plus-finite rank Jacobi operators by applying appropriate scaling and shifting.

Theorem 4.12

Let J be a Jacobi operator such that there exists an n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$

i.e. it is equal to the free Jacobi operator \(\Delta \) outside the \(n\times n\) principal submatrix. Then the principal resolvent for J is

$$\begin{aligned} G(\lambda ) = \frac{G_\Delta (\lambda ) - p_C^{\mu }(\lambda )}{p_C(\lambda )}, \end{aligned}$$
(4.9)

where

$$\begin{aligned} p_C(\lambda )&= \sum _{k = 0}^{2n-1} c_{0,k} P_k(\lambda ) = \sum _{k=0}^{2n-1} \langle C^Te_k ,C^T e_0 \rangle U_k(\lambda ), \end{aligned}$$
(4.10)
$$\begin{aligned} p^\mu _C(\lambda )&= \sum _{k = 1}^{2n-1} c_{0,k} P^\mu _k(\lambda ) = \sum _{k=0}^{2n-1} \langle (C^\mu )^T e_k, C^T e_0 \rangle U_k(\lambda ), \end{aligned}$$
(4.11)

\(P_k\) are the orthonormal polynomials for J, \(P_k^\mu \) are the first associated polynomials for J as in Definition 2.5, and \(U_k\) are the Chebyshev polynomials of the second kind.

Remark 4.13

\(p_C^{\mu }\) is the \(\mu \)-derivative of \(p_C\) as in Definition 2.7.

Proof

Using Theorem 2.2 and Proposition 3.6,

$$\begin{aligned} G_\Delta (\lambda )&= \int (s-\lambda )^{-1} \mathrm {d}\mu _\Delta (s) \\&= \int (s-\lambda )^{-1} p_C(s) \mathrm {d}\mu (s). \end{aligned}$$

Now, since \(p_C\) is a polynomial we can split this into

$$\begin{aligned} G_\Delta (\lambda ) = \int (s-\lambda )^{-1} p_C(\lambda ) \mathrm {d}\mu (s) + \int (s-\lambda )^{-1} (p_C(s)-p_C(\lambda )) \mathrm {d}\mu (s). \end{aligned}$$

The first term is equal to \(p_C(\lambda )G(\lambda )\), and the second term is equal to \(p_C^\mu (\lambda )\) by Lemma 2.6 and Remark 4.13. The equation can now be immediately rearranged to obtain (4.9).

To see the equality in equation (4.10), note that by the definition of the connection coefficient matrix C,

$$\begin{aligned} \sum _{k=0}^{2n-1}c_{0,k}P_k(\lambda )&= \sum _{k=0}^{2n-1} c_{0,k}\sum _{j=0}^{2n-1} c_{j,k}U_j(\lambda ) \\&= \sum _{j=0}^{2n-1} \left( \sum _{k=0}^{2n-1} c_{0,k}c_{j,k} \right) U_j(\lambda ) \\&= \sum _{j=0}^{2n-1} \langle C^Te_j,C^T e_0 \rangle U_j(\lambda ). \end{aligned}$$

Equation (4.11) follows by the same algebra. \(\square \)

Theorem 4.14

Let J be a Jacobi operator such that there exists an n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$

i.e. it is equal to the free Jacobi operator \(\Delta \) outside the \(n\times n\) principal submatrix. Then the spectral measure for J is

$$\begin{aligned} \mu (s) = \frac{1}{p_C(s)}\mu _\Delta (s) + \sum _{k=1}^{r} w_k \delta _{\lambda _k}(s), \end{aligned}$$
(4.12)

where \(\lambda _1, \ldots ,\lambda _{r}\) are the roots of \(p_C\) in \({\mathbb {R}}\setminus \{1,-1\}\) such that

$$\begin{aligned} w_k = \lim _{\epsilon \searrow 0} \frac{\epsilon }{i} G(\lambda _k+i\epsilon ) \ne 0. \end{aligned}$$

There are no roots of \(p_C\) inside \((-1,1)\), but there may be simple roots at \(\pm 1\).

Remark 4.15

We will see in Theorem 4.22 that the number of roots of \(p_C\) for which \(w_k \ne 0\) is at most n (i.e. \(r\le n\)). Hence, while the degree of \(p_C\) is at most \(2n-1\), at least \(n-1\) are cancelled out by factors in the numerator.

Proof

Let G and \(\mu \) be the principal resolvent and spectral measure of J respectively. By Theorem 4.12,

$$\begin{aligned} G(\lambda ) = \frac{G_\Delta (\lambda ) - p_C^{\mu }(\lambda )}{p_C(\lambda )}. \end{aligned}$$

Letting \(\lambda _1,\ldots ,\lambda _{2n-1}\) be the roots of \(p_C\) in the complex plane, define the set

$$\begin{aligned} S = [-1,1] \cup (\{\lambda _1,\ldots ,\lambda _{2n-1}\} \cap {\mathbb {R}}). \end{aligned}$$

By inspection of the above formula for G, and because resolvents of selfadjoint operators are analytic off the real line, we have that G is continuous outside of S. Therefore, for any \(s\in {\mathbb {R}}\) such that \(\mathrm {dist}(s,S) > 0\), we have

$$\begin{aligned} \lim _{\epsilon \searrow 0} \mathrm {Im}\,G(s+i\epsilon ) = \mathrm {Im}\,G(s) = 0. \end{aligned}$$

Hence by Theorem 2.2 part (ii), for any interval \((s_1,s_2)\) such that \(\mathrm {dist}(S,(s_1,s_2)) > 0\), we have \(\mu ((s_1,s_2)) + \frac{1}{2} \mu (\{s_1\}) + \frac{1}{2} \mu (\{s_2\}) = 0\). Therefore the essential support of \(\mu \) is contained within S.

We are interested in the real roots of \(p_C\). Let us consider the potential for roots of \(p_C\) in the interval \([-1,1]\). By Proposition 3.6, \(\mathrm {d}\mu _\Delta (s) = p_C(s) \mathrm {d}\mu (s)\) for all \(s \in {\mathbb {R}}\). For any \(s\in [-1,1]\) such that \(p_C(s) \ne 0\), it follows that \(\mathrm {d}\mu (s) = \frac{2}{\pi }\frac{\sqrt{1-s^2}}{p_C(s)}\mathrm {d}s\). From this we have

$$\begin{aligned} 1 \ge \mu ((-1,1)) = \int _{-1}^1 \frac{2}{\pi }\frac{\sqrt{1-s^2}}{p_C(s)} \,\mathrm {d}s. \end{aligned}$$

This integral is only finite, so \(p_C\) has no roots in \((-1,1)\), but may have simple roots at \(\pm 1\). One example where we have simple roots at \(\pm 1\) is seen in Example 4.18 with \(\beta = \sqrt{2}\).

Since S is a disjoint union of \([-1,1]\) and a finite set \(S'\) we can write

$$\begin{aligned} \mu (s) \frac{1}{p_C(s)} \mu _\Delta (s) + \sum _{\lambda _k \in S'} \mu (\{\lambda _k\}) \delta _{\lambda _k}(s). \end{aligned}$$

By Theorem 2.2 part (iii), \( \mu (\{s\}) = \lim _{\epsilon \searrow 0} \frac{\epsilon }{i} G(s+i\epsilon ) \text { for all } s\in {\mathbb {R}}\). This gives the desired formula for \(w_k\). \(\square \)

Remark 4.16

Theorem 4.14 gives an explicit formula for the spectral measure of J, when J is Toeplitz-plus-finite-rank Jacobi operator. The entries of C can be computed in \({\mathcal {O}}(n^2)\) operations (for an \(n \times n\) perturbation of Toeplitz). Hence, the absolutely continuous part of the measure can be computed exactly in finite time. It would appear at first that we may compute the locations of the point spectrum by computing the roots of \(p_C\), but as stated in Remark 4.15 we find that not all real roots of \(p_C\) have \(w_k \ne 0\). Hence we rely on cancellation between the numerator and denominator in the formula for \(G(\lambda )\), which is a dangerous game, because if roots of polynomials are only known approximately then it is impossible to distinguish between cancellation and the case where a pole and a root are merely extremely close. Section 4.5 remedies this situation.

Example 4.17

(Basic perturbation 1 revisited). The polynomial \(p_C\) in Theorem 4.12 is

$$\begin{aligned} p_C(\lambda ) = c_{0,0} P_0(\lambda ) +c_{0,1} P_1(\lambda ) = 1 - \alpha (2\lambda - \alpha ) = 2\alpha \left( \frac{1}{2}(\alpha + \alpha ^{-1}) - \lambda \right) , \end{aligned}$$

and the \(\mu \)-derivative is \(p_C^\mu (\lambda ) = -2\alpha \). Theorem 4.12 gives

$$\begin{aligned} G(\lambda ) = \frac{G_\Delta (\lambda ) + 2\alpha }{2\alpha \left( \frac{1}{2}(\alpha + \alpha ^{-1})-\lambda \right) }. \end{aligned}$$

Consider the case \(|\alpha | \le 1\). Then a brief calculation reveals \(G_\Delta (\frac{1}{2}(\alpha +\alpha ^{-1})) = -2\alpha \). Hence the root \(\lambda = \frac{1}{2}(\alpha + \alpha ^{-1})\) of the denominator is always cancelled out. Hence G has no poles, and so J has no eigenvalues.

In the case where \(|\alpha | > 1\), we have a different situation. Here \(G_\Delta (\frac{1}{2}(\alpha +\alpha ^{-1})) = -2\alpha ^{-1}\). Therefore the root \(\lambda = \frac{1}{2}(\alpha + \alpha ^{-1})\) of the denominator is never cancelled out. Hence there is always a pole of G at \(\lambda = \frac{1}{2}(\alpha +\alpha ^{-1})\), and therefore also an eigenvalue of J there.

Notice a heavy reliance on cancellations in the numerator and denominator for the existence of eigenvalues. The approach in Sect. 4.5 avoids this.

Example 4.18

(Basic perturbation 2 revisited). The polynomial \(p_C\) in Theorem 4.12 is

$$\begin{aligned} p_C(\lambda ) = c_{0,0}P_0(\lambda ) + c_{0,2}P_2(\lambda ) = 1 + (\beta ^{-1}-\beta )(4\beta ^{-1}\lambda ^2 - \beta ). \end{aligned}$$

This simplifies to \(p_C(\lambda ) = 4(1-\beta ^{-2})\left( \frac{\beta ^4}{4(\beta ^2-1)} - \lambda ^2\right) \). Using Definition 2.5, the \(\mu \)-derivative is \(p_C^\mu (\lambda ) = c_{0,2}P_2^\mu (\lambda ) = 4\beta ^{-1}\lambda \). Theorem 4.12 gives

$$\begin{aligned} G(\lambda ) = \frac{G_\Delta (\lambda ) + 4\beta ^{-1}\lambda }{4(1-\beta ^{-2})\left( \frac{\beta ^4}{4(\beta ^2-1)}-\lambda ^2\right) }. \end{aligned}$$

Clearly the only points at which G may have a pole is \(\lambda = \pm \frac{\beta ^2}{2\sqrt{\beta ^2-1}}\). However, it is difficult to see whether there would be cancellation on the numerator. In the previous discussion on this example we noted that there would not be any poles when \(|\beta |<\sqrt{2}\), which means that the numerator must be zero at these points, but it is far from clear here. The techniques we develop in the sequel illuminate this issue, especially for examples which are much more complicated than the two trivial ones given so far.

4.5 The Joukowski transformation

The following two lemmata and two theorems prove that the issue of cancellation and the number of discrete spectra in Theorem 4.12 and Theorem 4.14 can be solved by making the change of variables

$$\begin{aligned} \lambda (z) = \frac{1}{2}(z+z^{-1}) \end{aligned}$$

This map is known as the Joukowski map. It is an analytic bijection from \({\mathbb {D}} = \{z \in {\mathbb {C}}: |z| < 1 \}\) to \({\mathbb {C}}\setminus [-1,1]\), sending the unit circle to two copies of the interval \([-1,1]\).

The Joukowski map has special relevance for the principal resolvent of \(\Delta \). A brief calculation reveals that for \(z \in {\mathbb {D}}\),

$$\begin{aligned} G_\Delta (\lambda (z)) = -2z. \end{aligned}$$
(4.13)

Further, we will see that the polynomials \(p_C(\lambda )\) and \(p_C^\mu (\lambda )\) occurring in our formula for G can be expressed neatly as polynomials in z and \(z^{-1}\). This is a consequence of a special property of the Chebyshev polynomials of the second kind, that for any \(k \in {\mathbb {Z}}\) and \(z \in {\mathbb {D}}\)

$$\begin{aligned} \frac{U_{m-k}(\lambda (z))}{U_m(\lambda (z))} \rightarrow z^k \text { as } m \rightarrow \infty . \end{aligned}$$
(4.14)

These convenient facts allow us to remove any square roots involved in the formulae in Theorem 4.12.

Lemma 4.19

Let \(p_C(\lambda ) = \sum _{k=0}^{2n-1} \langle e_0, CC^T e_k \rangle U_k(\lambda )\) as in Theorem 4.12 and let c be the symbol of \(C_{\mathrm{Toe}}\), the Toeplitz part of C as guaranteed by Theorem 4.8. Then

$$\begin{aligned} p_C(\lambda (z)) = c(z)c(z^{-1}), \end{aligned}$$
(4.15)

where \(\lambda (z) = \frac{1}{2}(z + z^{-1})\).

Proof

The key quantity to observe for this proof is

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P_{m+k}(\lambda (z))}{U_m(\lambda (z))}, \end{aligned}$$
(4.16)

for \(z \in {\mathbb {D}}\) as \(m\rightarrow \infty \). We will show it is equal to both sides of equation (4.15). Consider the polynomial \(U_m \cdot p_C\). The jth coefficient in an expansion in the basis \(P_0,P_1,P_2,\ldots \) is given by

$$\begin{aligned} \int \left( U_m(s)p_C(s)\right) P_j \mathrm {d}\mu (s) = \int U_m(s)P_j(s) \mathrm {d}\mu _{\Delta }(s) = c_{m,j}, \end{aligned}$$

because \(p_C = \frac{\mathrm {d}\mu _{\Delta }}{\mathrm {d}\mu }\) by Proposition 3.6. Hence

$$\begin{aligned} p_C(\lambda (z)) = \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P_{m+k}(\lambda (z))}{U_m(\lambda (z))}, \end{aligned}$$

for all \(m \in {\mathbb {N}}\) and all \(z \in {\mathbb {D}}\).

Now we show that (4.16) converges to \(c(z)c(z^{-1})\) as \(m\rightarrow \infty \). By the definition of the connection coefficients, \(P_{m+k} = \sum _{j=0}^{2n-1} c_{m+k-j,m+k} U_{m+k-j}\). Therefore,

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P_{m+k}(\lambda (z))}{U_m(\lambda (z))} = \sum _{j,k=0}^{2n-1} c_{m,m+k} c_{m+k-j,m+k}\frac{U_{m+k-j}(\lambda (z))}{U_{m}(\lambda (z))}. \end{aligned}$$

Now, by Theorem 4.8, \(C = C_{\mathrm{Toe}} + C_{\mathrm{fin}}\), where \(C_{\mathrm{fin}}\) is zero outside the \((n-1) \times (2n-2)\) principal submatrix. Hence for m sufficiently large we have \(c_{m,m+k} = t_k\) for a sequence \((t_k)_{k\in {\mathbb {Z}}}\) such that \(t_k = 0\) for \(k \notin \{0,1,\ldots ,2n-1\}\). Hence we have for m sufficiently large,

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P_{m+k}(\lambda (z))}{U_m(\lambda (z))}&= \sum _{k=0}^{2n-1} \sum _{j=0}^{2n-1} t_k t_j \frac{U_{m+k-j}(\lambda (z))}{U_m(\lambda (z))}. \end{aligned}$$

By equation (4.14), this tends to \(\sum _{k=0}^{2n-1} \sum _{j=0}^{2n-1} t_k t_j z^{j-k}\) as \(m \rightarrow \infty \). This is equal to \(c(z)c(z^{-1})\), as required to complete the proof. \(\square \)

Lemma 4.20

Let \(p^\mu _C(\lambda ) = \sum _{k=0}^{2n-1} \langle e_k ,C^\mu C^T e_0 \rangle U_k(\lambda )\) as in Theorem 4.12 and let \(c_\mu \) be the symbol of \(C_{\mathrm{Toe}}^\mu \), the Toeplitz part of \(C^\mu \) as guaranteed by Corollary 4.10. Then

$$\begin{aligned} p_C^\mu (\lambda (z)) = c(z^{-1})c_\mu (z) - 2z, \end{aligned}$$
(4.17)

where \(\lambda (z) = \frac{1}{2}(z+z^{-1})\) and \(z \in {\mathbb {D}}\).

Proof

The key quantity to observe for this proof is

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P^\mu _{m+k}(\lambda (z))}{U_m(\lambda (z))}, \end{aligned}$$
(4.18)

for \(z\in {\mathbb {D}}\), as \(m \rightarrow \infty \). We will compute two equivalent expressions for this quantity to derive equation (4.17). In the proof of Lemma 4.19, it was shown that \(U_m(\lambda ) p_C(\lambda ) = \sum _{k=0}^{2n-1} c_{m,m+k}P_k(\lambda )\). We take the \(\mu \)-derivative (see Definition 2.7) of both sides as follows.

$$\begin{aligned} \int \frac{U_m(\lambda )p_C(\lambda ) - U_m(s)p_C(s)}{\lambda - s} \mathrm {d}\mu (s)&= U_m(\lambda )\int \frac{p_C(\lambda ) - p_C(s)}{\lambda - s} \mathrm {d}\mu (s) \\&\quad + \int \frac{U_m(\lambda ) - U_m(s)}{\lambda - s} p_C(s)\mathrm {d}\mu (s) \\&= U_m(\lambda ) p^\mu _C(\lambda ) + U_m^{\mu _{\Delta }}(\lambda ), \end{aligned}$$

because by Proposition 3.6, \(p_C = \frac{\mathrm {d}\mu _{\Delta }}{\mathrm {d}\mu }\). Using the formula from equation (4.3), \(U_m^{\mu _{\Delta }} = 2U_{m-1}\), we find that the \(\mu \)-derivative of \(U_m(s) p_C(s)\) is equal to \( U_m(\lambda ) p^\mu _C(\lambda ) + 2U_{m-1}(\lambda )\). Taking the \(\mu \)-derivative from the other side gives \(\sum _{k=0}^{2n-1} c_{m,m+k}P^\mu _k(\lambda )\). Taking the limit as \(m\rightarrow \infty \) and using equation (4.14), we have our first limit for the quantity in equation (4.18):

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P^\mu _{m+k}(\lambda (z))}{U_m(\lambda (z))} \rightarrow p_C^\mu (\lambda (z)) + 2z \text { as } m \rightarrow \infty . \end{aligned}$$

Now we show that (4.18) converges to \(c_\mu (z)c(z^{-1})\) as \(m\rightarrow \infty \). By the definition of the connection coefficients matrix \(C^\mu \), \(P^\mu _{m+k} = \sum _{j=1}^{2n-2} c^\mu _{m+k-j,m+k} U_{m+k-j}\). Therefore,

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P^\mu _{m+k}(\lambda (z))}{U_m(\lambda (z))} = \sum _{k=0}^{2n-1}\sum _{j=1}^{2n-2} c_{m,m+k} c^\mu _{m+k-j,m+k}\frac{U_{m+k-j}(\lambda (z))}{U_{m}(\lambda (z))}. \end{aligned}$$

By Corollary 4.10, \(C^\mu = C^\mu _{\mathrm{Toe}} + C^\mu _{\mathrm{fin}}\), where \(C^\mu _{\mathrm{fin}}\) is zero outside the principal \((n-2) \times (2n-1)\) submatrix. Hence for m sufficiently large we have \(c^\mu _{m,m+k} = t^\mu _k\) for a sequence \((t^\mu _k)_{k\in {\mathbb {Z}}}\) such that \(t^\mu _k = 0\) for \(k \notin \{1,\ldots ,2n-2\}\). Hence we have for sufficiently large m,

$$\begin{aligned} \frac{\sum _{k=0}^{2n-1}c_{m,m+k}P_{m+k}(\lambda (z))}{U_m(\lambda (z))}&= \sum _{k=0}^{2n-1} \sum _{j=1}^{2n-2} t_k t^\mu _j \frac{U_{m+k-j}(\lambda (z))}{U_m(\lambda (z))}. \end{aligned}$$

By equation (4.14), this tends to \(\sum _{k=0}^{2n-1} \sum _{j=1}^{2n-2} t_k t^\mu _j z^{j-k}\) as \(m \rightarrow \infty \). This is equal to \(c_\mu (z)c(z^{-1})\). Equating this with the other equality for equation (4.18) gives \(p^\mu (\lambda (z)) = c(z^{-1})c_\mu (z) - 2z\) as required. \(\square \)

The following theorem describes Theorem 4.12 under the change of variables induced by the Joukowski map. The remarkable thing is that the resolvent is expressible as a rational function inside the unit disc.

Theorem 4.21

Let J be a Jacobi operator such that there exists an n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$

i.e. it is equal to the free Jacobi operator \(\Delta \) outside the \(n\times n\) principal submatrix. By Theorem 4.8 the connection coefficient matrix can be decomposed into \(C = C_{\mathrm{Toe}} + C_{\mathrm{fin}}\). By Corollary 4.10, we similarly have \(C^\mu = C^\mu _{\mathrm{Toe}} + C^\mu _{\mathrm{fin}}\). If c and \(c_\mu \) are the Toeplitz symbols of \(C_{\mathrm{Toe}}\) and \(C^\mu _{\mathrm{Toe}}\) respectively, then for \(\lambda (z) = \frac{1}{2}(z+z^{-1})\) with \(z\in {\mathbb {D}}\), the principal resolvent G is given by the rational function

$$\begin{aligned} G(\lambda (z)) = -\frac{c_\mu (z)}{c(z)}. \end{aligned}$$
(4.19)

Proof

Combining Theorem 4.12, equation (4.13) and Lemmata 4.19 and 4.20, we have

$$\begin{aligned} G(\lambda (z))&= \frac{G_\Delta (\lambda (z)) - p^\mu (\lambda (z))}{p(\lambda (z))} \\&= \frac{-2z - (c(z^{-1})c_\mu (z)-2z)}{c(z)c(z^{-1})} \\&= -\frac{c_\mu (z)}{c(z)}. \end{aligned}$$

This completes the proof. \(\square \)

The following theorem gives a better description of the weights \(w_k\) in Theorem 4.14, utilising the Joukowski map and the Toeplitz symbol c.

Theorem 4.22

Let J be a Jacobi operator such that there exists an n such that

$$\begin{aligned} \alpha _k = 0, \quad \beta _{k-1} = \frac{1}{2} \text { for all } k \ge n, \end{aligned}$$

i.e. it is equal to the free Jacobi operator \(\Delta \) outside the \(n\times n\) principal submatrix. By Theorem 4.8 the connection coefficient matrix can be written \(C = C_{\mathrm{Toe}} + C_{\mathrm{fin}}\). If c is the Toeplitz symbol of \(C_{\mathrm{Toe}}\), then the spectral measure of J is

$$\begin{aligned} \mu (s) = \frac{1}{p_C(s)}\mu _\Delta (s) + \sum _{k=1}^r \frac{(z_k-z_k^{-1})^2}{z_kc'(z_k)c(z_k^{-1})}\delta _{\lambda (z_k)}(s). \end{aligned}$$

Here \(z_k\) are the roots of c that lie in the open unit disk, which are all real and simple. The only roots of c on the unit circle are \(\pm 1\), which can also only be simple. Further, \(r \le n\).

Proof

By Theorem 4.14,

$$\begin{aligned} \mu (s) = \frac{1}{p_C(s)}\mu _\Delta (s) + \sum _{k=1}^{r} w_k \delta _{\lambda _k}(s), \end{aligned}$$

where \(r \le n\). Hence we just need to prove something more specific about the roots of c, \(\lambda _1,\ldots ,\lambda _r\), and \(w_1,\ldots ,w_r\).

By Theorem 4.21, \(G(\lambda (z)) = -c_\mu (z)/c(z)\) for \(z \in {\mathbb {D}}\). By Lemma 4.20, \(c(z^{-1})c_\mu (z)-2z = p^\mu (\lambda (z)) = p^\mu (\lambda (z^{-1})) = c(z)c_\mu (z^{-1})-2z^{-1}\), so

$$\begin{aligned} c(z^{-1})c_\mu (z) - c(z)c_\mu (z^{-1}) = 2(z - z^{-1}). \end{aligned}$$
(4.20)

Therefore c(z) and \(c_\mu (z)\) cannot simultaneously be zero unless \(z = z^{-1}\), which only happens at \(z = \pm 1\). By the same reasoning, c(z) and \(c(z^{-1})\) also cannot be simultaneously zero unless \(z = \pm 1\). Since the Joukowski map \(\lambda \) is a bijection from \({\mathbb {D}}\) to \({\mathbb {C}}\setminus [-1,1]\), this shows that the (simple and real) poles of G in \({\mathbb {C}}\setminus [-1,1]\) are precisely \(\lambda (z_1), \ldots ,\lambda (z_r)\), where \(z_1,\ldots ,z_r\) are the (necessarily simple and real) roots of c in \({\mathbb {D}}\).

What are the values of the weights of the Dirac deltas, \(w_1,\ldots ,w_r\)? By Theorem 4.14,

$$\begin{aligned} w_k&= \lim _{\epsilon \searrow 0}\frac{\epsilon }{i} G(\lambda (z_k)+i\epsilon ) \\&= \lim _{\lambda \rightarrow \lambda (z_k)} (\lambda (z_k)-\lambda ) G(\lambda ) \\&= \lim _{z \rightarrow z_k} \frac{1}{2}(z_k+z_k^{-1} - z-z^{-1}) (-1) \frac{c_\mu (z)}{c(z)} \\&= \lim _{z \rightarrow z_k} \frac{1}{2} z^{-1}(z-z_k)(z-z_k^{-1}) \frac{c_\mu (z)}{c(z)} \\&= \frac{1}{2} z_k^{-1}(z_k-z_k^{-1})c_\mu (z_k) \lim _{z \rightarrow z_k}\frac{(z-z_k)}{c(z)} \\&= \frac{1}{2} z_k^{-1}(z_k-z_k^{-1})\frac{c_\mu (z_k)}{c'(z_k)}. \end{aligned}$$

By equation (4.20), since \(c(z_k) = 0\), we have \(c_\mu (z_k) = 2(z_k - z_k^{-1})/c(z_k^{-1})\). This gives

$$\begin{aligned} w_k = \frac{(z_k - z_k^{-1})^2}{z_k c(z_k^{-1}) c'(z_k)}. \end{aligned}$$

Note that if \(c(z) = 0\) then \(c({\overline{z}}) = 0\) because c has real coefficients. If c has a root \(z_0\) on the unit circle, then \(c(z_0) = c(z_0^{-1}) = 0\) because \(\overline{z_0} = z_0^{-1}\), which earlier in the proof we showed only occurs if \(z_0 = \pm 1\). Hence c does not have roots on the unit circle except possibly \(\pm 1\). \(\square \)

Example 4.23

(Basic perturbation 1 re-revisited). Considering the connection coefficient matrix in equation (4.4), we see that the Toeplitz symbol c is \(c(z) = 1 - \alpha z\). By Theorem 4.22 the roots of c in the unit disc correspond to eigenvalues of \(J_\alpha \). As is consistent with our previous considerations, c has a root in the unit disc if and only if \(|\alpha | > 1\), and those eigenvalues are \(\lambda (\alpha ^{-1}) = \frac{1}{2}(\alpha + \alpha ^{-1})\). See Appendix A for figures depicting the spectral measure and the resolvent.

Example 4.24

(Basic perturbation 2 re-revisited). Considering the connection coefficient matrix in equation (4.5), we see that the Toeplitz symbol c is \(c(z) = \beta ^{-1} + (\beta ^{-1}-\beta ) z^2\). By Theorem 4.22 the roots of c in the unit disc correspond to eigenvalues of \(J_\beta \). The roots of c are \(\pm \frac{1}{\sqrt{\beta ^2-1}}\). If \(\beta \in \left( 0,\sqrt{2}\right] \setminus \{1\}\) then \(\left| \pm \frac{1}{\sqrt{\beta ^2-1}} \right| \ge 1\) so there are no roots of c in the unit disc, as is consistent with the previous observations. What was difficult to see before is, if \(\beta > \sqrt{2}\) then \(\left| \pm \frac{1}{\sqrt{\beta ^2-1}} \right| < 1\), so there is a root of c inside \({\mathbb {D}}\), and it corresponds to an eigenvalue,

$$\begin{aligned} \lambda \left( \pm \frac{1}{\sqrt{\beta ^2-1}}\right) = \pm \frac{1}{2}\left( \frac{1}{\sqrt{\beta ^2-1}} + \sqrt{\beta ^2-1}\right) = \pm \frac{\beta ^2}{2\sqrt{\beta ^2-1}}. \end{aligned}$$

See Appendix A for figures depicting the spectral measure and the resolvent.

5 Toeplitz-Plus-Trace-Class Jacobi Operators

In this section we extend the results of the previous section to the case where the Jacobi operator is Toeplitz-plus-trace-class. This cannot be done as a direct extension of the work in the previous section as the formulae obtained depended on the fact that some of the functions involved were merely polynomials in order to have a function defined for all \(\lambda \) in an a priori known region of the complex plane. We admit that it may be possible to perform the analysis directly, but state that it is not straightforward. We are interested in feasible (finite) computation so are content to deal directly with the Toeplitz-plus-finite-rank case and perform a limiting process. The crucial question for computation is, can we approximate the spectral measure of a Toeplitz-plus-trace-class Jacobi operator whilst reading only finitely many entries of the matrix?

Here we make clear the definition of a Toeplitz-plus-trace-class Jacobi operator.

Definition 5.1

An operator \(K : \ell ^2 \rightarrow \ell ^2\) is said to be trace class if \(\sum _{k=0}^{\infty } e_k^T (K^T K)^{1/2} e_k < \infty \). Hence we say that a Jacobi operator J such that \(\alpha _k \rightarrow 0\), \(\beta _k \rightarrow \frac{1}{2}\) as \(k \rightarrow \infty \) is Toeplitz-plus-trace-class if

$$\begin{aligned} \sum _{k=0}^\infty \left| \beta _k - \frac{1}{2}\right| + |\alpha _k| < \infty . \end{aligned}$$

5.1 Jacobi operators for Jacobi polynomials

The most well known class of orthogonal polynomials is the Jacobi polynomials, whose measure of orthogonality is

$$\begin{aligned} \mathrm {d}\mu (s) = \left( 2^{\alpha +\beta +1}B(\alpha +1,\beta +1)\right) ^{-1}(1-s)^\alpha (1+s)^\beta \bigg |_{s \in [-1,1]} \mathrm {d}s, \end{aligned}$$

where \(\alpha \),\(\beta > -1\) and B is Euler’s Beta function. The Jacobi operator for the normalised Jacobi polynomials with respect to this probability measure, and hence the three-term recurrence coefficients, are given by [36],

$$\begin{aligned} \alpha _k&= \frac{\beta ^2-\alpha ^2}{(2k+\alpha +\beta )(2k+\alpha +\beta +2)} \\ \beta _{k-1}&= 2\sqrt{\frac{k(k+\alpha )(k+\beta )(k+\alpha +\beta )}{ (2k+\alpha +\beta -1)(2k+\alpha +\beta )^2(2k+\alpha +\beta +1)}} \end{aligned}$$

Note that \(|\alpha _k| = {\mathcal {O}}(k^{-2})\) and

$$\begin{aligned} \beta _{k-1} = \frac{1}{2}\sqrt{1+\frac{(4-8\alpha ^2-8\beta ^2)k^2+{\mathcal {O}}(k)}{(2k+\alpha +\beta -1)(2k+\alpha +\beta )^2(2k+\alpha +\beta +1)}} = \frac{1}{2} + {\mathcal {O}}(k^{-2}). \end{aligned}$$

Hence the Jacobi operators for the Jacobi polynomials are Toeplitz-plus-trace-class for all \(\alpha ,\beta > -1\).

The Chebyshev polynomials \(T_k\) and \(U_k\) discussed in the previous section are specific cases of Jacobi polynomials, with \(\alpha ,\beta = -\frac{1}{2},-\frac{1}{2}\) for \(T_k\) and \(\alpha ,\beta = \frac{1}{2},\frac{1}{2}\) for \(U_k\).

In Appendix A numerical computations of the spectral measures and resolvents of these Jacobi operators are presented.

5.2 Toeplitz-plus-finite-rank approximations

We propose to use the techniques from Sect. 4. Therefore for a Jacobi operator J, we can define the Toeplitz-plus-finite-rank approximations \(J^{[m]}\), where

$$\begin{aligned} J^{[m]}_{i,j} = {\left\{ \begin{array}{ll} J_{i,j} &{} \text { if } 0 \le i,j < m \\ \Delta _{i,j} &{} \text { otherwise.} \end{array}\right. } \end{aligned}$$
(5.1)

Each Jacobi operator \(J^{[m]}\) has a spectral measure \(\mu ^{[m]}\) which can be computed using Theorem 4.22. The main question for this section is: how do the computable measures \(\mu ^{[m]}\) approximate the spectral measure \(\mu \) of J?

Proposition 5.2

Let J a Jacobi operator (bounded, but with no assumed structure imposed) and let \(\mu \) be its spectral measure. Then the measures \(\mu ^{[1]},\mu ^{[2]},\ldots \) which are the spectral measures of \(J^{[1]},J^{[2]},\ldots \) converge to \(\mu \) in a weak sense. Precisely,

$$\begin{aligned} \lim _{m \rightarrow \infty } \int f(s) \,\mathrm {d}\mu ^{[m]}(s) = \int f(s) \,\mathrm {d}\mu (s), \end{aligned}$$

for all \(f \in C_b({\mathbb {R}})\).

Proof

Each spectral measure \(\mu ^{[m]}\) and \(\mu \) are supported on the spectra of \(J^{[m]}\) and J, each of which are contained within \([-\Vert J^{[m]}\Vert _2,\Vert J^{[m]}\Vert _2]\) and \([-\Vert J\Vert _2,\Vert J\Vert _2]\). Since \(\Vert J^{[m]}\Vert _2\) and \(\Vert J\Vert _2\) are less than

$$\begin{aligned} M = 3\left( \sup _{k \ge 0} |\alpha _k| + \sup _{k \ge 0} |\beta _k|\right) , \end{aligned}$$

we have that all the spectral measures involved are supported within the interval \([-M,M]\). Hence we can consider integrating functions \(f \in C([-M,M])\) without ambiguity.

By Weierstrass’ Theorem, polynomials are dense in \(C([-M,M])\), so we only need to consider polynomials as test functions, and by linearity we only need to consider the orthogonal polynomials for J. The first polynomial \(P_0\) has immediate convergence, since the measures are all probability measures. Now consider \(P_k\) for some \(k>0\), which satisfies \(\int P_k(s) \,\mathrm {d}\mu (s) = 0\). For \(m > k\), \(P_k\) is also the kth orthogonal polynomial for \(J^{[m]}\), hence \(\int P_k(s) \,\mathrm {d}\mu ^{[m]}(s) = 0\). This completes the proof. \(\square \)

5.3 Asymptotics of the connection coefficients

Here we formulate a lower triangular block operator equation \({\mathcal {L}}{\underline{c}} = e_0^0\) , where \(e_0^0 = (e_0, 0, 0, \ldots )^\top \), satisfied by the entries of the connection coefficient matrices encoded into a vector \({\underline{c}}\). For Toeplitz-plus-trace-class Jacobi operators we give appropriate Banach spaces upon which the operator \({\mathcal {L}}\) is bounded and invertible, enabling precise results about the asymptotics of the connection coefficients to be derived.

Lemma 5.3

Let J and D be Jacobi operators with entries \(\{\alpha _k,\beta _k\}_{k=0}^\infty \) and \(\{\gamma _k,\delta _k\}_{k=0}^\infty \) respectively. If we decompose the upper triangular part of \(C_{J\rightarrow D}\) into a sequence of sequences, stacking each diagonal on top of each other, we get the following block linear system,

(5.2)

where for each i,

For \(B_{-1}\) to make sense we define \(\beta _{-1} = 1/2\).

Proof

This is simply the 5-point discrete system in Lemma 3.2 rewritten. \(\square \)

We write the infinite-dimensional-block-infinite-dimensional system (5.2) in the form,

$$\begin{aligned} {\mathcal {L}} {\underline{c}} = e_0^0. \end{aligned}$$
(5.3)

For general Jacobi operators J and D, the operators \(A_i\) and \(B_i\) are well defined linear operators from \(\ell _{{\mathcal {F}}}^\star \) to \(\ell _{{\mathcal {F}}}^\star \). The block operator \({\mathcal {L}}\) is whence considered as a linear operator from the space of sequences of real sequences, \(\ell _{{\mathcal {F}}}^\star (\ell _{{\mathcal {F}}}^\star )\) to itself. We will use this kind of notation for other spaces as follows.

Definition 5.4

(Vector-valued sequences). If \(\ell _X\) is a vector space of scalar-valued sequences, and Y is another vector space then we let \(\ell _X(Y)\) denote the vector space of sequences of elements of Y. In many cases in which \(\ell _X\) and Y are both normed spaces, then \(\ell _X(Y)\) naturally defines a normed space in which the norm is derived from that of \(\ell _X\) by replacing all instances of absolute value with the norm on Y. For example, \(\ell ^p(\ell ^\infty )\) is a normed space with norm \(\Vert (a_k)_{k=0}^\infty \Vert _{\ell ^p(\ell ^\infty )} = \left( \sum _{k=0}^\infty \Vert a_k\Vert _\infty ^p\right) ^{\frac{1}{p}}\).

The following two spaces are relevant for the Toeplitz-plus-trace-class Jacobi operators.

Definition 5.5

(Sequences of bounded variation). Following [21, Ch. IV.2.3], denote by bv the Banach space of all sequences with bounded variation, that is sequences such that the norm

$$\begin{aligned} \Vert a\Vert _{bv} = |a_0| + \sum _{k=0}^\infty |a_{k+1}-a_k|, \end{aligned}$$

is finite.

The following result is immediate from the definition of the norm on bv.

Lemma 5.6

There is a continuous embedding of bv into the Banach space of convergent sequences (endowed with the supremum norm) i.e. for all \((a_k)_{k=0}^\infty \in bv\), \(\lim _{k\rightarrow \infty } a_k\) exists, and \(\sup _{k}|a_k| \le \Vert (a_k)_{k=0}^\infty \Vert _{bv}\). Furthermore, \(\lim _{k\rightarrow \infty } |a_k| \le \Vert a\Vert _{bv}\).

Definition 5.7

(Geometrically weighted \(\ell ^1\)). For any \(R > 0\), e define the Banach space \(\ell ^1_R\) to be the space of sequences such that the norm

$$\begin{aligned} \Vert v\Vert _{\ell _R^1} = \sum _{k=0}^\infty R^k |v_k|, \end{aligned}$$

is finite.

Proposition 5.8

The operator norm on \(\ell ^1_R\) is equal to

$$\begin{aligned} \Vert A\Vert _{\ell _R^1 \rightarrow \ell _R^1} = \sup _{j} \sum _{i} R^{i-j} |a_{ij}|. \end{aligned}$$

The following Lemma and its Corollary show that it is natural to think of \({\underline{c}}\) as lying in the space \(\ell ^1_R(bv)\).

Lemma 5.9

Let \(J = \Delta + K\) be a Jacobi operator where K is trace class and let \(D = \Delta \). Then for any \(R \in (0,1)\) the operator \({\mathcal {L}}\) in equation (5.3) is bounded and invertible as an operator from \(\ell _R^1(bv)\) to \(\ell ^1_R(\ell ^1)\). Furthermore, if \({\mathcal {L}}^{[m]}\) is the operator in equation (5.3) generated by the Toeplitz-plus-finite-rank truncation \(J^{[m]}\), then

$$\begin{aligned} \Vert {\mathcal {L}}-{\mathcal {L}}^{[m]}\Vert _{\ell ^1_R(bv)\rightarrow \ell ^1_R(\ell ^1)} \rightarrow 0 \text { as } m \rightarrow \infty . \end{aligned}$$

Proof

We can write \({\mathcal {L}}\) in equation (5.3) in the form \({\mathcal {L}} = {\mathcal {T}} + {\mathcal {K}}\) where

and

This decomposition will allow us to prove that \({\mathcal {L}}\) is bounded and invertible as follows. We will show that as operators from \(\ell ^1_R(bv)\) to \(\ell ^1_R(\ell ^1)\), \({\mathcal {T}}\) is bounded and invertible, and \({\mathcal {K}}\) is compact. This implies that \({\mathcal {L}}\) is a Fredholm operator with index 0. Therefore, by the Fredholm Alternative Theorem, \({\mathcal {L}}\) is invertible if and only if it is injective. It is indeed injective, because it is block lower triangular with invertible diagonal blocks, so forward substitution on the system \({\mathcal {L}} {\underline{v}} = {\underline{0}}\) implies that each entry of \({\underline{v}}\) must be zero.

First let us prove that \({\mathcal {T}}\) is bounded and invertible. It is elementary that T is an isometric isomorphism from bv to \(\ell ^1\) and \(T^T\) is bounded with norm at most 1. Hence using Proposition 5.8 we have

$$\begin{aligned} \Vert {\mathcal {T}}\Vert _{\ell ^1_R(bv) \rightarrow \ell ^1_R(\ell ^1)} = R^0\Vert T\Vert _{bv \rightarrow \ell ^1} + R^2 \Vert T^T\Vert _{bv \rightarrow \ell ^1} \le 1 + R^2. \end{aligned}$$

Because each operator is lower triangular, the left and right inverse of \({\mathcal {T}} : \ell _{{\mathcal {F}}}(\ell _{{\mathcal {F}}}) \rightarrow \ell _{{\mathcal {F}}}(\ell _{{\mathcal {F}}})\) is

This matrix is block-lower triangular and block-Toeplitz with first column having 2ith block of the form \(T^{-1}(-T^T T^{-1})^i\) and \((2i+1)\)th block zero. We must check that \({\mathcal {T}}^{-1}\) is bounded in the norms on \(\ell _R^1(\ell ^1)\) to \(\ell _R^1(bv)\) so that it may be extended to \(\ell _R^1(\ell ^1)\) from the dense subspace \(\ell _{{\mathcal {F}}}\). Again using Proposition 5.8 we have

$$\begin{aligned} \Vert {\mathcal {T}}^{-1}\Vert _{\ell _R^1(\ell ^1)\rightarrow \ell _R^1(bv)}&= \sup _j \sum _{i=j}^\infty R^{2(i-j)}\Vert T^{-1}(-T^T T^{-1})^{i-j}\Vert _{\ell ^1\rightarrow bv} \\&= \sum _{k=0}^\infty R^{2k} \Vert T^{-1}(-T^T T^{-1})^k\Vert _{\ell ^1\rightarrow bv} \\&\le \sum _{k=0}^\infty R^{2k} \Vert T^{-1}\Vert _{\ell ^1\rightarrow bv}\left( \Vert T^T\Vert _{bv \rightarrow \ell ^1}\Vert T^{-1}\Vert _{\ell ^1\rightarrow bv}\right) ^k \\&\le \sum _{k=0}^\infty R^{2k} = (1-R^2)^{-1} < \infty . \end{aligned}$$

Now let us prove that \({\mathcal {K}} : \ell _R^1(bv) \rightarrow \ell _R^1(\ell ^1)\) is compact. Consider the finite rank operator \({\mathcal {K}}^{[m]}\), where all elements are the same as in \({\mathcal {K}}\), except that all occurrences of \(\alpha _i\) and \(2\beta _i - 1\) are replaced by 0 for \(i \ge m\). Using Proposition 5.8 we have

$$\begin{aligned} \Vert {\mathcal {K}}-{\mathcal {K}}^{[m]}\Vert _{\ell _R^1(bv) \rightarrow \ell _R^1(\ell ^1)}= & {} \sup _j R^0\Vert K_{j-1} - K^{[m]}_{j-1}\Vert _{bv \rightarrow \ell ^1} + R^1\Vert A_j - A^{[m]}_{j}\Vert _{bv \rightarrow \ell ^1} \\&+ R^2\Vert K_j - K^{[m]}_j\Vert _{bv \rightarrow \ell ^1}. \end{aligned}$$

By the continuous embedding in Lemma 5.6, \(\Vert \cdot \Vert _{bv \rightarrow \ell ^1} \le \Vert \cdot \Vert _{\ell ^\infty \rightarrow \ell ^1}\). Hence

$$\begin{aligned} \Vert {\mathcal {K}}-{\mathcal {K}}^{[m]}\Vert _{\ell _R^1(bv) \rightarrow \ell _R^1(\ell ^1)}&\le \sum _{k=m}^\infty R^0|2\beta _{k-1} -1| + R^1|\alpha _k| + R^2|2\beta _k - 1| \\&\rightarrow 0 \text { as } m \rightarrow \infty . \end{aligned}$$

This convergence is due to the fact that \(J-\Delta \) is trace class. Since \({\mathcal {K}}\) is a norm limit of finite rank operators it is compact. This completes the proof that \({\mathcal {L}}\) is bounded and invertible.

Now consider the operator \({\mathcal {L}}^{[m]}\) defined in the statement of the Lemma, which is equal to \({\mathcal {T}} + {\mathcal {K}}^{[m]}\) (where \({\mathcal {K}}^{[m]}\) is precisely that which was considered whilst proving \({\mathcal {K}}\) is compact). Hence,

$$\begin{aligned} \Vert {\mathcal {L}}-{\mathcal {L}}^{[m]}\Vert _{\ell ^1_R(bv)\rightarrow \ell ^1_R(\ell ^1)} = \Vert {\mathcal {K}}-{\mathcal {K}}^{[m]}\Vert _{\ell ^1_R(bv)\rightarrow \ell ^1_R(\ell ^1)} \rightarrow 0 \text { as } m \rightarrow \infty . \end{aligned}$$

This completes the proof. \(\square \)

Corollary 5.10

Let \(J = \Delta + K\) be a Jacobi operator where K is trace class and let \({\underline{c}} \in \ell _{{\mathcal {F}}}^\star (\ell _{{\mathcal {F}}}^\star )\) be the vector of diagonals of \(C_{J\rightarrow \Delta }\) as in equation (5.3). Then \({\underline{c}} \in \ell ^1_R(bv)\). If J has Toeplitz-plus-finite-rank approximations \(J^{[m]}\) and \({\underline{c}}^{[m]}\) denotes the vector of diagonals of \(C^{[m]}\), then

$$\begin{aligned} \Vert {\underline{c}} - {\underline{c}}^{[m]}\Vert _{\ell ^1_R(bv)} \rightarrow 0 \text { as } m \rightarrow \infty . \end{aligned}$$

Proof

By equation (5.3)

$$\begin{aligned} {\underline{c}} - {\underline{c}}^{[m]} = ({\mathcal {L}}^{-1} - ({\mathcal {L}}^{[m]})^{-1})e_0^0. \end{aligned}$$

Since \(\Vert e_0^0\Vert _{\ell _R^1(\ell ^1)} = 1\), the proof is completed if we show \(\Vert {\mathcal {L}}^{-1} - ({\mathcal {L}}^{[m]})^{-1}\Vert _{\ell _R^1(\ell ^1) \rightarrow \ell _R^1(bv)} \rightarrow 0\) as \(m \rightarrow \infty \).

Suppose that m is sufficiently large so that \(\Vert {\mathcal {L}} - {\mathcal {L}}^{[m]}\Vert < \Vert {\mathcal {L}}^{-1}\Vert ^{-1}\) (guaranteed by Lemma 5.9). Note that \({\mathcal {L}}^{-1}\) is bounded by the Inverse Mapping Theorem and Lemma 5.9. Then by a well-known result (see for example, [2]),

$$\begin{aligned} \Vert {\mathcal {L}}^{-1} - ({\mathcal {L}}^{[m]})^{-1}\Vert \le \frac{\Vert {\mathcal {L}}^{-1}\Vert ^2 \Vert {\mathcal {L}} - {\mathcal {L}}^{[m]}\Vert }{1-\Vert {\mathcal {L}}^{-1}\Vert \Vert {\mathcal {L}}-{\mathcal {L}}^{[m]}\Vert }. \end{aligned}$$

This tends to zero as \(m \rightarrow \infty \), by Lemma 5.9. \(\square \)

Theorem 5.11

Let \(J = \Delta + K\) be a Jacobi operator where K is trace class. Then \(C = C_{J\rightarrow \Delta }\) can be decomposed into

$$\begin{aligned} C = C_{\mathrm{Toe}} + C_{\mathrm{com}}, \end{aligned}$$

where \(C_{\mathrm{Toe}}\) is upper triangular, Toeplitz and bounded as an operator from \(\ell ^1_{R}\) to \(\ell ^1_{R}\), and \(C_{\mathrm{com}}\) is compact as an operator from \(\ell ^1_{R}\) to \(\ell ^1_{R}\), for all \(R > 1\). Also, if J has Toeplitz-plus-finite-rank approximations \(J^{[m]}\) with connection coefficient matrices \(C^{[m]}=C_{\mathrm{Toe}}^{[m]}+C_{\mathrm{com}}^{[m]}\), then

$$\begin{aligned} C^{[m]} \rightarrow C, \quad C_{\mathrm{Toe}}^{[m]} \rightarrow C_{\mathrm{Toe}}, \quad C_{\mathrm{com}}^{[m]} \rightarrow C_{\mathrm{com}} \text { as } m\rightarrow \infty , \end{aligned}$$

in the operator norm topology over \(\ell ^1_{R}\) for all \(R>1\).

Proof

By Lemma 5.9, for each k the sequence \((c_{0,0+k},c_{1,1+k},c_{2,2+k},\ldots )\) is an element of bv. By Lemma 5.6 each is therefore a convergent sequence, whose limits we call \(t_k\). Hence we can define an upper triangular Toeplitz matrix \(C_{\mathrm{Toe}}\) whose (ij)th element is \(t_{j-i}\), and define \(C_{\mathrm{com}} = C-C_{\mathrm{Toe}}\).

The Toeplitz matrix \(C_{\mathrm{Toe}}\) is bounded from \(\ell ^1_{R}\) to \(\ell ^1_{R}\) for all \(R>1\) by the following calculation.

$$\begin{aligned} \Vert C_{\mathrm{Toe}}\Vert _{\ell ^1_{R}\rightarrow \ell ^1_{R}}&= \sup _j \sum _{i=0}^j R^{i-j}|t_{j-i}| \\&= \sum _{k=0}^\infty R^{-k}|t_k| \\&\le \sum _{k=0}^\infty R^{-k} \Vert c_{*,*+k}\Vert _{bv} \\&= \Vert {\underline{c}}\Vert _{\ell ^1_{R^{-1}}(bv)}. \end{aligned}$$

By Lemma 5.9 this quantity is finite (since \(R^{-1} \in (0,1)\)).

Now we show convergence results. The compactness of \(C_{\mathrm{com}}\) will follow at the end. For all \(R>1\),

$$\begin{aligned} \Vert C-C^{[m]}\Vert _{\ell ^1_{R}\rightarrow \ell ^1_{R}}&= \sup _j \sum _{i=0}^j R^{i-j} |c_{i,j}-c_{i,j}^{[m]}| \\&= \sup _j \sum _{k=0}^j R^{-k} |c_{j-k,j} - c_{j-k,j}^{[m]}| \\&\le \sup _j \sum _{k=0}^j R^{-k} \Vert c_{*,*+k} - c^{[m]}_{*,*+k}\Vert _{bv} \\&= \sum _{k=0}^\infty R^{-k} \Vert c_{*,*+k} - c^{[m]}_{*,*+k}\Vert _{bv} \\&= \Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell ^1_{R^{-1}}(bv)}. \end{aligned}$$

For the third line of the above sequence of equations, note that for fixed k, \(c_{0,k}-c^{[m]}_{0,k},c_{1,1+k}-c^{[m]}_{1,1+k},c_{2,2+k}-c^{[m]}_{2,2+k},\ldots \) is a bv sequence, and refer to Lemma 5.6.

$$\begin{aligned} \Vert C_{\mathrm{Toe}}-C_{\mathrm{Toe}}^{[m]}\Vert _{\ell ^1_{R}\rightarrow \ell ^1_{R}}&= \sup _j \sum _{i=0}^j R^{i-j} |t_{j-i}-t_{j-i}^{[m]}| \\&=\sum _{k=0}^j R^{-k} |t_k -t_k^{[m]}| \\&\le \sum _{k=0}^\infty R^{-k} \Vert c_{*,*+k} - c^{[m]}_{*,*+k}\Vert _{bv} \\&= \Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell ^1_{R^{-1}}(bv)}. \end{aligned}$$

For the third line of the above sequence, note that \(t_k -t_k^{[m]}\) is the limit of the bv sequence \(c_{*,*+k} - c^{[m]}_{*,*+k}\), and refer to Lemma 5.6.

$$\begin{aligned} \Vert C_{\mathrm{com}}-C_{\mathrm{com}}^{[m]}\Vert _{\ell ^1_{R}\rightarrow \ell ^1_{R}} \le \Vert C-C^{[m]}\Vert + \Vert C_{\mathrm{Toe}}-C_{\mathrm{Toe}}^{[m]}\Vert \le 2\Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell ^1_{R^{-1}}(bv)}. \end{aligned}$$

Using Corollary 5.10, that \(\Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell ^1_R(bv)} \rightarrow 0\) as \(m\rightarrow \infty \), we have the convergence results.

By Theorem 4.8, \(C_{\mathrm{com}}^{[m]}\) has finite rank. Therefore, since \(C_{\mathrm{com}} = \lim _{m\rightarrow \infty } C_{\mathrm{com}}^{[m]}\) in the operator norm topology over \(\ell ^1_{R^{-1}}\), we have that \(C_{\mathrm{com}}\) is compact in that topology. \(\square \)

Corollary 5.12

Let \(C^\mu \) be as defined in Definition 3.11 for C as in Theorem 5.11. Then \(C^\mu \) can be decomposed into \(C^\mu = C^\mu _{\mathrm{Toe}} + C^\mu _{\mathrm{com}}\) where \(C^\mu _{\mathrm{Toe}}\) is upper triangular, Toeplitz and bounded as an operator from \(\ell ^1_{R}\) to \(\ell ^1_{R}\), and \(C^\mu _{\mathrm{com}}\) is compact as an operator from \(\ell ^1_{R}\) to \(\ell ^1_{R}\), for all \(R > 1\). Furthermore, if J has Toeplitz-plus-finite-rank approximations \(J^{[m]}\) with connection coefficient matrices \((C^\mu )^{[m]}=(C^\mu _{\mathrm{Toe}})^{[m]}+(C^\mu _{\mathrm{com}})^{[m]}\), then

$$\begin{aligned} (C^\mu )^{[m]} \rightarrow C^\mu , \quad (C_{\mathrm{Toe}}^\mu )^{[m]} \rightarrow C^\mu _{\mathrm{Toe}}, \quad (C^\mu _{\mathrm{com}})^{[m]} \rightarrow C^\mu _{\mathrm{com}} \text { as } m\rightarrow \infty , \end{aligned}$$

in the operator norm topology over \(\ell ^1_{R}\).

Proof

This follows from Theorem 5.11 applied to \(J^\mu \) as defined in Lemma 3.13. \(\square \)

Theorem 5.13

Let J be a Jacobi operator such that \(J = \Delta + K\) where K is trace class. The Toeplitz symbols c and \(c_\mu \) of the Toeplitz parts of \(C_{J\rightarrow \Delta }\) and \(C^\mu _{J\rightarrow \Delta }\) are both analytic in the unit disc. Furthermore, if J has Toeplitz-plus-finite-rank approximations \(J^{[m]}\) with Toeplitz symbols \(c^{[m]}\) and \(c_\mu ^{[m]}\), then \(c^{[m]} \rightarrow c\) and \(c^{[m]}_\mu \rightarrow c_\mu \) as \(m \rightarrow \infty \) uniformly on compact subsets of \({\mathbb {D}}\).

Proof

Let \(R > 1\), and let \(0 \le |z| \le R^{-1} < 1\). Then by Lemma 5.6 we have

$$\begin{aligned} \left| \sum _{k=0}^\infty t_k z^k \right|&\le \sum _{k=0}^\infty |t_k| R^{-k} \le \sum _{k=0}^\infty \Vert c_{*,*+k}\Vert _{bv} R^{-k} = \Vert {\underline{c}}\Vert _{\ell ^1_{R^{-1}}(bv)}, \end{aligned}$$

where \({\underline{c}}\) is as defined in equation (5.3). By Lemma 5.9 this quantity is finite. Since R is arbitrary, the radius of convergence of the series is 1. The same is true for \(c_\mu \) by Lemma 3.13.

Now we prove that the Toeplitz symbols corresponding to the Toeplitz-plus-finite-rank approximations converge.

$$\begin{aligned} \sup _{|z|\le R^{-1}} |c(z)-c^{[m]}(z)|&= \sup _{|z|\le R^{-1}} \left| \sum _{k=0}^\infty (t_k-t_k^{[m]})z^k \right| \\&\le \sum _{k=0}^\infty |t_k - t^{[m]}_{k}| R^{-k} \\&\le \sum _{k=0}^\infty \Vert c_{*,*+k} - c^{[m]}_{*,*+k}\Vert _{bv}R^{-k} = \Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell _{R^{-1}}^1(bv)}, \end{aligned}$$

To go between the first and second lines, note that for each k, \(c_{*,*+k} - c^{[m]}_{*,*+k}\) is a bv sequence whose limit is \(t_k - t^{[m]}_{k}\) and refer to Lemma 5.6. Now, \(\Vert {\underline{c}}-{\underline{c}}^{[m]}\Vert _{\ell _{R^{-1}}^1(bv)} \rightarrow 0\) as \(m \rightarrow \infty \) by Corollary 5.10. The same is true for \(\sup _{|z|\le R^{-1}} |c_\mu (z)-c_\mu ^{[m]}(z)|\) by Lemma 3.13. \(\square \)

Theorem 5.14

(See [31]). Let A and B be bounded self-adjoint operators on \(\ell ^2\). Then

$$\begin{aligned} \mathrm {dist}(\sigma (A),\sigma (B)) \le \Vert A-B\Vert _2. \end{aligned}$$

Theorem 5.15

Let \(J = \Delta + K\) be a Toeplitz-plus-trace-class Jacobi operator, and let c and \(c_\mu \) be the analytic functions as defined in Theorem 5.11 and Corollary 5.12. Then for \(\lambda (z) = \frac{1}{2}(z+z^{-1})\) with \(z\in {\mathbb {D}}\) such that \(\lambda (z) \notin \sigma (J)\), the principal resolvent G is given by the meromorphic function

$$\begin{aligned} G(\lambda (z)) = -\frac{c_\mu (z)}{c(z)}. \end{aligned}$$
(5.4)

Therefore, all eigenvalues of J are of the form \(\lambda (z_k)\), where \(z_k\) is a root of c in \({\mathbb {D}}\).

Proof

Let \(z \in {\mathbb {D}}\) such that \(\lambda (z) \notin \sigma (J)\), and let \(J^{[m]}\) denote the Toeplitz-plus-finite-rank approximations of J with principal resolvents \(G^{[m]}\). Then \(J^{[m]} \rightarrow J\) as \(m\rightarrow \infty \), so by Theorem 5.14 there exists M such that for all \(m\ge M\), \(\lambda (z) \notin \sigma (J^{[m]})\). For such m, both \(G(\lambda (z))\) and \(G^{[m]}(\lambda (z))\) are well defined, and using a well-known result on the difference of inverses (see for example, [2]), we have

$$\begin{aligned} G^{[m]}(\lambda ) - G(\lambda )&= \left\langle e_0,\left( (J^{[m]} - \lambda )^{-1} - (J-\lambda )^{-1}\right) e_0\right\rangle \\&\le \Vert (J^{[m]} - \lambda )^{-1} - (J-\lambda )^{-1}\Vert _2\\&\le \frac{\Vert (J-\lambda )^{-1}\Vert _2^2\Vert J-J^{[m]}\Vert _2}{1-\Vert (J-\lambda )^{-1}\Vert _2\Vert J-J^{[m]}\Vert _2}\\&\rightarrow 0 \text { as } m \rightarrow \infty . \end{aligned}$$

Theorem 5.13 shows that \(\lim _{m\rightarrow \infty } c^{[m]}_\mu (z)/c^{[m]}(z) = c_\mu (z)/c(z)\). Therefore by Theorem 4.21 these limits are the same and we have equation (5.4). \(\square \)

6 Computability Aspects

In this section we discuss computability questions à la Ben-Artzi–Colbrook–Hansen–Nevanlinna–Seidel [4, 5, 28]. This involves an informal definition of the Solvability Complexity Index (SCI), a recent development that rigorously describes the extent to which various scientific computing problems can be solved. It is in contrast to classical computability theory à la Turing, in which problems are solvable exactly in finite time. In scientific computing we are often interested in problems which we can only approximate the solution in finite time, such that in an ideal situation this approximation can be made as accurate as desired. For example, the solution to a differential equation, the roots of a polynomial, or the spectrum of a linear operator.

Throughout this section we will consider only real number arithmetic, and the results do not necessarily apply to algorithms using floating point arithmetic.

The Solvability Complexity Index (SCI) has a rather lengthy definition, but in the end is quite intuitive [4].

Definition 6.1

(Computational problem). A computational problem is a 4-tuple, \(\{\Xi ,\Omega ,\Lambda ,{\mathcal {M}}\}\), where \(\Omega \) is a set, called the input set, \(\Lambda \) is a set of functions from \(\Omega \) into the complex numbers, called the evaluation set, \({\mathcal {M}}\) is a metric space, and \(\Xi : \Omega \rightarrow {\mathcal {M}}\) is the problem function.

Definition 6.2

(General Algorithm). Given a computational problem \(\{\Xi ,\Omega ,\Lambda ,{\mathcal {M}}\}\), a general algorithm is a function \(\Gamma : \Omega \rightarrow {\mathcal {M}}\) such that for each \(A \in \Omega \),

  1. (i)

    The action of \(\Gamma \) on A only depends on the set \(\{f(A) : f \in \Lambda _\Gamma (A) \}\) where \(\Lambda _\Gamma (A)\) is a finite subset of \(\Lambda \)

  2. (ii)

    For every \(B \in \Omega \), \(f(B) = f(A)\) for all \(f \in \Lambda _\Gamma (A)\) implies \(\Lambda _\Gamma (B) = \Lambda _\Gamma (A)\).

This definition of an algorithm is very general indeed. There are no assumptions on how \(\Gamma \) computes its output; requirement (i) ensures that it can only use a finite amount of information about its input, and requirement (ii) ensures that the algorithm will only be affected by changes in the inputs which are actually measured. In short, \(\Gamma \) depends only on, and is determined by, finitely many evaluable elements of each input.

Definition 6.3

(Solvability Complexity Index). A computational problem function \(\{\Xi ,\Omega ,\Lambda ,{\mathcal {M}}\}\) has Solvability Complexity Index k if k is the smallest integer such that, for each \((n_1,\ldots ,n_k) \in {\mathbb {N}}^k\) there exists a general algorithm \(\Gamma _{n_1,\ldots ,n_k} : \Omega \rightarrow {\mathcal {M}}\), such that for all \(A \in \Omega \),

$$\begin{aligned} \Gamma (A) = \lim _{n_k \rightarrow \infty } \lim _{n_{k-1}\rightarrow \infty }\ldots \lim _{n_1 \rightarrow \infty } \Gamma _{n_1,\ldots ,n_k}(A), \end{aligned}$$

where the limit is taken in the metric on \({\mathcal {M}}\). In other words, the output of \(\Gamma \) can be computed using a sequence of k limits.

We require a metric space for the SCI.

Definition 6.4

The Hausdorff metric for two compact subsets of the complex plane A and B is defined to be

$$\begin{aligned} d_H(A,B) = \mathrm {max} \left\{ \sup _{a \in A} \mathrm {dist}(a,B), \sup _{b \in B} \mathrm {dist}(b,A) \right\} . \end{aligned}$$

If a sequence of sets \(A_1,A_2,A_3,\ldots \) converges to A in the Hausdorff metric, we write \(A_n \xrightarrow {H} A\) as \(n \rightarrow \infty \).

The computational problems considered for the remainder of this paper have \(\Omega \) as a set of bounded self-adjoint operators on \(\ell ^2\), \(\Lambda \) is the set of functions which simply return each individual element of the matrix representation of the operator, \({\mathcal {M}}\) is the set of subsets of \({\mathbb {R}}\) equipped with the Hausdorff metric, and \(\Xi \) returns the spectrum of the operator.

Theorem 6.5

([4]). The Solvability Complexity Index of the problem of computing the spectrum of a self-adjoint operator \(A \in {\mathcal {B}}(\ell ^2)\) is equal to 2 with respect to the Hausdorff metric on \({\mathbb {R}}\). For compact operators and banded self-adjoint operators the SCI reduces to 1.

Theorem 6.5 implies that the SCI of computing the spectrum of bounded Jacobi operators in the Hausdorff metric is 1. In loose terms, the problem is solvable using only one limit of computable outputs. What more can we prove about the computability?

The results of Sect. 4 reduce the computation of the spectrum of a Toeplitz-plus-finite-rank Jacobi operator to finding the roots of a polynomial. From an uninformed position, one is lead to believe that polynomial rootfinding is a solved problem, with many standard approaches used every day. One common method is to use the QR algorithm to find the eigenvalues of the companion matrix for the polynomial. This can be done stably and efficiently in practice [3]. However, the QR algorithm is not necessarily convergent for non-normal matrices (companion matrices are normal if and only if they are unitary, which is exceptional). Fortunately, the SCI of polynomial rootfinding with respect to the Hausdorff metric in for subsets of \({\mathbb {C}}\) is 1, but if one requires the multiplicities of these roots then the SCI is not yet known [4].

A globally convergent polynomial rootfinding algorithm is given in [29]. For any degree d polynomial the authors describe a procedure guaranteed to compute fewer than \(1.11 d(\log d)^2\) points in the complex plane, such that for each root of the polynomial, a Newton iteration starting from at least one of these points will converge to this root.

Let \(\epsilon > 0\). If a polynomial p of degree d has r roots, how do we know when to stop so that we have r points in the complex plane each within \(\epsilon \) of a distinct root of p? This leads us to the concept of error control.

Definition 6.6

(Error control). A function \(\Gamma \) which takes inputs to elements in a metric space \({\mathcal {M}}\) is computable with error control if it has solvability complexity index 1, and for each \(\epsilon \) we can compute n to guarantee that

$$\begin{aligned} d_{{\mathcal {M}}}(\Gamma _{n}(A),\Gamma (A)) < \epsilon . \end{aligned}$$

In other words, the output of \(\Gamma \) can be computed using a single limit, and an upper bound for the error committed by each \(\Gamma _n\) is also computable.

Besides providing \({\mathcal {O}}(d(\log d)^2)\) initial data for the Newton iteration (to find the complex roots of a degree d polynomial), the authors of [29] discuss stopping criteria. In Section 9 of [29], it is noted therein that for Newton iterates \(z_1, z_2, \ldots \), if \(|z_{k}-z_{k-1}| < \epsilon /d\), then there exists a root \(\xi \) of the polynomial in question such that \(|z_k-\xi | < \epsilon \). It is then noted, however, that if there are multiple roots then it is in general impossible to compute their multiplicities with complete certainty. This is because the Newton iterates can pass arbitrarily close to a root to which this iterate does not, in the end, converge. Another consequence of this possibility is that roots could be missed out altogether because all of the iterates can be found to be close to a strict subset of the roots.

To salvage the situation, we give the following lemma, which adds some assumptions to the polynomial in question.

Lemma 6.7

Let p be a polynomial and \(\Omega \subset {\mathbb {C}}\) an open set such that, a priori, the degree d is known and it is known that there are r distinct roots of p in \(\Omega \) and no roots on the boundary of \(\Omega \). Then the roots of p in \(\Omega \) is computable with error control in the Hausdorff metric (see Definition 6.4 and Definition 6.6).

Proof

Use Newton’s method with the \({\mathcal {O}}(d(\log d)^2)\) complex initial data given in [29]. Using the stopping criteria in the discussion preceding this lemma, the algorithm at each iteration produces \({\mathcal {O}}(d(\log d)^2)\) discs in the complex plane, within which all roots of p must lie. To be clear, these discs have centres \(z_k\) and radii \(d\cdot |z_k - z_{k-1}|\). Let \(R_k \subset \Omega \) denote the union of the discs which lie entirely inside \(\Omega \) and have radius less than \(\epsilon \) (the desired error). Note that this set may be empty if none of the discs are sufficiently small.

Because the Newton iterations are guaranteed to converge from these initial data, we must have eventually, for some sufficiently large k, that \(R_k\) has r connected components each with diameter less than \(\epsilon \). Terminate when this verifiable condition has been fulfilled. \(\square \)

Theorem 6.8

Let \(J = \Delta + F\) be a Toeplitz-plus-finite-rank Jacobi operator such that the rank of F is known a priori. Then its point spectrum \(\sigma _p(J)\) is computable with error control in the Hausdorff metric (see Definition 6.4 and Definition 6.6).

Remark 6.9

Note that the full spectrum is simply \([-1,1] \cup \sigma _p(J)\).

Proof

Suppose F is zero outside the \(n \times n\) principal submatrix. The value of n can be computed given that we know the rank of F. Compute the principal \(2n \times 2n\) submatrix of the connection coefficients matrix \(C_{J \rightarrow \Delta }\) using formulae (3.3)–(3.7). The entries in the final column of this \(2n \times 2n\) matrix give the coefficients of the Toeplitz symbol c, which is a degree \(2n-1\) polynomial.

Decide if \(\pm 1\) are roots by evaluating \(p(\pm 1)\). Divide by the linear factors if necessary to obtain a polynomial \({\tilde{p}}\) such that \({\tilde{p}}(\pm 1) \ne 0\). Use Sturm’s Theorem to determine the number of roots of \({\tilde{p}}\) in \((-1,1)\), which we denote r [39]. Since all roots in \(\overline{{\mathbb {D}}}\) are real, there are r roots of \({\tilde{p}}\) in the open unit disc \({\mathbb {D}}\) and none on the boundary.

By Lemma 6.7, the roots \(z_1,\ldots ,z_r\) of this polynomial c which lie in \((-1,1)\) can be computed with error control. By Theorem 4.22, for the point spectrum of J we actually require \(\lambda _k = \frac{1}{2}(z_k + z_k^{-1})\) to be computed with error control. Note that since \(|\lambda _k| \le \Vert J\Vert _2\) for each k, we have that \(|z_k| \ge (1+2\Vert J\Vert _2)^{-1}\). We should ensure that this holds for the computed roots \({\hat{z}}_k \in {\mathbb {D}}\) too. By the mean value theorem,

$$\begin{aligned} |\lambda (z_k) - \lambda ({\hat{z}}_k)|&\le \sup _{|z|\ge (1+2\Vert J\Vert _2)^{-1}}|\lambda '(z)||z_k - {\hat{z}}_k| \\&= \frac{1}{2}\left( (1+2\Vert J\Vert _2)^2-1\right) |z_k - {\hat{z}}_k| \\&= 2\Vert J\Vert _2(1+\Vert J\Vert _2)|z_k - {\hat{z}}_k| \\&\le 2(1+\Vert F\Vert _2)(2+\Vert F\Vert _2)|z_k - {\hat{z}}_k|. \end{aligned}$$

Therefore it suffices to compute \({\hat{z}}_k\) such that \(|z_k - {\hat{z}}_k| \le \frac{\epsilon }{2}(1+\Vert F\Vert _2)^{-1}(2+\Vert F\Vert _2)^{-1}\), where \(\epsilon \) is the desired error in the eigenvalues. \(\square \)

The following Theorem shows that taking Toeplitz-plus-finite rank approximations of a Toeplitz-plus-compact Jacobi operator is sufficient for computing the spectrum with error control with respect to the Hausdorff metric.

Theorem 6.10

Let \(J = \Delta + K\) be a Toeplitz-plus-compact Jacobi operator. If for all \(\epsilon > 0\) an integer m can be computed such that

$$\begin{aligned} \sup _{k\ge m} |\alpha _k| + \sup _{k\ge m} \left| \beta _k -\frac{1}{2} \right| < \epsilon , \end{aligned}$$
(6.1)

then the spectrum can be computed with error control in the Hausdorff metric.

Proof

Let \(\epsilon > 0\). By the oracle assumed in the statement of the theorem, compute m such that

$$\begin{aligned} \sup _{k\ge m} |\alpha _k| + \sup _{k\ge m} \left| \beta _k -\frac{1}{2} \right| < \frac{\epsilon }{6}. \end{aligned}$$

Now compute the point spectrum of the Toeplitz-plus-finite-rank approximation \(J^{[m]}\) such that \(d_H(\Sigma ,\sigma (J^{[m]})) < \epsilon /2\), where \(\Sigma \) denotes the computed set. Then, using Theorem 5.14, we have

$$\begin{aligned} d_H(\Sigma ,\sigma (J))&\le d_H(\Sigma ,\sigma (J^{[m]})) + d_H(\sigma (J^{[m]}),\sigma (J)) \\&\le \frac{\epsilon }{2} + \Vert J^{[m]}-J\Vert _2 \\&\le \frac{\epsilon }{2} + 3 \frac{\epsilon }{6}\\&= \epsilon . \end{aligned}$$

Here we used the fact that for a self-adjoint tridiagonal operator A,

$$ \Vert A\Vert _2 \le 3(\sup _{k\ge 0} |a_{k,k}| + \sup _{k\ge 0}|a_{k,k+1}|). $$

This completes the proof. \(\square \)

An immediate question following Lemma 6.7 and Theorem 6.8 is why we have opted to use a Newton iteration in the complex plane instead of a purely real algorithm. We do this purely because Lemma 6.7 is an interesting point to make in and of itself with regards to the Solvability Complexity Index of polynomial rootfinding with error control. The key point is that while there exist algorithms to compute all of the roots of a polynomial (without multiplicity) in a single limit (i.e. with SCI equal to 1), one does not necessarily know when to stop the algorithm to achieve a desired error. Lemma 6.7 provides a basic condition on the polynomial to allow such control, which applies to this specific spectral problem.

7 Conclusions

In this paper we have proven new results about the relationship between the connection coefficients matrix between two different families of orthonormal polynomials, and the spectral theory of their associated Jacobi operators. We specialised the discussion to finite-rank perturbations of the free Jacobi operator and demonstrated explicit formulas for the principal resolvent and the spectral measure in terms of entries of the connection coefficients matrix. We showed that the results extend to trace class perturbations. Finally, we discussed computability aspects of the spectra of Toeplitz-plus-compact Jacobi operators. We showed that the spectrum of a Toeplitz-plus-compact Jacobi operator can be computed with error control, as long as the tail of the coefficients can be suitably estimated.

There are some immediate questions. Regarding regularity properties of the Radon-Nikodym derivative \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\) between the spectral measures \(\nu \) and \(\mu \) of Jacobi operators D and J respectively given in Propositions 3.6 and 3.9 and Corollary 3.10: can weaker regularity of \(\frac{\mathrm {d}\nu }{\mathrm {d}\mu }\) be related to weak properties of \(C = C_{J\rightarrow D}\)? For example, the present authors conjecture that the Kullbeck–Leibler divergence,

$$\begin{aligned} K(\mu |\nu ) = \left\{ \begin{array}{rl} \int \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \log \frac{\mathrm {d}\nu }{\mathrm {d}\mu }(s) \,\mathrm {d}\nu (s) &{} \text { if} \, \nu \, \text {is absolutely continuous w.r.t.} \, \mu \\ \infty &{} \text { otherwise,} \end{array} \right. \end{aligned}$$

is finite if and only if the function of operators, \(C^T C \log (C^T C)\) is well-defined as an operator mapping \(\ell _{{\mathcal {F}}}\rightarrow \ell _{{\mathcal {F}}}^\star \). The reasoning comes from Lemma 3.8. Making such statements more precise for the case where \(D = \Gamma \) or \(D = \Delta \) (see equation (4.1)) could give greater insight into Szego and quasi-Szego asymptotics (respectively) for orthogonal polynomials [14, 22, 33].

Regarding computability: is there a theorem that covers the ground between Theorem 6.8 (for Toeplitz-plus-finite-rank Jacobi operators) and Theorem 6.10 (for Toeplitz-plus-compact Jacobi operators)? What can be said about the convergence of the continuous part of the spectral measure of a Toeplitz-plus-finite-rank truncations of a Toeplitz-plus-trace-class Jacob operator? Proposition 5.2 implies that this convergence is at least weak sense when tested against \(f \in C_b({\mathbb {R}})\).

The computability theorems in Sect. 6 all assume real arithmetic. What can be said about floating point arithmetic? Under what situations can the computation fail to give an unambiguously accurate solution? Answering this question is related to the mathematical problem of stability of the spectral measure under small perturbations of the Jacobi operator.

This paper also opens some broader avenues for future research. The connection coefficient matrix can be defined for any two Jacobi operators J and D. It is natural to explore what structure \(C_{J\rightarrow D}\) has when D is a different reference operator to \(\Delta \), and J is a finite rank, trace class, or compact perturbation of D. For example, do perturbations of the Jacobi operator with periodic entries [12, 26] have structured connection coefficient matrices? Beyond periodic Jacobi operators, it would be interesting from the viewpoint of ergodic theory if we could facilitate the study and computation of almost-periodic Jacobi operators, such as the discrete almost-Mathieu operator [17]. Perturbations of the Jacobi operators for Laguerre polynomials and the Hermite polynomials could also be of interest, but challenges associated with the unboundedness of these operators could hamper progress [36]. Discrete Schrödinger operators with non-decaying potentials will also be of interest in this direction.

Spectra of banded self-adjoint operators may be accessible with these types of techniques too. Either using connection coefficient matrices between matrix orthogonal polynomials [13], or developing tridiagonalisation techniques are possible approaches, but the authors also consider this nothing more than conjecture at present. The multiplicity of the spectrum for operators with bandwidth greater than 1 appears to be a major challenge here. This becomes even more challenging for non-banded operators, such as Schrödinger operators on \({{\mathbb {Z}}}^d\) lattices.

Lower Hessenberg operators define polynomials orthogonal with respect to Sobolev inner products [24, pp. 40–43]. Therefore, we have two families of (Sobolev) orthogonal polynomials with which we may define connection coefficient matrices, as discussed in [27, p. 77]. Whether the connection coefficient matrices (which are still upper triangular) have structure which can be exploited for studying and computing the spectra of lower Hessenberg operators is yet to be studied.

Besides spectra of discrete operators defined on \(\ell ^2\), we conjecture that the results of this paper will also be applicable to continuous Schrödinger operators on \(L^2({\mathbb {R}})\), which are of the form \(L_V[\phi ](x) = -\phi ''(x) + V(x)\phi (x)\) for a potential function \(V :{\mathbb {R}}\rightarrow {\mathbb {R}}\). The reference operator is the negative Laplacian \(L_0\) (which is the “free” Schrödinger operator). In this scenario, whereas the entries of a discrete connection coefficient matrix satisfy a discrete second order recurrence relation on \({\mathbb {N}}_0^2\) (see Lemma 3.2), the continuous analogue of the connection coefficient operator \(C_{L_V\rightarrow L_0}\) is an integral operator whose (distributional) kernel satisfies a second order PDE on \({\mathbb {R}}^2\).