1 Introduction

1.1 The Operators and their Matrices

Discretizing the (self-adjoint) 1D Schrödinger operator \(\Delta + v\cdot \) by finite differences, one gets an infinite tridiagonal matrix of the form A below, where the sequence \((b_k)_{k\in {\mathbb {Z}}}\) comes from samples of the function v.

When we speak of generalized Schrödinger operators here, we think of bounded linear operators on \(\ell ^2({\mathbb {Z}})\) with matrix representation of the form B, where

$$\begin{aligned} A=\begin{pmatrix} \ddots &{} \ddots \\ \smash \ddots &{} b_{-1} &{} 1 \\ &{} 1 &{} b_0 &{} 1\\ &{}&{} 1 &{} b_1 &{} 1\\ &{}&{}&{} 1&{} b_2&{}\smash \ddots \\ &{}&{}&{}&{}\ddots &{}\ddots \end{pmatrix}, \quad B=\begin{pmatrix} \ddots &{}\ddots \\ \smash \ddots &{} *&{}\bullet \\ \smash \ddots &{} b_{-1}&{}*&{}\bullet \\ &{} \star &{}b_0&{}*&{}\bullet \\ &{}&{} \star &{}b_1&{}*&{}\smash \ddots \\ &{}&{}&{}\ddots &{}\ddots &{}\ddots \end{pmatrix} \end{aligned}$$

and \(\star \), \(*\) and \(\bullet \) mark constant (but possibly different) diagonals. The number of nonzero diagonals is not important, as long as it is finite, and the one varying diagonal, carrying the so-called potential \(b=(b_k)_{k\in {\mathbb {Z}}}\), could be anywhere. In Sect. 4.9 below we explain how (and how far) also these limitations can be overcome. We denote the generalized Schrödinger operator with matrix B by H(b).

1.2 Our Results

The aim of this paper is to demonstrate how the set \({{\mathcal {W}}}(b)\) of all finite subwords (vectors of consecutive entries, a.k.a. factors) of b determines spectral quantities of H(b), such as the spectrum, the pseudospectra and the smallest singular value. We compare these spectral quantities between two operators, H(b) and H(c), and prove their approximation by those of a sequence, \(H(b_m)\), purely by looking at the sets \({{\mathcal {W}}}(b)\), \({{\mathcal {W}}}(c)\) and \({{\mathcal {W}}}(b_m)\). More precisely:

  • If \({{\mathcal {W}}}(b)={{\mathcal {W}}}(c)\), then all aforementioned spectral quantities of H(b) and H(c) coincide, irrelevant at which position a subword \(w\in {{\mathcal {W}}}(b)\) appears in c. For example, H(b) with a random \(\{0,1\}\)-potential has the same spectrum as H(c) with \(c\in \{0,1\}^{\mathbb {Z}}\) listing all natural numbers in binary form.

  • We explicitly compare the same quantities of one band operator, between the axis and the half-axis (with homogeneous Dirichlet boundary condition).

  • We approximate the pseudospectrum (and in the normal case also the spectrum) of H(b) in the Hausdorff distance by the pseudospectra of \(H(b_m)\), where \(b_m\) and b have the same subwords of length N and where \(N=N(m)\rightarrow \infty \) if \(m\rightarrow \infty \).

  • For the corresponding approximation result on the half-axis, we have to assume, in addition, that \(b_m\rightarrow b\) pointwise.

We demonstrate our approximations of spectra and pseudospectra for different configurations including the self-adjoint and a non-self-adjoint Fibonacci Hamiltonian, the non-self-adjoint Anderson model, Feinberg and Zee’s randomly hopping particle and a one-way version of it with two varying diagonals. The potentials in these examples are aperiodic or pseudoergodic [30]. See Sects. 2.4 and 2.5 and Fig. 1 for the details.

Our results might not be too surprising for experts on self-adjoint Schrödinger operators \(H(b)=A\) from above, see [3, 41], but our tools are such that everything works as well for the generalized version \(H(b)=B\), also with more than one varying diagonal, and in \(\ell ^p({\mathbb {Z}}^d)\) with entries in a Banach space X, see Sect. 4.9.

Fig. 1
figure 1

Pseudospectral approximations of the 3-state randomly hopping particle [17, 28] and a non-self-adjoint version of the Fibonacci Hamiltonian [22]; see Sect. 2.5 for details and other operators

1.3 Our Message

Perhaps the message is that spectral approximation is about approximating \(\Vert (A-\lambda )^{-1}\Vert \), not \((A-\lambda )^{-1}\). In particular, it is not about cutting larger and larger portions out of the operator, but instead about capturing the variety of its finite subpatterns: Finding all the \(2^N\) subwords of length N in a random \(\{0,1\}\)-potential b can take unpredictably large sections and create huge numerical costs. Arranging all of them in a new “surrogate” potential c takes \(N\cdot 2^N\) letters if arranged naively, and just \(2^N\) if in a clever, condensed arrangement [27].

So if a model H(b) allows a priori knowledge of the set \({{\mathcal {W}}}(b)\) of finite subwords then we suggest to build a surrogate potential that has all those subwords of length N (but no others), maybe neatly overlapping, and to periodize it. Doing this for an increasing sequence \(N=N_1, N_2,\dots \), this will Hausdorff-approximate the pseudospectrum of H(b)—in normal models also the spectrum. Spectra of periodic band operators can be explicitly computed by so-called Floquet-Bloch theory, see e.g. [26, Thm. 4.4.9] (two-sided case) and [36, Thm. 4.42] (one-sided case).

1.4 Related Literature

Always a great inspiration is [25] by Brian Davies. The way he expresses one essence of random behavior via pseudoergodicity—maximal variety of subwords—is guiding our view on spectral theory, not only here and not only for random operators. Aperiodic models are also characterized by their subword variety but now it is minimal (among the nontrivial classes). Spectral theory of aperiodic operators, see e.g. [2, 21,22,23, 58, 59], was actually our starting point for this research. We learned about asymptotics of pseudospectra, in particular in the Hausdorff topology, from Böttcher, Hagen, Roch and Silbermann [7, 33]—going from concrete results for Toeplitz and related operators to more generic statements for band-dominated operators. One of our main tools, reducing spectral studies to finite column sets—a.k.a. one-sided or rectangular finite sections—is the localization of the lower norm (Lemma 3.1), developed in [11, 49] and extended in [50].

In parallel, Ben-Artzi, Colbrook, Hansen, Nevanlinna, Roman and Seidel [4, 19] also looked at spectral approximations via rectangular submatrices but combined with so-called \((N,\varepsilon )\)-pseudospectra [38, 54]. Of particular relevance are their complexity studies for many computational, including spectral, problems.

Of course, this list is incomplete; many other groups are studying spectral quantities of non-self-adjoint operators, e.g. [5, 6, 8, 29], in terms of pseudospectra but also numerical ranges, higher order spectra, polynomial hulls, etc.

1.5 Structure of the Paper

After introducing the main actors, our philosophy and the results in this introduction, we give proper definitions of the operator classes and discuss concrete examples from mathematical physics in Sect. 2. Sections 3 and 4 introduce our main tools and techniques and finally prove our results.

2 Spectra, Operators and Examples

2.1 Spectrum and Pseudospectra

Given a bounded linear operator A on a Banach space X, we denote its spectrum and pseudospectra [61], respectively, by

$$\begin{aligned} \sigma (A)\,=\ \{\lambda \in {\mathbb {C}}: A-\lambda \text { is not invertible}\} \end{aligned}$$

and

$$\begin{aligned} \sigma _\varepsilon (A)\,= \left\{ \lambda \in {\mathbb {C}}: \Vert (A-\lambda )^{-1}\Vert>\frac{1}{\varepsilon }\right\} ,\qquad \varepsilon >0, \end{aligned}$$
(1)

where we identify \(\Vert B^{-1}\Vert :=\infty >\frac{1}{\varepsilon }\) if B is not invertible, so that \(\sigma (A)\subseteq \sigma _\varepsilon (A)\) for all \(\varepsilon >0\).

2.2 Band Operators

We focus on operators on \(\ell ^2({\mathbb {Z}})\). Our basic building stones are multiplication operators \((M_ax)_n=a_nx_n\) with some \(a\in \ell ^\infty ({\mathbb {Z}})\) and the shift operator, \((S x)_n = x_{n-1}\), both acting on \(\ell ^2({\mathbb {Z}})\).

A band operator is a finite sum of finite products of those two ingredients, i.e.,

$$\begin{aligned} A = \sum _{k=-w}^w M_{a^{(k)}}S^k \end{aligned}$$

with all \(a^{(k)}\in \ell ^\infty ({\mathbb {Z}})\). The number \(w\in {\mathbb {N}}\) here is called the band-width of A.

We identify an operator A on \(\ell ^2({\mathbb {Z}})\) with its usual matrix representation \((A_{ij})_{i,j \in {\mathbb {Z}}}\) with respect to the canonical basis in \(\ell ^2({\mathbb {Z}})\). Band operators exactly correspond to infinite band matrices with bounded diagonals.

Band operators act boundedly on every \(\ell ^p({\mathbb {Z}})\) with \(p\in [1,\infty ]\). Even their invertibility and spectrum are independent of p ( [42, 5.27] and [53, Cor. 2.5.4]):

Proposition 2.1

Let B be a band operator on \(\ell ^p({\mathbb {I}})\) for some \({\mathbb {I}}\subseteq {\mathbb {Z}}\) and \(p \in [1,\infty ]\). If B is invertible, then B is also invertible as an operator on \(\ell ^q({\mathbb {I}})\) for all \(q \in [1,\infty ]\).

So when we focus on \(p=2\) for spectral analysis, that is not much of a restriction. Many of our results will be formulated for the following class of band operators.

2.3 Generalized Schrödinger Operators

Discretizing the 1D Schrödinger operator \(\Delta + v\cdot \) by finite differences, one gets a tridiagonal self-adjoint matrix with only one varying diagonal: the main diagonal. We generalize this as follows:

Definition 2.2

(Generalized Schrödinger operator) For \(b\in \ell ^\infty ({\mathbb {Z}})\), let H(b) denote the generalized (discrete) Schrödinger operator,

$$\begin{aligned} H(b)\,=\ H(b,L,\gamma )\,=\ L\ +\ S^\gamma M_b, \end{aligned}$$

on \(\ell ^2({\mathbb {Z}})\), where L is a fixed translation invariant (\(LS=SL\)) band operator and \(\gamma \in {\mathbb {Z}}\). The sequence b is called the potential of H(b).

The matrix of L has constant diagonals, only finitely many of them nonzero. We add the sequence b to the \(\gamma \)-th diagonal of L, resulting in H(b). In particular, H(b) is a band operator. Denote its bandwidth by w throughout. The standard and self-adjoint (discrete) Schrödinger operator is a special case of H(b), when \(L=S+S^{-1}\), \(\gamma =0\), \(w=1\) and b is real-valued.

2.4 Classes of Potentials Considered Here

Our examples have either pseudoergodic or aperiodic potentials. We explain these in the language of finite subwords.

2.5 Words

An alphabet is a non-empty compact set \(\Sigma \subseteq {\mathbb {C}}\). The elements of an alphabet are called letters. For \(n\in {\mathbb {N}}\), vectors \(w=(s_1,\dots ,s_n)\in \Sigma ^n\) are referred to as words over \(\Sigma \). We also write \(w=s_1\dots s_n\) or \(w:\{1,\dots ,n\}\rightarrow \Sigma \) with \(w(k)=s_k\) and call \(|w|:=n\) the length of w.

Together with the empty word, \(\epsilon \) with \(|\mathcal \epsilon |=0\), \(\Sigma ^0 = \{\epsilon \}\), and the operation of word concatenation, \((v_1,\dots ,v_m)(w_1,\dots ,w_n):=(v_1,\dots ,v_m,w_1,\dots ,w_n)\in \Sigma ^{m+n}\), the set \(\Sigma ^*:= \cup _{n=0}^\infty \ \Sigma ^n\) of all finite words forms a monoid—the so-called free monoid over \(\Sigma \). We also study infinite words \(w\in \Sigma ^{\mathbb {I}}\) with infinite \({\mathbb {I}}\subseteq {\mathbb {Z}}\), understood as \(w:{\mathbb {I}}\rightarrow \Sigma \). For words \(w\in \Sigma ^*\), \(b\in \Sigma ^{\mathbb {I}}\) with \({\mathbb {I}}\in \{{\mathbb {Z}},{\mathbb {N}}\}\) and \(N\in {\mathbb {N}}\), we write

\(\circ \):

\(\textrm{pos}(w,b):=\{k\in {\mathbb {I}}: b(k)\dots b(k+|w|-1)=w\}\),

\(\circ \):

\(\#(w,b):=|\textrm{pos}(w,b)|\), meaning the number of occurrences,

\(\circ \):

\({{\mathcal {W}}}(b) \,=\ \{w\in \Sigma ^*:\#(w,b)\ge 1\}\), the set of all finite subwords of b, and

\(\circ \):

\({{\mathcal {W}}}_N(b):=\{w\in {{\mathcal {W}}}(b):|w|=N\}\).

2.6 Pseudoergodicity

\(b\in \Sigma ^{\mathbb {Z}}\) is called pseudoergodic over the finite alphabet \(\Sigma \) if \({{\mathcal {W}}}_N(b)=\Sigma ^N\) for every \(N\in {\mathbb {N}}\).

This notion was introduced by Davies [25] (and has been successfully employed since then, e.g. [15, 16, 18, 25, 43]) in order to capture spectral properties of random operators while eliminating stochastic details. Our results are very much in line with this philosophy of connecting spectral properties purely with subword variety instead of locations, probability and distribution.

This spirit is reflected by the (anything but random) construction of our approximants: For \(m\in {\mathbb {N}}\), let \(b_m\) be the periodic extension of a concatenation of all elements of \(\Sigma ^m\). Listing those \(|\Sigma |^m\) words of length m takes \(m|\Sigma |^m\) letters if done naively and \(|\Sigma |^m\) in a clever condensed arrangement as a de Bruijn sequence [27]; for example, the word \(u=000\,001\,010\,011\,100\,101\,110\,111\) and the cyclic word \(v=00010111\) both contain all \(w\in \{0,1\}^3\). For both arrangements, Theorem 4.10 below, which is our main result on spectral approximation for the full axis, applies and yields Hausdorff convergence \(\sigma _\varepsilon (H(b_m))\rightarrow \sigma _\varepsilon (H(b))\). In particular, condition (18) of Theorem 4.10, specifying the desired approximation of the potential in terms of its subwords, is satisfied with the particular choice \(m_0:=N\).

2.7 Aperiodicity

Let \(\Sigma =\{0,1\}\). Also aperiodicity of \(b\in \Sigma ^{\mathbb {N}}\) is characterized by the element count of \({{\mathcal {W}}}_N(b)\): But, unlike for pseudoergodicity, where this count is \(2^N\) (hence, maximal), for aperiodicity it is \(N+1\) (which is minimal among not eventually periodic sequences, by the Morse-Hedlund theorem [20]).

A sequence \(b\in \Sigma ^{\mathbb {Z}}\) is aperiodic if both its restrictions, to the negative and to the non-negative half-axis, are aperiodic. This is not equivalent to \(|{{\mathcal {W}}}_N(b)|=N+1\); the latter is already satisfied by bi-infinite sequences as simple as \(\chi _{\{0\}}\) or \(\chi _{\mathbb {N}}\), where \(\chi _J\) is the characteristic function of a set \(J\subseteq {\mathbb {Z}}\).

One of the most famous aperiodic words is the Fibonacci word \(b\in \Sigma ^{\mathbb {Z}}\) with

$$\begin{aligned} b(n)\ =\ \chi _{[1-\alpha ,1)}(n\alpha \bmod 1),\qquad n\in {\mathbb {Z}}, \end{aligned}$$
(2)

where \(\alpha =\frac{1}{2}(\sqrt{5}-1)\).

Writing down periodic approximants \(b_m\) to an aperiodic b such that (18) holds with an explicitly given \(m_0\) is not as simple as in the pseudoergodic case since putting “legal” subwords \(u,v\in {{\mathcal {W}}}_N(b)\) one after the other often creates “illegal” ones, \(w\not \in {{\mathcal {W}}}_N(b)\), in the transition zone from u to v. For the Fibonacci word, \(b_m\) can be constructed via (2) with \(\alpha \) replaced by the continued fraction expansion of \(\alpha \), truncated after m divisions [31]. Below is another approach for the Fibonacci word and some of its relatives.

2.8 Words Generated by Primitive Substitutions

Let us also come to a third class of potentials b that has some overlap with the second, for example, the Fibonacci word (2). Take a map \(M:\Sigma \rightarrow \Sigma ^* {\setminus } \{\epsilon \}\) and extend it via concatenation, \(M(uv):=M(u)M(v)\), to an endomorphism on \(\Sigma ^*\) and even on \(\Sigma ^{\mathbb {N}}\). We call such a morphism M a substitution if it satisfies

  1. (i)

    There is an \(a\in \Sigma \) with \(M(a)=au\) for some \(u\in \Sigma ^*\),

  2. (ii)

    For all \(c\in \Sigma \), \(|M^n(c)| \rightarrow \infty \) for \(n\rightarrow \infty \).

Then \(M^n(a)\) converges pointwise to the substitution word \(d \in \Sigma ^{\mathbb {N}}\), a fixed point of M. If there is a \(k\in {\mathbb {Z}}_+\) such that for all \(c_1,c_2 \in \Sigma \), \(c_2\) is a subword of \(M^k(c_1)\), then we call M primitive. For the Fibonacci word, the primitive substitution is given by

$$\begin{aligned} M_{\text {Fib}}:\quad 0\mapsto 1, \qquad 1\mapsto 10. \end{aligned}$$

Other famous primitive substitutions on \(\{0,1 \}^*\) include the Thue-Morse substitution,

$$\begin{aligned} M_{\text {TM}}:\quad 0\mapsto 01, \qquad 1\mapsto 10, \end{aligned}$$

and the period doubling substitution,

$$\begin{aligned} M_{\text {PD}}:\quad 0\mapsto 01, \qquad 1\mapsto 00. \end{aligned}$$

To obtain two-sided infinite words \(b\in \Sigma ^{\mathbb {Z}}\) from a substitution word \(d\in \Sigma ^{\mathbb {N}}\), we take accumulation points of \(S^{-k}(d)\), \(k\rightarrow \infty \), where S is the right-shift. These are exactly the words \(b\in \Sigma ^{\mathbb {Z}}\) that satisfy \({{\mathcal {W}}}(b) = {{\mathcal {W}}}(d)\). For a given word \(b\in \Sigma ^{\mathbb {Z}}\) that is generated by a primitive substitution as described above, we can approximate spectral quantities of H(b) by those of \(H(b_m)\), with \(b_m=M^m(a)\) for \(a\in \Sigma \) with (i), see also [3].

2.9 Operator Examples from Mathematical Physics

2.10 Anderson Model: Self-adjoint and Non-self-adjoint

The famous Anderson model [1] of 1958 studies localization and delocalization of eigenvectors for a better understanding of electric conductivity in 1D disordered media. In this model, one looks at the self-adjoint Schrödinger operator H(b) from (2.2) on the axis with \(L=S+S^{-1}\), \(\gamma =0\), \(b\in \Sigma ^{\mathbb {Z}}\) pseudoergodic and \(\Sigma \subseteq {\mathbb {R}}\). It is straightforward to prove (e.g. [47]) that \(\sigma (H(b))=\Sigma +[-2,2]\).

In the late 1990s, the Anderson model reemerged in a non-self-adjoint (NSA) setting: The only change is that now \(L=e^gS+e^{-g}S^{-1}\), where \(g>0\) is the strength of an external magnetic field. The now also famous paper of Hatano and Nelson [39] looks at flux lines in type II superconductors under the influence of a tilted external magnetic field. Within short time, the NSA Anderson model reappeared in population dynamics [52] and other areas.

Mathematicians from stochastics [32] and spectral theory [24, 25, 51] studied its spectrum and how it invades the complex plane (See Fig. 2), and the subject of pseudospectra (see [61] and the references therein) received an additional uplift.

Fig. 2
figure 2

Here is an approximation of the pseudospectrum of the NSA Anderson model with \(e^g=\frac{1}{2}\), i.e. H(b) with \(L=\frac{1}{2}S+ 2S^{-1}\), \(\gamma =0\), \(\Sigma =\{-3,3\}\) and \(b\in \Sigma ^{\mathbb {Z}}\) pseudoergodic. We see in blue the pseudospectrum of \(H(b_{14})\) [30], where \(b_m\) is a periodic sequence over \(\Sigma \) containing all subwords of b of length m. By Theorem 4.10, \(\sigma _\varepsilon (H(b_m))\) approximates \(\sigma _\varepsilon (H(b))\) in Hausdorff distance as \(m\rightarrow \infty \). For the spectrum, known bounds, e.g. from Theorem 14 in [25], are displayed: dark orange is guaranteed to be spectrum, bright orange shows how far the spectrum could go at most

2.11 Quasicrystals and the Fibonacci Hamiltonian: Self-adjoint and Non-self-adjoint

Again, the question was electric conductivity but now the medium was not disordered (random) but fairly ordered (periodic)—or maybe not quite—when Shechtman [57] discovered the first so-called quasicrystal in 1982 in his laboratory.

Mathematicians were excited [2, 21, 22, 58, 59] to see Hamiltonians with zero measure Cantor spectrum that was purely singular continuous, and finally, in 2011, after years of being called a quasiscientist, Shechtman received the Nobel Prize. The phrase “aperiodic” was coined to label this scenario of slightly disorderly order.

The most famous model in 2D is the Penrose tiling, whose spectral analysis is so notoriously difficult that analysts resort to 1D and, e.g., the Fibonacci Hamiltonian [21]. In Figs. 3 and 4, we approximate both, the standard (self-adjoint) and a modified (non-self-adjoint) Fibonacci Hamiltonian H(b), by periodic potentials \(b_m\) that have \({{\mathcal {W}}}_N(b_m)={{\mathcal {W}}}_N(b)\) for \(N=N(m)\rightarrow \infty \) as \(m\rightarrow \infty \). By Theorem 4.10, \(\sigma _\varepsilon (H(b_m))\) approximates \(\sigma _\varepsilon (H(b))\) in Hausdorff distance as \(m\rightarrow \infty \). In the self-adjoint case, the same holds for spectra.

Fig. 3
figure 3

Here is an approximation of the spectrum of the standard (self-adjoint) Fibonacci Hamiltonian, H(b) with \(L=S+S^{-1}\), \(\gamma =0\) and b from (2). The spectra of the periodic approximations (here shown in blue), \(H(b_m)\), are a union of finitely many closed intervals and Hausdorff-approximate, by Theorem 4.10 (the normal case), the spectrum of H(b), which is known to be a Cantor set on the real line [22]. Here we show, stacked in the vertical direction, many copies of the real line together with an m axis to better envisage this approximation, \(\sigma (H(b_m))\rightarrow \sigma (H(b))\) as \(m\rightarrow \infty \)

Fig. 4
figure 4

In contrast to Fig. 3, here is an approximation of a NSA Fibonacci Hamiltonian. The potential b from (2) was multiplied by minus the imaginary unit. We see the spectrum (dark blue) and the \(\varepsilon \)-pseudospectrum (light blue, \(\varepsilon =10^{-4}\)) of \(H(b_{18})\) [30]

2.12 Feinberg and Zee’s Randomly Hopping Particle

Motivated by [39], other NSA models popped up in the late 1990s, for example, Feinberg & Zee’s paper [28] on localization and delocalization in models of a randomly hopping particle on a 1D grid. The particle can jump one node to the left or right, and it has a state (e.g. spin) in \(\Sigma =\{\pm 1\}\) that changes randomly at every jump. Also the model was studied where the state changes randomly when jumping right but not when jumping left. By a similarity transform, the two particles can be shown to have identic spectra.

Fig. 5
figure 5

Approximation of \(\sigma (H(b))\) for the hopping particle with \(L=S\), \(\gamma =-1\), \(\Sigma =\{\pm 1\}\) and \(b\in \Sigma ^{\mathbb {Z}}\) pseudoergodic. We see the spectrum and pseudospectrum of \(H(b_{12})\) [30]

The spectrum looks self-similar; it is invariant under many transformations [35] but it is not explicitly known. See [10, 13, 34, 35, 40] for extensive studies, including provable subsets and supersets. Figure 5 shows a spectral approximation of H(b) by \(H(b_m)\) with our periodic approximation \(b_m\) of a pseudoergodic b.

Fig. 6
figure 6

Approximation of \(\sigma (H(b))\) for the hopping particle with \(q=3\) states, with \(L=S\), \(\gamma =-1\), \(\Sigma =\{z\in {\mathbb {C}}:z^3=1\}\) and \(b\in \Sigma ^{\mathbb {Z}}\) pseudoergodic. On the left we see the union of spectra of \(H(b_m)\) for \(m=1,\ldots ,6\) [30], each m corresponds to a different shade of blue. On the right the pseudospectrum of \(H(b_{6})\) in blue and the numerical range of H(b) in yellow

Cicuta, Contedini and Molinari [17] generalized [28] to a particle with \(q\in {\mathbb {N}}\) different states, say \(\Sigma =\{z\in {\mathbb {C}}:z^q=1\}\). Many results of the case \(q=2\) are preserved, see [12, 62]. The numerical range (an upper bound on the spectrum) is a regular 2q-polygon with outer radius 2 and center at the origin, see Fig. 6.

2.13 A One-Way Model

Brezin, Feinberg and Zee [9, 28] also modeled a random particle that can only jump in one direction and then randomly change its state, a so-called one-way model. The spectral analysis is simpler as the matrix is only supported on two adjacent diagonals. That’s why the spectrum is explicitly known, see [60] for one constant and one random diagonal and [44] for two stochastically independent random diagonals. We display a spectral approximation of the latter case in Fig. 7.

Fig. 7
figure 7

Here is an approximation of the one-way model \(H(b,c):=M_b+SM_c\) with two pseudoergodic diagonals \(b\in \Sigma _b^{\mathbb {Z}}\) and \(c\in \Sigma _c^{\mathbb {Z}}\). Here we use \(\Sigma _b=\{-2,2\}\) and \(\Sigma _c=\{3,4\}\). This case is still subject to Proposition 4.9. We see in blue the pseudospectrum of \(H(b_7,c_7)\) [30], where \(b_m\) and \(c_m\) are periodic and the pair \((b_m,c_m)\) contains all \(\Sigma _b\times \Sigma _c\)-subwords of (bc) of length m. In orange, we see \(\sigma (H(b,c))\) based on [44]

3 Further Notations and Tools

3.1 Discrete Intervals and Submatrices

We use the following abbreviations for discrete intervals. Given \(a,b\in {\mathbb {Z}}\), we write

$$\begin{aligned} a..b&:= \{n\in {\mathbb {Z}}:a\le n\le b\},\\ a..&:= \{n\in {\mathbb {Z}}:a\le n\},\\ ..b&:= \{n\in {\mathbb {Z}}:n\le b\}. \end{aligned}$$

For an operator \(A:\ell ^2({\mathbb {Z}})\rightarrow \ell ^2({\mathbb {Z}})\) with matrix representation \((A_{ij})_{i,j\in {\mathbb {Z}}}\) and a discrete interval \({\mathbb {I}}\subseteq {\mathbb {Z}}\), we abbreviate the restriction of A to \(\ell ^2({\mathbb {I}})\),

$$\begin{aligned} A|_{\ell ^2({\mathbb {I}})}.\,\ \ell ^2({\mathbb {I}})\rightarrow \ell ^2({\mathbb {Z}}), \end{aligned}$$

by \(A|_{\mathbb {I}}\). The corresponding matrix representation is \((A_{ij})_{i \in {\mathbb {Z}},j\in {\mathbb {I}}}\). Together with our shorthands for discrete intervals, this explains the notations \(A|_{a..b},\ A|_{a..}\) and \(A|_{..b}\).

Furthermore, the operator \(\ell ^2({\mathbb {N}})\rightarrow \ell ^2({\mathbb {N}})\), corresponding to \(A^+:= (A_{ij})_{i,j\in {\mathbb {N}}}\), is called the compression of A to \({\mathbb {N}}\).

3.2 Approximate Equality and Bounds

We, moreover, find the following notations useful: For \(a,b\in {\mathbb {R}}\) and \(\varepsilon >0\), let us write \(a{\mathop {\approx }\limits ^{\varepsilon }}b\ \) if \(\ b\in (a-\varepsilon ,a+\varepsilon )\) and \(a{\mathop {\preceq }\limits ^{\varepsilon }}b\ \) if \(\ b\in [a,a+\varepsilon )\). In particular, if \(a {\mathop {\approx }\limits ^{\delta }}b\) and \(b {\mathop {\approx }\limits ^{\varepsilon }}c\) then \(a {\mathop {\approx }\limits ^{\delta +\varepsilon }}c\). The same holds for the relation \({\mathop {\preceq }\limits ^{\varepsilon }}\).

3.3 The Lower Norm

An important spectral quantity that we use a lot in our arguments is the so-called lower norm of an operator A on \(\ell ^2({\mathbb {I}})\), meaning

$$\begin{aligned} \nu (A)\,=\ \inf \{\Vert Ax\Vert : x\in \ell ^2({\mathbb {I}}), \Vert x\Vert =1\}. \end{aligned}$$

Note that it is not a norm and that the name “lower norm” is used, as in [43, 53], to address its role as a counterpart to the operator norm, \(\Vert A\Vert :=\sup _{\Vert x\Vert =1}\Vert Ax\Vert \).

\(\nu (A)\) turns out to be a fairly accessible quantity to study \(\Vert A^{-1}\Vert \). Indeed,

$$\begin{aligned} \Vert A^{-1}\Vert \ =\ 1/\min \big \{\,\nu (A),\nu (A^*)\,\big \}. \end{aligned}$$
(3)

Here \(A^*\) is the adjoint of A, and \(\Vert A^{-1}\Vert =\infty \) if and only if A is not invertible. In Hilbert space, \(\nu (A)\) is the smallest singular value of A. For normal A, it is the smallest (in modulus) spectral value,

$$\begin{aligned} \nu (A)={{\,\mathrm{\textrm{dist}}\,}}(0,\sigma (A)), \quad \text {whence}\quad \nu (A-\lambda )={{\,\mathrm{\textrm{dist}}\,}}(\lambda ,\sigma (A)). \end{aligned}$$
(4)

For non-normal operators, we have “\(\le \)” instead in both equalities of (4).

For band operators, \(\nu (A)\) can be conveniently approximated / localized via

$$\begin{aligned} \nu _N(A)\searrow \nu (A)\quad \text {as}\quad N\rightarrow \infty , \end{aligned}$$
(5)

where \(\nu _N(A)\), for \(N\in {\mathbb {N}}\), refers to the local lower norm, defined as

$$\begin{aligned} \nu _N(A)\ :=\ \inf \big \{\Vert Ax\Vert : x\in \ell ^2({\mathbb {I}}), \Vert x\Vert =1, {{\,\textrm{diam}\,}}({{\,\textrm{supp}\,}}(x))<N\big \}, \end{aligned}$$
(6)

where \({{\,\textrm{supp}\,}}(x) = \{k \in {\mathbb {I}}: x_k \ne 0\}\) and \({{\,\textrm{diam}\,}}(S)=\sup _{s,t\in S}|s-t|\). Together with (3) we get, as \(N\rightarrow \infty \),

$$\begin{aligned} \min \big \{\nu _N(A),\nu _N(A^*)\big \}\ \searrow \ \min \big \{\nu (A),\nu (A^*)\big \}\ =\ 1\,/\,\Vert A^{-1}\Vert . \end{aligned}$$
(7)

In Lemma 3.4 below we will see for which kinds of operators A we can neglect the terms involving \(A^*\) in (3) and (7). For the approximation of spectral quantities, it is important to know that (5) holds in a very uniform sense, as follows:

Lemma 3.1

([49, Prop. 6]) Let \(\varepsilon >0\), \(r>0\) and \(w\in {\mathbb {N}}\). Then there is an \(N\in {\mathbb {N}}\) such that, for all band operators A with band-width less than w and \(\Vert A\Vert < r\),

$$\begin{aligned} \nu (A) {\mathop {\preceq }\limits ^{\varepsilon }}\nu _N(A). \end{aligned}$$

One can explicitly quantify N vs. \(\varepsilon \). An analogous localization of the operator norm is in [37, Prop. 3.4].

For an even finer localization of the lower norm, put, for \({\mathbb {I}}\subseteq {\mathbb {Z}}\),

$$\begin{aligned} \nu _{\mathbb {I}}(A)\,=\ \inf \{\Vert Ax\Vert :{{\,\textrm{supp}\,}}(x)\subseteq {\mathbb {I}}, \Vert x\Vert =1\}. \end{aligned}$$
(8)

Corollary 3.2

For every band operator A and all \(\varepsilon >0\), there are \(l,r\in {\mathbb {Z}}\) such that \(\nu (A){\mathop {\preceq }\limits ^{\varepsilon }}\nu _{l..r}(A)\).

Proof

With the help of Lemma 3.1 choose N large enough that \(\nu (A){\mathop {\preceq }\limits ^{\varepsilon /2}}\nu _N(A)=\inf _j\nu _{j..j+N-1}(A)\) and then j such that the infimum is \({\mathop {\preceq }\limits ^{\varepsilon /2}}\nu _{j..j+N-1}(A)\). Then put \(l:=j\) and \(r:=j+N-1\). \(\square \)

3.4 Submatrices of Consecutive Columns

Let A be a band matrix with band-width w.

For \(N\in {\mathbb {N}}\), we say that C is an N-column submatrix of A if C consists of N consecutive columns of A. For simplicity and comparability, restrict C to size \((N+2w)\times N\), capturing exactly the “banded part” of those N columns of A, that is, \(C=(C_{ij})_{i\in 1-w..N+w,\ j\in 1..N}\) with

$$\begin{aligned} \exists k\in {\mathbb {Z}}:\quad C_{i,j}=A_{k+i,k+j},\quad i\in 1-w..N+w,\ j\in 1..N, \end{aligned}$$

where \(C_{i,j}:=0\) if \(A_{k+i,k+j}\) is not defined. Let

  • \({{\mathcal {C}}}_N(A)\) denote the set of all N-column submatrices of A and

  • \({{\mathcal {C}}}(A):=\cup _{N\in {\mathbb {N}}}\ {{\mathcal {C}}}_N(A)\).

3.5 Self-contained Operators

We call a band operator A self-contained if every \(C\in {{\mathcal {C}}}(A)\) appears infinitely often in A. For \(A=H(b)\) with some word b this means that every \(w\in {{\mathcal {W}}}(b)\) appears infinitely often in b. For example, operators with aperiodic or pseudoergodic matrix diagonals (see Sect. 2.4) are self-contained.

Lemma 3.3

For a band operator A, the statements

  1. (i)

    A is self-contained,

  2. (ii)

    A is self-similar (in the sense of [14, 46]),

  3. (iii)

    A is invertible if and only if it is Fredholm,

are related as follows: \(\quad (i) \Rightarrow (ii) \Rightarrow (iii)\).

If the set of all entries of A (as a matrix) is finite then \((i)\Leftrightarrow (ii)\).

Proof

This is immediate from Lemmas 4.3 and 2.2 in [46]. \(\square \)

For self-contained operators, many of our formulas and arguments simplify:

Lemma 3.4

If A is self-contained, then \(\nu (A)=\nu (A^*)\), so that we can discard \(\nu (A^*)\) and \(\nu _N(A^*)\) from (3) and (7), i.e.

$$\begin{aligned} \nu _N(A) \searrow \nu (A) =\frac{1}{\Vert A^{-1}\Vert } \quad \text {as } N\rightarrow \infty . \end{aligned}$$

Proof

We distinguish two cases for \(\nu (A)\).

  • Case 1: \(\nu (A)>0\)

    • then \(\dim \ker (A)=0\) and \(\textrm{im}(A)\) is closed, e.g. Lemma 2.32 in [43],

    • so A is semi-Fredholm (in terms of [56], \(\Phi _+\)),

    • by Theorem 4.3 in [56], A is Fredholm,

    • by Lemma 3.3 (iii), A is invertible,

    • but then \(\nu (A^*)=\nu (A)>0\), e.g. Lemma 2.10 [37].

  • Case 2: \(\nu (A)=0\)

    If \(\nu (A^*)\) were nonzero then, arguing as in case 1, also \(\nu (A)>0\); contradiction. So \(\nu (A^*)=0\).

\(\square \)

4 Subwords, Spectra and Approximation

4.1 Column-Submatrices, Spectra and Pseudospectra

Because the submatrices \(C\in {{\mathcal {C}}}_N(A)\) are what vectors x with \({{\,\textrm{diam}\,}}({{\,\textrm{supp}\,}}(x))<N\) get to “see” of A, the set \({{\mathcal {C}}}_N(A)\) has obvious connections to \(\nu _N(A)\) and \(\nu (A)\) but then also to resolvent norm and spectrum of A:

Lemma 4.1

Let A be a band operator on \(\ell ^2({\mathbb {I}})\) with a discrete interval \({\mathbb {I}}\subseteq {\mathbb {Z}}\). Then, for every \(N\in {\mathbb {N}}\),

$$\begin{aligned} \nu _N(A)\ =\ \inf \{\nu _{j..j+N-1}(A): j..j+N-1\subseteq {\mathbb {I}}\} =\ \inf \{\nu (C):C\in {{\mathcal {C}}}_N(A)\}. \end{aligned}$$

Proof

Rewriting the definition (6) in terms of (8), gives the first equality. The second equality is by the definition of N-column submatrices and \({{\mathcal {C}}}_N(A)\). \(\square \)

Proposition 4.2

Let \({\mathbb {I}}_A,{\mathbb {J}}_A,{\mathbb {I}}_B,{\mathbb {J}}_B\subseteq {\mathbb {Z}}\) be finite or infinite discrete intervals and let \(A:\ell ^2({\mathbb {J}}_A)\rightarrow \ell ^2({\mathbb {I}}_A)\) and \(B:\ell ^2({\mathbb {J}}_B)\rightarrow \ell ^2({\mathbb {I}}_B)\) be band operators, associated with matrices \((A_{ij})_{i\in {\mathbb {I}}_A,j\in {\mathbb {J}}_A}\) and \((B_{ij})_{i\in {\mathbb {I}}_B,j\in {\mathbb {J}}_B}\) with the same band-width w.

  1. (a)

    If, for some \(N\in {\mathbb {N}}\), \({{\mathcal {C}}}_N(A)\subseteq {{\mathcal {C}}}_N(B)\) then \(\nu _N(A)\ge \nu _N(B)\).

  2. (b)

    If \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\) then \(\nu (A)\ge \nu (B)\).

  3. (c)

    If \(N\in {\mathbb {N}}\) and \({{\mathcal {C}}}_{N+2w}(A)\subseteq {{\mathcal {C}}}_{N+2w}(B)\) then also \({{\mathcal {C}}}_N(A^*)\subseteq {{\mathcal {C}}}_N(B^*)\).

  4. (d)

    If \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\) then also \({{\mathcal {C}}}(A^*)\subseteq {{\mathcal {C}}}(B^*)\).

Now suppose \({\mathbb {I}}_A={\mathbb {J}}_A\) and \({\mathbb {I}}_B={\mathbb {J}}_B\), so that A and B are endomorphisms of \(\ell ^2({\mathbb {I}}_A)\) and \(\ell ^2({\mathbb {I}}_B)\), respectively.

  1. (e)

    If \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\) then

    $$\begin{aligned} \nu (A-\lambda )\ \ge \ \nu (B-\lambda ),\quad \forall \lambda \in {\mathbb {C}}\end{aligned}$$
    (9)

    and

    $$\begin{aligned} \Vert (A-\lambda )^{-1}\Vert \ \le \ \Vert (B-\lambda )^{-1}\Vert ,\quad \forall \lambda \in {\mathbb {C}}, \end{aligned}$$
    (10)

    whence

    $$\begin{aligned} \sigma (A)\subseteq \sigma (B) \qquad \text {and}\qquad \sigma _\varepsilon (A)\subseteq \sigma _\varepsilon (B),\quad \varepsilon >0. \end{aligned}$$

Proof

  1. (a)

    This is immediate from Lemma 4.1.

  2. (b)

    If \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\), i.e. \({{\mathcal {C}}}_N(A)\subseteq {{\mathcal {C}}}_N(B)\) holds for all \(N\in {\mathbb {N}}\), then, by (a) and (5), it follows that \(\nu (A) \ge \nu (B)\).

  3. (c)

    Let \({{\mathcal {C}}}_{N+2w}(A)\subseteq {{\mathcal {C}}}_{N+2w}(B)\) and take \(C\in {{\mathcal {C}}}_N(A^*)\). Then C has N columns and \(N+2w\) rows. So \(C^*\) is contained in \(N+2w\) consecutive columns of A and, hence, in a matrix \(D\in {{\mathcal {C}}}_{N+2w}(A)\subseteq {{\mathcal {C}}}_{N+2w}(B)\). Using the same arguments backwards, \(C\in {{\mathcal {C}}}_N(B^*)\).

  4. (d)

    If \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\) then \({{\mathcal {C}}}_{N+2w}(A)\subseteq {{\mathcal {C}}}_{N+2w}(B)\), and, by (c), \({{\mathcal {C}}}_N(A^*)\subseteq {{\mathcal {C}}}_N(B^*)\) for all \(N\in {\mathbb {N}}\), so that \({{\mathcal {C}}}(A^*)\subseteq {{\mathcal {C}}}(B^*)\).

  5. (e)

    From \({{\mathcal {C}}}(A)\subseteq {{\mathcal {C}}}(B)\) it follows that \({{\mathcal {C}}}(A-\lambda )\subseteq {{\mathcal {C}}}(B-\lambda )\) for all \(\lambda \in {\mathbb {C}}\), so that (9) follows in analogy. By (d), we also have \({{\mathcal {C}}}((A-\lambda )^*)\subseteq {{\mathcal {C}}}((B-\lambda )^*)\) for all \(\lambda \in {\mathbb {C}}\), so that (9) also holds for the adjoints, i.e.,

    $$\begin{aligned} \nu (A-\lambda )\ \ge \ \nu (B-\lambda ) \quad \text {and}\quad \nu ((A-\lambda )^*)\ \ge \ \nu ((B-\lambda )^*) \end{aligned}$$

    for all \(\lambda \in {\mathbb {C}}\). Now (10) follows from (3). The inclusion of spectra and pseudospectra is now immediate, by their definition.

\(\square \)

4.2 Two Schrödinger Operators on the Axis: Subword Variety and Spectrum

In the setting of a generalized Schrödinger operator, \(A=H(b)\) with \(b\in \Sigma ^{\mathbb {Z}}\), the matrix is constant on all but one diagonal, so that \({{\mathcal {C}}}_N(H(b))\) corresponds directly to the set \({{\mathcal {W}}}_N(b)\) of all length-N subwords of b. We get a first simple result on how the subword variety of b has implications on resolvent and spectrum of H(b).

Theorem 4.3

If \(b,c\in \Sigma ^{\mathbb {Z}}\) with \({{\mathcal {W}}}(b)\subseteq {{\mathcal {W}}}(c)\) then \(\nu (H(b))\ge \nu (H(c))\) and

$$\begin{aligned} \Vert (H(b)-\lambda )^{-1}\Vert \le \Vert (H(c)-\lambda )^{-1}\Vert , \quad \lambda \in {\mathbb {C}}, \end{aligned}$$

so that

$$\begin{aligned} \sigma (H(b))\subseteq \sigma (H(c)) \qquad \text {and}\qquad \sigma _\varepsilon (H(b))\subseteq \sigma _\varepsilon (H(c)),\quad \varepsilon >0. \end{aligned}$$

Proof

The result follows directly from Proposition 4.2 and (3) since \({{\mathcal {W}}}(b)\subseteq {{\mathcal {W}}}(c)\) implies \({{\mathcal {C}}}(H(b))\subseteq {{\mathcal {C}}}(H(c))\) and \({{\mathcal {C}}}(H(b)^*)\subseteq {{\mathcal {C}}}(H(c)^*)\). \(\square \)

4.3 Band Operator: Axis Versus Half-Axis

Let A be a band operator on the axis and let w denote its band-width. We want to compare the inverses, resolvent norms and spectra of A and its half-axis compression \(A^+\).

As an intermediate operator between A and \(A^+\), look at its restriction

$$\begin{aligned} A|_{\mathbb {N}}\,=\ A|_{\ell ^2({\mathbb {N}})}.\,\ \ell ^2({\mathbb {N}})\rightarrow \ell ^2({\mathbb {Z}}) \end{aligned}$$

to \(\ell ^2({\mathbb {N}})\). The matrix of \(A|_{\mathbb {N}}\) is \({\mathbb {Z}}\times {\mathbb {N}}\) and consists of the columns of A with index in \({\mathbb {N}}\); the matrix of \(A^+\) is \({\mathbb {N}}\times {\mathbb {N}}\) and consists of the rows of \(A|_{\mathbb {N}}\) with index in \({\mathbb {N}}\):

$$\begin{aligned} A = \left( \begin{array}{ccc|ccccc} \smash \ddots &{}\smash \ddots &{}\smash \ddots &{}\\ \smash \ddots &{}+&{}*&{}*\\ \smash \ddots &{}*&{}+&{}*&{}*\\ \hline &{}*&{}*&{}+&{}*&{}*\\ &{}&{}*&{}*&{}+&{}*&{}*\\ &{}&{}&{}*&{}*&{}+&{}*&{}\smash \ddots \\ &{}&{}&{}&{}*&{}*&{}+&{}\smash \ddots \\ &{}&{}&{}&{}&{}\smash \ddots &{}\smash \ddots &{}\smash \ddots \end{array} \right) \ \ \begin{array}{l} \text {The band-width here is }w=2,\\ ``+'' \text { marks the main diagonal},\\ \text {the right half is }A|_{\mathbb {N}},\\ \text {the lower right quarter is }A^+. \end{array} \end{aligned}$$

Another way to look at these restrictions is that

$$\begin{aligned} A|_{\mathbb {N}}=AP:\textrm{im}(P)\rightarrow \ell ^2({\mathbb {Z}}) \quad \text {and}\quad A^+=PAP:\textrm{im}(P)\rightarrow \textrm{im}(P), \end{aligned}$$
(11)

where P is the orthogonal projection from \(\ell ^2({\mathbb {Z}})\) onto \(\textrm{im}(P)=\ell ^2({\mathbb {N}})\).

Let us start with the assumption that

$$\begin{aligned} {{\mathcal {C}}}(A)\ =\ {{\mathcal {C}}}(A|_{\mathbb {N}}), \end{aligned}$$
(12)

so that no other patterns appear on the columns with index \(j\in ..0\) of A and the only aspect here is what effect the truncation (or zero Dirichlet condition), from A to \(A^+\), has on the resolvent and the spectrum. Before we come to this, let us study some further implications of (12).

Lemma 4.4

Let A be a band operator. If A satisfies (12) then

  1. (a)

    A is self-contained,

  2. (b)

    Also \(A^*\) is subject to (12) in place of A, and

  3. (c)

    One also has \({{\mathcal {C}}}(A)={{\mathcal {C}}}(A|_{k..})\) for all \(k\in {\mathbb {Z}}\).

Self-containedness of a band operator B can even be characterized via the restrictions \(B|_{\mathbb {N}}\) and \(B|_{-{\mathbb {N}}}\) in the following sense:

  1. (d)

    B is self-contained if and only if \({{\mathcal {C}}}(B) = {{\mathcal {C}}}(B|_{\mathbb {N}}) \cup {{\mathcal {C}}}(B|_{-{\mathbb {N}}})\).

Proof

  1. (a)

    Let \(C\in {{\mathcal {C}}}(A|_{\mathbb {N}})\) and let l..r be the corresponding column numbers. We show that C can be found in infinitely many positions of A:

    By (12), the submatrix \(D\in {{\mathcal {C}}}(A)\) at columns \(-r..r\) of A can be found in \(A|_{\mathbb {N}}\), and hence at columns \(1..2r+1\) or later. In particular, the rightmost \(r-l+1\) columns of D, forming C, are found at columns \(l+r+1..2r+1\) or later, which is disjoint from the location, l..r, (it is further to the right) of the original \(C\in {{\mathcal {C}}}(A|_{\mathbb {N}})\). Now keep repeating the argument for the newly found copy of C in \(A|_{\mathbb {N}}\) to find a further copy of C, even further to the right. Hence, C appears infinitely many times in A, i.e. A is self-contained.

  2. (b)

    Follows from Proposition 4.2 (c).

  3. (c)

    Follows from (a).

  4. (d)

    Since self-containedness of B implies that every \(C\in {{\mathcal {C}}}(B)\) appears infinitely often, it is clear that C also appears in \(B|_{\mathbb {N}}\) or in \(B|_{-{\mathbb {N}}}\). Hence it remains to show that \({{\mathcal {C}}}(B) = {{\mathcal {C}}}(B|_{\mathbb {N}}) \cup {{\mathcal {C}}}(B|_{-{\mathbb {N}}})\) implies that B is self-contained. But this can be done via similar arguments as in (a).

\(\square \)

A first quick judgement on the spectrum of \(A^+\) vs. that of A: The step from A to \(A|_{\mathbb {N}}\) does not change the lower norm, by the assumption (12) and Proposition 4.2 (b). But the step from \(A|_{\mathbb {N}}\) to \(A^+\), chopping off some nonzero rows and hence deleting the entries \(y_{1-w},\dots ,y_0\) of every \(y:=A|_{\mathbb {N}}x\), decreases the lower norm and hence increases the inverse, resolvent and spectrum.

The proper analysis is straightforward: For \(x\in \ell ^2({\mathbb {N}})\), in the notations of (11),

$$\begin{aligned} \Vert A^+x\Vert =\Vert PAPx\Vert \le \Vert APx\Vert =\Vert A|_{\mathbb {N}}x\Vert =\Vert A{{\hat{x}}}\Vert , \end{aligned}$$

where \({{\hat{x}}}\in \ell ^2({\mathbb {Z}})\) is x, extended by zeros. But \(PAPx=APx\) if \({{\,\textrm{supp}\,}}(x)\subseteq w+1..\,\). So, for all \(N\in {\mathbb {N}}\),

$$\begin{aligned} \nu _{j..j+N-1}(A^+)&\le \nu _{j..j+N-1}(A|_{\mathbb {N}}) = \nu _{j..j+N-1}(A),\quad j\in 1..w, \end{aligned}$$
(13)
$$\begin{aligned} \nu _{j..j+N-1}(A^+)&= \nu _{j..j+N-1}(A|_{\mathbb {N}}) = \nu _{j..j+N-1}(A),\quad j\in w+1..\ . \end{aligned}$$
(14)

Columns with index in 1..w may lose some nonzero entries when passing from A via \(A|_{\mathbb {N}}\) to \(A^+\); columns in \(w+1..\) do not. With this preparation, we prove:

Proposition 4.5

Let A be a band operator with band-width w on the axis. Then, for all \(T\in \{A-\lambda , (A-\lambda )^*:\lambda \in {\mathbb {C}}\}\) and all \(N\in {\mathbb {N}}\),

$$\begin{aligned} \nu _N(T^+)\ =\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+),\,\ \nu _N(T|_{\mathbb {N}})\right\} , \end{aligned}$$
(15)

where \(\nu _{l..r}(T^+)\) is the smallest singular value of the rectangular submatrix

$$\begin{aligned} (T^+_{ij})_{i\in 1..r+w,\, j\in l..r},\qquad l,r\in {\mathbb {N}}, \end{aligned}$$

of \(T^+\). If, additionally, (12) holds for A, then \(\nu _N(T|_{\mathbb {N}})=\nu _N(T)\), so that

$$\begin{aligned} \nu _N(T^+)\ =\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+),\,\ \nu _N(T)\right\} . \end{aligned}$$
(16)

In particular, by (5) and (3),

$$\begin{aligned} \nu (T^+)\ \le \ \nu (T) \qquad \text {and}\qquad \Vert (T^+)^{-1}\Vert \ \ge \ \Vert T^{-1}\Vert , \end{aligned}$$

so that,

$$\begin{aligned} \sigma (A)\subseteq \sigma (A^+) \qquad \text {and}\qquad \sigma _\varepsilon (A)\subseteq \sigma _\varepsilon (A^+),\quad \varepsilon >0. \end{aligned}$$

Proof

We fix \(N\in {\mathbb {N}}\), apply (13) and (14) to \(T=A-\lambda \), and conclude (15) via Lemma 4.1:

$$\begin{aligned} \nu _N(T^+)\ {}&=\ \inf _{j\in {\mathbb {N}}}\nu _{j..j+N-1}(T^+)\\&=\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+)\ ,\ \inf _{j\ge w+1}\nu _{j..j+N-1}(T^+)\right\} \\&{\mathop {=}\limits ^{(14)}}\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+)\ ,\ \inf _{j\ge w+1}\nu _{j..j+N-1}(T)\right\} \\&{\mathop {=}\limits ^{(15)}}\ \min \left\{ \min _{j=1}^w\Big \{\nu _{j..j+N-1}(T^+),\ \nu _{j..j+N-1}(T)\Big \}\ ,\ \inf _{j\ge w+1}\nu _{j..j+N-1}(T)\right\} \\&=\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+)\ ,\ \inf _{j\in {\mathbb {N}}}\nu _{j..j+N-1}(T)\right\} \\&=\ \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(T^+)\ ,\ \nu _N(T|_{\mathbb {N}})\right\} . \end{aligned}$$

Since also (12) transfers from A to all \(A-\lambda \), we get \(\nu _N(T)=\nu _N(T|_{\mathbb {N}})\) and hence (16), by Proposition 4.2. The rest is by Lemma 4.4 and Lemma 3.4. \(\square \)

Remark 4.6

Letting \(N\rightarrow \infty \) in (15) yields \(\nu (T^+) = \min \{c,\nu (T|_{\mathbb {N}})\}\), where

$$\begin{aligned} c\,=\ \lim _{N\rightarrow \infty } \min _{j=1}^w \nu _{j..j+N-1}(T^+)\ =\ \min _{j=1}^w\nu _{j..}(T^+)\ =\ \nu _{1..}(T^+)\ =\ \nu (T^+) \end{aligned}$$

since \(\lim \) and \(\min \) commute and since \(\nu _{l..r}(T^+)\rightarrow \nu _{l..}(T^+)\) as \(r\rightarrow \infty \) (by monotonicity and Corollary 3.2). This equality shows that the second term in the minimum of (15) and (16) is asymptotically irrelevant.

Now we apply Proposition 4.5 to generalized Schrödinger operators \(A=H(b)\):

Corollary 4.7

Let \(b\in \Sigma ^{\mathbb {Z}}\). If \({{\mathcal {W}}}(b)={{\mathcal {W}}}(b|_{\mathbb {N}})\) then, for all \(\lambda \in {\mathbb {C}}\),

$$\begin{aligned} \Vert (H(b)^+-\lambda )^{-1}\Vert \ge \Vert (H(b)-\lambda )^{-1}\Vert ,\quad \text {so that}\quad \sigma (H(b)^+)\supseteq \sigma (H(b)), \end{aligned}$$

where (16) applies to \(T=H(b)-\lambda \).

Proof

Note (12) for \(A=H(b)\), by \({{\mathcal {W}}}(b)={{\mathcal {W}}}(b|_{\mathbb {N}})\), and use Proposition 4.5. \(\square \)

4.4 Two Schrödinger Operators on the Half Axis

Proposition 4.8

Let \(b,c\in \Sigma ^{\mathbb {N}}\) and let \(H(b)^+\) and \(H(c)^+\) denote the corresponding generalized Schrödinger operators on the half axis. Denote their band-width by w. If \(N\in {\mathbb {N}}\) and

$$\begin{aligned} b|_{1..w+N-1}\ =\ c|_{1..w+N-1} \qquad \text {and}\qquad {{\mathcal {W}}}_N(b)\ \subseteq \ {{\mathcal {W}}}_N(c) \end{aligned}$$

then

$$\begin{aligned} \nu _N(H(b)^+)\ \ge \ \nu _N(H(c)^+). \end{aligned}$$

Proof

Let \(N\in {\mathbb {N}}\) and note that

$$\begin{aligned} \nu _N(H(b)^+)&{\mathop {=}\limits ^{(15)}} \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(H(b)^+)\ ,\ \nu _N(H(b)|_{\mathbb {N}})\right\} \\&\ge \min \left\{ \min _{j=1}^w \nu _{j..j+N-1}(H(c)^+)\ ,\ \nu _N(H(c)|_{\mathbb {N}})\right\} {\mathop {=}\limits ^{(4.5)}} \nu _N(H(c)^+) \end{aligned}$$

since the two \(\min _{j=1}^w \nu _{j..j+N-1}\dots \) terms are equal, by \(b|_{1..w+N-1}\ =\ c|_{1..w+N-1}\) and \(\nu _N(H(b)|_{\mathbb {N}})\ge \nu _N(H(c)|_{\mathbb {N}})\), by \({{\mathcal {W}}}_N(b)\subseteq {{\mathcal {W}}}_N(c)\) and Proposition 4.2 (a). \(\square \)

For a direct comparison of \(\nu (H(b)^+)\) and \(\nu (H(c)^+)\) in the same style, we have to send \(N\rightarrow \infty \), enforcing \(b=c\) via \(b|_{1..w+N-1}=c|_{1..w+N-1}\) for all \(N\in {\mathbb {N}}\). For an approximation \(\nu (H(b_m)^+)\rightarrow \nu (H(b)^+)\) however, we have much less restrictive conditions (including pointwise convergence \(b_m\rightarrow b\)), see Theorem 4.12 below.

4.5 Spectral Approximation via Approximation of Submatrices

We start with the general case, not necessarily generalized Schrödinger. In particular, many diagonals could be varying.

Proposition 4.9

Let \(A,A_1,A_2,\ldots \) be band operators on the axis with a uniform upper bound on their norms and on their band-widths. If

$$\begin{aligned} \forall N\in {\mathbb {N}}:\ \exists m_0\in {\mathbb {N}}:\ \forall m\ge m_0:\quad {{\mathcal {C}}}_N(A_m)\ =\ {{\mathcal {C}}}_N(A) \end{aligned}$$
(17)

then \(\nu (A_m)\rightarrow \nu (A)\) and, in fact,

$$\begin{aligned} \nu (A_m-\lambda )\ \rightarrow \ \nu (A-\lambda ),\qquad \lambda \in {\mathbb {C}}. \end{aligned}$$

Proof

Let \(\varepsilon >0\) and take \(N\in {\mathbb {N}}\) so that Lemma 3.1 applies, with \(\frac{\varepsilon }{2}\) in place of \(\varepsilon \), to A and all \(A_m\) with \(m\in {\mathbb {N}}\). This is possible since we have a uniform upper bound on their norms and on their band-widths.

Now, in accordance with N, take m large enough that (17) holds. Then, by Lemma 3.1, Proposition 4.2 (a) and again Lemma 3.1, in this order,

$$\begin{aligned} \nu (A_m){\mathop {\approx }\limits ^{\varepsilon /2}}\nu _N(A_m)=\nu _N(A){\mathop {\approx }\limits ^{\varepsilon /2}}\nu (A), \quad \text {so that}\quad \nu (A_m){\mathop {\approx }\limits ^{\varepsilon }}\nu (A). \end{aligned}$$

Since \(\varepsilon \) was arbitrary, it follows that \(\nu (A_m)\rightarrow \nu (A)\) as \(m\rightarrow \infty \). Repeating the same argument for \(A_m-\lambda \) and \(A-\lambda \) in place of \(A_m\) and A proves the claim. \(\square \)

4.6 Spectral Approximation via Approximation of Subwords: The Axis

We directly conclude the same (and more) for a sequence of generalized Schrödinger operators on the axis. Note that a similar result is already known in the self-adjoint case [3].

Theorem 4.10

Let \(b,b_1,b_2,\ldots \in \Sigma ^{\mathbb {Z}}\) and look at the corresponding generalized Schrödinger operators on the axis, \(A:=H(b)\) as well as \(A_m:=H(b_m)\) for \(m\in {\mathbb {N}}\). If

$$\begin{aligned} \forall N\in {\mathbb {N}}:\ \exists m_0\in {\mathbb {N}}:\ \forall m\ge m_0:\quad {{\mathcal {W}}}_N(b_m)\ =\ {{\mathcal {W}}}_N(b) \end{aligned}$$
(18)

then, for all \(\lambda \in {\mathbb {C}}\) and \(\varepsilon >0\),

$$\begin{aligned} \Vert (A_m-\lambda )^{-1}\Vert \ \rightarrow \ \Vert (A-\lambda )^{-1}\Vert \qquad \text {and}\qquad \sigma _\varepsilon (A_m)\rightarrow \sigma _\varepsilon (A) \end{aligned}$$

in Hausdorff distance, as \(m\rightarrow \infty \). If \(A_m\) and A are normal, one also has

$$\begin{aligned} \sigma (A_m)\rightarrow \sigma (A). \end{aligned}$$

Proof

We apply Proposition 4.9, which is possible since all our operators have the same band-width and \(\Vert A\Vert , \Vert A_m\Vert \le \Vert L\Vert +\max \limits _{\sigma \in \Sigma } |\sigma |\) for all \(m\in {\mathbb {N}}\).

With the same argument for the adjoints of our operators, we get, after recalling (3), that \(\Vert (A_m-\lambda )^{-1}\Vert \rightarrow \Vert (A-\lambda )^{-1}\Vert \) for all \(\lambda \in {\mathbb {C}}\). Concluding Hausdorff convergence of the pseudospectra from here is a standard result (e.g. [18, Section 2.3] or [48]). In general, one cannot conclude Hausdorff convergence of the spectra. But in the normal case this is possible, due to (4). \(\square \)

Remark 4.11

For pseudoergodic \(b\in \Sigma ^{\mathbb {Z}}\), it is well-known [25] that

$$\begin{aligned} \sigma (H(b)) \ =\ \bigcup _{c\in \Sigma ^{\mathbb {Z}}} \sigma (H(c)), \end{aligned}$$
(19)

and an important question, for example in [12, 36, 44, 51], is whether restricting the union on the right to periodic \(c\in \Sigma ^{\mathbb {Z}}\) yields a dense subset. We are not quite answering this one, in large generality, but we shed light on the corresponding problem for pseudospectra:

In analogy to (19), one can prove (e.g. [16])

$$\begin{aligned} \sigma _\varepsilon (H(b))\ =\ \bigcup _{c\in \Sigma ^{\mathbb {Z}}} \sigma _\varepsilon (H(c)) \supset \bigcup _{\text {periodic }c\in \Sigma ^{\mathbb {Z}}} \sigma _\varepsilon (H(c)),\qquad \varepsilon >0. \end{aligned}$$
(20)

Our construction, approximating \(\sigma _\varepsilon (H(b))\) in Hausdorff distance by \(\varepsilon \)-pseudospectra of \(H(b_m)\) with periodic \(b_m\in \Sigma ^{\mathbb {Z}}\), shows that the subset on the right of (20) is dense in the set on the left, \(\sigma _\varepsilon (H(b))\). If \(H(b_m)\) and H(b) are normal then also the original question about spectra and (19) is answered affirmatively.

Note that Theorem 4.10, in particular (18), does not require any convergence \(b_m\rightarrow b\). This changes if we switch to the half-axis.

4.7 Spectral Approximation via Approximation of Subwords: The Half-Axis

Here is the corresponding result for the half-axis, for the self-adjoint case see [41].

Theorem 4.12

Let \(\Sigma \subseteq {\mathbb {C}}\) be finite, let \(b,b_1,b_2,\ldots \in \Sigma ^{\mathbb {N}}\) and look at the corresponding half-axis generalized Schrödinger operators \(A^+ :=H(b)^+\) as well as \(A_m^+ :=H(b_m)^+\) for \(m\in {\mathbb {N}}\).

If, again, (18) holds and, in addition, \(b_m\rightarrow b\) pointwise, then, for all \(\lambda \in {\mathbb {C}}\) and \(\varepsilon >0\),

$$\begin{aligned} \Vert (A_m^+-\lambda )^{-1}\Vert \ \rightarrow \ \Vert (A^+-\lambda )^{-1}\Vert \qquad \text {and}\qquad \sigma _\varepsilon (A_m^+)\rightarrow \sigma _\varepsilon (A^+) \end{aligned}$$

in Hausdorff distance, as \(m\rightarrow \infty \). If \(A_m^+\) and \(A^+\) are normal, one also has

$$\begin{aligned} \sigma (A_m^+)\rightarrow \sigma (A^+). \end{aligned}$$

Proof

The proof is almost identical to that of Theorem 4.10: again, given \(\varepsilon >0\), choose \(N\in {\mathbb {N}}\) so that Lemma 3.1 holds with \(\frac{\varepsilon }{2}\) for \(A^+\) and all \(A_m^+\). Then, and this is a bit different, in accordance with N, take \(m_0\) large enough that

  • (18) holds, as well as

  • \(b_m=b\) on \(1..w+N-1\) for all \(m\ge m_0\),

the latter is possible since \(b_m\rightarrow b\) pointwise and \(\Sigma \) is discrete.

Then, by Proposition 4.8, \(\nu _N(A_m^+)=\nu _N(A^+)\), and we proceed as in the proof of Proposition 4.9. \(\square \)

4.8 Sufficient Conditions for (18)

Condition (18), in short: for all \(N\in {\mathbb {N}}\), eventually, as \(m\rightarrow \infty \), \({{\mathcal {W}}}_N(b_m)={{\mathcal {W}}}_N(b)\), plays a crucial role in Proposition 4.9 and Theorems 4.10 and 4.12. Here we discuss some arguably more handy criteria that are sufficient for (18). We look at the case of the whole axis but the half-axis case works the same way (with obvious modifications, like replacing \(-r..r\) by 1..r).

For aperiodic potentials and potentials generated by substitutions, this condition can always be satisfied with periodic potentials \(b_m\) that can be obtained constructively. We will leave the details to [3] and [31].

In the general case, we discuss the inclusions \({{\mathcal {W}}}_N(b_m)\supseteq {{\mathcal {W}}}_N(b)\) and \({{\mathcal {W}}}_N(b_m)\subseteq {{\mathcal {W}}}_N(b)\) separately because each is interesting in its own right: “\(\supseteq \)” guarantees a spectral inclusion of \(\sigma (H(b))\), and “\(\subseteq \)” at least avoids spectral pollution.

For both inclusions, we make the following assumptions:

  1. (1)

    \(|\Sigma |<\infty \),

    A bounded \(\Sigma \subseteq {\mathbb {C}}\) is discrete if and only if it is finite (Bolzano-Weierstrass).

  2. (2)

    \(b_m\in \Sigma ^{\mathbb {Z}}\) for all \(m\in {\mathbb {N}}\),

    Together with 1.) and 3.) we also conclude \(b\in \Sigma ^{\mathbb {Z}}\).

  3. (3)

    \(b_m\rightarrow b\) pointwise.

Note: In the half-axis case (Theorem 4.12), we already make all three assumptions. By (1) and (3), we have that

$$\begin{aligned} \forall r\in {\mathbb {N}}:\ \ \exists m(r):\ \forall m\ge m(r): \quad b_m|_{-r..r}=b|_{-r..r}. \end{aligned}$$
(21)

4.9 Sufficient Conditions for \({{\mathcal {W}}}_N(b_m)\supseteq {{\mathcal {W}}}_N(b)\), Eventually

This inclusion holds without further conditions; (1)–(3) are enough. Indeed: Let \(N\in {\mathbb {N}}\) and \(w\in {{\mathcal {W}}}_N(b)\). Take \(r\in {\mathbb {N}}\) large enough for \(w\in {{\mathcal {W}}}(b|_{-r..r})\). For \(m\ge m_0:=m(r)\) from (21) we then have \(b_m|_{-r..r}=b|_{-r..r}\), whence \(w\in {{\mathcal {W}}}_N(b_m)\).

4.10 Sufficient Conditions for \({{\mathcal {W}}}_N(b_m)\subseteq {{\mathcal {W}}}_N(b)\), Eventually

This property is not for free. As a negative example, look at \(b_m:=\chi _{\{m\}}\rightarrow b\equiv 0\), so that subwords of \(b_m\) containing “1” do not show in the limit b. To rule out these kinds of examples, we will impose that each pattern that appears once in \(b_m\) appears infinitely often (this is not enough yet, e.g. \(b_m=\chi _{2m{\mathbb {Z}}+m}\rightarrow b\equiv 0\)) and that the gap between two occurrences remains bounded as \(m\rightarrow \infty \) (see e.g. [23] and Remark 4.5 in [46]).

For \(u\in \Sigma ^{\mathbb {Z}}\), \(N\in {\mathbb {N}}\) and \(w\in {{\mathcal {W}}}_N(u)\), recall

$$\begin{aligned} \textrm{pos}(w,u) = \{k\in {\mathbb {Z}}: u(k)\cdots u(k+N-1)=w\} \end{aligned}$$

and further put

$$\begin{aligned} \textrm{gap}(w,u)&:= \min \{r\in {\mathbb {N}}: \textrm{pos}(w,u)+(-r..r)={\mathbb {Z}}\},\\ \textrm{gap}(N,u)&:= \max \{\textrm{gap}(w,u):w\in {{\mathcal {W}}}_N(u)\}. \end{aligned}$$

If \(\textrm{gap}(w,u)<\infty \), we say that w has bounded gaps in u. Then \(\textrm{gap}(N,u)<\infty \) means that every subword of length N has bounded gaps in u. (By 1., there are only finitely many \(w\in {{\mathcal {W}}}_N(u)\), so that a uniform upper bound is automatic.)

The next step is to expect this property for all \(b_m\), uniformly in m:

Proposition 4.13

If, in addition to (1)–(3), one has property

  1. (4)

    \(\quad \forall N\in {\mathbb {N}}:\ \ g(N):=\sup \limits _{m\in {\mathbb {N}}}\ \textrm{gap}(N,b_m)<\infty \)

then the condition (18) is satisfied.

Proof

Let \(N\in {\mathbb {N}}\), \(r\in {\mathbb {N}}\) with \(2r>g(N)\) from 4.) and then take \(m\ge m_0:=m(r)\) from (21). Now let \(w\in {{\mathcal {W}}}_N(b_m)\). By 4.), we have

$$\begin{aligned} \textrm{gap}(w,b_m)\ \le \ \textrm{gap}(N,b_m)\ \le \ g(N)\ <\ 2r. \end{aligned}$$

So there exists a \(k\in \textrm{pos}(w,b_m)\cap -r..r\) and therefore an occurrence of w in \(b_m|_{-r..r}\) and, by (21), also in \(b|_{-r..r}\). So \(w\in {{\mathcal {W}}}_N(b)\).

This finishes the proof that, eventually, as \(m\rightarrow \infty \), \({{\mathcal {W}}}_N(b_m)\subseteq {{\mathcal {W}}}_N(b)\). The reverse inclusion already follows from (1) to (3), see the previous subsection. \(\square \)

While Proposition 4.13 might help to decide whether some given b and \((b_m)\) satisfy (18), it does not help to construct \((b_m)\) from b. Let us therefore sketch a straightforward way to create a sequence \((b_m)\) such that (18) is always satisfied. For that purpose we have to restrict ourselves to the case where \({{\mathcal {W}}}(b)={{\mathcal {W}}}(b|_{\mathbb {N}})\), in which case, by Lemma 4.4, \(\#(w,b|_{\mathbb {N}}) = \infty \) for all \(w\in {{\mathcal {W}}}(b)\).

Let \(N\in {\mathbb {N}}\) and find \(r\in {\mathbb {N}}\) such that \({{\mathcal {W}}}_N(b) = {{\mathcal {W}}}_N(b|_{1..r})\). Then take \({\widetilde{r}} :=\min \{\textrm{pos}(b|_{1..N},b|_{r+1..}) \}-1\) and set \(b_N\) to be the \({\widetilde{r}}\)-periodic word repeating \(b|_{1..{\widetilde{r}}}\). It is easy to see that (18) is always satisfied with \(m_0=N\).

4.11 Directions of Extension

We have seen (Proposition 4.9, one-way model in Sect. 2.5) that the limitation to just one varying diagonal was merely for convenience. But there is more room for extensions: Instead of band operators on \(\ell ^2({\mathbb {Z}})\), we could pass to uniform limits of band operators on \(\ell ^p({\mathbb {Z}}^d,X)\), where \(p\in [1,\infty ]\), \(d\in {\mathbb {N}}\) and X is any Banach space.

  • How to generalize Lemma 3.1 from the set of band operators to its closure, the so-called band-dominated operators [43, 53], is shown in [49].

  • The generalization from \(p=2\) to arbitrary \(p\in [1, \infty ]\) is possible using so-called \({{\mathcal {P}}}\)-theory [55], but it does not change the spectra, see Lemma 2.1, as long as we move away from band operators not too far. Precisely, it holds in the so-called Wiener algebra [45], which is somewhere between the classes of band and band-dominated operators. In particular, Lemma 3.1 is not restricted to the case \(p=2\).

  • Passing from scalar to X-valued sequence spaces is again covered by \({{\mathcal {P}}}\)-theory [55] and enables us to study, for example, \(L^p({\mathbb {R}})\), by identifying it with \(\ell ^p({\mathbb {Z}},L^p[0,1])\).

  • Our tools (e.g. Lemma 3.1) and ideas are in fact not limited to the 1D case, \(d=1\). The case \(d\ge 2\) just needs some obvious modifications:

    • Discrete intervals have to be replaced by, e.g. cartesian products of discrete intervals or, more flexibly, by bounded sets \(S\subseteq {\mathbb {Z}}^d\) for which \(S+[-\frac{1}{2},\frac{1}{2}]^d\) is connected in \({\mathbb {R}}^d\);

    • A finite subword of \(b\in \Sigma ^{({\mathbb {Z}}^d)}\) is then \(b|_S\) with S from above;

    • In the same style, a finite submatrix of consecutive columns is then \((A_{i,j})_{i\in {\mathbb {Z}}^d,\,j\in S}\), truncated to the band part of A and shifted to a common region;

    • The full space results are again straightforward,

    • For compressions to an infinite set \(U\subseteq {\mathbb {Z}}^d\) (like the half-axis in 1D), the new Proposition 4.8 will again have to require exactness in the w-neighborhood of the boundary of U and a match of the sets of finite subwords otherwise.

    • Note however that Lemma 3.4 does not transfer to \(d\ge 2\) since the main results of [56] rely on 1D. (Also see Example 30 in [55].)