1 Introduction

Traditional stability analysis of linear dynamic models is based on eigenvalues. Thus determining the eigenvalues of a matrix or, more generally, the spectrum of a linear operator is a major task in analysis and numerics. The explicit computation of the whole spectrum of a linear operator by analytical or numerical techniques is only possible in rare cases. Moreover, the spectrum is in general quite sensitive with respect to small perturbations of the operator. This is in particular true for non-normal matrices and operators. Therefore, one is interested in supersets of the spectrum that are easier to compute and that are also robust under perturbations. One suitable superset is the \(\varepsilon \)-pseudospectrum, a notion which has been independently introduced by Landau [15], Varah [25], Godunov [14], Trefethen [22] and Hinrichsen and Pritchard [11]. The \(\varepsilon \)-pseudospectrum of a linear operator A on a Hilbert space H consists of the union of the spectra of all operators on H of the form \(A+P\) with \(\Vert P\Vert <\varepsilon \). Besides the fact that the pseudospectrum is robust under perturbations, it is also suitable to determine the transient growth behavior of linear dynamic models in finite time, which may be far from the asymptotic behavior. For an overview on the pseudospectrum and its applications we refer the reader to [8, 24].

Numerical computation of the pseudospectrum of a matrix has been intensively studied in the literature. Most algorithms use simple grid-based methods, where one computes the smallest singular value of \(A-z\) at the points z of a grid, or path-following methods, see the survey [23] or the overview at [8]. Both methods face several challenges. The main problem of grid-based methods is first to find a suitable region in the complex plane and then to perform the computation on a usually very large number of grid points. The main difficulty of path-following algorithms is to find a starting point, that is, a point on the boundary of the pseudospectrum. Moreover, as the pseudospectrum may be disconnected it is difficult to find every component. However, there are several speedup techniques available, see [23], which are essential for applications.

A simple method to enclose the pseudospectrum is in terms of the numerical range. More precisely, under an additional weak assumption, the \(\varepsilon \)-pseudospectrum is contained in an \(\varepsilon \)-neighborhood of the numerical range of the operator, see Remark 2.8. While this superset is easy to compute for matrices, it can not distinguish disconnected components of the pseudospectrum as the numerical range is convex.

In this article we propose a new method to enclose the pseudospectrum via the numerical range of the inverse of the matrix or linear operator. More precisely, for a linear operator A on a Hilbert space and \(\varepsilon >0\) we show

$$\begin{aligned} \sigma _\varepsilon (A)\subset \bigcap _{s\in S} \left[ \bigl (B_{\delta _s}(W((A-s)^{-1}))\bigr )^{-1}+s\right] , \end{aligned}$$
(1)

see Theorem 2.2. Here \(\sigma _\varepsilon (A)\) denotes the \(\varepsilon \)-pseudospectrum of A, \(W((A-s)^{-1})\) is the numerical range of the resolvent operator \((A-s)^{-1}\), \(B_{\delta _s}(U)\) is the \({\delta _s}\)-neighborhood of a set U, and S is a suitable subset of the complex plane. This inclusion holds for matrices as well as for linear operators on Hilbert spaces. Further, we show that the enclosure of the pseudospectrum in (1) becomes optimal if the set S is chosen optimally, see Theorem 2.5. The idea to study the numerical range of the inverses stems from the fact that the spectrum of a matrix can be expressed in terms of inverses of shifted matrices [12].

From a numerical point of view this new method faces similar challenges as grid-based methods as a suitable set S of points has to be found and then the numerical ranges of a large number of matrices have to be computed. However, this new method has the advantage that it enables us to enclose the pseudospectrum of an infinite-dimensional operator by a set which is expressed by the approximating matrices.

The usual procedure to compute the pseudospectrum of a linear operator on an infinite-dimensional Hilbert space is to approximate it by matrices and then to calculate the pseudospectrum of one of the approximating matrices. In [24, Chapter 43] spectral methods are used for the approximation, but no convergence properties of the pseudospectrum under discretization are proved. So far only few results are available concerning the relations between the pseudospectra of the discretized operator and those of the infinite-dimensional operator. Convergence properties of the pseudospectrum under discretization have been studied for the linearized Navier-Stokes equation [9], for band-dominated bounded operators [18] and for Toeplitz operators [5]. Bögli and Siegl [3, 4] prove local and global convergence of the pseudospectra of a sequence of linear operators which converge in a generalized resolvent sense. Further, Wolff [27] shows some abstract convergence results for the approximate point spectrum of a linear operator using the pseudospectra of the approximations.

In this article we refine the enclosure (1) of the pseudospectrum of linear operators further and show that it is sufficient to calculate the numerical ranges of approximating matrices. More precisely, we show in Theorem 3.6 that

$$\begin{aligned} \sigma _\varepsilon (A)\subset \bigcap _{s\in S} \left[ \bigl (B_{\delta _s}(W((A_n-s)^{-1}))\bigr )^{-1}+s\right] \end{aligned}$$
(2)

if n is sufficiently large. Here \(A_n\) is a sequence of matrices which approximates the operator A strongly. We refer to Sect. 3 for the precise definition of strong approximation. If we even have a uniform approximation of the operator A, then we are able to prove an estimate for the index n such that (2) holds in intersections with compact subsets of the complex plane, see Sect. 4. In Sect. 5 we show that finite element discretizations of elliptic partial differential operators yield uniform approximations. Further, as an example of strong approximation we study in Sect. 6 a class of structured block operator matrices. In the final section we apply our obtained results to the advection–diffusion operator and the Hain–Lüst operator.

We conclude this introduction with some remarks on the notation used. Let H be a Hilbert space. Throughout this article we assume that \(A:{\mathcal {D}}(A)\subset H\rightarrow H\) is a closed, densely defined, linear operator. We denote the range of A by \({\mathcal {R}}(A)\) and the spectrum by \(\sigma (A)\). The resolvent set is \(\varrho (A)={\mathbb {C}}\backslash \sigma (A)\). Let \({\mathcal {L}}(H_1, H_2)\) denote the set of linear, bounded operators from the Hilbert space \(H_1\) to the Hilbert space \(H_2\). The operator norm of \(T\in {\mathcal {L}}(H_1, H_2)\) will be denoted by \(\Vert T\Vert _{{\mathcal {L}}(H_1, H_2)}\). To shorten notation, we write \({\mathcal {L}}(H)={\mathcal {L}}(H, H)\) and denote the operator norm of \(T\in {\mathcal {L}}(H)\) by \(\Vert T\Vert \). The identity operator is denoted by I. For every \(\lambda \in \varrho (A)\), the resolvent \((A-\lambda )^{-1}:=(A-\lambda I)^{-1}\) satisfies \((A-\lambda )^{-1}\in {\mathcal {L}}(H)\). For a set of complex numbers \(S\subset {\mathbb {C}}\) we denote the \(\delta \)-neighborhood by \(B_\delta (S)\), i.e., \(B_\delta (S)=\left\{ z\in {\mathbb {C}}\,\big |\,{{\,\mathrm{dist}\,}}(z,S)<\delta \right\} \), and we also use the notation \(S^{-1}=\left\{ z^{-1}\,\big |\,z\in S{\setminus }\{0\}\right\} \). Further, we use the notation \({\mathbb {C}}^*:={\mathbb {C}}\backslash \{0\}\).

2 Pseudospectrum Enclosures Using the Numerical Range

In this section we present the basic idea of considering numerical ranges of shifted inverses of an operator in order to obtain an enclosure of its pseudospectrum. We start by recalling the notions of the numerical range and the \(\varepsilon \)-pseudospectrum.

The numerical range of an operator A is defined as the set

$$\begin{aligned} W(A)=\left\{ \langle Ax,x\rangle \,\big |\,x\in {\mathcal {D}}(A),\,\Vert x\Vert =1\right\} , \end{aligned}$$

see e.g. [13]. It is always a convex set and, if A is additionally bounded, then W(A) is bounded too. The numerical radius is \(w(A)=\sup \left\{ |z|\,\big |\,z\in W(A)\right\} \). The numerical range satisfies the inclusions

$$\begin{aligned} \sigma _p(A)\subset W(A),\qquad \sigma _{\mathrm {app}}(A)\subset \overline{W(A)}, \end{aligned}$$

where \(\sigma _p(A)\) is the point spectrum of A, i.e., the set of all eigenvalues and \(\sigma _{\mathrm {app}}(A)\) is the so-called approximate point spectrum defined by

$$\begin{aligned} \sigma _{\mathrm {app}}(A)=\left\{ \lambda \in {\mathbb {C}}\,\big |\,\exists x_n\in {\mathcal {D}}(A),\, \Vert x_n\Vert =1 : \lim _{n\rightarrow \infty }(A-\lambda )x_n=0\right\} . \end{aligned}$$

The spectrum, point spectrum and approximate point spectrum are related by \(\sigma _p(A)\subset \sigma _{\mathrm {app}}(A)\subset \sigma (A)\). If A has a compact resolvent, then the spectrum consists of eigenvalues only and hence we have equality.

For \(\varepsilon >0\) the \(\varepsilon \)-pseudospectrum of A is given by

$$\begin{aligned} \sigma _\varepsilon (A)=\sigma (A)\cup \left\{ \lambda \in \varrho (A)\,\big |\,\Vert (A-\lambda )^{-1}\Vert >\frac{1}{\varepsilon }\right\} . \end{aligned}$$

If we understand \(\Vert (A-\lambda )^{-1}\Vert \) to be infinity for \(\lambda \in \sigma (A)\), then this can be shortened to

$$\begin{aligned} \sigma _\varepsilon (A)=\left\{ \lambda \in {\mathbb {C}}\,\big |\,\Vert (A-\lambda )^{-1}\Vert >\frac{1}{\varepsilon }\right\} . \end{aligned}$$

Hence

$$\begin{aligned} {\mathbb {C}}{\setminus }\sigma _\varepsilon (A)=\left\{ \lambda \in \varrho (A)\,\big |\,\Vert (A-\lambda )^{-1}\Vert \le \frac{1}{\varepsilon }\right\} . \end{aligned}$$

The central idea of this article is the following: If \(\lambda \in {\mathbb {C}}\) is such that \(1/\lambda \) has a certain positive distance \(\delta \) to the numerical range of the inverse operator \(A^{-1}\), then this yields an estimate of the form

$$\begin{aligned} \Vert (A-\lambda )x\Vert \ge \varepsilon \Vert x\Vert ,\qquad x\in {\mathcal {D}}(A), \end{aligned}$$

with some constant \(\varepsilon >0\), which will in turn be used to show \(\lambda \in \varrho (A)\) with \(\Vert (A-\lambda )^{-1}\Vert \le \frac{1}{\varepsilon }\), i.e., \(\lambda \not \in \sigma _\varepsilon (A)\). This is made explicit with the next proposition:

Proposition 2.1

Suppose that \(0\in \varrho (A)\). Then for every \(0<\varepsilon < \frac{1}{\Vert A^{-1}\Vert }\) and \(\delta =\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }\) we have

$$\begin{aligned} \sigma _\varepsilon (A)\subset \bigl (B_\delta (W(A^{-1}))\bigr )^{-1}. \end{aligned}$$

Proof

Let us denote \(U=\bigl (B_\delta (W(A^{-1}))\bigr )^{-1}\). As a first step we show that

$$\begin{aligned} \Vert (A-\lambda )x\Vert \ge \varepsilon \quad \text {for all}\quad \lambda \in {\mathbb {C}}{\setminus } U,\,x\in {\mathcal {D}}(A),\,\Vert x\Vert =1. \end{aligned}$$
(3)

So let \(\lambda \in {\mathbb {C}}{\setminus } U\). We consider two cases. First suppose that \(|\lambda |>\frac{1}{\Vert A^{-1}\Vert }-\varepsilon \). Then \(\lambda \ne 0\), \(\lambda ^{-1}\not \in B_{\delta }(W(A^{-1}))\) and hence \({{\,\mathrm{dist}\,}}(\lambda ^{-1},W(A^{-1}))\ge \delta \). For \(x\in {\mathcal {D}}(A)\), \(\Vert x\Vert =1\) we find

$$\begin{aligned} \delta \le |\lambda ^{-1}-\langle A^{-1}x,x\rangle | =|\langle (\lambda ^{-1}-A^{-1})x,x\rangle | \le \Vert (\lambda ^{-1}-A^{-1})x\Vert . \end{aligned}$$

Consequently

$$\begin{aligned} \Vert (A-\lambda )x\Vert&=|\lambda |\Vert A(\lambda ^{-1}-A^{-1})x\Vert \ge \frac{|\lambda |}{\Vert A^{-1}\Vert }\Vert (\lambda ^{-1}-A^{-1})x\Vert \\&\ge \frac{\delta }{\Vert A^{-1}\Vert }\left( \frac{1}{\Vert A^{-1}\Vert }-\varepsilon \right) = \frac{\delta (1-\Vert A^{-1}\Vert \varepsilon )}{\Vert A^{-1}\Vert ^2}=\varepsilon . \end{aligned}$$

In the other case if \(|\lambda |\le \frac{1}{\Vert A^{-1}\Vert }-\varepsilon \) then \(|\lambda |\Vert A^{-1}\Vert \le 1-\Vert A^{-1}\Vert \varepsilon \) and hence \(I-\lambda A^{-1}\) is invertible by a Neumann series argument with \(\Vert (I-\lambda A^{-1})^{-1}\Vert \le \frac{1}{\Vert A^{-1}\Vert \varepsilon }\). For \(x\in {\mathcal {D}}(A)\), \(\Vert x\Vert =1\) this implies

$$\begin{aligned} \Vert (A-\lambda )x\Vert =\Vert A(I-\lambda A^{-1})x\Vert \ge \frac{1}{\Vert A^{-1}\Vert \Vert (I-\lambda A^{-1})^{-1}\Vert }\ge \varepsilon . \end{aligned}$$

We have thus shown (3). In particular, \(\lambda \in {\mathbb {C}}{\setminus } U\) implies \(\lambda \not \in \sigma _{\mathrm {app}}(A)\), i.e.,

$$\begin{aligned} \sigma _{\mathrm {app}}(A)\cap {\mathbb {C}}{\setminus } U=\varnothing . \end{aligned}$$
(4)

Since \(B_\delta (W(A^{-1}))\) is convex and bounded, the set \({\mathbb {C}}^*{\setminus } B_\delta (W(A^{-1}))\) is connected and hence also

$$\begin{aligned} {\mathbb {C}}^*{\setminus } U=\left( {\mathbb {C}}^*{\setminus } B_\delta (W(A^{-1}))\right) ^{-1}, \end{aligned}$$

the image under the homeomorphism \({\mathbb {C}}^*\rightarrow {\mathbb {C}}^*\), \(z\mapsto z^{-1}\). On the other hand, the boundedness of \(B_\delta (W(A^{-1}))\) implies that a neighborhood around 0 belongs to \({\mathbb {C}}{\setminus } U=({\mathbb {C}}^*{\setminus } U) \cup \{0\}\). Consequently, the set \({\mathbb {C}}{\setminus } U\) is connected and satisfies \(0\in \varrho (A)\cap {\mathbb {C}}{\setminus } U\). Using (4) and the fact that \(\partial \sigma (A)\subset \sigma _{\mathrm {app}}(A)\), we conclude that

$$\begin{aligned} {\mathbb {C}}{\setminus } U\subset \varrho (A). \end{aligned}$$

Here \(\partial \sigma (A)\) denotes the boundary of the spectrum of A. Now (3) implies that if \(\lambda \in {\mathbb {C}}{\setminus } U\) then \(\Vert (A-\lambda )^{-1}\Vert \le \frac{1}{\varepsilon }\) and therefore we obtain \(\lambda \not \in \sigma _\varepsilon (A)\). \(\square \)

Applying the last result to the shifted operator \(A-s\) and then taking the intersection over a suitable set of shifts, we obtain our first main result on an enclosure of the pseudospectrum:

Theorem 2.2

Consider a set \(S\subset \varrho (A)\) such that

$$\begin{aligned} M:=\sup _{s\in S}\Vert (A-s)^{-1}\Vert <\infty . \end{aligned}$$

Then for \(0<\varepsilon < \frac{1}{M}\) we get the inclusion

$$\begin{aligned} \sigma _\varepsilon (A)\subset \bigcap _{s\in S} \left[ \bigl (B_{\delta _s}(W((A-s)^{-1}))\bigr )^{-1}+s\right] \end{aligned}$$
(5)

where \(\delta _s=\frac{\Vert (A-s)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s)^{-1}\Vert \varepsilon }\).

Proof

For every \(s\in S\) we can apply Proposition 2.1 to the operator \(A-s\) and obtain

$$\begin{aligned} \sigma _\varepsilon (A)-s=\sigma _\varepsilon (A-s)\subset \bigl (B_{\delta _s}(W((A-s)^{-1}))\bigr )^{-1}. \end{aligned}$$

\(\square \)

The following simple example demonstrates that the \(\delta \)-neighborhood around the numerical range is actually needed to obtain an enclosure of the pseudospectrum.

Example 2.3

Let \(A={{\,\mathrm{diag}\,}}(-1+i,-1-i,1+i,1-i)\in {\mathbb {C}}^{4\times 4}\). Then \(A^{-1}=\frac{1}{2}{{\,\mathrm{diag}\,}}(-1-i,-1+i,1-i,1+i)\). Since \(A^{-1}\) is normal, its numerical range is simply the convex hull of its eigenvalues. Thus \(W(A^{-1})\) is the following square:

Then, using the fact that \(z\mapsto \frac{1}{z}\) is a Möbius transformation, we obtain for \(W(A^{-1})^{-1}\) the following curve plus its exterior:

We see that \(W(A^{-1})^{-1}\) touches the spectrum of A. This is of course clear: if an eigenvalue \(1/\lambda \) of \(A^{-1}\) is on the boundary of \(W(A^{-1})\), then the eigenvalue \(\lambda \) of A is on the boundary of \(W(A^{-1})^{-1}\). In particular in this example we do not have \(\sigma _\varepsilon (A)\subset W(A^{-1})^{-1}\) for any \(\varepsilon >0\) since \(\sigma _\varepsilon (A)\) contains discs with radius \(\varepsilon \) around the eigenvalues.

Proposition 2.4

For \(s\in \varrho (A)\), \(0<\varepsilon <\frac{1}{\Vert (A-s)^{-1}\Vert }\) and \(\delta _s = \frac{\Vert (A-s)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s)^{-1}\Vert \varepsilon }\) we have that

$$\begin{aligned} \overline{B_{\rho _s}(s)} \cap \left[ \bigl (B_{\delta _{{s}}}(W((A-{s})^{-1}))\bigr )^{-1}+{s}\right] = \varnothing \end{aligned}$$

where \(\rho _s = \frac{1}{w((A-s)^{-1})+\delta _s} \ge \frac{1}{\Vert (A-s)^{-1}\Vert +\delta _s}\).

Proof

Let \(s\in \varrho (A)\) and \(t\in \bigl (B_{\delta _s}(W((A-s)^{-1}))\bigr )^{-1}+s\). Then

$$\begin{aligned} \frac{1}{t-s}\in B_{\delta _s}(W((A-s)^{-1})) \end{aligned}$$

and we can estimate

$$\begin{aligned} \frac{1}{|t-s|}&< \delta _s + \sup _{\Vert x\Vert =1}|\langle (A-s)^{-1}x,x\rangle | =\delta _s+w((A-s)^{-1}) = \frac{1}{\rho _s}. \end{aligned}$$

This implies \(|t-s|>\rho _s\) and therefore \(t\notin \overline{B_{\rho _s}(s)}\). \(\square \)

The following theorem shows that the enclosure of the pseudospectrum in Theorem 2.2 becomes optimal if the shifts are chosen optimally.

Theorem 2.5

Let \(\varepsilon >0\) be such that \(\frac{1}{\varepsilon }\) is not a global minimum of the norm of the resolvent of A and \(S_\gamma :=\left\{ s\in \varrho (A)\,\big |\,\Vert (A-s)^{-1}\Vert =\frac{1}{\varepsilon +\gamma }\right\} \) for \(\gamma >0\). Let further \(\delta _s=\frac{\Vert (A-s)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s)^{-1}\Vert \varepsilon }\). Then:

  1. (a)
    $$\begin{aligned} \sigma _\varepsilon (A)&\subset \bigcap _{\gamma >0}\bigcap _{s\in S_\gamma }\left[ \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\right] \\&\subset \left\{ \lambda \in {\mathbb {C}}\,\big |\,\Vert (A-\lambda )^{-1}\Vert \ge \frac{1}{\varepsilon }\right\} = \overline{\sigma _\varepsilon (A)}, \end{aligned}$$
  2. (b)
    $$\begin{aligned} \overline{\sigma _\varepsilon (A)} = \bigcap _{\gamma >0}\bigcap _{s\in S_\gamma }\left[ \left( \overline{B_{\delta _s}(W((A-s)^{-1}))}\right) ^{-1}+s\right] , \end{aligned}$$
  3. (c)

    Under the additional assumption that A is normal with compact resolvent and \(L>0\), there exists an \(\varepsilon _0>0\) (depending on L) such that for all \(\varepsilon <\varepsilon _0\)

    $$\begin{aligned} \sigma _\varepsilon (A)\cap \overline{B_L(0)} = \bigcap _{\gamma >0}\bigcap _{s\in S_\gamma }\left[ \left( B_{\delta _s}(W((A-s)^{-1}))\right) +s\right] \cap \overline{B_L(0)}. \end{aligned}$$

Proof

  1. (a)

    The first inclusion follows from Theorem 2.2. In order to prove the second inclusion first note that

    $$\begin{aligned} S_\gamma \cap \bigcap _{{s}\in S_\gamma } \left[ \bigl (B_{\delta _{{s}}}(W((A-{s})^{-1}))\bigr )^{-1}+{s}\right] = \varnothing \end{aligned}$$

    for every \(\gamma >0\) by Proposition 2.4. Hence,

    $$\begin{aligned}&\bigcap _{\gamma>0}\bigcap _{s\in S_\gamma }\left[ \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\right] \subset \bigcap _{\gamma>0}{\mathbb {C}}{\setminus } S_\gamma = {\mathbb {C}}{\setminus }\bigcup _{\gamma >0} S_\gamma \\&\quad ={\mathbb {C}}{\setminus }\left\{ s\in \varrho (A)\,\big |\,\Vert (A-s)^{-1}\Vert <\frac{1}{\varepsilon }\right\} = \left\{ s\in {\mathbb {C}}\,\big |\,\Vert (A-s)^{-1}\Vert \ge \frac{1}{\varepsilon }\right\} . \end{aligned}$$

    From [4, Theorem 3.2] we have that the norm of the resolvent can only be constant on an open subset of \(\varrho (A)\) at its minimum. Since by assumption \(\frac{1}{\varepsilon }\) is not this minimum, we obtain the equality with the closure of the pseudospectrum.

  2. (b)

    Taking the closure in Theorem 2.2 yields

    $$\begin{aligned} \overline{\sigma _\varepsilon (A)} \subset \bigcap _{\gamma >0}\bigcap _{s\in S_\gamma }\left[ \left( \overline{B_{\delta _s}(W((A-s)^{-1}))}\right) ^{-1}+s\right] . \end{aligned}$$

    The other inclusion can be shown as in part (a) since we also have

    $$\begin{aligned} S_\gamma \cap \bigcap _{{s}\in S_\gamma } \left[ \bigl (\overline{B_{\delta _{{s}}}(W((A-{s})^{-1}))}\bigr )^{-1}+{s}\right] = \varnothing \end{aligned}$$

    for every \(\gamma >0\) as a consequence of Proposition 2.4.

  3. (c)

    By (a) it suffices to show that \(\lambda \in \varrho (A)\cap \overline{B_L(0)}\), \(\Vert (A-\lambda )^{-1}\Vert =\frac{1}{\varepsilon }\) implies \(\lambda \notin \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\) for some \(\gamma >0\) and \(s\in S_\gamma \). Let

    $$\begin{aligned} \varepsilon _1 =\frac{1}{2}\min \left\{ {{\,\mathrm{dist}\,}}(\mu ,\sigma (A){\setminus } \{\mu \})\,\big |\,\mu \in \sigma (A)\cap \overline{B_L(0)}\right\} . \end{aligned}$$

    Since A has compact resolvent, the minimum exists and is positive. With

    $$\begin{aligned} \varepsilon _0 = \frac{1}{2}\min \left\{ {{\,\mathrm{dist}\,}}(\mu ,\sigma (A){\setminus } \{\mu \})\,\big |\,\mu \in \sigma (A)\cap \overline{B_{L+3\varepsilon _1}(0)}\right\} \end{aligned}$$
    (6)

    we then have \(0<\varepsilon _0\le \varepsilon _1\). Let now \(\varepsilon <\varepsilon _0\) and \(\lambda \in \varrho (A)\cap \overline{B_L(0)}\) with \(\Vert (A-\lambda )^{-1}\Vert =\frac{1}{\varepsilon }\). Since A is normal, we get \({{\,\mathrm{dist}\,}}(\lambda ,\sigma (A))=\varepsilon \) and hence there exists a \(\mu \in \sigma (A)\) such that \(|\lambda -\mu |=\varepsilon \). In particular we have \(\mu \in B_{L+\varepsilon _1}(0)\). Choose now \(\gamma \in (0,\varepsilon _0-\varepsilon )\), i.e. \(\varepsilon<\varepsilon +\gamma <\varepsilon _0\), and set

    $$\begin{aligned} s = \mu +\frac{\varepsilon +\gamma }{\varepsilon }(\lambda -\mu ). \end{aligned}$$

    Then \(s\in B_{\varepsilon _0}(\mu )\) and

    $$\begin{aligned} {{\,\mathrm{dist}\,}}(s,\sigma (A)) = |\mu -s|=\varepsilon +\gamma . \end{aligned}$$

    Indeed if \(\mu '\in \sigma (A)\cap \overline{B_{L+3\varepsilon _1}(0)}\) with \(\mu \ne \mu '\), then \(B_{\varepsilon _0}(\mu )\cap B_{\varepsilon _0}(\mu ')=\varnothing \) and hence \(|\mu '-s|>\varepsilon _0\). If \(\mu '\in \sigma (A)\) and \(|\mu '|>L+3\varepsilon _1\), then \({{\,\mathrm{dist}\,}}(\mu ',B_{\varepsilon _0}(\mu ))>\varepsilon _1\) since \(B_{\varepsilon _0}(\mu )\subset B_{L+\varepsilon _1+\varepsilon _0}(0)\) and thus \(|\mu '-s|>\varepsilon _1\ge \varepsilon _0\). Due to \(|\mu -s|<\varepsilon _0\) we therefore obtain \({{\,\mathrm{dist}\,}}(s,\sigma (A))=|\mu -s|\) and because A is normal we can conclude

    $$\begin{aligned} \Vert (A-s)^{-1}\Vert = \frac{1}{\varepsilon +\gamma }, \end{aligned}$$

    i.e. \(s\in S_\gamma \). Since

    $$\begin{aligned} \frac{1}{\delta _s+\Vert (A-s)^{-1}\Vert } = \left( \frac{\Vert (A-s)^{-1}\Vert }{1-\Vert (A-s)^{-1}\Vert \varepsilon }\right) ^{-1} = \frac{1}{\Vert (A-s)^{-1}\Vert }-\varepsilon = \gamma , \end{aligned}$$
    (7)

    Proposition 2.4 implies

    $$\begin{aligned} \overline{B_\gamma (s)}\cap \left[ \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\right] =\varnothing . \end{aligned}$$

    By our choice of s we have \(\lambda \in \overline{B_\gamma (s)}\) and thus

    $$\begin{aligned} \lambda \notin \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s. \end{aligned}$$

\(\square \)

Remark 2.6

  1. (a)

    The statements of part (a) and (b) of the previous theorem continue to hold under the weaker assumption \(\delta _s\ge \frac{\Vert (A-s)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s)^{-1}\Vert \varepsilon },\) i.e., equality is not needed there.

  2. (b)

    Note that \(\varepsilon _0\) in part (c) depends on L. For instance if we consider an operator A with

    $$\begin{aligned} \sigma (A) = \left\{ \mu _n=\sum _{k=1}^n\frac{1}{k}\,\big |\,n\in {\mathbb {N}}\right\} \end{aligned}$$

    we have \(\lim _{n\rightarrow \infty }|\mu _n-\mu _{n+1}|=0\), \(\lim _{n\rightarrow \infty }\mu _n=\infty \) and from (6) we obtain \(\varepsilon _0\rightarrow 0\) for \(L\rightarrow \infty \).

  3. (c)

    The cutoff with the large ball \(\overline{B_L(0)}\) in part (c) is not needed in the matrix case (i.e. \(\dim H<\infty \)), or if the eigenvalues of A satisfy a uniform gap condition. On the other hand, the equality in (c) will typically not hold for all \(\varepsilon >0\), i.e. the restriction \(\varepsilon <\varepsilon _0\) is needed, even in the matrix case. This is illustrated with the next (counter-)example.

Example 2.7

Let the normal matrix A be given by

$$\begin{aligned} A = \left( \begin{matrix} 1&{}\quad 0\\ 0&{}\quad -\,1 \end{matrix}\right) \end{aligned}$$

and consider \(\varepsilon =1\). Then \(\sigma _\varepsilon (A)=B_1(1)\cup B_1(-1)\) and in particular \(0\notin \sigma _\varepsilon (A)\). See Fig. 2 for the pseudospectrum with an enclosure. We will show now that \(0\in \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\) for all \(s\in S_\gamma \), \(\gamma >0\). Hence

$$\begin{aligned} \sigma _\varepsilon (A)\subsetneqq \bigcap _{\gamma >0}\bigcap _{s\in S_\gamma }\left[ \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\right] \end{aligned}$$

in this case. First observe that for \(s\in S_\gamma \), i.e. \(\Vert (A-s)^{-1}\Vert =\frac{1}{\varepsilon +\gamma }\), we have \(\frac{1}{\delta _s+\Vert (A-s)^{-1}\Vert } = \gamma \), see (7). This implies

$$\begin{aligned} \delta _s = \frac{1}{\gamma }-\Vert (A-s)^{-1}\Vert = \frac{1}{\gamma }-\frac{1}{\varepsilon +\gamma } = \frac{\varepsilon }{\gamma (\varepsilon +\gamma )} = \frac{1}{\gamma (1+\gamma )} \end{aligned}$$

since \(\varepsilon =1\). We also have

$$\begin{aligned} (A-s)^{-1} = \left( \begin{matrix} (1-s)^{-1}&{}0\\ 0&{}(-1-s)^{-1} \end{matrix}\right) \end{aligned}$$

and hence

$$\begin{aligned} W((A-s)^{-1}) = \left\{ r(1-s)^{-1}+(1-r)(-1-s)^{-1}\,\big |\,r\in [0,1]\right\} . \end{aligned}$$

Due to A being normal, \(S_\gamma \) is the boundary of the \((1+\gamma )\)-neighborhood of \(\{-1,1\}\). Thus by taking \(s_0\in S_\gamma \) with \({{\,\mathrm{Re}\,}}s_0 = 0\) we have

$$\begin{aligned} |s|^2\ge |s_0|^2 = (1+\gamma )^2-1^2 = \gamma ^2+2\gamma \end{aligned}$$

and hence \(|s|>\gamma \), see Fig. 1.

Fig. 1
figure 1

1-pseudospectrum of A with \(S_\gamma \)

From

$$\begin{aligned} |(\pm 1-s)^{-1}-(-s^{-1})| = \left| \frac{1}{\pm 1-s}+\frac{1}{s}\right| = \frac{1}{|s||\pm 1-s|}\le \frac{1}{|s|(1+\gamma )} \end{aligned}$$

we get

$$\begin{aligned}&{{\,\mathrm{dist}\,}}(-s^{-1},W((A-s)^{-1}))\\&\quad \le \min _{r\in [0,1]}r\left| (1-s)^{-1}-(-s^{-1})\right| +(1-r)\left| (-1-s)^{-1}-(-s)^{-1}\right| \\&\quad \le \frac{1}{|s|(1+\gamma )}<\frac{1}{\gamma (1+\gamma )} = \delta _s. \end{aligned}$$

This shows that \(-s^{-1}\in B_{\delta _s}(W((A-s)^{-1}))\) and therefore

$$\begin{aligned} 0\in \left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s. \end{aligned}$$
Fig. 2
figure 2

Exemplary enclosure of the 1-pseudospectrum of A from Example 2.7. The blue lines depict the boundaries of the sets \(\left( B_{\delta _s}(W((A-s)^{-1}))\right) ^{-1}+s\) for some s in an \(S_\gamma \) (color figure online)

Remark 2.8

Note that under the assumption \(\sigma (A)\subset \overline{W(A)}\) (which holds for example if A has a compact resolvent) it is known (see e.g. [24] for the matrix case) that the pseudospectrum can also be enclosed by an \(\varepsilon \)-neighborhood of the numerical range, namely

$$\begin{aligned} \sigma _\varepsilon (A)\subset B_\varepsilon (W(A)). \end{aligned}$$
(8)

Indeed for \(\lambda \in \sigma _\varepsilon (A){\setminus }\sigma (A)\) we have \(\Vert (A-\lambda )^{-1}\Vert >\frac{1}{\varepsilon }\) and therefore

$$\begin{aligned} \Vert (A-\lambda )x\Vert <\varepsilon \qquad \text { for all } x\in {\mathcal {D}}(A), \Vert x\Vert =1. \end{aligned}$$

This implies

$$\begin{aligned} |\langle Ax,x\rangle -\lambda | = |\langle (A-\lambda )x,x\rangle | \le \Vert (A-\lambda )x\Vert <\varepsilon \end{aligned}$$

for \(x\in {\mathcal {D}}(A)\), \(\Vert x\Vert =1\). See Sect. 7 for a comparison of the enclosure (8) with our method (5).

3 A Strong Approximation Scheme

In this section we consider finite-dimensional approximations \(A_n\) to the full operator A. Our aim is to prove a version of Theorem 2.2 which provides a pseudospectrum enclosure for the full operator A in terms of numerical ranges of the approximating matrices \(A_n\); this will allow us to compute the enclosure by numerical methods.

We suppose that \(0\in \varrho (A)\) and consider a sequence of approximations \(A_n\) of the operator A of the following form:

  1. (a)

    \(U_n\subset H\), \(n\in {\mathbb {N}}\), are finite-dimensional subspaces of the Hilbert space H.

  2. (b)

    \(P_n\in {\mathcal {L}}(H)\) are projections (not necessarily orthogonal) onto \(U_n\), i.e. \({\mathcal {R}}(P_n)=U_n\), such that

    $$\begin{aligned} \lim _{n\rightarrow \infty }P_nx=x\qquad \text {for all}\qquad x\in H. \end{aligned}$$
    (9)
  3. (c)

    \(A_n\in {\mathcal {L}}(U_n)\) are invertible such that

    $$\begin{aligned} \lim _{n\rightarrow \infty }A_n^{-1}P_nx= A^{-1}x \qquad \text {for all}\qquad x\in H. \end{aligned}$$
    (10)

In this case we say that the family \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A strongly. Note that (9) implies that \(\bigcup _{n\in {\mathbb {N}}}U_n\) is dense in H and that \(\sup _{n\in {\mathbb {N}}}\Vert P_n\Vert <\infty \) by the uniform boundedness principle.

Lemma 3.1

Let \(U_n\), \(P_n\) be such that (9) holds and let \(A_n\in {\mathcal {L}}(U_n)\) be invertible. Then the following assertions are equivalent:

  1. (a)

    \(\lim _{n\rightarrow \infty }A_n^{-1}P_n x= A^{-1}x\) for all \(x\in H\), i.e., (10) holds.

  2. (b)

    \(\sup _{n\in {\mathbb {N}}}\Vert A_n^{-1}\Vert _{{\mathcal {L}}(U_n)}<\infty \) and for all \(x\in {\mathcal {D}}(A)\) there exist \(x_n\in U_n\) such that

    $$\begin{aligned} \lim _{n\rightarrow \infty }x_n=x,\quad \lim _{n\rightarrow \infty }A_nx_n=Ax. \end{aligned}$$

Proof

\((a)\Rightarrow (b)\). The uniform boundedness principle yields

$$\begin{aligned} \sup _{n\in {\mathbb {N}}}\Vert A_n^{-1}P_n\Vert _{{\mathcal {L}}(H)}<\infty . \end{aligned}$$

Since \(\Vert A_n^{-1}u\Vert =\Vert A_n^{-1}P_nu\Vert \le \Vert A_n^{-1}P_n\Vert _{{\mathcal {L}}(H)}\Vert u\Vert \) for all \(u\in U_n\), this shows the first part. For the second, let \(x\in {\mathcal {D}}(A)\) and set \(y=Ax\) and \(x_n=A_n^{-1}P_ny\). Then \(x_n\rightarrow A^{-1}y=x\) and \(A_nx_n=P_ny\rightarrow y=Ax\) as \(n\rightarrow \infty \).

\((b)\Rightarrow (a)\). Let \(y\in H\). Set \(x=A^{-1}y\) and choose \(x_n\in U_n\) according to (b). Then

$$\begin{aligned} A_n^{-1}P_ny= A_n^{-1}P_nAx= A_n^{-1}(P_nAx-A_nx_n)+x_n. \end{aligned}$$

Since both \(P_nAx\rightarrow Ax\) and \(A_nx_n\rightarrow Ax\) as \(n\rightarrow \infty \) and \(\Vert A_n^{-1}\Vert \) is uniformly bounded, we obtain (a). \(\square \)

The following lemma shows that if A is approximated by \(A_n\) strongly, then \(A-\lambda \) is approximated by \(A_n-\lambda \) strongly too, provided \(\Vert (A_n-\lambda )^{-1}\Vert \) is uniformly bounded in n.

Lemma 3.2

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A strongly. If \(\lambda \in \varrho (A)\) is such that \(\lambda \in \varrho (A_n)\) for all \(n\in {\mathbb {N}}\) and \(\sup _{n\in {\mathbb {N}}}\Vert (A_n-\lambda )^{-1}\Vert <\infty \), then

$$\begin{aligned} \lim _{n\rightarrow \infty }(A_n-\lambda )^{-1}P_nx=(A-\lambda )^{-1}x \qquad \text {for all}\qquad x\in H. \end{aligned}$$

Proof

This follows immediately from Lemma 3.1 since

$$\begin{aligned} \lim _{n\rightarrow \infty }A_nx_n=Ax \quad \Longleftrightarrow \quad \lim _{n\rightarrow \infty }(A_n-\lambda )x_n=(A-\lambda )x \end{aligned}$$

whenever \(\lim _{n\rightarrow \infty }x_n=x\). \(\square \)

Remark 3.3

In the literature there is a variety of notions describing the approximation of a linear operator. Two notions that are close to our definition of a strong approximation scheme are generalized strong resolvent convergence, considered in [2, 3, 26], and discrete-stable convergence, see [6]. There are however subtle differences between these two notions and our setting: First, we do not assume that \(P_n({\mathcal {D}}(A))\subset U_n\). Second, in Lemma 3.1(b) we do not have the convergence of \(A_nP_nx\) to Ax, which would be the case for discrete-stable convergence. Up to these differences, the results of Lemmas 3.1 and 3.2 are well known in the literature, see [2, Lemma 1.2.2, Theorem 1.2.9] and [6, Lemma 3.16].

We now prove a convergence result for the numerical range of the inverse operator under strong approximations.

Lemma 3.4

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A strongly. Then

  1. (a)

    for every \(x\in H\), \(\Vert x\Vert =1\) there exists a sequence \(y_n\in U_n\), \(\Vert y_n\Vert =1\) such that

    $$\begin{aligned} \lim _{n\rightarrow \infty }\langle A_n^{-1}y_n,y_n\rangle =\langle A^{-1}x,x\rangle ; \end{aligned}$$
  2. (b)

    for all \(\delta >0\) there exists \(n_0\in {\mathbb {N}}\) such that

    $$\begin{aligned} W(A^{-1})\subset B_\delta \left( W(A_n^{-1})\right) , \qquad n\ge n_0. \end{aligned}$$

Proof

  1. (a)

    We set \(y_n=P_nx/\Vert P_nx\Vert \). Note that \(y_n\) is well defined for almost all n since \(\Vert P_nx\Vert \rightarrow \Vert x\Vert =1\). We get \(y_n\rightarrow x\) as \(n\rightarrow \infty \) and

    $$\begin{aligned}&|\langle A^{-1}x,x\rangle -\langle A_n^{-1}y_n,y_n\rangle |\\&\quad \le |\langle A^{-1}x-A_n^{-1}P_nx,x\rangle |+|\langle A_n^{-1}P_nx,x-y_n\rangle | +|\langle A_n^{-1}(P_nx-y_n),y_n\rangle |\\&\quad \le \Vert A^{-1}x-A_n^{-1}P_nx\Vert +\Vert A_n^{-1}\Vert \Vert P_nx\Vert \Vert x-y_n\Vert +\Vert A_n^{-1}\Vert \Vert P_nx-y_n\Vert , \end{aligned}$$

    which yields the assertion.

  2. (b)

    Since \(W(A^{-1})\) is bounded, it is precompact and hence there exist points \(z_1,\dots ,z_m\in W(A^{-1})\) such that

    $$\begin{aligned} W(A^{-1})\subset \bigcup _{j=1}^m B_{\delta /2}(z_j). \end{aligned}$$

    For every j we have \(z_j=\langle A^{-1}x_j,x_j\rangle \) with some \(x_j\in H\), \(\Vert x_j\Vert =1\), and by (a) there exists \(n_j\in {\mathbb {N}}\) such that for all \(n\ge n_j\) there is a \(y_j\in U_n\), \(\Vert y_j\Vert =1\) such that

    $$\begin{aligned} |\langle A^{-1}x_j,x_j\rangle -\langle A_n^{-1}y_j,y_j\rangle |<\frac{\delta }{2}. \end{aligned}$$

    Hence

    $$\begin{aligned} W(A^{-1})\subset \bigcup _{j=1}^mB_\delta \left( \langle A_n^{-1}y_j,y_j\rangle \right) \subset B_\delta \left( W(A_n^{-1})\right) \end{aligned}$$

    for all \(n\ge n_0=\max \{n_1,\dots ,n_m\}\).

\(\square \)

The previous lemma allows us easily to prove an approximation version of the basic enclosure result Proposition 2.1.

Proposition 3.5

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A strongly. For \(0<\varepsilon < \frac{1}{\Vert A^{-1}\Vert }\) and \(\delta >\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }\) there exists \(n_0\in {\mathbb {N}}\) such that

$$\begin{aligned} \sigma _\varepsilon (A)\subset \left( B_\delta (W(A_n^{-1}))\right) ^{-1} \quad \text {for all}\quad n\ge n_0. \end{aligned}$$

Proof

By Proposition 2.1 we have

$$\begin{aligned} \sigma _\varepsilon (A)\subset \left( B_{\delta '}(W(A^{-1}))\right) ^{-1} \end{aligned}$$

where \(\delta '=\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }\). Since \(\delta -\delta '>0\), Lemma 3.4 yields a constant \(n_0\in {\mathbb {N}}\) such that

$$\begin{aligned} W(A^{-1})\subset B_{\delta -\delta '}\left( W(A_n^{-1})\right) , \qquad n\ge n_0. \end{aligned}$$

Consequently \(B_{\delta '}(W(A^{-1}))\subset B_\delta (W(A_n^{-1}))\) for \(n\ge n_0\) and the proof is complete. \(\square \)

Combining the previous proposition with shifts of the operator, we get our second main result. It is analogous to Theorem 2.2, but provides an enclosure of the pseudospectrum of the infinite-dimensional operator in terms of numerical ranges of the approximating matrices.

Theorem 3.6

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A strongly. Let the shifts \(s_1,\dots ,s_m\in \varrho (A)\) be such that

$$\begin{aligned} \sup _{n\in {\mathbb {N}}}\Vert (A_n-s_j)^{-1}\Vert <\infty \quad \text {for all}\quad j=1,\dots ,m. \end{aligned}$$

Let \(0<\varepsilon < \frac{1}{\max _{j=1,\dots ,m}\Vert (A-s_j)^{-1}\Vert }\) and \(\delta _j>\frac{\Vert (A-s_j)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s_j)^{-1}\Vert \varepsilon }\) for all j. Then there exists \(n_0\in {\mathbb {N}}\) such that

$$\begin{aligned} \sigma _\varepsilon (A)\subset \bigcap _{j=1}^m \left[ \left( B_{\delta _j}(W((A_n-s_j)^{-1}))\right) ^{-1}+s_j\right] \quad \text {for all}\quad n\ge n_0. \end{aligned}$$

Proof

In view of Lemma 3.2, Proposition 3.5 can be applied to every \(A-s_j\). Hence there exists \(n_j\in {\mathbb {N}}\) such that

$$\begin{aligned} \sigma _\varepsilon (A-s_j)\subset \left( B_{\delta _j}(W((A_n-s_j)^{-1}))\right) ^{-1}, \qquad n\ge n_j. \end{aligned}$$

Since \(\sigma _\varepsilon (A)=\sigma _\varepsilon (A-s_j)+s_j\), the claim follows with \(n_0=\max \{n_1,\dots ,n_m\}\). \(\square \)

4 A Uniform Approximation Scheme

In this section we pose additional assumptions on the approximations \(A_n\) of the infinite-dimensional operator A, that will allow us to estimate the starting index \(n_0\) for which the pseudospectrum enclosures from Proposition 3.5 and Theorem 3.6 hold on bounded sets.

Throughout this section we assume that A has a compact resolvent, \(0\in \varrho (A)\) and that \({\mathcal {D}}(A)\subset W\subset H\) where the Hilbert space W is continuously and densely embedded into H. The closed graph theorem then implies \(A^{-1}\in {\mathcal {L}}(H,W)\). Further, we suppose that there is a sequence of approximations of the operator A in the following sense:

  1. (a)

    \(U_n\subset H\), \(n\in {\mathbb {N}}\), are finite-dimensional subspaces of H.

  2. (b)

    There exist projections \(P_n\in {\mathcal {L}}(H)\) onto \(U_n\), \(n\in {\mathbb {N}}\), not necessarily orthogonal, with \(\sup _{n\in {\mathbb {N}}} \Vert P_n\Vert <\infty \) and \(\Vert (I-P_n)|_W\Vert _{{\mathcal {L}}(W,H)}\rightarrow 0\) as \(n\rightarrow \infty \).

  3. (c)

    There exist invertible operators \(A_n\in {\mathcal {L}}(U_n)\), \(n\in {\mathbb {N}}\), such that \(\Vert A^{-1}-A^{-1}_nP_n\Vert \rightarrow 0\) as \(n\rightarrow \infty \).

We say that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly. For \(\Vert (I-P_n)|_W\Vert _{{\mathcal {L}}(W,H)}\) we will write abbreviatory \(\Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}\).

Remark 4.1

  1. (a)

    Property (c) already implies that A has compact resolvent: indeed \(A^{-1}\) is the uniform limit of the finite rank operators \(A_n^{-1}P_n\) and hence compact.

  2. (b)

    If \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly, then also strongly. Note here that from (b) we first obtain \(P_nx\rightarrow x\) for \(x\in W\), which can then be extended to all \(x\in H\) by the density of W in H and the uniform boundedness of the \(P_n\). One particular consequence of the strong approximation is

    $$\begin{aligned} \sup _{n\in {\mathbb {N}}} \Vert A^{-1}_n\Vert <\infty , \end{aligned}$$

    see Lemma 3.1.

  3. (c)

    Property (c) amounts to the convergence of \(A_n\) to A in generalized norm resolvent sense, see [2, 3, 26] for this notion. Note however that our setting has the additional assumption that \(P_n\rightarrow I\) uniformly in L(WH) where \({\mathcal {D}}(A)\subset W\subset H\). For generalized norm resolvent convergence this is not the case, but it will be a crucial element in the following proofs.

In order to obtain improved enclosures of the pseudospectrum under a uniform approximation scheme, that is, additional estimates of the starting index \(n_0\) for which the pseudospectrum enclosures from Proposition 3.5 and Theorem 3.6 hold on bounded sets, we refine the results from Sect. 2 in terms of certain subsets of the full numerical range of \(A^{-1}\). For \(d>0\) we define

$$\begin{aligned} W(A^{-1},d)= \left\{ \langle A^{-1}x,x\rangle \,\big |\,\Vert x\Vert =1,\,x\in W,\, \Vert x\Vert _W\le d\right\} . \end{aligned}$$
(11)

Clearly \(W(A^{-1},d)\subset W(A^{-1})\). Moreover since W is dense in H we get

$$\begin{aligned} \overline{\bigcup _{d>0}W(A^{-1},d)}=\overline{W(A^{-1})}. \end{aligned}$$
(12)

Proposition 4.2

Let \(L>0\) and \(d=L\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}\). Then

  1. (a)

    \(\sigma (A)\cap \overline{B_L(0)}\subset W(A^{-1},d)^{-1}\).

  2. (b)

    If in addition \(0<\varepsilon < \frac{1}{\Vert A^{-1}\Vert }\), \(L>\varepsilon \) and \(\delta =\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }\) then

    $$\begin{aligned} \sigma _\varepsilon (A)\cap \overline{B_{L-\varepsilon }(0)} \subset \left( B_\delta (W(A^{-1},d))\right) ^{-1}. \end{aligned}$$

Proof

  1. (a)

    Let \(\lambda \in \sigma (A)\) with \(|\lambda |\le L\). Then there exists \(x\in {\mathcal {D}}(A)\) with \(\Vert x\Vert =1\) and \(Ax=\lambda x\). This implies

    $$\begin{aligned} \frac{1}{|\lambda |}\Vert x\Vert _W= \Vert A^{-1}x\Vert _W\le \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} \Vert x\Vert =\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} \end{aligned}$$

    and thus we obtain

    $$\begin{aligned} \Vert x\Vert _W \le \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}|\lambda |\le L \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}=d. \end{aligned}$$

    Consequently \(\lambda ^{-1}= \langle A^{-1}x,x\rangle \in W(A^{-1},d)\).

  2. (b)

    The proof is similar to the one of Proposition 2.1. We set

    $$\begin{aligned} U=\left( B_\delta (W(A^{-1},d))\right) ^{-1} \end{aligned}$$

    and first show

    $$\begin{aligned} \Vert (A-\lambda )x\Vert \ge \varepsilon \quad \text {for all}\quad \lambda \in \overline{B_{L-\varepsilon }(0)}{\setminus } U,\,x\in {\mathcal {D}}(A),\,\Vert x\Vert =1. \end{aligned}$$
    (13)

    Let \(\lambda \in \overline{B_{L-\varepsilon }(0)}{\setminus } U\), \(x\in {\mathcal {D}}(A)\), \(\Vert x\Vert =1\). We consider three cases. Suppose first that \(|\lambda |>\frac{1}{\delta +\Vert A^{-1}\Vert }\) and \(\Vert x\Vert _W\le d\). From \(\lambda \not \in U\) we obtain \({{\,\mathrm{dist}\,}}(\lambda ^{-1},W(A^{-1},d))\ge \delta \), which implies

    $$\begin{aligned} \delta \le |\lambda ^{-1}-\langle A^{-1}x,x\rangle | =|\langle (\lambda ^{-1}-A^{-1})x,x\rangle | \le \Vert (\lambda ^{-1}-A^{-1})x\Vert \end{aligned}$$

    and thus

    $$\begin{aligned} \Vert (A-\lambda )x\Vert \ge \frac{|\lambda |}{\Vert A^{-1}\Vert }\Vert (\lambda ^{-1}-A^{-1})x\Vert \ge \frac{\delta }{\Vert A^{-1}\Vert (\delta +\Vert A^{-1}\Vert )}=\varepsilon . \end{aligned}$$

    In the second case assume \(\Vert x\Vert _W\ge d\). Then

    $$\begin{aligned} d\le \Vert x\Vert _W \le \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} \Vert Ax\Vert , \end{aligned}$$

    which in view of \(\lambda \in \overline{B_{L-\varepsilon }(0)}\) implies

    $$\begin{aligned} \Vert (A-\lambda )x\Vert \ge \Vert Ax\Vert -|\lambda | \ge \frac{d}{\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} }-|\lambda |=L-|\lambda | \ge \varepsilon . \end{aligned}$$

    Finally if \(|\lambda |\le \frac{1}{\delta +\Vert A^{-1}\Vert }\), the same reasoning as in the proof of Proposition 2.1 yields once again that \(\Vert (A-\lambda )x\Vert \ge \varepsilon \), and therefore (13) is proved. Now, since A has a compact resolvent (13) implies that

    $$\begin{aligned} \lambda \in \overline{B_{L-\varepsilon }(0)}{\setminus } U \quad \Rightarrow \quad \lambda \in \varrho (A),\, \Vert (A-\lambda )^{-1}\Vert \le \frac{1}{\varepsilon }. \end{aligned}$$

    Consequently \(\sigma _\varepsilon (A)\cap \overline{B_{L-\varepsilon }(0)}\subset U\).

\(\square \)

From Proposition 4.2 we get again a shifted version:

Theorem 4.3

Let \(S\subset \varrho (A)\) be such that

$$\begin{aligned} M_0:=\sup _{s\in S}\Vert (A-s)^{-1}\Vert<\infty ,\qquad M_1:=\sup _{s\in S}\Vert (A-s)^{-1}\Vert _{{\mathcal {L}}(H,W)}<\infty . \end{aligned}$$

For \(0<\varepsilon < \frac{1}{M_0}\), \(L>\varepsilon \), \(d=LM_1\) and \(\delta _s=\frac{\Vert (A-s)^{-1}\Vert ^2\varepsilon }{1-\Vert (A-s)^{-1}\Vert \varepsilon }\) we get the inclusion

$$\begin{aligned} \sigma _\varepsilon (A)\cap \bigcap _{s\in S}\overline{B_{L-\varepsilon }(s)} \subset \bigcap _{s\in S}\left[ \left( B_{\delta _s}(W((A-s)^{-1},d))\right) ^{-1}+s\right] . \end{aligned}$$

Proof

Apply Proposition 4.2(b) to \(A-s\) for all \(s\in S\) and note that

$$\begin{aligned} \lambda \in \sigma _\varepsilon (A-s)\cap \overline{ B_{L-\varepsilon }(0)} \quad \Leftrightarrow \quad \lambda +s\in \sigma _\varepsilon (A)\cap \overline{B_{L-\varepsilon }(s)}. \end{aligned}$$

\(\square \)

Remark 4.4

By the continuity of the embedding \(W\hookrightarrow H\), the condition \(M_1<\infty \) already implies \(M_0<\infty \).

For a uniform approximation scheme, the numerical range of \(A^{-1}\) can now be approximated with explicit control on the starting index \(n_0\):

Lemma 4.5

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly. Let

$$\begin{aligned} C_0=\sup _{n\in {\mathbb {N}}}\left( \Vert A^{-1}_n\Vert \Vert P_n\Vert + 6 \Vert A^{-1}_n\Vert \Vert P_n\Vert ^2\right) . \end{aligned}$$
(14)
  1. (a)

    If \(d>0\), \(0<\delta \le \frac{C_0}{2}\) and \(n_0\in {\mathbb {N}}\) are such that for every \(n\ge n_0\)

    $$\begin{aligned}\Vert A^{-1}- A^{-1}_n P_n\Vert + d C_0 \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}<\delta ,\end{aligned}$$

    then

    $$\begin{aligned} W(A^{-1},d)\subset B_\delta (W(A^{-1}_n)), \qquad n\ge n_0. \end{aligned}$$
  2. (b)

    If \(\delta >0\) and \(n_0\in {\mathbb {N}}\) are such that for every \(n\ge n_0\) we have \(\Vert A^{-1}- A^{-1}_n P_n\Vert <\delta \), then

    $$\begin{aligned} W(A^{-1}_n)\subset B_\delta (W(A^{-1})),\qquad n\ge n_0. \end{aligned}$$

Proof

Let \(x\in W\) with \(\Vert x\Vert =1\) and \(\Vert x\Vert _W\le d\). Then we obtain

$$\begin{aligned}&|\langle A^{-1}x,x\rangle - \langle A^{-1}_n P_nx,P_nx\rangle |\\&\quad \le |\langle A^{-1}x- A^{-1}_n P_nx,x\rangle | + |\langle A^{-1}_n P_nx,x-P_nx\rangle |\\&\quad \le \Vert A^{-1}- A^{-1}_n P_n\Vert \Vert x\Vert ^2 + \Vert A^{-1}_n\Vert \Vert P_n\Vert \Vert x\Vert \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)} \Vert x \Vert _W\\&\quad \le \Vert A^{-1}- A^{-1}_n P_n\Vert + d \Vert A^{-1}_n\Vert \Vert P_n\Vert \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}. \end{aligned}$$

as well as

$$\begin{aligned} |1-\Vert P_nx\Vert | \le \Vert x-P_n x\Vert \le \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}\Vert x\Vert _W \le d\Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}. \end{aligned}$$

Let \(n\ge n_0\). Then

$$\begin{aligned} |1-\Vert P_nx\Vert |\le d\Vert I-P_n\Vert _{{\mathcal {L}}(W,H)} <\frac{\delta }{C_0}\le \frac{1}{2} \end{aligned}$$

and hence \(\Vert P_nx\Vert \ge \frac{1}{2}\). Let \(x_n= \frac{P_nx}{\Vert P_nx\Vert }\). Then \(\Vert x_n\Vert =1\) and

$$\begin{aligned} \left| 1-\frac{1}{\Vert P_nx\Vert ^2}\right|&= \left| \frac{\Vert P_nx\Vert ^2-1}{\Vert P_nx\Vert ^2}\right| \\&= \frac{(\Vert P_nx\Vert +1)|\Vert P_nx\Vert -1|}{\Vert P_nx\Vert ^2}\\&= \left( \frac{1}{\Vert P_nx\Vert }+\frac{1}{\Vert P_nx\Vert ^2}\right) |1-\Vert P_nx\Vert |\\&\le 6 |1-\Vert P_nx\Vert |\\&\le 6d \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}. \end{aligned}$$

This implies

$$\begin{aligned}&|\langle A^{-1}_n P_nx,P_nx\rangle - \langle A^{-1}_nx_n,x_n\rangle |\\&\quad =\left| \langle A^{-1}_n P_nx,P_nx\rangle - \frac{\langle A^{-1}_nP_n x,P_n x\rangle }{\Vert P_nx\Vert ^2}\right| \\&\quad = \left| 1 - \frac{1}{\Vert P_nx\Vert ^2}\right| |\langle A^{-1}_n P_nx,P_nx\rangle |\\&\quad \le 6 d \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)} \Vert A^{-1}_n\Vert \Vert P_n\Vert ^2, \end{aligned}$$

and thus for \(n\ge n_0\) we arrive at

$$\begin{aligned}&|\langle A^{-1}x,x\rangle -\langle A^{-1}_n x_n,x_n\rangle |\\&\quad \le \Vert A^{-1}- A^{-1}_n P_n\Vert + d \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}( \Vert A^{-1}_n\Vert \Vert P_n\Vert +6\Vert A_n^{-1}\Vert \Vert P_n\Vert ^2)\\&\quad \le \Vert A^{-1}- A^{-1}_n P_n\Vert + d C_0 \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}\\&\quad <\delta . \end{aligned}$$

This yields \(\langle A^{-1}x,x\rangle \in B_\delta (W(A^{-1} _n))\) if \(n\ge n_0\) and proves (a).

In order to show part (b), let \(x\in U_n\) with \(\Vert x\Vert =1\). As \(x=P_nx\) we have

$$\begin{aligned} |\langle A^{-1}_nx,x\rangle - \langle A^{-1}x,x\rangle |&\le \Vert A^{-1}_n x- A^{-1}x\Vert \Vert x\Vert \\&= \Vert A^{-1}_nP_n x- A^{-1}x\Vert \le \Vert A^{-1}- A^{-1}_n P_n\Vert . \end{aligned}$$

Thus \(\langle A^{-1}_nx,x\rangle \in B_\delta (W(A^{-1}))\) for \(n\ge n_0\). \(\square \)

Corollary 4.6

If \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly, then

$$\begin{aligned} \overline{W(A^{-1})}=\{ \lambda \in {\mathbb {C}}\mid \exists (\lambda _n)_{n\in {\mathbb {N}}} \text { with } \lambda _n\in W(A_n^{-1}) \text { and }\lim _{n\rightarrow \infty } \lambda _n= \lambda \} \end{aligned}$$

or, equivalently,

$$\begin{aligned} \displaystyle \overline{W(A^{-1})}=\bigcap _{m\in {\mathbb {N}}} \overline{\bigcup _{n\ge m} W(A_n^{-1})}. \end{aligned}$$

Proof

We first show the inclusion “\(\supset \)”. Let \((\lambda _n)_{n\in {\mathbb {N}}}\) be a convergent sequence in \({\mathbb {C}}\) with \(\lambda _n\in W(A_n^{-1})\) and define \(\lambda =\lim _{n\rightarrow \infty } \lambda _n\). Let \(\delta >0\) be arbitrary. Lemma 4.5(b) implies that there exists \(n_0\in {\mathbb {N}}\) such that \(\lambda _n\in B_\delta (W(A^{-1}))\) for every \(n\ge n_0\). This implies \(\lambda \in B_\delta (W(A^{-1}))\) for every \(\delta >0\), and thus \(\lambda \in \overline{W(A^{-1})}\).

Conversely, let \(\lambda \in W(A^{-1},d)\) for some \(d>0\). Using Lemma 4.5(a), we can construct a sequence \((\lambda _n)_{n\in {\mathbb {N}}}\) in \({\mathbb {C}}\) with \(\lambda _n\in W(A_n^{-1})\) and \(\lambda =\lim _{n\rightarrow \infty } \lambda _n\). The statement now follows from (12). \(\square \)

The last result shows that \(\overline{W(A^{-1})}\) can be represented as the pointwise limit of the finite-dimensional numerical ranges \(W(A_n^{-1})\). Lemma 4.5 even yields a uniform approximation, but this is asymmetric, since one inclusion only holds for the restricted numerical range \(W(A^{-1},d)\). A more symmetric result is discussed in the next remark:

Remark 4.7

If \(U_n\subset W\) for some \(n\in {\mathbb {N}}\) then, due to the fact that the space \(U_n\) is finite-dimensional,

$$\begin{aligned} d_n:=\sup _{x\in U_n}\frac{\Vert x\Vert _W}{\Vert x\Vert }<\infty . \end{aligned}$$

Using the same reasoning as in the proof of Lemma 4.5(b), we then obtain

$$\begin{aligned} W(A^{-1}_n)\subset B_\delta (W(A^{-1}, d_n)) \end{aligned}$$

if \(\Vert A^{-1}- A^{-1}_n P_n\Vert <\delta \).

Note however that for finite element discretization schemes the condition \(U_n\subset W\) will usually not be fulfilled. In our examples for instance \(U_n\) are piecewise linear finite elements while \(W\subset H^2(\Omega )\) is a second order Sobolev space, and thus \(U_n\not \subset W\).

Under a uniform approximation scheme the pseudospectrum can be approximated as follows.

Proposition 4.8

Suppose that \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly. Let

$$\begin{aligned} r>0,\quad 0<\varepsilon< \frac{1}{\Vert A^{-1}\Vert } \quad \text {and}\quad \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }<\delta \le \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon } + \frac{7}{2}\Vert A^{-1}\Vert . \end{aligned}$$

If we choose \(n_0\in {\mathbb {N}}\) such that for every \(n\ge n_0\)

$$\begin{aligned} \Vert A^{-1}- A^{-1}_n P_n\Vert + (r+\varepsilon ) \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} C_0 \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)} <\delta - \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }, \end{aligned}$$

where \(C_0\) is defined in (14), then we obtain

$$\begin{aligned} \sigma _\varepsilon (A)\cap \overline{B_{r}(0)} \subset \left( B_\delta (W(A^{-1}_n))\right) ^{-1} \quad \text {for all}\quad n\ge n_0. \end{aligned}$$

Proof

Let \(\delta '= \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }\), \(L=r+\varepsilon \) and \(d=L \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}\). Proposition 4.2 implies

$$\begin{aligned} \sigma _\varepsilon (A)\cap \overline{B_{r}(0)} \subset (B_{\delta '}(W(A^{-1},d)))^{-1}. \end{aligned}$$

Next note that

$$\begin{aligned} \delta -\delta '&\le \frac{7}{2}\Vert A^{-1}\Vert =\lim _{n\rightarrow \infty }\frac{7}{2}\Vert A_n^{-1}P_n\Vert \\&\le \frac{1}{2}\limsup _{n\rightarrow \infty }\left( \Vert A_n^{-1}\Vert \Vert P_n\Vert +6\Vert A_n^{-1}\Vert \Vert P_n\Vert ^2\right) \le \frac{C_0}{2}, \end{aligned}$$

because \(P_n\) is a projection. We can therefore apply Lemma 4.5 with \(\delta \) replaced by \(\delta -\delta '\) and \(n_0\) chosen as stated above and obtain

$$\begin{aligned} W(A^{-1},d)\subset B_{\delta -\delta '}(W(A_n^{-1})) \quad \text {for}\quad n\ge n_0 \end{aligned}$$

and hence the assertion. \(\square \)

5 Finite Element Discretization of Elliptic Partial Differential Operators

As an example for a uniform approximation scheme defined in Sect. 4 we now consider finite element discretizations. We use the standard textbook approach via form methods, which can be found e.g. in [1, 21].

Let V and H be Hilbert spaces with \(V\subset H\) densely and continuously embedded. In particular there is a constant \(c>0\) such that

$$\begin{aligned} \Vert x\Vert \le c\Vert x\Vert _V,\qquad x\in V. \end{aligned}$$
(15)

Moreover, we consider a bounded and coercive sesquilinear form \(a:V\times V\rightarrow {\mathbb {C}}\), that is, there exists constants \(M, \gamma >0\) such that

$$\begin{aligned} |a(x,y)|\le M \Vert x\Vert _V \Vert y\Vert _V \quad \text{ and }\quad {{\,\mathrm{Re}\,}}a(x,x)\ge \gamma \Vert x\Vert ^2_V, \quad x,y\in V. \end{aligned}$$
(16)

Let \(A:{\mathcal {D}}(A)\subset H\rightarrow H\) be the operator associated with a, which is given by

$$\begin{aligned} {\mathcal {D}}(A)&= \{ x\in V\mid \exists c_x>0: |a(x,y)|\le c_x\Vert y\Vert \text{ for } y\in V\},\\ a(x,y)&= \langle Ax,y\rangle , \quad x\in D(A),y\in V. \end{aligned}$$

Then A is a densely defined, closed operator with \(0\in \varrho (A)\) and \(\Vert A^{-1}\Vert \le \frac{c^2}{\gamma }\), where \(c>0\) is the constant from (15).

Let \((U_n)_{n\in {\mathbb {N}}}\) be a sequence of finite-dimensional subspaces of V which are nested, that is \(U_n\subset U_{n+1}\). We denote by \(a_n=a|_{U_n}\) the restriction of a from V to \(U_n\). The form \(a_n\) is again bounded and coercive with the same constants M and \(\gamma \). Let \(A_n\in {\mathcal {L}}(U_n)\) be the operator associated with \(a_n\), i.e.

$$\begin{aligned} a(x,y) =\langle A_n x,y\rangle , \qquad x, y \in U_n. \end{aligned}$$

Then again \(0\in \varrho (A_n)\) and \(\Vert A^{-1}_n\Vert \le \frac{c^2}{\gamma }\). Let \(P_n\in {\mathcal {L}}(H)\) be the orthogonal projection onto \(U_n\). Thus \(\Vert P_n\Vert =1\) and \(A_n=P_n A_{n+1}|_{U_n}\), that is, \(A_n\) is a compression of \(A_{n+1}\).

To obtain a uniform approximation scheme, we now consider an additional Hilbert space W which is densely and continuously embedded into H such that \({\mathcal {D}}(A)\subset W\subset V\). We assume that there exists a sequence of operators \(Q_n\in {\mathcal {L}}(W,V)\) with \({\mathcal {R}}(Q_n)\subset U_n\) and

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)}=0. \end{aligned}$$
(17)

Lemma 5.1

For all \(n\in {\mathbb {N}}\) the estimates

$$\begin{aligned} \Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}&\le c \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)}, \\ \Vert A^{-1}- A^{-1}_n P_n\Vert&\le \frac{cM}{\gamma } \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)} \end{aligned}$$

hold. In particular, the family \((P_n,A_n)_{n\in {\mathbb {N}}}\) approximates A uniformly.

Proof

For \(w\in W\) we calculate

$$\begin{aligned} \Vert w-P_nw\Vert&= \inf _{u\in U_n}\Vert w-u\Vert \le \Vert w-Q_nw\Vert \le c\Vert w-Q_nw\Vert _V\\&\le c \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)} \Vert w\Vert _W, \end{aligned}$$

which shows the first assertion. Moreover, for \(f\in H\) we set \(x=A^{-1}f\) and \(x_n=A_n^{-1}P_nf\). Then we obtain

$$\begin{aligned} a(x,y)&= \langle Ax,y\rangle = \langle f,y\rangle ,\quad y\in V,\\ a(x_n,u)&=\langle A_nx_n,u\rangle = \langle P_nf,u\rangle = \langle f,u\rangle ,\quad u\in U_n. \end{aligned}$$

Using the Lemma of Cea [21, Theorem VII.5.A], we find

$$\begin{aligned} \Vert A^{-1}f-A_n^{-1}P_nf\Vert&=\Vert x-x_n\Vert \le c\Vert x-x_n\Vert _V\le \frac{cM}{\gamma } \inf _{u\in U_n}\Vert x-u\Vert _V\\&\le \frac{cM}{\gamma } \Vert x-Q_n x\Vert _V \le \frac{cM}{\gamma } \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)}\Vert x\Vert _W \\&\le \frac{cM}{\gamma } \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)} \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}\Vert f\Vert , \end{aligned}$$

which implies the second assertion. \(\square \)

Theorem 5.2

Let A be the operator associated with the coercive form a and let \(A_n\), \(Q_n\) be as above. Let

$$\begin{aligned} r>0,\quad 0<\varepsilon< \frac{1}{\Vert A^{-1}\Vert } \quad \text {and}\quad \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }<\delta \le \frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon } + \frac{7}{2}\Vert A^{-1}\Vert . \end{aligned}$$

If \(n_0\in {\mathbb {N}}\) is such that for every \(n\ge n_0\)

$$\begin{aligned} \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)}< \frac{\delta -\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }}{c\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}\left( \frac{M}{\gamma } + (r+\varepsilon )\frac{7c^2}{\gamma }\right) }, \end{aligned}$$

then

$$\begin{aligned} \sigma _\varepsilon (A)\cap \overline{B_{r}(0)} \subset \left( B_\delta (W(A^{-1}_n))\right) ^{-1} \quad \text {for all}\quad n\ge n_0. \end{aligned}$$

Proof

We check that the conditions of Proposition 4.8 are satisfied: Using Lemma 5.1, we estimate for \(n\ge n_0\) and with \(C_0\) from (14),

$$\begin{aligned}&\Vert A^{-1}-A_n^{-1}P_n\Vert +(r+\varepsilon )\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}C_0\Vert I-P_n\Vert _{{\mathcal {L}}(W,H)}\\&\quad \le c \Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)}\Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)} \left( \frac{M}{\gamma } + (r+\varepsilon ) \frac{7c^2}{\gamma } \right) <\delta -\frac{\Vert A^{-1}\Vert ^2\varepsilon }{1-\Vert A^{-1}\Vert \varepsilon }. \end{aligned}$$

\(\square \)

Example 5.3

Let \(\Omega \subset {\mathbb {R}}^ 2\) be a bounded, open, convex domain with polygonal boundary \(\Gamma \) and \(\Gamma _D\subset \Gamma \) a union of polygons of \(\Gamma \). Let

$$\begin{aligned} V=H_0^1(\Omega ), \end{aligned}$$

equipped with the \(H^1\)-norm. On V we consider the sesquilinear form

$$\begin{aligned} a(u,v)=\int _\Omega \left( \sum _{i,j=1}^2 a_{ij} u_{x_i}{\overline{v}}_{x_j}+ \sum _{i=1}^2 b_{i} u_{x_i}{\overline{v}}+cu{\overline{v}}\right) dx, \end{aligned}$$
(18)

where \(a_{ij}\in C^{0,1}({\overline{\Omega }})\) and \(b_i,c\in L^\infty (\Omega )\). We suppose that a is coercive and uniformly elliptic. Let \(\{\mathcal {T}_n\}_{n\in {\mathbb {N}}}\) be a family of nested, admissible and quasi-uniform triangulations of \(\Omega \) satisfying \(\sup _{T\in \mathcal {T}_n} \mathrm{diam}(T)\le \frac{1}{n}\). Let

$$\begin{aligned} W= H^2(\Omega )\cap H_0^1(\Omega ), \end{aligned}$$

equipped with the \(H^2\)-norm, and

$$\begin{aligned} U_n=\left\{ u\in C^0(\overline{\Omega })\,\big |\, u|_T\in {\mathbb {P}}_1(T), T\in \mathcal {T}_n, u|_{\Gamma }=0\right\} ,\quad n\in {\mathbb {N}}. \end{aligned}$$

Here \({\mathbb {P}}_1(T)\) denotes the set of polynomials of degree 1 on the triangle T. We get \(U_n\subset V\). Moreover, the operator A associated with a is given by

$$\begin{aligned} Au&=-\sum _{i,j=1}^2 \partial _{x_j}(a_{ij} u_{x_i}) + \sum _{i=1}^2 b_{i} u_{x_i}+cu,\\ {\mathcal {D}}(A)&=W. \end{aligned}$$

For the proof of \({\mathcal {D}}(A)=W\) we refer to [10, Theorem 3.2.1.2 and §2.4.2].

By the Sobolev embedding theorem we have \(H^2(\Omega )\hookrightarrow C^0(\overline{\Omega })\). For \(u\in W\) we define \(Q_nu\) as the unique element of \(U_n\) satisfying \((Q_nu)(x)=u(x)\) for every vertex of the triangulation \(\mathcal {T}_n\). Then \(Q_n\in {\mathcal {L}}(W,V)\) with \({\mathcal {R}}(Q_n)\subset U_n\). Moreover, [1, Theorem 9.27] implies that there is a constant \(K>0\) such that

$$\begin{aligned} \Vert I-Q_n\Vert _{{\mathcal {L}}(W,V)}\le \frac{K}{n},\qquad n\in {\mathbb {N}}. \end{aligned}$$

We conclude that Theorem 5.2 can be applied in this example with \(n_0\in {\mathbb {N}}\) chosen such that

$$\begin{aligned} n_0>\frac{Kc\Vert A^{-1}\Vert _{{\mathcal {L}}(H,W)} \left( \frac{M}{\gamma } + (r+\varepsilon )\frac{7c^2}{\gamma }\right) }{\delta -2\Vert A^{-1}\Vert ^2\varepsilon }. \end{aligned}$$

Note that in Example 5.3 we can also consider \(\Omega \) to be an open interval in \({\mathbb {R}}\). All results continue to hold in an analogous way.

6 Discretization of a Structured Block Operator Matrix

In this section we investigate discretizations of a certain kind of block operator matrices. We consider block matrices of the form

$$\begin{aligned} {\mathcal {A}}=\begin{pmatrix}A&{}\quad B\\ B^*&{}\quad D\end{pmatrix} \end{aligned}$$

where A is a closed, densely defined operator \(A:{\mathcal {D}}(A)\subset H\rightarrow H\) on the Hilbert space H, and \(B,D\in {\mathcal {L}}(H)\). Then the block matrix \({\mathcal {A}}\) is a closed, densely defined operator on the product space \(H\times H\) with domain \({\mathcal {D}}({\mathcal {A}})={\mathcal {D}}(A)\times H\). Additionally we assume that \(0\in \varrho (A)\), \(0\in \varrho (D)\) and that both A and \(-D\) are uniformly accretive, i.e., there exist constants \(\gamma _A,\gamma _D>0\) such that

$$\begin{aligned} {{\,\mathrm{Re}\,}}\langle Ax,x\rangle\ge & {} \gamma _A\Vert x\Vert ^2, \qquad x\in {\mathcal {D}}(A), \end{aligned}$$
(19)
$$\begin{aligned} {{\,\mathrm{Re}\,}}\langle Dx,x\rangle\le & {} -\gamma _D\Vert x\Vert ^2, \qquad x\in H. \end{aligned}$$
(20)

In the next lemma we show that under the above assumptions there is a gap in the spectrum of \({\mathcal {A}}\) along the imaginary axis, and we also prove an estimate for the norm of the resolvent. Similar results were obtained in [16, 17] under the additional assumption that A is sectorial and, in [17], without the condition that B and D are bounded. However, no corresponding resolvent estimates were shown. We remark that the boundedness of D is not essential in Lemma 6.1 but will be used thereafter.

Lemma 6.1

We have \(\left\{ \lambda \in {\mathbb {C}}\,\big |\,-\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A\right\} \subset \varrho ({\mathcal {A}})\) and

$$\begin{aligned} \Vert ({\mathcal {A}}-\lambda )^{-1}\Vert \le \frac{1}{\min \{\gamma _A-{{\,\mathrm{Re}\,}}\lambda , \gamma _D+{{\,\mathrm{Re}\,}}\lambda \}},\quad -\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A. \end{aligned}$$

Proof

Consider the block operator matrix

$$\begin{aligned} J=\begin{pmatrix}I&{}\quad 0\\ 0&{}\quad -\,I\end{pmatrix}. \end{aligned}$$

A simple calculation shows that for \(\lambda \in U:=\left\{ \lambda \in {\mathbb {C}}\,\big |\,-\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A\right\} \) and \(x\in {\mathcal {D}}(A)\), \(y\in H\),

$$\begin{aligned}&{{\,\mathrm{Re}\,}}\left\langle J({\mathcal {A}}-\lambda )\begin{pmatrix}x\\ y\end{pmatrix},\begin{pmatrix}x\\ y\end{pmatrix}\right\rangle \\&\quad ={{\,\mathrm{Re}\,}}\bigl (\langle (A-\lambda )x,x\rangle +\langle By,x\rangle -\langle B^*x,y\rangle -\langle (D-\lambda )y,y\rangle \bigr )\\&\quad ={{\,\mathrm{Re}\,}}\langle (A-\lambda )x,x\rangle -{{\,\mathrm{Re}\,}}\langle (D-\lambda )y,y\rangle \\&\quad \ge (\gamma _A-{{\,\mathrm{Re}\,}}\lambda )\Vert x\Vert ^2+(\gamma _D+{{\,\mathrm{Re}\,}}\lambda )\Vert y\Vert ^2\\&\quad \ge c_\lambda \left\| \begin{pmatrix}x\\ y\end{pmatrix}\right\| ^2, \end{aligned}$$

where \(c_\lambda =\min \{\gamma _A-{{\,\mathrm{Re}\,}}\lambda ,\gamma _D+{{\,\mathrm{Re}\,}}\lambda \}\). It follows that

$$\begin{aligned} \Vert J({\mathcal {A}}-\lambda )v\Vert \Vert v\Vert \ge |\langle J({\mathcal {A}}-\lambda )v,v\rangle | \ge c_\lambda \Vert v\Vert ^2, \qquad v\in {\mathcal {D}}({\mathcal {A}}), \end{aligned}$$

and therefore, since \(\Vert Jw\Vert =\Vert w\Vert \) for all \(w\in H\times H\),

$$\begin{aligned} \Vert ({\mathcal {A}}-\lambda )v\Vert \ge c_\lambda \Vert v\Vert ,\qquad v\in {\mathcal {D}}({\mathcal {A}}). \end{aligned}$$
(21)

In particular \(\lambda \not \in \sigma _{\mathrm {app}}({\mathcal {A}})\), i.e., \(U\cap \sigma _{\mathrm {app}}({\mathcal {A}})=\varnothing \). The adjoint of \({\mathcal {A}}\) is the block operator matrix

$$\begin{aligned} {\mathcal {A}}^*=\begin{pmatrix}A^*&{}\quad B\\ B^*&{}\quad D^*\end{pmatrix}, \end{aligned}$$

which also satisfies the assumptions of this lemma. Indeed, (20) obviously also holds for \(D^*\). Moreover, the uniform accretivity (19) of A together with \(0\in \varrho (A)\) imply that \(A-\gamma _A\) is m-accretive, see [13, §V.3.10]. This in turn yields that \(A^*-\gamma _A\) is m-accretive too and hence

$$\begin{aligned} {{\,\mathrm{Re}\,}}\langle A^*x,x\rangle \ge \gamma _A\Vert x\Vert ^2,\qquad x\in {\mathcal {D}}(A^*). \end{aligned}$$

It follows that (21) also holds for \({\mathcal {A}}^*\). In particular \(\ker {\mathcal {A}}^*=\{0\}\) or, equivalently, \({\mathcal {R}}({\mathcal {A}})\subset H\times H\) is dense. On the other hand, (21) implies that \(\ker {\mathcal {A}}=\{0\}\) and that \({\mathcal {R}}({\mathcal {A}})\) is closed. Consequently \({\mathcal {R}}({\mathcal {A}})=H\times H\) and therefore \(0\in \varrho ({\mathcal {A}})\). Using \(\partial \sigma ({\mathcal {A}})\subset \sigma _{\mathrm {app}}({\mathcal {A}})\) and the connectedness of the set U, we obtain \(U\subset \varrho ({\mathcal {A}})\). Now (21) implies \(\Vert ({\mathcal {A}}-\lambda )^{-1}\Vert \le 1/c_\lambda \) for all \(\lambda \in U\). \(\square \)

We consider approximations \({\mathcal {A}}_n\) of \({\mathcal {A}}\) of the form

$$\begin{aligned} {\mathcal {A}}_n=\begin{pmatrix}A_n&{}\quad B_n\\ B_n^*&{}\quad D_n\end{pmatrix} \end{aligned}$$

where

  1. (a)

    \((P_n,A_n)_{n\in {\mathbb {N}}}\) is a family which approximates A strongly in the sense of Sect. 3;

  2. (b)

    all projections \(P_n\) are orthogonal and all \(A_n\) are uniformly accretive with the same constant \(\gamma _A\) as in (19);

  3. (c)

    \(B_n=P_nB|_{U_n}\), \(D_n=P_nD|_{U_n}\) where \(U_n={\mathcal {R}}(P_n)\)

Lemma 6.2

  1. (a)

    \(\left\{ \lambda \in {\mathbb {C}}\,\big |\,-\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A\right\} \subset \varrho ({\mathcal {A}}_n)\) and

    $$\begin{aligned} \Vert ({\mathcal {A}}_n-\lambda )^{-1}\Vert \le \frac{1}{\min \{\gamma _A-{{\,\mathrm{Re}\,}}\lambda , \gamma _D+{{\,\mathrm{Re}\,}}\lambda \}},\quad -\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A. \end{aligned}$$
  2. (b)

    \(({\mathcal {P}}_n,{\mathcal {A}}_n)_{n\in {\mathbb {N}}}\) approximates \({\mathcal {A}}\) strongly where \({\mathcal {P}}_n={{\,\mathrm{diag}\,}}(P_n,P_n)\).

Proof

  1. (a)

    From

    $$\begin{aligned} \langle D_nx,x\rangle =\langle P_nDx,x\rangle =\langle Dx,x\rangle ,\quad x\in U_n, \end{aligned}$$

    it follows that \(-D_n\) is uniformly accretive with constant \(\gamma _D\) from (20). Consequently Lemma 6.1 can be applied to \({\mathcal {A}}_n\).

  2. (b)

    In view of (a) and Lemma 3.1 it suffices to show that for all \((x,y)\in {\mathcal {D}}(A)\times H\) there exist \((x_n,y_n)\in U_n\times U_n\) such that

    $$\begin{aligned} \lim _{n\rightarrow \infty }\begin{pmatrix}x_n\\ y_n\end{pmatrix}=\begin{pmatrix}x\\ y\end{pmatrix},\qquad \lim _{n\rightarrow \infty }{\mathcal {A}}_n\begin{pmatrix}x_n\\ y_n\end{pmatrix}={\mathcal {A}}\begin{pmatrix}x\\ y\end{pmatrix}. \end{aligned}$$
    (22)

    Let \((x,y)\in {\mathcal {D}}(A)\times H\). From Lemma 3.1 we get \(x_n\in U_n\) with \(x_n\rightarrow x\) and \(A_nx_n\rightarrow Ax\) as \(n\rightarrow \infty \). Set \(y_n=P_ny\). Then \(y_n\rightarrow y\) and

    $$\begin{aligned} \Vert D_ny_n-Dy\Vert&\le \Vert P_n(Dy_n-Dy)\Vert +\Vert P_nDy-Dy\Vert \\&\le \Vert Dy_n-Dy\Vert +\Vert P_nDy-Dy\Vert \rightarrow 0, \quad n\rightarrow \infty , \end{aligned}$$

    i.e., \(D_ny_n\rightarrow Dy\). The proof of \(B_ny_n\rightarrow By\) and \(B_n^*x_n\rightarrow B^*x\) is the same after the additional observation \(B_n^*=P_nB^*|_{U_n}\). Hence we have shown (22).

\(\square \)

Theorem 6.3

Let \(s_1,\dots ,s_m\in \left\{ \lambda \in {\mathbb {C}}\,\big |\,-\gamma _D<{{\,\mathrm{Re}\,}}\lambda <\gamma _A\right\} \). Let \(0<\varepsilon < \min _{j=1,\dots ,m}\left( \min \{\gamma _A-{{\,\mathrm{Re}\,}}s_j, \gamma _D+{{\,\mathrm{Re}\,}}s_j\}\right) \) and

$$\begin{aligned} \delta _j>\frac{\varepsilon }{\min \{\gamma _A-{{\,\mathrm{Re}\,}}s_j, \gamma _D+{{\,\mathrm{Re}\,}}s_j\}^2-\varepsilon \min \{\gamma _A-{{\,\mathrm{Re}\,}}s_j, \gamma _D+{{\,\mathrm{Re}\,}}s_j\}} \end{aligned}$$

for \(j=1,\dots ,m\). Then there exists \(n_0\in {\mathbb {N}}\) such that

$$\begin{aligned} \sigma _\varepsilon ({\mathcal {A}})\subset \bigcap _{j=1}^m \left[ \left( B_{\delta _j}(W(({\mathcal {A}}_n-s_j)^{-1}))\right) ^{-1}+s_j\right] \quad \text {for all}\quad n\ge n_0. \end{aligned}$$

Proof

Lemmas 6.1 and 6.2 imply

$$\begin{aligned} \Vert ({\mathcal {A}}-s_j)^{-1}\Vert \le \frac{1}{\min \{\gamma _A-{{\,\mathrm{Re}\,}}s_j, \gamma _D+{{\,\mathrm{Re}\,}}s_j\}}\le \frac{1}{\varepsilon },\quad \Vert ({\mathcal {A}}_n-s_j)^{-1}\Vert \le \frac{1}{\varepsilon }, \end{aligned}$$

and hence the assertion follows from Theorem 3.6. \(\square \)

Remark 6.4

Suppose that A is the operator associated with a coercive sesqui-linear form a on \(V\subset H\) and that \(U_n\), W, \(P_n\in {\mathcal {L}}(H)\), \(A_n\in {\mathcal {L}}(U_n)\) are chosen as in Sect. 5. Then \((P_n,A_n)\) approximates A uniformly, and hence also strongly, see Remark 4.1. Moreover, the coercivity of a implies that A and all \(A_n\) are uniformly accretive with constant \(\gamma _A=\gamma \) from (16). Hence all assumptions of this section are fulfilled in this case.

7 Numerical Examples

In order to exemplify the previously developed theory we take a look at the results of numerical computations. We investigate the steps that were involved in the discretization of a given operator and describe a visualization of supersets of the pseudospectrum.

Example 7.1

In this example we will examine the Hain–Lüst operator which fits into the framework of Sect. 6. See [19, 20] for results on the approximation of the quadratic numerical range of such a block operator. The Hain–Lüst operator under consideration here is defined by

$$\begin{aligned} {\mathcal {A}}= \begin{pmatrix}A &{}\quad B \\ B^* &{}\quad D\end{pmatrix} \end{aligned}$$

on the Hilbert space \(L^2(0,1)\times L^2(0,1)\) where \(A=-\frac{1}{100}\frac{\partial ^2}{\partial x^2} +2\), \(B= I\) and \(D = 2\mathrm {e}^{2\pi \mathrm {i}\cdot }-3\) with \({\mathcal {D}}(A) = \{u\in H^2(0,1) \mid u(0)=u(1)=0\}\), \({\mathcal {D}}(B)={\mathcal {D}}(D)=L^2(0,1)\) and \({\mathcal {D}}({\mathcal {A}}) = {\mathcal {D}}(A)\oplus {\mathcal {D}}(D)\). Hence, for \(u\in {\mathcal {D}}({\mathcal {A}})\) and \(v\in C^\infty (0,1)\times C^\infty (0,1)\) with \(v(0)=v(1)=0\) we have

$$\begin{aligned} \langle {\mathcal {A}}u,v\rangle= & {} \int _0^1\left( \left( -\frac{1}{100}\frac{\partial ^2}{\partial x^2} +2\right) u_1(x)+u_2(x)\right) \overline{v_1(x)}\,\mathrm {d}x \nonumber \\&+ \,\int _0^1\left( u_1(x) + \left( 2\mathrm {e}^{2\pi \mathrm {i}x}-3\right) u_2(x)\right) \overline{v_2(x)}\,\mathrm {d}x \nonumber \\= & {} \int _0^1\frac{1}{100}\frac{\partial }{\partial x}u_1(x)\frac{\partial }{\partial x}\overline{v_1(x)} + (2u_1(x)+u_2(x))\overline{v_1(x)}\,\mathrm {d}x\nonumber \\&\quad + \int _0^1\left( u_1(x) + \left( 2\mathrm {e}^{2\pi \mathrm {i}x}-3\right) u_2(x)\right) \overline{v_2(x)}\,\mathrm {d}x. \end{aligned}$$
(23)

Let \(\{\mathcal {T}_{\frac{1}{n}}\}_{n\in {\mathbb {N}}}\) be the family of decompositions of the interval (0, 1) where every subinterval \(T\in \mathcal {T}_\frac{1}{n}\) is of length \(\frac{1}{n}\) and let

$$\begin{aligned} U_n = \{u\in C(0,1) \mid u|_T \in {\mathbb {P}}_1(T), T\in \mathcal {T}_{\frac{1}{n}}, u(0)=u(1)=0\},\quad n\in {\mathbb {N}}. \end{aligned}$$

Here \({\mathbb {P}}_1(T)\) denotes the set of polynomials of degree 1 on the subinterval T. The piecewise linear functions

$$\begin{aligned} {\widetilde{\varphi }}_i = {\left\{ \begin{array}{ll} nx-i+1, \qquad &{} x\in (\frac{i-1}{n},\frac{i}{n}),\\ i+1-nx, \qquad &{} x\in (\frac{i}{n},\frac{i+1}{n}),\\ 0, \qquad &{} \text {else}, \end{array}\right. } \end{aligned}$$

for \(i\in \{1,\dots ,n-1\}\) form a basis of \(U_n\) and therefore the functions

$$\begin{aligned} \varphi _i = {\left\{ \begin{array}{ll} ({\widetilde{\varphi }}_i,0), \qquad &{}i\le n-1,\\ (0,{\widetilde{\varphi }}_{i-n+1}), \qquad &{}i>n-1, \end{array}\right. } \end{aligned}$$

for \(i\in \{1,\dots ,2(n-1)\}\) form a basis of \(U_n\times U_n\). Evaluating (23) on these basis functions, the finite-element discretization matrices \({\mathcal {A}}_n\) of \({\mathcal {A}}\) are given by

$$\begin{aligned} {\mathcal {A}}_n = \left( \left( \langle {\mathcal {A}}\varphi _i,\varphi _j\rangle \right) _{i,j}\cdot \left( \langle \varphi _i,\varphi _j\rangle \right) _{i,j}^{-1}\right) ^\intercal . \end{aligned}$$

Due to Lemma 6.2, Theorem 3.6 can be applied here. In order to illustrate the inclusion specified therein the boundaries of the sets

$$\begin{aligned} \bigl (B_{\delta _j}(W(({\mathcal {A}}_n-s_j)^{-1}))\bigr )^{-1}+s_j \end{aligned}$$

(blue) are depicted in Fig. 3 for shifts \(s_1,\dots ,s_m\in \varrho ({\mathcal {A}})\). The choice of the shifts was determined by the expected shape of the pseudospectrum aiming to obtain a relatively small superset thereof. They are located on two circles around \(-3\) with radii greater and smaller than 2 and on lines parallel to the real axis in the right half plane. Here \(n=600\), \(\delta _j=1.1\frac{\Vert ({\mathcal {A}}_n-s_j)^{-1}\Vert ^2\varepsilon }{1-\Vert ({\mathcal {A}}_n-s_j)^{-1}\Vert \varepsilon }\) and \(\varepsilon \approx 0.4\). The red dots are the eigenvalues of \({\mathcal {A}}_n\) while the black lines correspond to the boundaries of the pseudospectrum of the approximation matrix \(\sigma _\varepsilon ({\mathcal {A}}_n)\) computed by eigtool, see [7]. Note that according to Theorem 3.6 the intersection of the blue areas form an enclosure of the pseudospectrum of the actual operator \(\sigma _\varepsilon ({\mathcal {A}})\), while the black lines only give the information for the discretized operator. Furthermore the spectral gap mentioned in Lemma 6.1 becomes visible.

Fig. 3
figure 3

Pseudospectrum approximation for the Hain–Lüst operator

Example 7.2

Let us consider the the advection–diffusion operator \(A:{\mathcal {D}}(A)\)\(\subset L^2(0,1)\rightarrow L^2(0,1)\) defined by

$$\begin{aligned} A = \eta \frac{\partial ^2}{\partial x^2} + \frac{\partial }{\partial x} \end{aligned}$$

with \({\mathcal {D}}(A) = \{u\in H^2(0,1) \mid u(0)=u(1)=0\}\), which has also been examined in [24, pp. 115]. For \(u\in {\mathcal {D}}(A)\) and \(v\in C^\infty (0,1)\) we have

$$\begin{aligned} \langle Au,v\rangle= & {} \int _0^1 \left( \eta \frac{\partial ^2}{\partial x^2}u(x) + \frac{\partial }{\partial x}u(x)\right) \overline{v(x)}\,\mathrm {d}x \nonumber \\= & {} \int _0^1 \frac{\partial }{\partial x}u(x)\overline{v(x)} - \eta \frac{\partial }{\partial x}u(x)\frac{\partial }{\partial x}\overline{v(x)}\,\mathrm {d}x. \end{aligned}$$
(24)

As in the previous example let \(\{\mathcal {T}_{\frac{1}{n}}\}_{n\in {\mathbb {N}}}\) be the family of decompositions of the interval (0, 1) where every subinterval \(T\in \mathcal {T}_\frac{1}{n}\) is of length \(\frac{1}{n}\) and let

$$\begin{aligned} U_n = \{u\in C(0,1) \mid u|_T \in {\mathbb {P}}_1(T), T\in \mathcal {T}_{\frac{1}{n}}, u(0)=u(1)=0\},\quad n\in {\mathbb {N}}. \end{aligned}$$

Here \({\mathbb {P}}_1(T)\) denotes the set of polynomials of degree 1 on the subinterval T. The piecewise linear functions

$$\begin{aligned} \varphi _i = {\left\{ \begin{array}{ll} nx-i+1, \qquad &{} x\in (\frac{i-1}{n},\frac{i}{n}),\\ i+1-nx, \qquad &{} x\in (\frac{i}{n},\frac{i+1}{n}),\\ 0, \qquad &{} \text {else}, \end{array}\right. } \end{aligned}$$

for \(i\in \{1,\dots ,n-1\}\) form a basis of \(U_n\). Evaluating (24) on these basis functions, the finite-element discretization matrices \(A_n\) of A are given by

$$\begin{aligned} A_n = \left( \left( \langle A\varphi _i,\varphi _j\rangle \right) _{i,j}\cdot \left( \langle \varphi _i,\varphi _j\rangle \right) _{i,j}^{-1}\right) ^\intercal . \end{aligned}$$

With the choice of \(\eta = 0.015\), Fig. 4 shows the eigenvalues of \(A_n\) for \(n=40\) (red) and the sets

$$\begin{aligned} \bigl (B_{\delta _j}(W((A_n-s_j)^{-1}))\bigr )^{-1}+s_j \end{aligned}$$

(blue) for a number of shifts \(s_1,\dots ,s_m\) where \(\delta _j=1.1\frac{\Vert (A_n-s_j)^{-1}\Vert ^2\varepsilon }{1-\Vert (A_n-s_j)^{-1}\Vert \varepsilon }\) and \(\varepsilon \approx 16\). The shifts are located at a certain distance to the expected pseudospectrum so as to obtain a relatively small superset thereof. The black line corresponds to the boundary of \(\sigma _\varepsilon (A_n)\) computed by eigtool, see [7]. This demonstrates the result of Theorem 3.6 which actually yields an enclosure for the pseudospectrum of the operator A while the black line only shows the boundary of the pseudospectrum of the approximation matrix \(A_n\).

Fig. 4
figure 4

Pseudospectrum approximation for the advection–diffusion operator

As already mentioned in Remark 2.8 we also have the enclosure

$$\begin{aligned} \sigma _\varepsilon (A)\subset B_\varepsilon (W(A)) \end{aligned}$$

for operators A with a compact resolvent. Note that, because both sides of the enclosure are in terms of the same operator A, this only yields an enclosure for the discretized operator when applied numerically, not the full operator. So let us take a look at the discretizations of the Hain–Lüst (Fig. 5) and the advection–diffusion operator (Fig. 6) again. Here, the \(\varepsilon \)-neighborhoods of the numerical ranges are depicted by green lines. As you can see, this approach leads to a very similar result in case of the advection–diffusion operator (where the pseudospectrum is convex), while it fails to distinguish disconnected components of the pseudospectrum in case of the Hain–Lüst operator.

Fig. 5
figure 5

\(\varepsilon \)-neighborhood of the numerical range of the Hain–Lüst operator

Fig. 6
figure 6

\(\varepsilon \)-neighborhood of the numerical range of the advection–diffusion operator