Abstract
Let A and E be self-adjoint matrices or operators on \(\ell ^2({{\mathbb {N}}})\), where A is fixed and E is a small perturbation. We study how the eigenvalues of \(A+E\) depend on E, with the aim of obtaining second order formulas that are explicitly computable in terms of the spectral decomposition of A and a certain block decomposition of E. In particular we extend the classical Rayleigh-Schrödinger formulas for the one-parameter perturbation \(A+tE\) where \(t\in {{\mathbb {R}}}\) varies and E is held fixed, by dropping t and considering E as the variable.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Perturbation of Eigenvalues of Self-Adjoint Matrices
In the introduction we will primarily focus on the finite dimensional case although these results will be extended to operators on separable Hilbert spaces. We denote by \({{\mathcal {H}}}_n\) the set of self-adjoint matrices of size \(n\times n\). Given \(A\in {{\mathcal {H}}}_n\), the spectral decomposition gives a unitary matrix \(U_A\) and a diagonal matrix \(\Lambda _\alpha \) such that
Here the columns of \(U_A\) are the eigenvectors, \(\alpha \) is the vector of eigenvalues and \(\Lambda _\alpha \) denotes the corresponding diagonal matrix
Given a “small” self-adjoint matrix E we ask how \(\alpha \) changes upon replacing A with \(A+E\). The literature on this topic is immense and can roughly be divided into two groups. One group “freeze” the matrix E and consider \(A+tE\) as a function of the complex variable t, giving rise to a beautiful and rich connection with algebra and analytic function theory [3, 15, 31]. However, it lacks a global perspective, in the sense that E is fixed and not a free variable. The second group of results do not “freeze” E, with weaker more general results as a consequence, such as the estimates by Geršgorin, Weyl, Stewart and Bauer-Fike to name a few. In this article we will present uniform third order bounds for the eigenvalues of the perturbation \(A+E\) under the condition that E is small. Hence we work in a framework which in a sense fits in between, by providing estimates that are global in E but only interesting for small \(\Vert E\Vert \).
A key result in this field is the fact that the eigenvalues \(\xi (t)\) of \(A+tE\) are real analytic functions of \(t\in {{\mathbb {R}}}\) (given a suitable ordering). This result is due to F. Rellich in a sequence of articles from the 30’s [23], and a simple proof in the finite dimensional setting is found in his monograph [24] (or consult Theorem 6.1 Ch. II in [15]). Even before that, the coefficients of the corresponding series expansion were computed by Lord Rayleigh and later Schrödinger, although they lacked a general proof that the corresponding series converged. These coefficients are typically found in the literature on mathematical physics and quantum physics, rather than books on pure mathematics such as Kato’s seminal work [15] or Rellich’s own monograph [24], for that matter. For example, they are computed in Reed-Simon’s book [22] Ch. XII.1 using complex analytic tools, while assuming that the eigenvalues of A are simple. Even without this assumption, Courant and Hilbert [8] computes them by making a simple ansatz and backing out their values from a set of equation systems, see Ch. 5.13. While these coefficients have very complicated expressions, the first and second order terms are manageable; If we suppose for simplicity that a basis has been chosen such that \(A=\Lambda _\alpha \) and moreover such that \(E(i,j)=0\) whenever \(\alpha _i=\alpha _j\) and \(i\ne j\), then
Despite the beauty of this formula, it lacks a global perspective. In his 1953 monograph, Rellich himself points out that even introducing two unknown parameters in the perturbation leads to lack of analyticity and unpredictable behavior. In this paper we prove a generalization of (1.2) which holds for all E (with uniform control on the \(O(\Vert E\Vert ^3)\)-error term).
1.2 Novelties
Let \(A\in {{\mathcal {H}}}_n\) be a fixed given matrix, let \(\lambda \) be any one of its eigenvalues, and let m be the dimension of the eigenspace corresponding to \(\lambda \). Let \(U_A\) be a unitary matrix such that
where \(I_m\) is the \(m\times m\) identity matrix and \({\hat{A}}_{22}\) is a matrix whose eigenvalues are distinct from \(\lambda \). Now consider an arbitrary self-adjoint perturbation \(E\in {{\mathcal {H}}}_n\). Let \(\hat{E} = U_A^*E U_A\) and introduce the block decomposition
where \({\hat{E}}_{11}\) is \(m\times m\). We now introduce a matrix B by
For simplicity, we here only present the results in the finite-dimensional setting, since the results are new also for matrices (for \(m>1\) at least).
Theorem 1.1
Let \(\lambda \) be a fixed eigenvalue of \(A\in {{\mathcal {H}}}_n\) of multiplicity m and \(\{\beta _j\}_{j=1}^{m}\) be the eigenvalues of B. Then the eigenvalues \(\{\xi _j\}_{j=1}^{n}\) of \(A+E\) can be arranged such that
Example
As an example, we took the diagonal operator
and perturbed it with a random self-adjoint matrix E with norm \(10^{-1}\). In this manner we obtained \((-0.04172, -0.01581, 0.03698)\) for the three smallest values of \(\xi \), whereas the eigenvalues of \(\beta \) where \((-0.04181, -0.01581, 0.03698)\) so the maximal error is precisely \(9\cdot 10^{-4}\approx 10^{-3}=\Vert E\Vert ^3\). When lowering the size of \(\Vert E\Vert \) to \(10^{-4}\), the corresponding error drops to just below \(10^{-12}\), in perfect harmony with Theorem 1.1.
We shall also extend the theorem to certain operators on separable Hilbert spaces, which is needed for applications to quantum physics. However, the above result is, to the best of our knowledge, new also for matrices. The proof in the infinite dimensional setting is more involved. Mainly since we can not rely on Geršgorin’s Circle theorem.
As a corollary, in Sect. 4 we show that
where \(\{\varepsilon _j\}_{j=1}^m\) are the eigenvalues of \({\hat{E}}_{11}\). Although the latter estimate is much simpler to prove, we have not been able to find it either in the literature when \(m>1\). Another corollary, also new for multiplicity greater than one, pertains to the singular value decomposition, but since this is a bit lengthy to state we leave the details to Sect. 5.
The only results that we have found in the existing literature that partially overlap our results are those by G. W. Stewart, see e.g. [27], and in collaboration with J. Sun [28]. In particular, Theorem 1.1 is an extension of the key result of [27] (which also appears as Lemma 4.5 in [28]) to the case of multiplicity higher than one. In the case of \(\lambda \) being a simple eigenvalue of A, our method can be used iteratively to provide formulas with error \(O(\left\| E\right\| ^k)\), which is another extension of the above mentioned results, see Corollary 4.2. The application to singular values extends Theorem 4.6 of [28], again with the only difference that we include the case of multiplicity greater than one. We consider this application in the final section. Although not immediately obvious, the formula (1.2) for the first and second order Rayleigh-Schrödinger coefficient is a special case of the above theorem, obtained by inserting tE. (We refer to the article [5], which is a preliminary version of the present work containing a number of additional results and observations.)
Our proof will be based on an iteration of a lemma due to Issai Schur and avoids the use of Geršgorin type estimates, although based on new results [6] it would have been possible to do a proof along those lines as well. However, we prefer the present self-contained version.
1.3 Related Works
It seems that E. Schrödinger was one of the first to postulate some results and conjectures concerning perturbation series in [25], in particular this paper contains the first coefficients in the series expansion (1.2), which Schrödinger attributes to Lord Rayleigh’s investigations of harmonic vibrations of a string [29]. Such results are of key interest to mathematical physics and in particular quantum physics, where more complicated systems are considered as perturbations of simpler systems for which closed form solutions do exist. The classical example is the study of the hydrogen atom, see Example 3, Section XII.2, in Reed and Simon [22]. Many more interesting examples are found in the same chapter, and the “Notes”-section contains a more extensive historical exposition. Other books on quantum mechanics that treat perturbation theory include [8, 16, 17]. For a more recent application to quantum information theory, see [10].
F. Rellich was the first to systematically study the topic in a sequence of papers in the 30’s and 40’s (Störungstheorie der Spektralzerlegung I-V), and in particular he proved Schrödinger’s conjectures and established analyticity of eigenvalues and eigenprojections for perturbations (depending analytically on one parameter) of self-adjoint operators. The area was very active through the 50’s and 60’s which led to the classic [15] by T. Kato (see in part. Sec. 6, Ch II), still today a key reference on perturbation theory. This work is continued e.g. in Baumgärtel [3].
In parallel, global bounds for the perturbation of eigenvalues goes back to H. Weyl around 1910 [32], where in particular the famous “Weyl perturbation theorem” is established. Improvements were then given e.g. by Hoffman-Wielandt, Bauer-Fike, Mirsky and later Bhatia. It seems that (1.3) is a sort of local improvement of Weyl’s, Bauer-Fike’s and Hoffman-Wielandt’s theorems, in the sense that these results give less accurate information on the eigenvalues of \(A+E\) than (1.3) for small E. We refer to [14] (Ch. 6), [28] (Ch. IV and V) and [4] (Ch. VI and VII) for more information on this type of results.
From the numerical perspective we mention the books [9, 11, 21] and [33]. Other more recent contributions to perturbation theory for self-adjoint matrices include [2, 12, 18, 20, 30], but the results are of a different nature than those presented here. For example, Section 9 of [18] tries to understand the local behavior of eigenvalues using so called Clarke subdifferentials. The recent article [26] treats the use of Schur complements in spectral problems, but seems to have no overlap with the present article. See also Ch. 15 of [13] for an overview of modern results. The fairly recent book [34] contains a compilation of known uses of Schur complements, see in particular Section 2.5 containing a large amount of estimates on eigenvalues in the self-adjoint case.
Finally, we also mention the works [1, 7, 19] which, among other things, consider the topic of localizing the spectrum as well as contain a number of results on convergence of the spectrum when \((A_n)_{n=1}^{\infty }\) is a sequence of operators converging, in some sense, to A. However, there seems to be no overlap between those results and the theory presented in this paper.
2 The Case of Operators
Let A be a bounded self-adjoint operator on a separable Hilbert space. In general, the spectrum \(\sigma (A)\) of such an operator can be rather complicated, but the discrete spectrum behaves quite like the eigenvalues of matrices (we follow [22] for terminology which differs slightly between various books, see in particular Sec. XII.2). We recall that the discrete spectrum is the complement of the essential spectrum, and that \(\lambda \) lies in the discrete spectrum if and only if it is an isolated point in \(\sigma (A)\) which has a finite dimensional spectral projection,Footnote 1 defined e.g. by the Riesz-Cauchy functional calculus
where \(\Omega \) is a disc around \(\lambda \) that has a finite distance to the remaining spectrum. The “isolation distance” is the distance between \(\lambda \) and the remainder of the spectrum. For a matrix the algebraic multiplicity of an eigenvalue coincides with the rank of the corresponding spectral projection. In the infinite dimensional setting we define the algebraic multiplicity of and eigenvalue \(\lambda \) as the rank of the corresponding projection \(P_{\lambda }\). Hence, if \(\lambda \) has algebraic multiplicity m as an eigenvalue to the self-adjoint operator A, \(A+E\) will have precisely m eigenvalues inside \(\Omega \) for small enough E, and these are part of the discrete spectrum for \(A+E\). In what follows we denote them by \(\xi _1,\ldots ,\xi _m\), and we assume that they are ordered non-increasingly. Upon introducing an orthonormal basis, we may identify the Hilbert space in question with \(\ell ^2({{\mathbb {N}}})\), and subsequently any linear operator can be identified with an infinite dimensional matrix in the obvious way.
Now let \(\lambda \) be a particular eigenvalue of multiplicity m in the discrete spectrum of A, and assume that the basis has been chosen so that A has the form
where \(A_{22}\) is not necessarily diagonal. Denote by \(E_{11},~E_{12},~E_{21}\) and \(E_{22}\) the operators in the corresponding decomposition of E, (where \(E_{11}\) is a finite-dimensional matrix). Set
and let \(\beta \) be the corresponding eigenvalues, which we assume are ordered non-increasingly. The extension of Theorem 1.1 then reads as follows.
Theorem 2.1
Let A and E be bounded self-adjoint operators on a separable Hilbert space. If \(\lambda \) is an eigenvalue of A of multiplicity m then there are eigenvalues \(\xi _1,\dotsc ,\xi _m\) of \(A+E\) ordered non-increasingly such that
where \(\beta _j\) are the eigenvalues of B from (2.2), ordered non-increasingly.
3 The Proofs
It is easy to see that Theorem 1.1 can be obtained from Theorem 2.1 so we content with proving the latter. A key ingredient will based on the concept of the Schur complement as introduced by Issai Schur. Given a matrix representation
the Schur complement of F with respect to the block \(F_{22}\) is denoted by \(F/F_{22}\) and is defined via
If we recall (2.2) we see that the operator B introduced there is in fact a Schur complement.
Lemma 3.1
(Schur) Let F be an operator acting on \(\ell ^2({{\mathbb {N}}})\) whose matrix representation is given as in (3.1). The operator F is via a change of basis similar to both operators
Proof
The matrix \( J= \begin{pmatrix} I&{}\quad 0 \\ F_{22}^{-1}F_{21} &{} \quad I \end{pmatrix} \) is invertible with inverse \(J^{-1}=\begin{pmatrix} I&{}\quad 0\\ -F_{22}^{-1}F_{21} &{}\quad I \end{pmatrix}\). The first result follows by computing \(JFJ^{-1}\), and the second identity can be proved by a similar argument, (or by applying the first to the adjoint). \(\square \)
Similar decompositions as the one in Lemma 3.1 appears throughout the study of Schur complements. This particular result appears for instance in Theorem 1.6 in [34].
Concerning the proof of Theorem 2.1, there is clearly no loss in generality in assuming that \(\lambda = 0\) since \(A+E-\lambda I\) has the same eigenvalues as \(A+E\) apart from a translation by \(\lambda \). Moreover we may assume that an orthonormal basis has been chosen so that A has the form
where \(A_{22}\) is self-adjoint and invertible. We let E denote a self-adjoint perturbation and write its corresponding block representation as
Note that E here coincides with \(\hat{E}=U_A^*E U_A\) from the introduction, as we are assuming that A is diagonal to start with. The Schur complement of \(A+E\) with respect to \(A_{22}+E_{22}\) equals
which should be compared with the operator B from the introduction, which in the present setting takes the form
We will need the following result relating the eigenvalues of B with those of \(\tilde{B}\). Note that both matrices are self-adjoint.
Lemma 3.2
Let the eigenvalues of B and \({\tilde{B}}\), ordered non-increasingly, be denoted \(\beta \) and \(\tilde{\beta }\) respectively. Then
Proof
We consider the difference
Thus \(\Vert B-\tilde{B}\Vert =O(\Vert E\Vert ^3)\) and the desired result then follows from Weyl’s perturbation inequality which implies that \(|\beta _j-\tilde{\beta }_j|\le \Vert B-\tilde{B}\Vert \). \(\square \)
Armed with these results, we are now in position to prove the main result.
Proof of Theorem 2.1
Due to Lemma 3.2, it suffices to prove that the eigenvalues \(\{\xi _j\}_{j=1}^{n}\) of \(A+E\) can be arranged such that
keeping in mind that we have set \(\lambda =0\). We can then rewrite the \(A+E\) as
Applying (3.3) from Lemma 3.1, we find that \(A+E\) is similar to
For sufficiently small values of \(\left\| E\right\| \) the operator \(A_{22}+O(\left\| E\right\| )\) is invertible, (since \(A_{22}\) is by construction). Therefore the similarity of (3.3) in Lemma 3.1 is applicable once more, and we find that \(A+E\) is similar to
where we used that \(\tilde{B}=O(\Vert E\Vert )\). Again the lower right corner of the matrix is of the form \(A_{22}+O(\Vert E\Vert )\) and hence this block will be invertible for small enough E. A final application of (3.3) from Lemma 3.1 gives us that
(This step is only needed for the improved estimate (3.14) below.)
At this point, it is possible to conclude Theorem 1.1 by carefully invoking a new extension of Geršgorin’s theorem to the Hilbert space setting, see [6]. However, we prefer to present a self-contained proof as follows.
We now apply Lemma 3.1 to obtain a similar operator where also the upper right corner is \(O(\Vert E\Vert ^4)\), and then rely on a careful use of the Riesz-Cauchy functional calculus. To begin with, we apply (3.4) from Lemma 3.1. Note that the Schur complement denoted \(F/F_{22}\) in the lemma is in this case equal to \(\tilde{B} -E_{12}A_{22}^{-2}E_{21}E_{11}+O(\Vert E\Vert ^4)\) while \(\tilde{B} = O(\Vert E\Vert )\). We apply the lemma three times to conclude that
where H(E) denotes an operator that is not necessarily self-adjoint but \(O(\Vert E\Vert )\). We denote the first operator in the last expression by \({\tilde{A}}\) and the latter by \({\tilde{E}}\), so that both depend on E and \(\Vert {\tilde{E}}\Vert =O(\Vert E\Vert ^3)\). Let \(\delta \) be the distance from the spectrum of \(A_{22}\) to 0. We will now fix E with the only restriction that \(\Vert E\Vert <r\), where we assume that \(r>0\) is small enough so that \(\Vert H(E)\Vert \le \delta /3\) and \(2\Vert {\tilde{E}}\Vert <\delta /6\). Since also \(\tilde{B}\) depends continuously on E, we can in addition assume that r is such that \(|\tilde{\beta _j}|<\delta /6\) holds for all \(1\le j\le m\).
Given any \(\zeta \in {{\mathbb {C}}}\) with \(|\zeta |<\delta /3\), we shall now prove that
exists and is bounded by \(3/\delta \). To see this, first observe that since \((\zeta I-A_{22})^{-1}\) is a normal operator its norm equals its spectral radius, and the spectrum of \(\zeta I-A_{22}\) is outside the disc with radius \(2\delta /3\). Hence \(\Vert (\zeta I-A_{22})^{-1}\Vert \le \frac{3}{2\delta }\) so \(\Vert (\zeta I-A_{22})^{-1}H(E)\Vert \le \frac{3}{2\delta }\frac{\delta }{3}=\frac{1}{2}\) and therefore
which has norm less than or equal to 2. Since \(2\frac{3}{2\delta }=\frac{3}{\delta }\), the desired estimate follows from (3.10).
Let \(\Omega \) be the union of open discs with center at \({{\tilde{\beta }}}_j\), \(j=1,\ldots ,m\), and radius \(2\Vert {\tilde{E}}\Vert \). Since \(2\Vert {\tilde{E}}\Vert <\delta /6\) and \(|{{\tilde{\beta }}}_j|<\delta /6\), \(j=1,\ldots ,m\), by the previous assumptions, \(\Omega \) is inside the disc with center 0 and radius \(\delta /3\). Moreover its boundary does not intersect with \(\sigma ({\tilde{A}})\). (To see this, note that \(\sigma ({\tilde{A}})=\sigma ({\tilde{B}})\cup \sigma (A_{22}+H(B))\), and the latter stays away from \(\partial \Omega \) by the existence of (3.10)). Given any F the spectral projection of \(\tilde{A}+F\) onto the eigenspace corresponding to the eigenvalues in \(\Omega \) is then equal to
and it depends continuously on F as long as F is small enough (so that the inverse exists for all \(\zeta \in \partial \Omega \)). Since the rank of a projection is integer, this implies that the amount of eigenvalues (counted with algebraic multiplicity) in \(\Omega \) is constant for all F in some neighborhood of 0. We now show that this neighborhood includes a disc centered at 0 with radius \(2\Vert {\tilde{E}}\Vert \). To this end, note that for any \(\zeta \in \partial \Omega \) we have
whenever the inverse exists. Let \(\Omega _1,\ldots ,\Omega _K\) be the connected components of \(\Omega \), ordered so that \(\Omega _{k+1}\) always lies to the left of \(\Omega _k\), which can be done since the sets are symmetric around the real axis (as \(\tilde{B}\) is self-adjoint). Note that \(\Vert (\zeta I-\tilde{B})^{-1}\Vert =(2\Vert E\Vert )^{-1}\) (since \(\zeta I-\tilde{B}\) is normal with spectrum outside the disc with center 0 and radius \(2\Vert E\Vert \)). By (3.12) it therefore follows that \(\Vert (\zeta -{\tilde{A}})^{-1}\Vert =(2\Vert {\tilde{E}}\Vert )^{-1}\) whenever \(\zeta \in \partial \Omega \) (since \(3/\delta \) is a bound for the lower right operator and \((2\Vert {\tilde{E}}\Vert )^{-1}>6/\delta \) by assumption). Finally, using a series expansion similar to (3.10), it follows that \((\zeta -({\tilde{A}}+F))^{-1}\) exists for any operator F with \(\Vert F\Vert <2\Vert {\tilde{E}}\Vert \), as desired.
In fact, since the rank of a projection is an integer, continuity implies that each \(\Omega _k\) contains precisely as many eigenvalues of \(\tilde{A}+F\) as \(\tilde{A}\) does, given that \(\Vert F\Vert <2\Vert {\tilde{E}}\Vert \). In particular, setting \(F={\tilde{E}}\) and recalling that \(\tilde{A}+\tilde{E}=A+E\) we find that every set \(\Omega _k\) contains precisely as many \(\xi _j\)’s as \({{\tilde{\beta }}}_j\)’s. Due to the ordering of the sets \(\Omega _k\) and the fact that the \(\xi _j\)’s are also ordered non-increasingly, we conclude that \(\{j:~{{\tilde{\beta }}}_j\in \Omega _k\}=\{j:~\xi _j\in \Omega _k\}\) holds for each k. Since the diameter of each \(\Omega _k\) is at most \(2 (2\Vert {\tilde{E}}\Vert ) \) times the amount of balls it is made up of (which is less than m), we see that
as desired. \(\square \)
We remark that the estimate in Theorem 2.1 can be made slightly more precise as follows
where \(\delta \) as above is the isolation distance of \(\lambda \). To derive the above inequality, first note that
(by the computation in the proof of Lemma 3.2), and similarly that \({\tilde{E}} \) has the structure \(\begin{pmatrix} {\tilde{E}}_{11} &{} \quad {\tilde{E}}_{12} \\ {\tilde{E}}_{21} &{} \quad 0 \end{pmatrix} \) where \(\max (\Vert \tilde{E}_{12}\Vert ,\Vert \tilde{E}_{21}\Vert ) = O(\Vert E\Vert ^4)\) and \(\Vert \tilde{E}_{11}\Vert \le \delta ^{-2}\Vert E\Vert ^3+O(\left\| E\right\| ^4)\), from which it easily follows that \(\Vert {\tilde{E}}\Vert \le \delta ^{-2}\Vert E\Vert ^3+ O(\Vert E\Vert ^4)\). Combining (3.13) with (3.15), this implies that the left hand side of (3.14) is bounded by \((1+4m)\delta ^{-2}\Vert E\Vert ^3+O(\left\| E\right\| ^4)\) which, since \(1\le m\), gives the desired bound.
4 Further Results
We first prove the estimate (1.3) from the introduction, which to our best knowledge is new as well. Recall \(E_{11}\) as defined in (3.6).
Corollary 4.1
Let A and E be bounded self-adjoint operators on a separable Hilbert space. If \(\lambda \) is an eigenvalue of A of multiplicity m then there are eigenvalues \(\xi _1,\dotsc ,\xi _m\) of \(A+E\) such that
where \(\varepsilon _j\) are the eigenvalues of \(E_{11}\) (ordered non-increasingly).
Proof
By Weyl’s inequality the eigenvalues \( \beta \) of B from Theorem 2.1 satisfy
The right hand side is \(O(\Vert E\Vert ^2)\), and hence the desired estimate directly follows from Theorem 2.1. \(\square \)
The main problem with extending the proof of Theorem 2.1 to higher order formulas is that the the operator which takes the role of \({\tilde{B}}\) in this setting no longer is self-adjoint. Indeed (3.9) implies that
and here \(E_{12}A_{22}^{-2}E_{21}E_{11}\) need not be self-adjoint. For these reasons we refrain from pursuing this in the general case. However, if the eigenvalue is simple, i.e. when \(m=1\), the latter issue disappears. In this case we conclude that a generalized form of the eigenvalue approximation found in Lemma 4.5 in Ch. V.4 of [28] holds.
Corollary 4.2
Let A and E be bounded self-adjoint operators on a separable Hilbert space. Assume that \(\lambda \) is an isolated eigenvalue of A of multiplicity 1. Let \(\xi \) be the corresponding eigenvalue of \(A+E\), and denote the number \(E_{11}\) from (3.6) by \(\varepsilon \). Then
as \(\Vert E\Vert \rightarrow 0\).
Proof
As before we assume that \(\lambda = 0\). By (3.9) \(A+E\) is similar to
so the result follows from the same argument as provided in the proof of Theorem 2.1 (with \(m=1\)). \(\square \)
Remark
We should remark that iterating the process described in the proof of Theorem 1.1k-times gives an explicit approximation with error term of the form \(O(\left\| E\right\| ^{k+1})\), which can be used to extend the above corollary to higher orders. However, we have not been observed any particular structure of the approximant that is suitable for a closed formula and hence refrain from pursuing this further.
5 An Application: Singular Values
We recall that every \({n_1}\times {n_2}\) matrix A has a singular value decomposition. That is we can find unitary matrices U and V of type \({n_1}\times {n_1}\) and \({n_2}\times {n_2}\) respectively such that
where \(\Sigma \) is an \({n_1}\times {n_2}\) rectangular diagonal matrix with non-negative entries. If E is a given perturbation then \(A+E\) shares singular values with \(\Sigma +U^*EV\), since these are invariant under multiplication with unitary matrices. We can therefore restrict our study of perturbed singular values to matrices where the fixed term is diagonal and non-negative. We consider the following theorem which directly generalizes the singular value estimate given as Theorem 4.6 in Chapter V of [28] to the case of singular values of arbitrary multiplicity. To simplify the notation we assume that \({n_1}>{n_2}\) although this is no real restriction as we can always consider the transpose which then shares the singular spectrum. Let \(\varsigma \) be a particular singular value of multiplicity m, and assume that
where \(\Lambda _{\tau }\) is diagonal and contain the remaining singular values (possibly both larger and smaller than \(\varsigma \)). Consider the perturbation \(\Sigma +E\), where E has corresponding block representation
and let \((\sigma _j)_{j=1}^{n_2}\) be the corresponding singular values, ordered non-increasingly. Let \(k(\varsigma )\) denote the number of singular values of \(\Sigma \) larger than \(\varsigma \), counting multiplicity. We then have
Theorem 5.1
Let \(\{\mu _j\}_{j=1}^{m}\) be the non-increasing enumeration of the eigenvalues of
Then
Proof
The singular values of \(\Sigma +E\) are the square roots of the eigenvalues of \((\Sigma +E)^*(\Sigma +E)\). By expanding the parentheses we obtain
In order to connect with the notation in the previous sections we note that the perturbation now becomes \(\Sigma ^*E+E^*\Sigma +E^*E\) while the fixed term is \(\Sigma ^*\Sigma \). In block form the summands of (5.3) become
The matrix B (recall (2.2)) thus becomes
and hence Theorem 2.1 implies that the singular values of \(\Sigma +E\) satisfy
(where \((\beta _j)_{j=1}^m\) are the eigenvalues of B). Note that the matrix M, as defined in (5.1), differs from B by
which clearly is \(O(\Vert E\Vert ^3)\), and therefore Weyl’s perturbation theorem implies that
The desired result now follows immediately by combining this with (5.4). \(\square \)
Notes
be aware that some authors use “spectral idempotent”, reserving projection for “orthogonal projection”
References
Ahues, M., Largillier, A., Limaye, B.: Spectral Computations for Bounded Operators. CRC Press, Boca Raton (2001)
Axelsson, O., Neytcheva, M.: Eigenvalue estimates for preconditioned saddle point matrices. Numer. Linear Algebra Appl. 13(4), 339–360 (2006)
Baumgärtel, H.: Analytic Perturbation Theory for Matrices and Operators. Birkhäuser Verlag, Basel (1985)
Bhatia, R.: Matrix Analysis, vol. 169. Springer, Berlin (2013)
Carlsson, M.: Perturbation theory for the spectral decomposition of Hermitian matrices. arXiv:1809.09480 (2018)
Carlsson, M., Rubin, O.: A Hilbert space variant of Geršgorin’s Circle Theorem (to appear)
Chatelin, F.: Spectral Approximation of Linear Operators. SIAM, Philadelphia (2011)
Courant, R., Hilbert, D.: Methods of Mathematical Physics. CUP Archive, Cambridge (1955)
Golub, G.H., Van Loan, C.F.: Matrix Computations, vol. 3. JHU Press, Baltimore (2012)
Grace, M.R., Guha, S.: Perturbation theory for quantum information. In: 2022 IEEE Information Theory Workshop (ITW). IEEE, pp. 500–505 (2022)
Higham, N.J.: Functions of Matrices: Theory and Computation, vol. 104. SIAM, Philadelphia (2008)
Hiriart-Urruty, J.-B., Ye, D.: Sensitivity analysis of all eigenvalues of a symmetric matrix. Numer. Math. 70(1), 45–72 (1995)
Hogben, L.: Handbook of Linear Algebra. Chapman and Hall/CRC, Boca Raton (2006)
Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge University Press, Cambridge (1990)
Kato, T.: Perturbation Theory for Linear Operators, vol. 132. Springer, Berlin (2013)
Kemble, E.C.: The Fundamental Principles of Quantum Mechanics: With Elementary Applications. McGraw-Hill, New York (1937)
Landau, L.D., Lifshitz, E.M.: Quantum Mechanics: Non-relativistic Theory, vol. 3. Elsevier, Amsterdam (2013)
Lewis, A.S.: Nonsmooth analysis of eigenvalues. Math. Program. 84(1), 1–24 (1999)
Limaye, B.V. et al.: Spectral perturbation and approximation with numerical experiements. [sn], (1987)
Mathias, R.: Spectral perturbation bounds for positive definite matrices. SIAM J. Matrix Anal. Appl. 18(4), 959–980 (1997)
Parlett, B.N.: The Symmetric Eigenvalue Problem, vol. 20. SIAM, Philadelphia (1998)
Reed, M., Simon, B.: Methods of Modern Mathematical Physics IV: Analysis of Operators. Academic Press, Cambridge (1978)
Rellich, F.: Störungstheorie der spektralzerlegung. Mathematische Annalen 113(1), 600–619 (1937)
Rellich, F., Berkowitz, J.: Perturbation Theory of Eigenvalue Problems. CRC Press, Boca Raton (1969)
Schrödinger, E.: Quantisierung als eigenwertproblem. Annalen der physik 385(13), 437–490 (1926)
Sjöstrand, J., Zworski, M.: Elementary linear algebra for advanced spectral problems. In Annales de l’institut Fourier 57, 2095–2141 (2007)
Stewart, G.W.: A second order perturbation expansion for small singular values. Linear Algebra Appl. 56, 231–235 (1984)
Stewart, G.W., Sun, J.: Matrix Perturbation Theory. Elsevier Science, Amsterdam (1990)
Strutt, J.W., Rayleigh, B.: The Theory of Sound I. Macmillan (1894)
Sun, D., Sun, J.: Strong semismoothness of eigenvalues of symmetric matrices and its application to inverse eigenvalue problems. SIAM J. Numer. Anal. 40(6), 2352–2367 (2002)
Torki, M.: Second-order directional derivatives of all eigenvalues of a symmetric matrix. Nonlinear Anal. 46(8), 1133–1150 (2001)
Weyl, H.: Das asymptotische verteilungsgesetz der eigenwerte linearer partieller differentialgleichungen (mit einer anwendung auf die theorie der hohlraumstrahlung). Mathematische Annalen 71(4), 441–479 (1912)
Wilkinson, J.H.: The Algebraic Eigenvalue Problem, vol. 87. Clarendon Press, Oxford (1965)
Zhang, F.: The Schur Complement and its Applications, vol. 4. Springer, Berlin (2006)
Funding
Open access funding provided by Lund University.
Author information
Authors and Affiliations
Contributions
Both authors have participated equally in the discovery of these results and the writing of the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Communicated by Eric Weber.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Carlsson, M., Rubin, O. On Perturbation of Operators and Rayleigh-Schrödinger Coefficients. Complex Anal. Oper. Theory 18, 47 (2024). https://doi.org/10.1007/s11785-024-01482-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11785-024-01482-9