# Iterative refinement for symmetric eigenvalue decomposition

- 190 Downloads

## Abstract

An efficient refinement algorithm is proposed for symmetric eigenvalue problems. The structure of the algorithm is straightforward, primarily comprising matrix multiplications. We show that the proposed algorithm converges quadratically if a modestly accurate initial guess is given, including the case of multiple eigenvalues. Our convergence analysis can be extended to Hermitian matrices. Numerical results demonstrate excellent performance of the proposed algorithm in terms of convergence rate and overall computational cost, and show that the proposed algorithm is considerably faster than a standard approach using multiple-precision arithmetic.

## Keywords

Accurate numerical algorithm Iterative refinement Symmetric eigenvalue decomposition Quadratic convergence## Mathematics Subject Classification

65F15 15A18 15A23## 1 Introduction

Let *A* be a real symmetric \(n \times n\) matrix. We are concerned with the standard symmetric eigenvalue problem \(Ax = \lambda x\), where \(\lambda \in \mathbb {R}\) is an eigenvalue of *A* and \(x \in \mathbb {R}^{n}\) is an eigenvector of *A* associated with \(\lambda \). Solving this problem is important because it plays a significant role in scientific computing. For example, highly accurate computations of a few or all eigenvectors are crucial for large-scale electronic structure calculations in material physics [31, 32], in which specific interior eigenvalues with associated eigenvectors need to be computed. Excellent overviews on the symmetric eigenvalue problem can be found in references [25, 30].

Throughout this paper, *I* and *O* denote the identity and the zero matrices of appropriate size, respectively. For matrices, \(\Vert \cdot \Vert _{2}\) and \(\Vert \cdot \Vert _{\mathrm {F}}\) denote the spectral norm and the Frobenius norm, respectively. Unless otherwise specified, \(\Vert \cdot \Vert \) means \(\Vert \cdot \Vert _{2}\). For legibility, if necessary, we distinguish between the approximate quantities and the computed results, e.g., for some quantity \(\alpha \) we write \(\widetilde{\alpha }\) and \(\widehat{\alpha }\) as an approximation of \(\alpha \) and a computed result for \(\alpha \), respectively.

*i*-th columns are eigenvectors \(x_{(i)}\) of

*A*(called an eigenvector matrix) and \({D}\) is an \(n \times n\) diagonal matrix whose diagonal elements are the corresponding eigenvalues \(\lambda _{i} \in \mathbb {R}\), i.e., \(D_{ii} = \lambda _{i}\) for \(i = 1, \ldots , n\). For this purpose, we discuss an iterative refinement algorithm for (1) together with a convergence analysis. Throughout the paper, we assume that

Several efficient numerical algorithms for computing (1) have been developed such as the bisection method with inverse iteration, the QR algorithm, the divide-and-conquer algorithm or the MRRR (multiple relatively robust representations) algorithm via Householder reduction, and the Jacobi algorithm. For details, see [10, 11, 14, 15, 25, 30] and references cited therein. Since such algorithms have been studied actively in numerical linear algebra for decades, there are highly reliable implementations for them, such as LAPACK routines [4]. We stress that we do not intend to compete with such existing algorithms, i.e., we aim to develop an algorithm to improve the results obtained by any of them. Such a refinement algorithm is useful if the quality of the results is not satisfactory. In other words, our proposed algorithm can be regarded as a supplement to existing algorithms for computing (1).

*A*are

There exist several refinement algorithms for eigenvalue problems that are based on Newton’s method for nonlinear equations (cf. e.g., [3, 7, 12, 29]). Since this sort of algorithm is designed to improve eigenpairs \((\lambda , x) \in \mathbb {R}\times \mathbb {R}^{n}\) individually, applying such a method to all eigenpairs requires \(\mathcal {O}(n^{4})\) arithmetic operations. To reduce the computational cost, one may consider preconditioning by Householder reduction of *A* in ordinary floating-point arithmetic such as \(T \approx \widehat{H}^{\mathrm {T}}A\widehat{H}\), where *T* is a tridiagonal matrix, and \(\widehat{H}\) is an approximate orthogonal matrix involving rounding errors. However, this is not a similarity transformation; thus, the original problem is slightly perturbed. Assuming \(\widehat{H} = H + \varDelta _{H}\) for some orthogonal matrix *H* with \(\epsilon _{H} = \Vert \varDelta _{H}\Vert \ll 1\), we have \(\widetilde{T} = \widehat{H}^{\mathrm {T}}A\widehat{H} = H^{\mathrm {T}}(A + \varDelta _{A})H\) with \(\Vert \varDelta _{A}\Vert = \mathcal {O}(\Vert A\Vert \epsilon _{H})\) and \(\epsilon _{H} \approx \Vert I - \widehat{H}^{\mathrm {T}}\widehat{H}\Vert \), where \(\varDelta _{A}\) can be considered a backward error, and the accuracy of eigenpairs using \(\widehat{H}\) is limited by its lack of orthogonality.

A possible approach to achieve an accurate eigenvalue decomposition is to use multiple-precision arithmetic for Householder reduction and the subsequent algorithm. In general, however, we do not know in advance how much arithmetic precision is sufficient to obtain results with desired accuracy. Moreover, the use of such multiple-precision arithmetic for entire computations is often much more time-consuming than ordinary floating-point arithmetic such as IEEE 754 binary64. Therefore, we prefer the iterative refinement approach rather than simply using multiple-precision arithmetic.

Simultaneous iteration or Grassmann–Rayleigh quotient iteration [1] can potentially be used to refine eigenvalue decompositions. However, such methods require higher-precision arithmetic for the orthogonalization of approximate eigenvectors. Thus, we cannot restrict the higher-precision arithmetic to matrix multiplication. Wilkinson [30, Chapter 9, pp.637–647] explained the refinement of eigenvalue decompositions for general square matrices with reference to Jahn’s method [6, 19]. Such methods rely on a similarity transformation \(C := \widehat{X}^{-1}A\widehat{X}\) with high accuracy for a computed result \(\widehat{X}\) for \({X}\), which requires an accurate solution of the linear system \(\widehat{X}C = A\widehat{X}\) for *C*, and slightly breaks the symmetry of *A* due to nonorthogonality of \(\widehat{X}\). Davies and Modi [9] proposed a direct method to complete the symmetric eigenvalue decomposition of nearly diagonal matrices. The Davies–Modi algorithm assumes that *A* is preconditioned to a nearly diagonal matrix such as \(\widehat{X}^{\mathrm {T}}A\widehat{X}\), where \(\widehat{X}\) is an approximate orthogonal matrix involving rounding errors. Again, as mentioned above, this is not a similarity transformation, and a problem similar to the preconditioning by Householder reduction remains unsolved, i.e., the Davies–Modi algorithm does not refine the nonorthogonality of \(\widehat{X}\). See Appendix for details regarding the relationship between the Davies–Modi algorithm and ours.

Given this background, we try to derive a simple and efficient iterative refinement algorithm to simultaneously improve the accuracy of all eigenvectors with quadratic convergence, which requires \(\mathcal {O}(n^{3})\) operations for each iteration. The proposed algorithm can be regarded as a variant of Newton’s method, and therefore, its quadratic convergence is naturally derived.

*E*requires that \(\widehat{X}\) is nonsingular, which is usually satisfied in practice. We assume that \(\widehat{X}\) is modestly accurate, e.g., it is obtained by some backward stable algorithm. Then, we aim to compute a sufficiently precise approximation \(\widetilde{E}\) of

*E*using the following two relations.

The main benefit of our algorithm is their adaptivity, i.e., they allow the adaptive use of high-precision arithmetic until the desired accuracy of the computed results is achieved. In other words, the use of high-precision arithmetic is mandatory, however, it is primarily restricted to matrix multiplication, which accounts for most of the computational cost. Note that a multiple-precision arithmetic library, such as MPFR [21] with GMP [13], could be used for this purpose. With approaches such as quad-double precision arithmetic [16] and arbitrary precision arithmetic [26], multiple-precision arithmetic can be simulated using ordinary floating-point arithmetic. There are more specific approaches for high-precision matrix multiplication. For example, XBLAS (extra precise BLAS) [20] and other accurate and efficient algorithms for dot products [22, 27, 28] and matrix products [24] based on error-free transformations are available for practical implementation.

The remainder of the paper is organized as follows. In Sect. 2, we present a refinement algorithm for the symmetric eigenvalue decomposition. In Sect. 3, we provide a convergence analysis of the proposed algorithm. In Sect. 4, we present some numerical results showing the behavior and performance of the proposed algorithm. Finally, we conclude the paper in Sect. 5. To put our results into context, we review existing work in Appendix.

For simplicity, we basically handle only real matrices. The discussions in this paper can be extended to Hermitian matrices. Moreover, discussion of the standard symmetric eigenvalue problem can readily be extended to the generalized symmetric (or Hermitian) definite eigenvalue problem \(Ax = \lambda Bx\), where *A* and *B* are real symmetric (or Hermitian) with *B* being positive definite.

## 2 Proposed algorithm

Let \(A = A^{\mathrm {T}} \in \mathbb {R}^{n \times n}\). The eigenvalues of *A* are denoted by \(\lambda _{i} \in \mathbb {R}\), \(i = 1, \ldots , n\). Then \(\Vert A\Vert = \max _{1 \le i \le n}|\lambda _{i}|\). Let \({X} \in \mathbb {R}^{n \times n}\) denote an orthogonal eigenvector matrix comprising normalized eigenvectors of *A*, and let \(\widehat{X}\) denote an approximation of \({X}\) computed by some numerical algorithm.

*E*. Here, we assume that

*A*such that \({X}^{\mathrm {T}}A{X} = {D}\). From this, we obtain \({D} = {X}^{\mathrm {T}}A{X} = (I + E)^{\mathrm {T}}\widehat{X}^{\mathrm {T}}A\widehat{X}(I + E)\) and

*i*-th column of \(\widehat{X}\).

*i*,

*j*) satisfying \(\widetilde{\lambda }_{i} \ne \widetilde{\lambda }_{j}\), the linear systems have unique solutions

*i*-th and

*j*-th columns of \(\widehat{X}\), respectively. Moreover, the accuracy of \(\widehat{x}_{(i)}\) and \(\widehat{x}_{(j)}\) is improved as shown in Sect. 3.2.

*A*in Algorithm 1, which is designed to be applied iteratively.

### Remark 1

On practical computation of \(\delta \) at the line 6 in Algorithm 1, one may prefer to use the Frobenius norm rather than the spectral norm, since the former is much easier to compute than the latter. For any real \(n \times n\) matrix *C*, it is known (cf., e.g., [14, p.72]) that \(\Vert C\Vert _{2} \le \Vert C\Vert _{\mathrm {F}} \le \sqrt{n}\Vert C\Vert _{2}\). Thus, it may cause some overestimate of \(\delta \), and affect the behavior of the algorithm. \(\square \)

### Remark 2

For the generalized symmetric definite eigenvalue problem \(Ax = \lambda Bx\) where *A* and *B* are real symmetric with *B* being positive definite, a similar algorithm can readily be derived by replacing line 2 in Algorithm 1 with \(R \leftarrow I - \widehat{X}^{\mathrm {T}}B\widehat{X}\). \(\square \)

## 3 Convergence analysis

In this section, we prove quadratic convergence of Algorithm 1 under the assumption that the approximate solutions are modestly close to the exact solutions. Our analysis is divided into two parts. First, if we assume that *A* does not have multiple eigenvalues, then quadratic convergence is proven. Next, we consider a general analysis for any *A*.

*E*and \(\widetilde{E}\), let \(\epsilon \) be defined as in (3) and

### 3.1 Simple eigenvalues

We focus on the situation where the eigenvalues of *A* are all simple and a given \(\widehat{X}\) is sufficiently close to an orthogonal eigenvector matrix *X*. First, we derive a sufficient condition that (17) is chosen for all (*i*, *j*), \(i \ne j\) in Algorithm 1.

### Lemma 1

*A*be a real symmetric \(n \times n\) matrix with simple eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\) and a corresponding orthogonal eigenvector matrix \(X \in \mathbb {R}^{n \times n}\). For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\) in exact arithmetic, and \(\widetilde{D} = \mathrm {diag}(\widetilde{\lambda }_{i})\),

*R*,

*S*, and \(\delta \) are the quantities calculated in Algorithm 1. Define

*E*such that \(X=\widehat{X}(I+E)\). If

### Proof

*i*,

*j*), \(i \ne j\),

The assumption (29) is crucial for the first iteration in the iterative process (20). In the following, monotone convergence of \(\Vert E^{(\nu )}\Vert \) is proven under the assumption (29) for a given initial guess \(\widehat{X} = X^{(0)}\) and \(E = E^{(0)}\), so that \(\Vert E^{(\nu + 1)}\Vert < \Vert E^{(\nu )}\Vert \) for \(\nu = 0, 1, \ldots \) . Thus, in the iterative refinement using Algorithm 1, Lemma 1 ensures that \(|\widetilde{\lambda }_{i}^{(\nu )} - \widetilde{\lambda }_{j}^{(\nu )}| > \delta ^{(\nu )}\) for all (*i*, *j*), \(i \ne j\) as in (30) are consecutively satisfied for \(X^{(\nu )}\) in the iterative process. In addition, recall that our aim is to prove the quadratic convergence in the asymptotic regime. To this end, we derive a key lemma that shows (25).

### Lemma 2

*A*be a real symmetric \(n \times n\) matrix with simple eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\) and a corresponding orthogonal eigenvector matrix \(X \in \mathbb {R}^{n \times n}\). For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\) in exact arithmetic, and \(\widetilde{E}\) is the quantity calculated in Algorithm 1. Define

*E*such that \(X=\widehat{X}(I+E)\). Under the assumption (29) in Lemma 1, we have

### Proof

*i*,

*j*) elements of \(\widetilde{\varDelta }_{2}\). In addition, from (22), it follows that

Using the above lemmas, we obtain a main theorem that states the quadratic convergence of Algorithm 1 if all eigenvalues are simple and a given \(\widehat{X}\) is sufficiently close to *X*.

### Theorem 1

*A*be a real symmetric \(n \times n\) matrix with simple eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\) and a corresponding orthogonal eigenvector matrix \(X \in \mathbb {R}^{n\times n}\). For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\) in exact arithmetic, and \(X'\) is the quantity calculated in Algorithm 1. Define

*E*and \(E'\) such that \(X=\widehat{X}(I+E)\) and \(X=X'(I+E')\), respectively. Under the assumption (29) in Lemma 1, we have

### Proof

Our analysis indicates that Algorithm 1 may not be convergent for very large *n*. However, in practice, *n* is much smaller than \(1/\epsilon \) for \(\epsilon :=\Vert E\Vert \) when the initial guess \(\widehat{X}\) is computed by some backward stable algorithm, e.g., in IEEE 754 binary64 arithmetic, unless *A* has nearly multiple eigenvalues. In such a situation, the iterative refinement works well.

### Remark 3

### 3.2 Multiple eigenvalues

Multiple eigenvalues require some care. If \(\widetilde{\lambda }_{i}\approx \widetilde{\lambda }_{j}\) corresponding to multiple eigenvalues \(\lambda _{i}=\lambda _{j}\), we might not be able to solve the linear system given by (21) and (22). Therefore, we use equation (21) only, i.e., \(\widetilde{e}_{ij}=\widetilde{e}_{ji}={r}_{ij}/2\) if \(|\widetilde{\lambda }_{i}-\widetilde{\lambda }_{j}|\le \delta \).

To investigate the above exceptional process, let us consider a simple case as follows. Suppose \(\lambda _{i}\), \(i \in \mathcal {M} := \{1, 2, \ldots , p\}\) are multiple, i.e., \(\lambda _{1}=\cdots =\lambda _{p}<\lambda _{p+1}<\cdots < \lambda _{n}\). Then, the eigenvectors corresponding to \(\lambda _{i}\), \(1\le i \le p\) are not unique. Suppose \(X = [x_{(1)},\ldots ,x_{(n)}] \in \mathbb {R}^{n\times n}\) is an orthogonal eigenvector matrix of *A*, where \(x_{(i)}\) are the normalized eigenvectors corresponding to \(\lambda _{i}\) for \(i = 1, \ldots , n\). Define \(X_{\mathcal {M}} := [x_{(1)},\ldots ,x_{(p)}] \in \mathbb {R}^{n\times p}\) and \(X_{\mathcal {S}} := [x_{(p+1)},\ldots ,x_{(n)}] \in \mathbb {R}^{n\times (n - p)}\). Then, the columns of \(X_{\mathcal {M}}Q\) are also the eigenvectors of *A* for any orthogonal matrix \(Q \in \mathbb {R}^{p\times p}\). Thus, let \(\mathcal {V}\) be the set of \(n \times n\) orthogonal eigenvector matrices of *A* and \(\mathcal {E}:= \{ \widehat{X}^{-1}X-I : X \in \mathcal {V}\}\) for a given nonsingular \(\widehat{X}\).

*p*rows \(V_{\alpha } \in \mathbb {R}^{p \times p}\) and the remaining \((n - p)\) rows \(W_{\alpha } \in \mathbb {R}^{(n - p) \times p}\), we have

*C*is independent of

*Q*. In other words, we have

*Q*, the last equality in (53) is due to the polar decomposition \(V_{\alpha }{Q}= C(Q_{\alpha }Q)\). Hence, we have an eigenvector matrix

*Q*. Thus, we define the unique matrix \(Y := [X_{\mathcal {M}}Q_{\alpha }^{\mathrm {T}}, X_{\mathcal {S}}]\) for all matrices in \(\mathcal {V}\), where

*Y*depends only on \(\widehat{X}\). Then, the corresponding error term \(F = (f_{ij})\) is uniquely determined as

*i*,

*j*) elements of \(\varDelta _{1}\) for all (

*i*,

*j*). Now, let us consider the situation where \(\widehat{X}\) is an exact eigenvector matrix. In (52), noting \(\widehat{X}^{-1}=\widehat{X}^{\mathrm {T}}\), we have \(W_{\alpha }=O\) and \(C=I\) in the polar decomposition of \(V_{\alpha }\). Combining the features with (54), we see \(F=O\) for the exact eigenvector matrix \(\widehat{X}\).

Our aim is to prove \(\Vert F\Vert \rightarrow 0\) in the iterative refinement for \(\widehat{X}\approx Y \in \mathcal {V}\), where *Y* depends on \(\widehat{X}\). To this end, for the refined \(X'\) as a result of Algorithm 1, we also define an eigenvector matrix \(Y'\in \mathcal {V}\) and \(F':=(X')^{-1}Y'-I\) such that the submatrices of \((X')^{-1}Y'\) corresponding to the multiple eigenvalues are symmetric and positive definite. Note that the eigenvector matrix *Y* is changed to \(Y'\) corresponding to \(X'\) after the refinement.

### Lemma 3

*A*be a real symmetric \(n \times n\) matrix with the eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\). Suppose

*A*has multiple eigenvalues with index sets \(\mathcal {M}_{k}\), \(k = 1, 2, \ldots , M\), satisfying (56). Let \(\mathcal {V}\) be the set of \(n \times n\) orthogonal eigenvector matrices of

*A*. For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), define \(\mathcal {E}\) as

*k*, the \(n_{k}\times n_{k}\) submatrices of \(\widehat{X}^{-1}Y\) corresponding to \(\{\lambda _{i}\}_{i \in \mathcal {M}_{k}}\) are symmetric and positive semidefinite. Moreover, define \(F \in \mathcal {E}\) such that \(Y=\widehat{X}(I+F)\). Then, for any \(E_{\alpha } \in \mathcal {E}\),

*Y*such that \(\Vert F\Vert <1\), then \(Y\in \mathcal {V}\) is uniquely determined.

### Proof

*H*is a symmetric and positive semidefinite matrix and \(U_{\alpha }\) is an orthogonal matrix. Note that, similarly to

*C*in (52),

*H*is unique and independent of the choice of \(X_{\alpha } \in \mathcal {V}\) that satisfies \(X_{\alpha }=\widehat{X}(I+E_{\alpha })\), whereas \(U_{\alpha }\) is not always uniquely determined. Then, we have

*Y*and

*F*, (60), (57), and (59), respectively. Here, we see that

*H*are the singular values of \(HU_{\alpha }\) in (59) that range over the interval \([1-\Vert E_{\mathrm {diag}}\Vert , 1+\Vert E_{\mathrm {diag}}\Vert ]\) from the Weyl’s inequality for singular values. In addition, note that

Finally, we prove that *Y* is unique if \(\Vert F\Vert <1\). In the above discussion, if \(X_{\alpha }\) is replaced with some \(Y \in \mathcal {V}\), then \(E_{\alpha }=F\), and thus \(\Vert E_{\mathrm {diag}}\Vert \le \Vert E_{\alpha }\Vert = \Vert F\Vert <1\) in (59). Therefore, \(U_{\alpha }=I\) in (59) due to the uniqueness of the polar decomposition of the nonsingular matrix \(I+E_\mathrm{diag}\), which implies the uniqueness of *Y* from (60). \(\square \)

Moreover, we have the next lemma, corresponding to Lemmas 1 and 2.

### Lemma 4

*A*be a real symmetric \(n \times n\) matrix with the eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\). Suppose

*A*has multiple eigenvalues with index sets \(\mathcal {M}_{k}\), \(k = 1, 2, \ldots , M\), satisfying (56). For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\) in exact arithmetic, and \(\widetilde{D} = \mathrm {diag}(\widetilde{\lambda }_{i})\), \(\widetilde{E}\), and \(\delta \) are the quantities calculated in Algorithm 1. Let

*F*be defined as in Lemma 3. Assume that

### Proof

*i*,

*j*) corresponding to \(\lambda _{i}\not =\lambda _{j}\),

On the basis of Lemmas 3 and 4 and Theorem 1, we see the quadratic convergence for a real symmetric matrix *A* that has multiple eigenvalues.

### Theorem 2

*A*be a real symmetric \(n \times n\) matrix with the eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\). Suppose

*A*has multiple eigenvalues with index sets \(\mathcal {M}_{k}\), \(k = 1, 2, \ldots , M\), satisfying (56). Let \(\mathcal {V}\) be the set of \(n \times n\) orthogonal eigenvector matrices of

*A*. For a given nonsingular \(\widehat{X} \in \mathbb {R}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\) in exact arithmetic, and \(X'\) and \(\delta \) are the quantities calculated in Algorithm 1. Let \(Y, Y' \in \mathcal {V}\) be defined such that, for all

*k*, the \(n_{k}\times n_{k}\) submatrices of \(\widehat{X}^{-1}Y\) and \((X')^{-1}Y'\) corresponding to \(\{\lambda _{i}\}_{i \in \mathcal {M}_{k}}\) are symmetric and positive definite. Define

*F*and \(F'\) such that \(Y=\widehat{X}(I+F)\) and \(Y'=X'(I+F')\), respectively. Furthermore, suppose that (64) in Lemma 4 is satisified for \(\epsilon _{F} := \Vert F\Vert \). Then, we obtain

### Proof

In the iterative refinement, Theorem 2 shows that the error term \(\Vert F\Vert \) is quadratically convergent to zero. Note that \(\widehat{X}\) is also convergent to some fixed eigenvector matrix *X* because Theorem 2 and (65) imply \(\Vert \widetilde{E}\Vert /\Vert F\Vert \rightarrow 1\) as \(\Vert F\Vert \rightarrow 0\) in \(X':=\widehat{X}(I+\widetilde{E})\), where \(\Vert F\Vert \) is quadratically convergent to zero.

### 3.3 Complex case

For a Hermitian matrix \(A\in \mathbb {C}^{n \times n}\), we must note that, for any unitary diagonal matrix *U*, *XU* is also an eigenvector matrix, i.e., there is a continuum of normalized eigenvector matrices in contrast to the real case. Related to this, note that \(R:=I-\widehat{X}^{\mathrm {H}}\widehat{X}\) and \(S:=\widehat{X}^{\mathrm {H}}A\widehat{X}\), and (14) is replaced with \(\widetilde{E}+\widetilde{E}^{\mathrm {H}}=R\) in the complex case; thus, the diagonal elements \(\widetilde{e}_{ii}\) for \(i=1,\ldots ,n\) are not uniquely determined in \(\mathbb {C}\). Now, select \(\widetilde{e}_{ii}=r_{ii}/2\in \mathbb {R}\) for \(i=1,\ldots ,n\). Then, we can prove quadratic convergence using the polar decomposition in the same way as in the discussion of multiple eigenvalues in the real case. More precisely, we define a normalized eigenvector matrix *Y* as follows. First, we focus on the situation where all eigenvalues are simple. For a given nonsingular \(\widehat{X}\), let *Y* be defined such that all diagonal elements of \(\widehat{X}^{-1}Y\) are positive real numbers. In addition, let \(F:=\widehat{X}^{-1}Y-I\). Then, we see the quadratic convergence of *F* in the same way as in Theorem 2.

### Corollary 1

*A*. For a given nonsingular \(\widehat{X} \in \mathbb {C}^{n\times n}\), suppose that Algorithm 1 is applied to

*A*and \(\widehat{X}\), and a nonsingular \(X'\) is obtained. Define \(Y,Y' \in \mathcal {V}\) such that all the diagonal elements of \(\widehat{X}^{-1}Y\) and \((X')^{-1}Y'\) are positive real numbers. Furthermore, define

*F*and \(F'\) such that \(Y=\widehat{X}(I+F)\) and \(Y'=X'(I+F')\), respectively. If

For a general Hermitian matrix having multiple eigenvalues, we define *Y* in the same manner as in Theorem 2, resulting in the following corollary.

### Corollary 2

*A*has multiple eigenvalues with index sets \(\mathcal {M}_{k}\), \(k = 1, 2, \ldots , M\), satisfying (56). For a given nonsingular \(\widehat{X} \in \mathbb {C}^{n\times n}\), let

*Y*, \(Y'\),

*F*, \(F'\), and \(\delta \) be defined as in Corollary 1. Suppose that, for all

*k*, the \(n_{k}\times n_{k}\) submatrices of \(\widehat{X}^{-1}Y\) and \((X')^{-1}Y'\) corresponding to \(\{\lambda _{i}\}_{i \in \mathcal {M}_{k}}\) are Hermitian and positive definite. Furthermore, suppose that (70) is satisfied. Then, we obtain

Note that \(\widehat{X}\) in Corollaries 1 and 2 is convergent to some fixed eigenvector matrix *X* of *A* in the same manner as in the real case.

## 4 Numerical results

Here, we present numerical results to demonstrate the effectiveness of the proposed algorithm (Algorithm 1). The numerical experiments discussed in this section were conducted using MATLAB R2016b on a PC with two CPUs (3.0 GHz Intel Xeon E5-2687W v4) and 1 TB of main memory. Let \(\mathbf {u}\) denote the relative rounding error unit (\(\mathbf {u}= 2^{-24}\) for IEEE binary32, and \(\mathbf {u}= 2^{-53}\) for binary64). To realize multiple-precision arithmetic, we adopt Advanpix Multiprecision Computing Toolbox version 4.2.3 [2], which utilizes well-known, fast, and reliable multiple-precision arithmetic libraries including GMP and MPFR. In all cases, we use the MATLAB function norm for computing the spectral norms \(\Vert R\Vert \) and \(\Vert S - \widetilde{D}\Vert \) in Algorithm 1 in binary64 arithmetic, and approximate \(\Vert A\Vert \) by \(\max _{1 \le i \le n}|\widetilde{\lambda }_{i}|\). We discuss numerical experiments for some dozens of seeds for the random number generator, and all results are similar to those provided in this section. Therefore, we adopt the default seed as a typical example using the MATLAB command rng(‘default’) for reproducibility.

### 4.1 Convergence property

Here, we confirm the convergence property of the proposed algorithm for various eigenvalue distributions.

#### 4.1.1 Various eigenvalue distributions

*A*can be controlled by the input arguments \(\mathtt {mode} \in \{1,2,3,4,5\}\) and \(\mathtt {cnd} =: \alpha \ge 1\), as follows:

- 1.
one large: \(\lambda _{1} \approx 1\), \(\lambda _{i} \approx \alpha ^{-1}\), \(i = 2,\ldots ,n\)

- 2.
one small: \(\lambda _{n} \approx \alpha ^{-1}\), \(\lambda _{i} \approx 1\), \(i = 1,\ldots ,n-1\)

- 3.
geometrically distributed: \(\lambda _{i} \approx \alpha ^{-(i - 1)/(n - 1)}\), \(i = 1,\ldots ,n\)

- 4.
arithmetically distributed: \(\lambda _{i} \approx 1 - (1 - \alpha ^{-1})(i - 1)/(n - 1)\), \(i = 1,\ldots ,n\)

- 5.
random with uniformly distributed logarithm: \(\lambda _{i} \approx \alpha ^{-r(i)}\), \(i = 1,\ldots ,n\), where

*r*(*i*) are pseudo-random values drawn from the standard uniform distribution on (0, 1).

*A*is generated using randsvd, i.e., all clustered eigenvalues are slightly separated.

We start with small examples such as \(n = 10\), since they are sufficiently illustrative to observe the convergence behavior of the algorithm. Moreover, we set \(\mathtt {cnd} = 10^{8}\) to generate moderately ill-conditioned problems in binary64. We compute \(X^{(0)}\) as an initial approximate eigenvector matrix using the MATLAB function eig for the eigenvalue decomposition in binary64 arithmetic, which adopts the LAPACK routine DSYEV. Therefore, \(X^{(0)}\) suffers from rounding errors. To see the behavior of Algorithm 1 precisely, we use multiple-precision arithmetic with sufficiently long precision to simulate the exact arithmetic in Algorithm 1. Then, we expect that Algorithm 1 (RefSyEv) works effectively for \(\mathtt {mode} \in \{3,4,5\}\), but does not for \(\mathtt {mode} \in \{1,2\}\). We also use the built-in function eig in the multiple-precision toolbox to compute the eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\) for evaluation. The results are shown in Fig. 1, which provides \(\max _{1 \le i \le n}|\widehat{\lambda }_{i} - \lambda _{i}|/|\lambda _{i}|\) as the maximum relative error of the computed eigenvalues \(\widehat{\lambda }_{i}\), \(\Vert \mathrm {offdiag}(\widehat{X}^{\mathrm {T}}A\widehat{X})\Vert /\Vert A\Vert \) as the diagonality of \(\widehat{X}^{\mathrm {T}}A\widehat{X}\), \(\Vert I - \widehat{X}^{\mathrm {T}}\widehat{X}\Vert \) as the orthogonality of a computed eigenvector matrix \(\widehat{X}\), and \(\Vert \widehat{E}\Vert \), where \(\widehat{E}\) is a computed result of \(\widetilde{E}\) in Algorithm 1. Here, \(\mathrm {offdiag}(\cdot )\) denotes the off-diagonal part. The horizontal axis shows the number of iterations \(\nu \) of Algorithm 1.

#### 4.1.2 Multiple eigenvalues

*A*has exactly multiple eigenvalues. It is not trivial to generate such matrices using of floating-point arithmetic because rounding errors slightly perturb the eigenvalues and multiplicity is broken. To avoid this, we utilize a Hadamard matrix

*H*of order

*n*whose elements are 1 or \(-1\) with \(H^{\mathrm {T}}H = nI\). For a given \(k < n\), let \(A = \frac{1}{n}HDH^{\mathrm {T}}\) where

*A*has

*k*-fold eigenvalues \(\lambda _{i} = -1\), \(i = 1, \ldots , k\), and \(\lambda _{i+k} = i\), \(i = 1, \ldots , n - k\). We set \(n = 256\) and \(k = 10\), where

*A*is exactly representable in binary64 format, and compute \(X^{(0)}\) using eig in binary64 arithmetic. We apply Algorithm 1 to

*A*and \(X^{(\nu )}\) for \(\nu = 0, 1, 2, \ldots \) . The results are shown in Fig. 2. As can be seen, Algorithm 1 converges quadratically for matrices with multiple eigenvalues, which is consistent with our convergence analysis (Theorem 2).

### 4.2 Computational speed

Results for a pseudo-random real symmetric matrix, \(n = 100\)

Algorithm 1 | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |
---|---|---|---|---|

\(\Vert \widehat{E}\Vert \) | \(5.6 \times 10^{-14}\) | \(1.8 \times 10^{-27}\) | \(3.9 \times 10^{-54}\) | \(2.0 \times 10^{-107}\) |

Elapsed time (s) | 0.01 | 0.35 | 0.31 | 0.51 |

(accumulated) | 0.01 | 0.36 | 0.67 | 1.18 |

\(\hbox {NS}+\hbox {DM}\) | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |

\(\Vert I - \widehat{U}\Vert \) | \(5.5 \times 10^{-14}\) | \(2.2 \times 10^{-28}\) | \(6.7 \times 10^{-53}\) | \(4.4 \times 10^{-106}\) |

Elapsed time (s) | 0.01 | 0.48 | 0.52 | 1.02 |

(accumulated) | 0.01 | 0.49 | 1.01 | 2.04 |

MP-approach | \(\mathtt {mp.Digits(d)}\) | \(\mathtt {d} = 34\) | \(\mathtt {d} = 55\) | \(\mathtt {d} = 108\) |

Elapsed time (s) | 0.11 | 0.53 | 0.62 |

Results for a pseudo-random real symmetric matrix, \(n = 500\)

Algorithm 1 | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |
---|---|---|---|---|

\(\Vert \widehat{E}\Vert \) | \(5.1 \times 10^{-13}\) | \(1.3 \times 10^{-25}\) | \(2.5 \times 10^{-50}\) | \(9.5 \times 10^{-100}\) |

Elapsed time (s) | 0.04 | 1.32 | 3.54 | 9.35 |

(accumulated) | 0.04 | 1.37 | 4.91 | 14.25 |

\(\hbox {NS}+\hbox {DM}\) | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |

\(\Vert I - \widehat{U}\Vert \) | \(5.1 \times 10^{-13}\) | \(1.7 \times 10^{-27}\) | \(5.1 \times 10^{-50}\) | \(1.9 \times 10^{-99}\) |

Elapsed time (s) | 0.04 | 3.39 | 6.84 | 22.31 |

(accumulated) | 0.04 | 3.43 | 10.26 | 32.58 |

MP-approach | \(\mathtt {mp.Digits(d)}\) | \(\mathtt {d} = 34\) | \(\mathtt {d} = 52\) | \(\mathtt {d} = 101\) |

Elapsed time (s) | 3.64 | 39.35 | 48.12 |

*n*.

Results for a pseudo-random real symmetric matrix, \(n = 1000\)

Algorithm 1 | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |
---|---|---|---|---|

\(\Vert \widehat{E}\Vert \) | \(7.5 \times 10^{-13}\) | \(2.8 \times 10^{-25}\) | \(1.2 \times 10^{-49}\) | \(2.2 \times 10^{-98}\) |

Elapsed time (s) | 0.10 | 5.21 | 14.45 | 40.20 |

(accumulated) | 0.10 | 5.31 | 19.76 | 59.96 |

\(\hbox {NS}+\hbox {DM}\) | eig (binary64) | \(\nu = 1\) | \(\nu = 2\) | \(\nu = 3\) |

\(\Vert I - \widehat{U}\Vert \) | \(7.5 \times 10^{-13}\) | \(6.2 \times 10^{-27}\) | \(2.1 \times 10^{-48}\) | \(4.0 \times 10^{-97}\) |

Elapsed time (s) | 0.10 | 14.34 | 33.11 | 121.54 |

(accumulated) | 0.10 | 14.44 | 47.55 | 169.09 |

MP-approach | \(\mathtt {mp.Digits(d)}\) | \(\mathtt {d} = 34\) | \(\mathtt {d} = 51\) | \(\mathtt {d} = 100\) |

Elapsed time (s) | 17.23 | 291.24 | 356.39 |

## 5 Conclusion

We have proposed a refinement algorithm (Algorithm 1) for eigenvalue decomposition of real symmetric matrices that can be applied iteratively. Quadratic convergence of the proposed algorithm was proven for a sufficiently accurate initial guess, similar to Newton’s method.

The proposed algorithm benefits from the availability of efficient matrix multiplication in higher-precision arithmetic. The numerical results demonstrate the excellent performance of the proposed algorithm in terms of convergence rate and measured computing time.

In practice, it is likely that ordinary floating-point arithmetic, such as IEEE 754 binary32 or binary64, is used for calculating an approximation \(\widehat{X}\) to an eigenvector matrix \({X}\) of a given symmetric matrix *A*. As done in the numerical experiments, we can use \(\widehat{X}\) as an initial guess \(X^{(0)}\) in Algorithm 1. However, if *A* has nearly multiple eigenvalues, it is difficult to obtain a sufficiently accurate \(X^{(0)}\) in ordinary floating-point arithmetic such that Algorithm 1 works well. Our future work is to overcome this problem.

## Notes

### Acknowledgements

The first author would like to express his sincere thanks to Professor Chen Greif at the University of British Columbia for his valuable comments and helpful suggestions.

## References

- 1.Absil, P.A., Mahony, R., Sepulchre, R., Van Dooren, P.: A Grassmann-Rayleigh quotient iteration for computing invariant subspaces. SIAM Rev.
**44**, 57–73 (2006)MathSciNetCrossRefMATHGoogle Scholar - 2.Advanpix: Multiprecision Computing Toolbox for MATLAB, Code and documentation. http://www.advanpix.com/ (2016)
- 3.Ahuesa, M., Largillier, A., D’Almeida, F.D., Vasconcelos, P.B.: Spectral refinement on quasi-diagonal matrices. Linear Algebra Appl.
**401**, 109–117 (2005)MathSciNetCrossRefMATHGoogle Scholar - 4.Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999)CrossRefMATHGoogle Scholar
- 5.Atkinson, K., Han, W.: Theoretical Numerical Analysis, 3rd edn. Springer, New York (2009)MATHGoogle Scholar
- 6.Collar, A.R.: Some notes on Jahn’s method for the improvement of approximate latent roots and vectors of a square matrix. Quart. J. Mech. Appl. Math.
**1**, 145–148 (1948)MathSciNetCrossRefMATHGoogle Scholar - 7.Davies, P.I., Higham, N.J., Tisseur, F.: Analysis of the Cholesky method with iterative refinement for solving the symmetric definite generalized eigenproblem. SIAM J. Matrix Anal. Appl.
**23**, 472–493 (2001)MathSciNetCrossRefMATHGoogle Scholar - 8.Davies, P.I., Smith, M.I.: Updating the singular value decomposition. J. Comput. Appl. Math.
**170**, 145–167 (2004)MathSciNetCrossRefMATHGoogle Scholar - 9.Davies, R.O., Modi, J.J.: A direct method for completing eigenproblem solutions on a parallel computer. Linear Algebra Appl.
**77**, 61–74 (1986)CrossRefMATHGoogle Scholar - 10.Demmel, J.W.: Applied Numerical Linear Algebra. SIAM, Philadelphia (1997)CrossRefMATHGoogle Scholar
- 11.Dhillon, I.S., Parlett, B.N.: Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices. Linear Algebra Appl.
**387**, 1–28 (2004)MathSciNetCrossRefMATHGoogle Scholar - 12.Dongarra, J.J., Moler, C.B., Wilkinson, J.H.: Improving the accuracy of computed eigenvalues and eigenvectors. SIAM J. Numer. Anal.
**20**, 23–45 (1983)MathSciNetCrossRefMATHGoogle Scholar - 13.GMP: GNU Multiple Precision Arithmetic Library, Code and documentation. http://gmplib.org/ (2015)
- 14.Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2013)MATHGoogle Scholar
- 15.Gu, M., Eisenstat, S.C.: A divide-and-conquer algorithm for the symmetric tridiagonal eigenproblem. SIAM J. Matrix Anal. Appl.
**16**, 172–191 (1995)MathSciNetCrossRefMATHGoogle Scholar - 16.Hida, Y., Li, X.S., Bailey, D.H.: Algorithms for quad-double precision floating point arithmetic. In: Proceedings of the 15th IEEE Symposium on Computer Arithmetic, pp. 155–162. IEEE Computer Society Press (2001)Google Scholar
- 17.Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)CrossRefMATHGoogle Scholar
- 18.Higham, N.J.: Functions of Matrices: Theory and Computation. SIAM, Philadelphia (2008)CrossRefMATHGoogle Scholar
- 19.Jahn, H.A.: Improvement of an approximate set of latent roots and modal columns of a matrix by methods akin to those of classical perturbation theory. Quart. J. Mech. Appl. Math.
**1**, 131–144 (1948)MathSciNetCrossRefMATHGoogle Scholar - 20.Li, X.S., Demmel, J.W., Bailey, D.H., Henry, G., Hida, Y., Iskandar, J., Kahan, W., Kang, S.Y., Kapur, A., Martin, M.C., Thompson, B.J., Tung, T., Yoo, D.: Design, implementation and testing of extended and mixed precision BLAS. ACM Trans. Math. Softw.
**28**, 152–205 (2002)MathSciNetCrossRefMATHGoogle Scholar - 21.MPFR: The GNU MPFR Library, Code and documentation. http://www.mpfr.org/ (2013)
- 22.Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput.
**26**, 1955–1988 (2005)MathSciNetCrossRefMATHGoogle Scholar - 23.Oishi, S.: Fast enclosure of matrix eigenvalues and singular values via rounding mode controlled computation. Linear Algebra Appl.
**324**, 133–146 (2001)MathSciNetCrossRefMATHGoogle Scholar - 24.Ozaki, K., Ogita, T., Oishi, S., Rump, S.M.: Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numer. Algorithms
**59**, 95–118 (2012)MathSciNetCrossRefMATHGoogle Scholar - 25.Parlett, B.N.: The symmetric eigenvalue problem, vol. 20, 2nd edn. Classics in Applied Mathematics. SIAM, Philadelphia (1998)Google Scholar
- 26.Priest, D.M.: Algorithms for arbitrary precision floating point arithmetic. In: Proceedings of the 10th Symposium on Computer Arithmetic, pp. 132–145. IEEE Computer Society Press (1991)Google Scholar
- 27.Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation part I: faithful rounding. SIAM J. Sci. Comput.
**31**, 189–224 (2008)MathSciNetCrossRefMATHGoogle Scholar - 28.Rump, S.M., Ogita, T., Oishi, S.: Accurate floating-point summation part II: sign, \(K\)-fold faithful and rounding to nearest. SIAM J. Sci. Comput.
**31**, 1269–1302 (2008)MathSciNetCrossRefMATHGoogle Scholar - 29.Tisseur, F.: Newton’s method in floating point arithmetic and iterative refinement of generalized eigenvalue problems. SIAM J. Matrix Anal. Appl.
**22**, 1038–1057 (2001)MathSciNetCrossRefMATHGoogle Scholar - 30.Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1965)MATHGoogle Scholar
- 31.Yamamoto, S., Fujiwara, T., Hatsugai, Y.: Electronic structure of charge and spin stripe order in \(\text{ La }_{2-x}\,\text{ Sr }_{x}\,\text{ Ni }\,\text{ O }_{4}\) (\(x =\frac{1}{3}, \frac{1}{2}\)). Phys. Rev. B
**76**, 165114 (2007)CrossRefGoogle Scholar - 32.Yamamoto, S., Sogabe, T., Hoshi, T., Zhang, S.-L., Fujiwara, T.: Shifted conjugate-orthogonal-conjugate-gradient method and its application to double orbital extended Hubbard model. J. Phys. Soc. Jpn.
**77**, 114713 (2008)CrossRefGoogle Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.