Iterative refinement for symmetric eigenvalue decomposition II: clustered eigenvalues
 109 Downloads
Abstract
We are concerned with accurate eigenvalue decomposition of a real symmetric matrix A. In the previous paper (Ogita and Aishima in Jpn J Ind Appl Math 35(3): 1007–1035, 2018), we proposed an efficient refinement algorithm for improving the accuracy of all eigenvectors, which converges quadratically if a sufficiently accurate initial guess is given. However, since the accuracy of eigenvectors depends on the eigenvalue gap, it is difficult to provide such an initial guess to the algorithm in the case where A has clustered eigenvalues. To overcome this problem, we propose a novel algorithm that can refine approximate eigenvectors corresponding to clustered eigenvalues on the basis of the algorithm proposed in the previous paper. Numerical results are presented showing excellent performance of the proposed algorithm in terms of convergence rate and overall computational cost and illustrating an application to a quantum materials simulation.
Keywords
Accurate numerical algorithm Iterative refinement Symmetric eigenvalue decomposition Clustered eigenvaluesMathematics Subject Classification
65F15 15A18 15A231 Introduction
Let A be a real symmetric \(n \times n\) matrix. Since solving a standard symmetric eigenvalue problem \(Ax = \lambda x\), where \(\lambda \in {\mathbb {R}}\) is an eigenvalue of A and \(x \in {\mathbb {R}}^{n}\) is an eigenvector of A associated with \(\lambda \), is ubiquitous in scientific computing, it is important to develop reliable numerical algorithms for calculating eigenvalues and eigenvectors accurately. Excellent overviews on the symmetric eigenvalue problem can be found in references [20, 23].
We here collect notation used in this paper. Let I and O denote the identity matrix and the zero matrix of appropriate size, respectively. Unless otherwise specified, \(\Vert \cdot \Vert \) means \(\Vert \cdot \Vert _{2}\), which denotes the Euclidean norm for vectors and the spectral norm for matrices. For legibility, if necessary, we distinguish between the approximate quantities and the computed results, e.g., for some quantity \(\alpha \) we write \(\widetilde{\alpha }\) and \(\widehat{\alpha }\) as an approximation of \(\alpha \) and a computed result for \(\alpha \), respectively.
In [17], we proposed a refinement algorithm for the eigenvalue decomposition of A, which works not for an individual eigenvector but for all eigenvectors. Since the algorithm is based on Newton’s method, it converges quadratically, provided that an initial guess is sufficiently accurate. In practice, although the algorithm refines computed eigenvectors corresponding to sufficiently separated simple eigenvalues, it cannot refine computed eigenvectors corresponding to “nearly” multiple eigenvalues. This is because it is difficult for standard numerical algorithms in floatingpoint arithmetic to provide sufficiently accurate initial approximate eigenvectors corresponding to nearly multiple eigenvalues as shown in (2). The purpose of this paper is to remedy this problem, i.e., we aim to develop a refinement algorithm for the eigenvalue decomposition of a symmetric matrix with clustered eigenvalues.
One might notice that the above procedure is similar to the classical shiftinvert technique to transform eigenvalue distributions. In addition, the MRRR algorithm [6] also employs a shift strategy to increase relative gaps between clustered eigenvalues for computing the associated eigenvectors. In other words, it is well known that the diagonal shift is useful for solving eigenvalue problems accurately. Our contribution is to show its effectiveness on the basis of appropriate error analysis with the adaptive use of higher precision arithmetic, which leads to the derivation of the proposed algorithm.
In the same spirit of the previous paper [17], our proposed algorithm primarily comprises matrix multiplication, which accounts for the majority of the computational cost. Therefore, we can utilize higher precision matrix multiplication efficiently. For example, XBLAS [13] and other efficient algorithms [16, 19, 22] based on socalled errorfree transformations for accurate matrix multiplication are available for practical implementation.
The remainder of the paper is organized as follows. In Sect. 2, we recall the refinement algorithm (Algorithm 1) proposed in the previous paper [17] together with its convergence theory. For practical use, we present a rounding error analysis of Algorithm 1 in finite precision arithmetic in Sect. 3, which is useful for setting working precision and shows achievable accuracy of approximate eigenvectors obtained by using Algorithm 1. In Sect. 4, we show the behavior of Algorithm 1 for clustered eigenvalues, which explains the effect of nearly multiple eigenvalues on computed results and leads to the derivation of the proposed algorithm. On the basis of Algorithm 1, we propose a refinement algorithm (Algorithm 2: \(\mathsf {RefSyEvCL}\)) that can also be applied to matrices with clustered eigenvalues in Sect. 5. In Sect. 6, we present some numerical results showing the behavior and performance of the proposed algorithm together with an application to a quantum materials simulation as a realworld problem.
For simplicity, we basically handle only real matrices. As mentioned in the previous paper [17], the discussions in this paper can also be extended to generalized symmetric (Hermitian) definite eigenvalue problems.
2 Basic algorithm and its convergence theory
In this section, we introduce the refinement algorithm proposed in the previous paper [17], which is the basis of the algorithm proposed in this paper.
In [17, Theorem 1], we presented the following theorem that states the quadratic convergence of Algorithm 1 if all eigenvalues are simple and a given \(\widehat{X}\) is sufficiently close to X.
Theorem 1
In the following, we review the discussion in [17, §3.2] for exactly multiple eigenvalues. If \(\widetilde{\lambda }_{i}\approx \widetilde{\lambda }_{j}\) corresponding to multiple eigenvalues \(\lambda _{i}=\lambda _{j}\), we compute \(\widetilde{e}_{ij}=\widetilde{e}_{ji}={r}_{ij}/2\) for (i, j) such that \(\widetilde{\lambda }_{i}\widetilde{\lambda }_{j}\le \omega \).
In a similar way to Newton’s method (cf. e.g., [3, p. 236]), dropping the second order terms in (7) and (8) yields Algorithm 1, and the next convergence theorem is provided [17, Theorem 2].
Theorem 2
3 Rounding error analysis for basic algorithm
If Algorithm 1 is performed in finite precision arithmetic with the relative rounding error unit \({\mathbf {u}}_{h}\), the accuracy of a refined eigenvector matrix \(X'\) is restricted by \({\mathbf {u}}_{h}\). Since \(\widehat{X}\) is improved quadratically when using real arithmetic, \({\mathbf {u}}_{h}\) must correspond to \(\Vert E\Vert ^{2}\) to preserve the convergence property of Algorithm 1. We explain the details in the following. For simplicity, we consider the real case. The extension to the complex case is obvious.
Remark 1
As can be seen from (12), with a fixed \({\mathbf {u}}_{h}\), iterative use of Algorithm 1 eventually computes an approximate eigenvector matrix that is accurate to \({\mathcal {O}}(\beta {\mathbf {u}}_{h})\), provided that the assumption (3) in Theorem 1 holds in each iteration. This will be confirmed numerically in Sect. 6. \(\square \)
Suppose that \(\Vert \widehat{X}  X\Vert = c\beta {\mathbf {u}}\) and \(\Vert \widehat{X}'  X\Vert = c'\beta ^{3}{\mathbf {u}}^{2}\) where c and \(c'\) are some constants. If \(c''\beta ^{2}{\mathbf {u}}< 1\) for \(c'' := c'/c\), then an approximation of X is improved in the sense that \(\Vert \widehat{X}'  X\Vert < \Vert \widehat{X}  X\Vert \). In other words, if \(\beta \) is too large such that \(c''\beta ^{2}{\mathbf {u}}\ge 1\), Algorithm 1 may not work well.
In general, define \(E^{(\nu )} \in {\mathbb {R}}^{n \times n}\) such that \(X = \widehat{X}^{(\nu )}(I + E^{(\nu )})\) for \(\nu = 0, 1, \ldots \), where \(\widehat{X}^{(0)}\) is an initial guess and \(\widehat{X}^{(\nu )}\) is a result of the \(\nu \)th iteration of Algorithm 1 with working precision \({\mathbf {u}}_{h}^{(\nu )}\) for \(\nu = 1, 2, \ldots \). To preserve the convergence speed, we need to set \({\mathbf {u}}_{h}^{(\nu )}\) satisfying \({\mathbf {u}}_{h}^{(\nu )} < \Vert E^{(\nu  1)}\Vert ^{2}\) as can be seen from (12). Although we do not know \(\Vert E^{(\nu  1)}\Vert \), we can estimate \(\Vert E^{(\nu  1)}\Vert \) by \(\Vert \widetilde{E}^{(\nu  1)}\Vert \) where \(\widetilde{E}^{(\nu  1)}\) is computed at the (\(\nu  1\))st iteration of Algorithm 1.
4 Effect of nearly multiple eigenvalues in basic algorithm
In general, a given matrix A in floatingpoint format does not have exactly multiple eigenvalues. It is necessary to discuss the behavior of Algorithm 1 for A with some nearly multiple eigenvalues \(\lambda _{i}\approx \lambda _{j}\) such that \(\widetilde{\lambda }_{i}\widetilde{\lambda }_{j}\le \omega \) in line 7. We basically discuss the behavior in real arithmetic. The effect of the rounding error is briefly explained in Remark 2 at the end of this section.
Lemma 1
Proof
For the perturbation analysis of \(\widetilde{E}\), the next lemma is crucial.
Lemma 2
Proof
Remark 2
In this section, we proved that \(\widetilde{E}\) is sufficiently close to \(F_{\omega }\) under the mild assumptions. In Sect. 3, the effect of rounding errors on \(\widetilde{E}\) is evaluated as in (10), i.e., \(\varDelta _{E}:=\widehat{E}\widetilde{E}\) is sufficiently small, where \(\widehat{E}\) is computed in finite precision arithmetic. The rounding error analysis is not caused by the perturbation analysis to \(F_{\omega }\) in this section. Thus, it is easy to see that \(\Vert \widehat{E}F_{\omega }\Vert \le \Vert \widehat{E}F_{\omega }\Vert +\Vert \widehat{E}F_{\omega }\Vert \) simply holds for the individual estimation for each error \(\Vert \widehat{E}F_{\omega }\Vert \) and \(\Vert \widehat{E}F_{\omega }\Vert \) respectively, and hence, the computed \(\widehat{E}\) is sufficiently close to \(F_{\omega }\) corresponding to \(A_{\omega }\). \(\square \)
5 Proposed algorithm for nearly multiple eigenvalues
On the basis of the basic algorithm (Algorithm 1), we propose a practical version of an algorithm for improving the accuracy of computed eigenvectors of symmetric matrices that can also deal with nearly multiple eigenvalues.
5.1 Observation
In the following, we overcome such a problem for general symmetric matrices.
5.2 Outline of the proposed algorithm
As mentioned in Sect. 1, the \(\sin \theta \) theorem by Davis–Kahan suggests that backward stable algorithms can provide a sufficiently accurate initial guess of a subspace spanned by eigenvectors associated with clustered eigenvalues for each cluster. We explain how to refine approximate eigenvectors by extracting them from the subspace correctly.
Now the problem is how to refine \(X'(:,{\mathcal {J}}_{k}) \in {\mathbb {R}}^{n \times n_{k}}\), which denotes the matrix comprising approximate eigenvectors corresponding to the clustered approximate eigenvalues \(\{\widetilde{\lambda }_{i}\}_{i \in {\mathcal {J}}_{k}}\).
 1.
Find clusters of approximate eigenvalues of A and obtain the index sets \({\mathcal {J}}_{k}\), \(k = 1, 2, \ldots , n_{{\mathcal {J}}}\) for those clusters.
 2.
Define \(V_{k} := X'(:,{\mathcal {J}}_{k}) \in {\mathbb {R}}^{n \times n_{k}}\) where \(n_{k} := {\mathcal {J}}_{k}\).
 3.
Compute \(T_{k} = V_{k}^{\mathrm {T}}(A  \mu _{k}I)V_{k}\) where \(\mu _{k} := (\min _{i \in {\mathcal {J}}_{k}}\widetilde{\lambda }_{i} + \max _{i \in {\mathcal {J}}_{k}}\widetilde{\lambda }_{i})/2\).
 4.Perform the following procedure for each \(T_{k} \in {\mathbb {R}}^{n_{k} \times n_{k}}\).
 (i)
Compute an eigenvector matrix \(W_{k}\) of \(T_{k}\).
 (ii)
Update \(X'(:,{\mathcal {J}}_{k}) \in {\mathbb {R}}^{n \times n_{k}}\) by \(V_{k}W_{k}\).
 (i)
5.3 Proposed algorithm
In Algorithm 2, the function \({\mathsf {f}}{\mathsf {l}}(C)\) rounds an input matrix \(C \in {\mathbb {R}}^{n \times n}\) to a matrix \(T \in {\mathbb {F}}^{n \times n}\), where \({\mathbb {F}}\) is a set of floatingpoint numbers in ordinary precision, such as the IEEE 754 binary64 format. Here, “roundtonearest” rounding is not required; however, some faithful rounding, such as chopping, is desirable. Moreover, the function \(\mathsf {eig}(T)\) is similar to the MATLAB function, which computes all approximate eigenvectors of an input matrix \(T \in {\mathbb {F}}^{n \times n}\) in working precision arithmetic. This is expected to adopt some backward stable algorithm as implemented in the LAPACK routine xSYEV [2]. From lines 13–17 in Algorithm 2, we aim to obtain sufficiently accurate approximate eigenvectors \(X'(:,{\mathcal {J}}_{k})\) of A, where the columns of \(X'(:,{\mathcal {J}}_{k})\) correspond to \({\mathcal {J}}_{k}\). For this purpose, we iteratively apply Algorithm 1 (\(\mathsf {RefSyEv}\)) to \(A  \mu _{k}I\) and \(V_{k}^{(\nu )}\) until \(V_{k}^{(\nu )}\) for some \(\nu \) becomes as accurate as other eigenvectors associated with wellseparated eigenvalues. Note that the spectral norms \(\Vert \widetilde{E}\Vert _{2}\) and \(\Vert \widetilde{E}_{k}\Vert _{2}\) can be replaced by the Frobenius norms \(\Vert \widetilde{E}\Vert _{\mathrm {F}}\) and \(\Vert \widetilde{E}_{k}\Vert _{\mathrm {F}}\).
Remark 3

In Algorithm 1 called at line 2 in Algorithm 2, replace \(R \leftarrow I  \widehat{X}^{\mathrm {T}}\widehat{X}\) with \(R \leftarrow I  \widehat{X}^{\mathrm {T}}B\widehat{X}\).

Replace \(A_{k} \leftarrow A  \mu _{k}I\) with \(A_{k} \leftarrow A  \mu _{k}B\) in line 8 of Algorithm 2.
6 Numerical results
We present numerical results to demonstrate the effectiveness of the proposed algorithm (Algorithm 2: RefSyEvCL). All numerical experiments discussed in this section were conducted using MATLAB R2016b on our workstation with two CPUs (3.0 GHz Intel Xeon E52687W v4 (12 cores)) and 1 TB of main memory, unless otherwise specified. Let \({\mathbf {u}}\) denote the relative rounding error unit (\({\mathbf {u}}= 2^{24}\) for IEEE binary32 and \({\mathbf {u}}= 2^{53}\) for binary64). To realize multipleprecision arithmetic, we adopt Advanpix Multiprecision Computing Toolbox version 4.2.3 [1], which utilizes wellknown, fast, and reliable multipleprecision arithmetic libraries including GMP and MPFR. We also use the multipleprecision arithmetic with sufficiently long precision to simulate real arithmetic. In all cases, we use the MATLAB function norm for computing the spectral norms \(\Vert R\Vert \) and \(\Vert S  \widetilde{D}\Vert \) in Algorithm 1 in binary64 arithmetic, and we approximate \(\Vert A\Vert \) by \(\max (\widetilde{\lambda }_{1},\widetilde{\lambda }_{n})\). We discuss numerical experiments for some dozens of seeds for the random number generator, and all results are similar to those provided in this section. Therefore, we adopt the default seed as a typical example using the MATLAB command rng(‘default’) to ensure reproducibility of problems.
6.1 Convergence property
Here, we confirm the convergence property of the proposed algorithm for various eigenvalue distributions.
6.1.1 Various eigenvalue distributions
 1.
one large: \(\lambda _{1} \approx 1\), \(\lambda _{i} \approx \alpha ^{1}\), \(i = 2,\ldots ,n\)
 2.
one small: \(\lambda _{n} \approx \alpha ^{1}\), \(\lambda _{i} \approx 1\), \(i = 1,\ldots ,n1\)
 3.
geometrically distributed: \(\lambda _{i} \approx \alpha ^{(i  1)/(n  1)}\), \(i = 1,\ldots ,n\)
 4.
arithmetically distributed: \(\lambda _{i} \approx 1  (1  \alpha ^{1})(i  1)/(n  1)\), \(i = 1,\ldots ,n\)
 5.
random with uniformly distributed logarithm: \(\lambda _{i} \approx \alpha ^{r(i)}\), \(i = 1,\ldots ,n\), where r(i) are pseudorandom values drawn from the standard uniform distribution on (0, 1).
As in [17], we set \(n = 10\) and \(\texttt {cnd} = 10^{8}\) to generate moderately illconditioned problems in binary64 and consider the computed results obtained using multipleprecision arithmetic with sufficiently long precision as the exact eigenvalues \(\lambda _{i}\), \(i = 1, 2, \ldots , n\). We compute \(X^{(0)}\) as an initial approximate eigenvector matrix using the MATLAB function eig in binary64 arithmetic.
6.1.2 Clustered eigenvalues
6.2 Computational speed
To evaluate the computational speed of the proposed algorithm (Algorithm 2), we first compare the computing time of Algorithm 2 to that of an approach that uses multipleprecision arithmetic (MPapproach). Note that the timing should be observed for reference because the computing time for Algorithm 2 strongly depends on the implementation of accurate matrix multiplication. Thus, we adopt an efficient method proposed by Ozaki et al. [19] that utilizes fast matrix multiplication routines such as xGEMM in BLAS. To simulate multipleprecision numbers and arithmetic in Algorithm 2, we represent \(\widehat{X} = \widehat{X}_{1} + \widehat{X}_{2} + \cdots + \widehat{X}_{m}\) with \(\widehat{X}_{k}\), \(k = 1, 2, \dots , m\), being floatingpoint matrices in working precision, such as “doubledouble” (\(m = 2\)) and “quaddouble” (\(m = 4\)) precision format [10] and use the concept of errorfree transformations [16].
In the multipleprecision toolbox [1], the MRRR algorithm [6] via Householder reduction is implemented sophisticatedly with parallelism to solve symmetric eigenvalue problems.
Results for a pseudorandom real symmetric matrix with clustered eigenvalues on our workstation: \(n = 1000\), 5 clusters, 100 eigenvalues, and \(\beta = 10^{12}\) in each cluster
Algorithm 2  eig (binary64)  \(\nu = 1\)  \(\nu = 2\)  \(\nu = 3\) 

\(\Vert \widehat{E}\Vert \)  \(6.4 \times 10^{4}\)  \(2.1 \times 10^{7}\)  \(8.4 \times 10^{25}\)  \(1.1 \times 10^{38}\) 
\(n_{{\mathcal {J}}}\)  0  5  0  
Elapsed time (s)  0.15  6.63  23.14  45.19 
(accumulated)  0.15  6.77  29.91  75.10 
MPapproach  \(\texttt {mp.Digits(d)}\)  \(\texttt {d} = 34\)  \(\texttt {d} = 36\)  \(\texttt {d} = 49\) 
Elapsed time (s)  16.21  256.67  285.33 
Results for a pseudorandom real symmetric matrix with clustered eigenvalues on our laptop PC: \(n = 500\), 5 clusters, 10 eigenvalues, and \(\beta = 10^{12}\) in each cluster
Algorithm 2  eig (binary64)  \(\nu = 1\)  \(\nu = 2\)  \(\nu = 3\) 

\(\Vert \widehat{E}\Vert \)  \(4.5 \times 10^{4}\)  \(1.4 \times 10^{7}\)  \(5.8 \times 10^{26}\)  \(2.6 \times 10^{39}\) 
\(n_{{\mathcal {J}}}\)  0  5  0  
Elapsed time (s)  0.07  1.41  5.99  9.20 
(accumulated)  0.07  1.48  7.47  16.67 
MPapproach  \(\texttt {mp.Digits(d)}\)  \(\texttt {d} = 34\)  \(\texttt {d} = 37\)  \(\texttt {d} = 50\) 
Elapsed time (s)  5.12  33.89  35.93 
Results of iterative refinement by Algorithm 2 (\(\mathsf {RefSyEvCL}\)) for pseudorandom symmetric matrices on a workstation
\(n = 2000\)  eig (binary32)  \(\nu = 1\)  \(\nu = 2\)  \(\nu = 3\) 

\(\Vert \widehat{E}\Vert \)  \(2.2 \times 10^{3}\)  \(2.6 \times 10^{6}\)  \(9.5 \times 10^{12}\)  \(9.4 \times 10^{17}\) 
\(n_{{\mathcal {J}}}\)  0  0  0  
Elapsed time (s)  0.35  1.68  0.98  1.76 
(accumulated)  0.35  2.03  3.01  4.77 
\(n = 5000\)  eig (binary32)  \(\nu = 1\)  \(\nu = 2\)  \(\nu = 3\) 
\(\Vert \widehat{E}\Vert \)  \(3.0 \times 10^{3}\)  \(4.6 \times 10^{6}\)  \(3.0 \times 10^{11}\)  \(9.4 \times 10^{17}\) 
\(n_{{\mathcal {J}}}\)  80  16  0  
Elapsed time (s)  1.69  15.09  10.67  13.68 
(accumulated)  1.69  16.78  27.45  41.13 
\(n = 10{,}000\)  eig (binary32)  \(\nu = 1\)  \(\nu = 2\)  \(\nu = 3\) 
\(\Vert \widehat{E}\Vert \)  \(3.9 \times 10^{3}\)  \(8.6 \times 10^{6}\)  \(9.4 \times 10^{11}\)  \(9.4 \times 10^{17}\) 
\(n_{{\mathcal {J}}}\)  1276  587  0  
Elapsed time (s)  15.81  406.52  211.36  82.93 
(accumulated)  15.81  422.33  633.68  716.61 
6.3 Application to a realworld problem
Finally, we apply the proposed algorithm to a quantum materials simulation that aims to understand electronic structures in material physics. The problems can be reduced to generalized eigenvalue problems, where eigenvalues and eigenvectors correspond to electronic energies and wave functions, respectively. To understand properties of materials correctly, it is crucial to determine the order of eigenvalues [12] and to obtain accurate eigenvectors [24, 25].
We deal with a generalized eigenvalue problem \(Ax = \lambda Bx\) arising from a vibrating carbon nanotube within a supercell with s, p, d atomic orbitals [4]. The matrices A and B are taken from ELSES matrix library [7] as VCNT22500, where A and B are real symmetric \(n \times n\) matrices with B being positive definite and \(n = 22500\). Our goal is to compute accurate eigenvectors and separate all the eigenvalues of the problem for determining their order. To this end, we use a numerical verification method in [14] based on the Gershgorin circle theorem (cf. e.g. [9, Theorem 7.2.2] and [23, pp. 71ff]), which can rigorously check whether all eigenvalues are separated and determine an existing range of each eigenvalue.
We first computed an approximate eigenvector matrix \(\widehat{X}\) of \(B^{1}A\) using the MATLAB function \(\mathsf {eig}(A,B)\) in binary64 arithmetic as an initial guess, and \(\widehat{X}\) was obtained in 235.17 s. Then, we had \(\max _{1 \le i \le n}e_{i} = 2.75 \times 10^{7}\), and 10 eigenvalues with 5 clusters could not be separated due to relatively small eigenvalue gaps. We next applied Algorithm 2 to A, B, and \(\widehat{X}\) in higher precision arithmetic in a similar way to Sect. 6.2, and obtained a refined approximate eigenvector matrix \(\widehat{X}'\) in 597.52 s. Finally, we obtained \(\max _{1 \le i \le n}e_{i} = 1.58 \times 10^{14}\) and confirmed that all the eigenvalues can successfully be separated.
Notes
Acknowledgements
The first author would like to express his sincere thanks to Professor Chen Greif at the University of British Columbia for his valuable comments and helpful suggestions.
References
 1.Advanpix: Multiprecision Computing Toolbox for MATLAB, Code and documentation. http://www.advanpix.com/ (2016)
 2.Anderson, E., Bai, Z., Bischof, C., Blackford, S., Demmel, J., Dongarra, J., Du Croz, J., Greenbaum, A., Hammarling, S., McKenney, A., Sorensen, D.: LAPACK Users’ Guide, 3rd edn. SIAM, Philadelphia (1999)CrossRefzbMATHGoogle Scholar
 3.Atkinson, K., Han, W.: Theoretical Numerical Analysis, 3rd edn. Springer, New York (2009)zbMATHGoogle Scholar
 4.Cerdá, J., Soria, F.: Accurate and transferable extended Hückeltype tightbinding parameters. Phys. Rev. B 61, 7965–7971 (2000)CrossRefGoogle Scholar
 5.Davis, C., Kahan, W.M.: The rotation of eigenvectors by a perturbation. III. SIAM J. Numer. Anal. 7, 1–46 (1970)MathSciNetCrossRefzbMATHGoogle Scholar
 6.Dhillon, I.S., Parlett, B.N.: Multiple representations to compute orthogonal eigenvectors of symmetric tridiagonal matrices. Linear Algebra Appl. 387, 1–28 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
 7.ELSES matrix library, Data and documentation. http://www.elses.jp/matrix/ (2018)
 8.GMP: GNU Multiple Precision Arithmetic Library, Code and documentation. http://gmplib.org/ (2018)
 9.Golub, G.H., Van Loan, C.F.: Matrix Computations, 4th edn. The Johns Hopkins University Press, Baltimore (2013)zbMATHGoogle Scholar
 10.Hida, Y., Li, X. S., Bailey, D. H.: Algorithms for quaddouble precision floating point arithmetic. In: Proceedings of the 15th IEEE Symposium on Computer Arithmetic, pp. 155–162. IEEE Computer Society Press (2001)Google Scholar
 11.Higham, N.J.: Accuracy and Stability of Numerical Algorithms, 2nd edn. SIAM, Philadelphia (2002)CrossRefzbMATHGoogle Scholar
 12.Lee, D., Hoshi, T., Sogabe, T., Miyatake, Y., Zhang, S.L.: Solution of the \(k\)th eigenvalue problem in largescale electronic structure calculations. J. Comput. Phys. 371, 618–632 (2018)MathSciNetCrossRefGoogle Scholar
 13.Li, X.S., Demmel, J.W., Bailey, D.H., Henry, G., Hida, Y., Iskandar, J., Kahan, W., Kang, S.Y., Kapur, A., Martin, M.C., Thompson, B.J., Tung, T., Yoo, D.: Design, implementation and testing of extended and mixed precision BLAS. ACM Trans. Math. Softw. 28, 152–205 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
 14.Miyajima, S.: Numerical enclosure for each eigenvalue in generalized eigenvalue problem. J. Comput. Appl. Math. 236, 2545–2552 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 15.MPFR: The GNU MPFR Library, Code and documentation. http://www.mpfr.org/ (2018)
 16.Ogita, T., Rump, S.M., Oishi, S.: Accurate sum and dot product. SIAM J. Sci. Comput. 26, 1955–1988 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 17.Ogita, T., Aishima, K.: Iterative refinement for symmetric eigenvalue decomposition. Jpn. J. Ind. Appl. Math. 35(3), 1007–1035 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
 18.Oishi, S.: Fast enclosure of matrix eigenvalues and singular values via rounding mode controlled computation. Linear Algebra Appl. 324, 133–146 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
 19.Ozaki, K., Ogita, T., Oishi, S., Rump, S.M.: Errorfree transformations of matrix multiplication by using fast routines of matrix multiplication and its applications. Numer. Algorithms 59, 95–118 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
 20.Parlett, B.N.: The Symmetric Eigenvalue Problem. Classics in Applied Mathematics, vol. 20, 2nd edn. SIAM, Philadelphia (1998)CrossRefzbMATHGoogle Scholar
 21.Rump, S.M.: Fast and parallel interval arithmetic. BIT Numer. Math. 39, 534–554 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
 22.Rump, S.M., Ogita, T., Oishi, S.: Accurate floatingpoint summation part II: sign, \(K\)fold faithful and rounding to nearest. SIAM J. Sci. Comput. 31, 1269–1302 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Wilkinson, J.H.: The Algebraic Eigenvalue Problem. Clarendon Press, Oxford (1965)zbMATHGoogle Scholar
 24.Yamamoto, S., Fujiwara, T., Hatsugai, Y.: Electronic structure of charge and spin stripe order in \({\rm La}_{2x}\,{\rm Sr}_{x}\,{\rm NiO}_{4}\) (\(x = \frac{1}{3}, \frac{1}{2}\)). Phys. Rev. B 76, 165114 (2007)CrossRefGoogle Scholar
 25.Yamamoto, S., Sogabe, T., Hoshi, T., Zhang, S.L., Fujiwara, T.: Shifted conjugateorthogonalconjugategradient method and its application to double orbital extended Hubbard model. J. Phys. Soc. Jpn. 77, 114713 (2008)CrossRefGoogle Scholar
 26.Yamamoto, T.: Error bounds for approximate solutions of systems of equations. Jpn. J. Appl. Math. 1, 157–171 (1984)MathSciNetCrossRefzbMATHGoogle Scholar
Copyright information
OpenAccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.