Computing Interior Eigenvalues of Large Sparse Symmetric Matrices

In this paper we present new restarted Krylov methods for calculating interior eigenvalues of large sparse symmetric matrices. The proposed methods are compact versions of the Heart iteration which are modified to retain the monotonicity property. Numerical experiments illustrate the usefulness of the proposed approach.


Introduction
The Heart iteration is a new type of Restarted Krylov methods. Given a symmetric matrix G ∈ R n×n , the method is aimed at calculating a cluster of k exterior eigenvalues of G. As other Krylov methods it is best suited for large sparse matrices in which a matrix-vector product needs only 0(n) flops. Another underlying assumption is that k 2 is considerably smaller than n. The use of restarted Krylov methods for solving large eigenvalue problems was considered by several authors, e.g., [1]- [22]. Most of the methods are based on implicitly restarted Lanczos tridiagonalization algorithms in which the starting vector is determined by "polynomial filtering" or "thick restart". The Heart iteration is not using these tools. It is based on Gram-Schmidt orthogonalizations and a simple explicit choice of the starting vector. This results in a simple iteration that enjoys rapid convergence. Former implementations of the Heart iteration concentrate on the computation of exterior eigenvalues, e.g., [4]- [8]. In this case the method possesses an important monotonicity property: The computed Ritz values proceed monotonically toward the desired eigenvalues of G. However, when moving to compute interior eigenvalues the monotonicity property is not guaranteed, and the Heart iteration needs some amendments. These amendments are derived in this paper. Note that although we are speaking on clusters of eigenvalues, the algorithm is carried out by computing the related invariant subspace of G. That is, a subspace that is spanned by the corresponding eigenvectors.
The shift-and-invert technique is a useful tool for calculating a cluster of eigenvalues. Let λ i , i = 1, . . . , n, denote the eigenvalues of G and assume that we want to compute k eigenvalues of G that are close to some interior point ν. The term "interior" means that at least one eigenvalue is smaller than ν, and at least one eigenvalue is larger than ν. If ν = λ j for j = 1, . . . , n, then the matrix G − ν I is invertible. The inverse of this matrix has the eigenvalues 1/(λ j − ν), j = 1, . . . , n, and an eigenvector of G that corresponds to λ j is an eigenvector of (G − ν I ) −1 that corresponds to 1/(λ j − ν), and vice versa. This relation turns the smallest eigenvalues of G − ν I to be the largest eigenvalues of (G − ν I ) −1 . Therefore, applying a restarted Krylov method on (G − ν I ) −1 is likely to achieve fast convergence toward the desired invariant subspace.
After the shift we need to compute the k smallest eigenvalues of a given symmetric indefinite matrix, and this is the problem that we actually solve. The term "indefinite" means that the matrix has at least one negative eigenvalue, and at least one positive eigenvalue. This feature is the reason for losing the monotonicity property.
The algorithms proposed in this paper are aimed at solving the following problem. Let G ∈ R n×n be a given large sparse symmetric indefinite matrix whose eigenvalues are sorted to satisfy Then it is desired to compute the k smallest eigenvalues of G. As before, if |λ 1 | > 0 then applying a restarted Krylov method on G −1 is likely to achieve rapid convergence. However, since G is a large matrix, it is expensive to compute G −1 , and matrix-vector products of the form G −1 b are obtained by solving the linear system If G is factorizable then we can use a direct solution method. Otherwise the linear system is solved by some iterative method. An important advantage of the Heart iteration is that it allows the use of inexact inversions. That is, there is no need to compute an accurate solution for the linear system, e.g., [4]. Some of the linear solvers are restricted to positive definite matrices. In this case one may solve the linear system instead of (3). The algorithms described in this paper rely on the assumption that we have effective methods for solving these systems. A straightforward way to extract Ritz vectors is by using the spectral decomposition of the related Rayleigh quotient matrix. Yet, as we shall see, this option lacks the monotonicity property. One way to retain monotonicity is by moving to compute the largest eigenvalues of G −1 . This approach has an effective "compact" version that avoids direct computation of G −1 but requires accurate solutions of (3).
A second way to achieve monotonicity is by moving from eigenvalues computation to singular values computation. Here the algorithm computes the k smallest singular values of G and the related n × k orthonormal matrix, V , whose columns are the corresponding right singular vectors. Once V is computed, the desired eigenpairs of G are retrieved from the Rayleigh quotient matrix V T GV . This method can be viewed as modified version of the basic Heart iteration that is used in [5]. The main improvements are the introduction of a semi-compact expansion scheme and the use of classical Gram-Schmidt orthogonalization.
The plan of the paper is as follows. The SVD iteration is presented in the next section. It computes the k smallest singular values of G and the corresponding right singular vectors. Hence it is necessary to convert the singular vectors into appropriate eigenvectors. This issue is resolved in Sect. 3. Then, in Sect. 4, we establish the monotonicity property. A semicompact spectral iteration is presented in Sect. 5. This iteration is more intuitive but lacks the monotonicity property. The compact iteration that is described in Sect. 6 gains monotonicity by moving to compute the largest eigenvalues of G −1 . This iteration becomes attractive whenever we have an efficient algorithm for accurate solution of (3). The paper ends with numerical experiments that illustrate the behavior of the proposed methods.

A Semi-compact SVD Iteration
The algorithm described below is aimed at computing the k smallest singular values of G and the corresponding right singular vectors. The qth iteration, q = 0, 1, 2, . . . , has two parts. It starts with a matrix V q ∈ R n×k that contains "old" information on the target space, a matrix Y q ∈ R n× that contains "new" information, and a matrix X q = [V q , Y q ] ∈ R n×(k+ ) that includes all the known information. The matrix X q has p = k + orthonormal columns. That is (A typical value for is = k.) Part 1. Information extraction (Subspace contraction). First compute the matrix Then compute the k smallest singular values of H q and the corresponding k right singular vectors. The singular vectors are assembled into a matrix which is used to compute the related matrix of Ritz vectors, Since both X q and U q have orthonormal columns the matrix V q+1 inherits this property. Part 2: Information retrieval (Subspace expansion). First set the starting matrix X = V q+1 (9) and the starting vector where e = (1, 1, . . . , 1) T ∈ R k and C denotes the inverting operator (see below). Then for j = k + 1, . . . , k + , the jth column of X is computed as follows: Explanatory Comments. Comment 1: The expansion process starts with the matrix X = V q+1 ∈ R n×k and ends with the matrix X = X q+1 ∈ R n× p that satisfies Note that the first k columns of X q+1 are the columns of V q+1 . This feature stands behind the monotonicity property. (The description of the expansion process omits indices to ease the notations.) Comment 2: Steps a) and b) achieve Gram-Schmidt orthogonalizations of z against the former columns of X . (One orthogonalization is not sufficient, but twice is enough ...) Here we use classical G-S orthogonalization while in earlier versions of the Heart iteration we have used modified G-S orthogonalization, e.g., [4]- [6]. The classical G-S is somewhat simpler and may save time.

Comment 3:
The normalization of z is carried out by replacing z with z/ z 2 , where · 2 denotes the Euclidean vector norm.

Comment 4:
The matrix C is a symbol for an inverting operator. That is, matrix-vector products of the form Cz are computed by solving a suitable linear system such as (3) or (4). If the linear system is solved by some iterative method, then there is no need to compute a highly accurate solution. In other words, it is possible to use "inexact inversions". If the linear solver requires a factorization of G (or G T G), then the factorization is done only once, before starting the iterative process. Another alternative is to define C as a suitable polynomial filter, e.g., [13]. Comment 5: The expansion process builds a Krylov subspace whose starting vector is defined by (10). In earlier explicit methods there was no orthogonalization against the columns of V q+1 , which causes certain difficulties, e.g., [14,16]. Comment 6: The above iteration is a "semi-compact Heart iteration". The name "Heart iteration" comes from the similarity to the heart's systole-diastole cardiac cycle: Step 1 achieves subspace contraction, while in Step 2 the subspace is expanded. Former versions of the Heart iteration are proposed in [4]- [8]. It is a "semi-compact" scheme since the building of G X q is separated from the expansion process. In compact versions the Rayleigh-quotient matrix is built during the expansion process, see Sect. 6.

Retrieving the Smallest Eigenpairs
At the end of the iterative process we obtain the k smallest singular values of G and an orthonormal matrix V ∈ R n×k whose columns are the corresponding right singular vectors of G. Hence it is needed to convert this information into the k smallest eigenvalues of G and the corresponding eigenvectors. Usually a right singular vector of G is also an eigenvector of G. Yet this rule can be violated when G has a pair of nonzero eigenvalues, λ j and λ j+1 say, such that λ j+1 = −λ j . In this case it is possible to have a right singular vector that corresponds to |λ j | = |λ j+1 | which is not an eigenvector of G. However, as explained below, it is possible to ensure that V will have the "retrieving ability", which means that the columns of V provide an orthonormal basis for the desired invariant subspace of G.
(An invariant subspace that corresponds to the smallest k eigenvalues of G.) In this case the related eigenpairs are obtained as follows. First compute the Rayleigh quotient matrix then achieve a spectral decomposition of R to obtain the related Ritz values and Ritz vectors. More precisely, let Q ∈ R k×k be an orthonormal matrix such that then ρ 1 , . . . , ρ k are the smallest eigenvalues of G, and the columns of the matrix V Q are the corresponding eigenvectors of G.
The question of whether V has the retrieving ability is related to the following property. Let λ j be a given eigenvalue of G. Then λ j has the "sign property" if any other eigenvalue of G, say λ i , that satisfies |λ i | = |λ j | also satisfies λ i = λ j . In other words, an eigenvalue λ j has the "sign property" if all the other eigenvalues that have the same modulus as λ j also have the same sign as λ j . The importance of this feature comes from the following observations. Theorem 1 Let λ j be an eigenvalue of G that has the sign property. Let E j denote the eigenspace of G that corresponds to λ j . Let W j denote the eigenspace of G 2 that corresponds to λ 2 j . In other words, W j spans all the right singular vectors of G that correspond to |λ j |. Then That is, a right singular vector of G that corresponds to |λ j | is also an eigenvector of G that corresponds to λ j .
Proof Let v j ∈ R n be an eigenvector of G that corresponds to λ j . The the equality G 2 v j = λ 2 j v j means that v j is also a singular vector of G that corresponds to |λ j |. This shows that E j is contained in W j . Therefore, since both subspaces have the same dimension, W j = E j . Theorem 2 Let λ j be an eigenvalue of G that has not the sign property. In this case there exist singular vectors of G that correspond to |λ j | which are not eigenvectors of G, but we have the following relations. Let W j be as above. Let E + j and E − j denote eigenspaces of G that correspond to |λ j | and −|λ j |, respectively. Then That is, W j is a direct sum of the eigenspaces E + j and E − j .
This shows that the direct sum has the same dimension as W j , which proves (15).
As before it is possible to assume that the eigenvalues of G satisfy (1). Then, since G is symmetric, the singular values of G are |λ j |, j = 1, . . . , n, and the k smallest singular values that we compute are This means that |λ k | is the largest singular value that we compute, while the next singular value in size, which is not computed, is |λ k+1 |. It is also possible to assume that the columns of V are ordered in a similar manner, where v j , j = 1, . . . , k, is a right singular vector of G that corresponds to |λ j |. With these notations and assumptions Theorems 1-2 lead to the following conclusions. (12)  Proof The jth column of V T GV has the form V T Gv j , j = 1, . . . , k, and the entries of this column are v T i Gv j , i = 1, . . . , k. If λ j has the sign property then Gv j = λ j v j and V T Gv j = λ j e j , where e j denotes the jth column of the k × k identity matrix. Otherwise, when λ j has not the sign property, there exist indices i 1 < i 2 for which the set λ i , i = i 1 , . . . , i 2 , includes all the eigenvalues λ i that satisfy |λ i | = |λ j |. In this case (15) tells us that v j is a linear combination of eigenvectors whose eigenvalues come from this set. Consequently the vector Gv j is also a linear combination of these eigenvectors, and the vector V T Gv j is a linear combination of the vectors e i , i = i 1 , . . . , i 2 .

Theorem 3 The Rayleigh quotient matrix
The last observation indicates that the key for determining whether V has the retrieving ability lies in the properties of λ k .

Corollary 6
Assume that |λ k | = |λ k+1 |. If λ k has the sign property then V has the retrieving ability. Otherwise, when λ k does not have the sign property, the retrieving ability is not ensured.
If V has the retrieving ability then the absolute values of the computed Ritz values should equal the corresponding singular values (which have been computed together with V ). This gives a simple practical test to verify that the retrieving ability indeed holds. Otherwise, when this test fails, the situation described in Corollary 6 holds. That is, |λ k | = |λ k+1 | and the set of all the eigenvalues of G whose modulus equals |λ k | includes both a negative eigenvalue and a positive eigenvalue. Consequently a tiny shift of G, say τ , is likely to split this set into two separate sets: In one set all the eigenvalues have the same positive value, in the other set all the eigenvalues have the same negative value, while the corresponding absolute values differ by the amount of |2τ |. Therefore, if the retrieving test fails then Step 1 is repeated with a slightly different shift, which ensures the retrieving ability.

The Monotonicity Property
The monotonicity property of the computed singular values is based on the following extensions of Cauchy interlace theorem and Poincaré separation theorem. (The original theorems of Cauchy and Poincaré refer to eigenvalues of symmetric matrices. See, for example, [11] and [15].)

Theorem 7 Let G be an n × n matrix, with singular values
Let the n × k matrix F be obtained from G by deleting n − k columns of G and let, 0 ≤ η 1 ≤ η 2 ≤ · · · ≤ η k denote the singular values of F. Then and In particular, for k = n − 1 we obtain the interlacing relations Proof The proof is concluded by applying Cauchy interlace theorem on the matrices G T G and F T F.

Corollary 8 Let the n × p matrix X have orthonormal columns, and let
0 ≤ ξ 1 ≤ ξ 2 ≤ · · · ≤ ξ p denote the singular values of the n × p matrix G X . Let the n × k matrix V be obtained from X by deleting = p − k columns of X , and let η 1 ≤ η 2 ≤ · · · ≤ η k denote the singular values of the m × k matrix GV . Then and Proof Let Q ∈ R n×n be an orthonormal matrix whose first p columns are those of X . Then the matrix G Q has the same singular values as G. Now the inequalities (23) and (24) are concluded by following the proof of Theorem 7.
Observe that the proofs are not using the symmetry of G. In fact, similar arguments show that the above bounds hold for any m ×n matrix with m ≥ n. Moreover, the upper bounds (21) and (24) remain valid when m < n. Yet the lower bounds (20) and (23) are not guaranteed when m < n. Theorem 9 (Monotonicity) Consider the qth iteration of the new method, q = 1, 2, . . . , and let denote the singular values of the matrix H q = G X q . Then the k smallest singular values of these matrices satisfy for j = 1, . . . , k and q = 1, 2, 3, . . . .

Proof
The k Ritz singular values which are computed at Part 1 are the smaller ones, Similarly, the former iteration has computed which are the singular values of the matrix GV q . Therefore, since the first k columns of X q are the columns of V q , these matrices satisfy the conditions of Corollary 8, and (26) is a direct corollary of (23).
The proofs of Corollary 8 and Theorem 9 emphasize the importance of the orthonormality relations, and provide the motivation behind the proposed Heart iteration. The monotonicity property (26) ensures that σ (q) j moves closer to σ j at each iteration. That is, each iteration improves the computed estimates of the smallest singular values. (In eigenvalues terminology the computed estimates are called Ritz values.)

A Semi-compact Spectral Iteration
The traditional way to achieve extraction is by using the spectral decomposition of the related Rayleigh-quotient matrix. Former restarted Krylov methods compute this matrix via the Lanczos tridiagonalization algorithm, e.g., [3,13,17,21,22]. Yet the Heart iteration avoids the Lanczos process and builds the Rayleigh-quotient matrix directly. This takes Part 1 into the following form, while Part 2 remains unchanged. Part 1: Spectral extraction. First compute the Rayleigh-quotient matrix Then compute the k smallest eigenvalues of S q . The corresponding k eigenvectors of S q are assembled into a matrix which is used to compute the related matrix of Ritz vectors Indeed, this approach works very well when calculating small eigenvalues of positive definite matrices, e.g., [4]. However, when G turns to be indefinite the monotonicity property is not guaranteed, while loss of monotonicity may prevent the algorithm from achieving fast convergence. See Table 4.
A second weakness of this iteration is that C is restricted to (3), since solving (4) may bring inappropriate information that retards convergence. The reasons lie in the difficulties mentioned in Sect. 3.

A Compact Spectral Iteration
This iteration is aimed at computing the k largest eigenvalues of G −1 and the related eigenvectors, which are the actual eigenvectors that we seek. Therefore, the extraction stage is carried out by computing the largest k eigenpairs of the Rayleigh quotient matrix As before, G −1 is not available and matrix-vector products of the form G −1 b are computed by solving (3). But here S q is computed "on the fly", during the expansion process. This can be done by noting that the columns of S q have the form where x j denotes the jth column of X q . Hence the vector z = G −1 x j can be used both to construct S q and to expand the Krylov subspace. The resulting "compact" iteration has the following form. (We omit the iteration indices for simplicity.)

Part I: Contraction
Compute the k largest eigenvalues of S and the corresponding eigenvectors. The eigenvalues construct a diagonal matrix, D ∈ R k×k . The eigenvectors are assembled into the matrix which is used to compute the related matrix of Ritz vectors Since both X and U have orthonormal columns the matrix V inherits this property.

Part II: Expansion
First set and r = X T z.
(a) Set z = z − X r.    The above iteration is a variant of the compact Heart algorithm [7]. Here the compact scheme is implicitly applied on G −1 and computes the k largest eigenpairs of this matrix. Assume for a moment that η is an eigenvalue of G −1 that has the largest modulus, and let η (q) denote the related Ritz value at the qth iteration, q = 0, 1, 2, . . . . Then 1/η (q) is the corresponding estimate for the smallest eigenvalue of G, while the monotonicity property for large eigenvalues [7] implies the relations and η ≤ η (q+1) ≤ η (q) when η < 0.
Similar relations hold for the other Ritz values, forcing them to proceed monotonically toward the desired eigenvalues of G −1 .
The compact spectral iteration becomes attractive when we have an effective method for solving (3) accurately. The accuracy is needed in order to ensure that the computed Rayleighquotient matrix indeed equals (30). Thus, clearly, in this approach it is not possible to used inexact solutions.

The Initial Orthonormal Matrix
To start the algorithm we need an "initial" orthonormal matrix X 0 ∈ R n× p where p = k + .
In our experiments X 0 is obtained by starting the expansion process with and z = Cx 1 .

Table 5
Computation times and number of iterations for a Uniform Dist matrix, n = 500, 000

Numerical Experiments
In this section we describe some experiments with the proposed methods. The semi-compact SVD iteration is the algorithm described in Sect. 2 with C defined by solving (4). The semicompact Spectral iteration is the algorithm of Sect. 5, while the compact Spectral iteration is the algorithm described in Sect. 6. The test matrices are large sparse symmetric indefinite matrices that have the form where D ∈ R n×n is a diagonal matrix, and H ∈ R n×n is a sparse orthonormal matrix, The matrices H i , i = 1, . . . , 4, are n × n sparse random Householder matrices, where h i ∈ R n is a sparse random vector. To generate these vectors we have used MATLAB's command h = sprand(n, 1, density) with density = 200/n. Since H is a product of Householder matrices, it is orthonormal matrix, and the diagonal entries of D are the eigenvalues of G. The details of the test matrices are specified in Table 1. Tables 2-3 provide the number of iterations that were needed to satisfy the stopping condition

The figures in
As before, the eigenvalues of G are assumed to satisfy (1), and λ (q) j , j = 1, . . . , k, are the Ritz values at the qth iteration, q = 0, 1, 2 . . . . (In the SVD algorithm the Ritz values were computed from the spectral decomposition of the related Rayleigh quotient matrix (12).) If the number of iterations equals 0 it means that the stopping condition is satisfied at the starting point, which is computed as described in Sect. 7.
The experiments in Table 2 and 3 were carried out with l = max{40, k}. Comparing these tables shows that the tested methods behave in a similar way, regarding the number of iterations. The first three test problems are difficult to solve and require several iterations. A similar difficulty is shared by MATLAB's "eigs" function. For example, using "eigs" to compute the k = 10 smallest eigenvalues of these matrices ends with the warning " 0 of the 10 requested eigenvalues converged". Yet in the other test matrices we observe fast convergence.
The semi-compact Spectral method of Sect. 5 lacks the monotonicity property. This deficiency is illustrated in Table 4, which tracks the behavior of the eigenvalues error, q , which is defined in (43). The results of Table 4 were obtained when computing the k = 10 smallest eigenvalues of the Harmonic Roots matrix. A similar lack of monotonicity was noticed in the Normal Dist matrix and the Uniform Dist matrix. These examples illustrate the importance of having the monotonicity property.
Given k, the number of eigenvalues that we want to compute, it is desired to use an "optimal" value of l, which minimizes the overall computation time that is needed to achieve our stopping condition. Clearly, increasing l is likely to reduce the number of iterations. Yet at the same time it increases the computational effort per iteration. Hence, finding an optimal value for l is not that straightforward. The experiments in Table 5 illustrate the way k and l effect the number of iterations and the overall computation time. The results are obtained by applying the semi-compact SVD algorithm on a huge "Uniform Dist" matrix with n = 500, 000. To make this algorithm more effective we have used MATLAB's function "ichol" to compute an incomplete Cholesky factorization of G T G. (This was done before starting the iterative process, and the timing results include the factorization time.) The last column of Table 5 includes timing results for MATLAB's "eigs" function. This enables us to compare SVD with "eigs". We see that for l = k the computation time of the semi-compact SVD algorithm is nearly twice of "eigs" time. This is quite encouraging since the SVD algorithm was programmed (in MATLAB) exactly as described in Sect. 2. Yet, as in other restarted Krylov methods, the basic iteration can be improved in several ways. For example, more effective orthogonalization schemes, locking of converged eigenvectors, and an improved rule for the choice of l.

Concluding Remarks
The monotonicity property ensures that each iteration provides an improved set of Ritz values. This property holds when using the Heart iteration to calculate a cluster of exterior eigenvalues. Yet, as Table 4 shows, it does not hold when calculating interior eigenvalues, which raises the question of how to overcome this difficulty.
The compact spectral method of Sect. 6 attains monotonicity by replacing G with G −1 . The replacement requires an effective algorithm for calculating accurate solutions of (3). But such algorithm is not always available when solving large indefinite linear systems. A second way that ensures monotonicity is the SVD iteration of Sect. 2. This option has the advantage that it is able to use inexact solutions of (3) or (4). Both iterations are very simple and enjoy fast rate of convergence. The experiments that we have done are quite encouraging.