Introduction

The Heart iteration is a new type of Restarted Krylov methods. Given a symmetric matrix \(G \in \mathbb {R}^{n\times n}\), the method is aimed at calculating a cluster of k exterior eigenvalues of G. As other Krylov methods it is best suited for large sparse matrices in which a matrix-vector product needs only 0(n) flops. Another underlying assumption is that \(k^2\) is considerably smaller than n. The use of restarted Krylov methods for solving large eigenvalue problems was considered by several authors, e.g., [1]- [22]. Most of the methods are based on implicitly restarted Lanczos tridiagonalization algorithms in which the starting vector is determined by “polynomial filtering" or “thick restart”. The Heart iteration is not using these tools. It is based on Gram-Schmidt orthogonalizations and a simple explicit choice of the starting vector. This results in a simple iteration that enjoys rapid convergence. Former implementations of the Heart iteration concentrate on the computation of exterior eigenvalues, e.g., [4]– [8]. In this case the method possesses an important monotonicity property: The computed Ritz values proceed monotonically toward the desired eigenvalues of G. However, when moving to compute interior eigenvalues the monotonicity property is not guaranteed, and the Heart iteration needs some amendments. These amendments are derived in this paper.

Note that although we are speaking on clusters of eigenvalues, the algorithm is carried out by computing the related invariant subspace of G. That is, a subspace that is spanned by the corresponding eigenvectors.

The shift-and-invert technique is a useful tool for calculating a cluster of eigenvalues. Let \(\uplambda _i, \, i = 1,\dots , n\), denote the eigenvalues of G and assume that we want to compute k eigenvalues of G that are close to some interior point \(\nu \). The term “interior” means that at least one eigenvalue is smaller than \(\nu \), and at least one eigenvalue is larger than \(\nu \). If \(\nu \ne \uplambda _j\) for \(j = 1, \dots , n\), then the matrix \(G - \nu I\) is invertible. The inverse of this matrix has the eigenvalues \(1/(\uplambda _j - \nu )\), \(j = 1, \dots , n\), and an eigenvector of G that corresponds to \(\uplambda _j\) is an eigenvector of \((G - \nu I)^{-1}\) that corresponds to \(1/(\uplambda _j - \nu )\), and vice versa. This relation turns the smallest eigenvalues of \(G-\nu I\) to be the largest eigenvalues of \((G-\nu I)^{-1}\). Therefore, applying a restarted Krylov method on \((G-\nu I)^{-1}\) is likely to achieve fast convergence toward the desired invariant subspace.

After the shift we need to compute the k smallest eigenvalues of a given symmetric indefinite matrix, and this is the problem that we actually solve. The term “indefinite” means that the matrix has at least one negative eigenvalue, and at least one positive eigenvalue. This feature is the reason for losing the monotonicity property.

The algorithms proposed in this paper are aimed at solving the following problem. Let \(G \in \mathbb {R}^{n\times n}\) be a given large sparse symmetric indefinite matrix whose eigenvalues are sorted to satisfy

$$\begin{aligned} | \uplambda _1| \le |\uplambda _2| \le \cdots \le |\uplambda _n|. \end{aligned}$$
(1)

Then it is desired to compute

$$\begin{aligned} \uplambda _j, \;\; \, j = 1, \dots , k, \end{aligned}$$
(2)

the k smallest eigenvalues of G. As before, if \(| \uplambda _1| > 0\) then applying a restarted Krylov method on \(G^{-1}\) is likely to achieve rapid convergence. However, since G is a large matrix, it is expensive to compute \(G^{-1}\), and matrix-vector products of the form \(G^{-1} \mathbf{b}\) are obtained by solving the linear system

$$\begin{aligned} G\mathbf{x}= \mathbf{b}. \end{aligned}$$
(3)

If G is factorizable then we can use a direct solution method. Otherwise the linear system is solved by some iterative method. An important advantage of the Heart iteration is that it allows the use of inexact inversions. That is, there is no need to compute an accurate solution for the linear system, e.g., [4]. Some of the linear solvers are restricted to positive definite matrices. In this case one may solve the linear system

$$\begin{aligned} G^T G\mathbf{x}= \mathbf{b}\end{aligned}$$
(4)

instead of (3). The algorithms described in this paper rely on the assumption that we have effective methods for solving these systems.

A straightforward way to extract Ritz vectors is by using the spectral decomposition of the related Rayleigh quotient matrix. Yet, as we shall see, this option lacks the monotonicity property. One way to retain monotonicity is by moving to compute the largest eigenvalues of \(G^{-1}\). This approach has an effective “compact” version that avoids direct computation of \(G^{-1}\) but requires accurate solutions of (3).

A second way to achieve monotonicity is by moving from eigenvalues computation to singular values computation. Here the algorithm computes the k smallest singular values of G and the related \(n\times k\) orthonormal matrix, V, whose columns are the corresponding right singular vectors. Once V is computed, the desired eigenpairs of G are retrieved from the Rayleigh quotient matrix \(V^T GV\). This method can be viewed as modified version of the basic Heart iteration that is used in [5]. The main improvements are the introduction of a semi-compact expansion scheme and the use of classical Gram-Schmidt orthogonalization.

The plan of the paper is as follows. The SVD iteration is presented in the next section. It computes the k smallest singular values of G and the corresponding right singular vectors. Hence it is necessary to convert the singular vectors into appropriate eigenvectors. This issue is resolved in Sect. 3. Then, in Sect. 4, we establish the monotonicity property. A semi-compact spectral iteration is presented in Sect. 5. This iteration is more intuitive but lacks the monotonicity property. The compact iteration that is described in Sect. 6 gains monotonicity by moving to compute the largest eigenvalues of \(G^{-1}\). This iteration becomes attractive whenever we have an efficient algorithm for accurate solution of (3). The paper ends with numerical experiments that illustrate the behavior of the proposed methods.

A Semi-compact SVD Iteration

The algorithm described below is aimed at computing the k smallest singular values of G and the corresponding right singular vectors. The qth iteration, \(q = 0,1,2,\dots \), has two parts. It starts with a matrix \(V_q \in \mathbb {R}^{n \times k}\) that contains “old” information on the target space, a matrix \(Y_q \in \mathbb {R}^{n\times \ell }\) that contains “new” information, and a matrix \(X_q = [V_q, Y_q] \in \mathbb {R}^{n\times (k+\ell )}\) that includes all the known information. The matrix \(X_q\) has \(p = k + \ell \) orthonormal columns. That is

$$\begin{aligned} X^T_q X_q = I\in \mathbb {R}^{p\times p}. \end{aligned}$$
(5)

(A typical value for \(\ell \) is \(\ell = k\).)

Part 1. Information extraction (Subspace contraction). First compute the matrix

$$\begin{aligned} H_q = GX_q. \end{aligned}$$
(6)

Then compute the k smallest singular values of \(H_q\) and the corresponding k right singular vectors. The singular vectors are assembled into a matrix

$$\begin{aligned} U_q \in \mathbb {R}^{p \times k}, \; U^T_q U_q = I\in \mathbb {R}^{k\times k}, \end{aligned}$$
(7)

which is used to compute the related matrix of Ritz vectors,

$$\begin{aligned} V_{q+1} = X_q U_q. \end{aligned}$$
(8)

Since both \(X_q\) and \(U_q\) have orthonormal columns the matrix \(V_{q+1}\) inherits this property.

Part 2: Information retrieval (Subspace expansion). First set the starting matrix

$$\begin{aligned} X = V_{q+1} \end{aligned}$$
(9)

and the starting vector

$$\begin{aligned} \mathbf{z}= C(V_{q+1} \mathbf{e}), \end{aligned}$$
(10)

where \(\mathbf{e}= (1, 1, \dots , 1)^T \in \mathbb {R}^k\) and C denotes the inverting operator (see below). Then for \(j = k+1, \dots , k+\ell \), the jth column of X is computed as follows:

  1. a)

    Set \(\mathbf{z}= \mathbf{z}- X(X^T\mathbf{z})\).

  2. b)

    Set \(\mathbf{z}= \mathbf{z}- X(X^T\mathbf{z})\).

  3. c)

    Normalize \(\mathbf{z}\).

  4. d)

    Expand X to be \([X, \mathbf{z}]\).

  5. e)

    Set \(\mathbf{z}= C\mathbf{z}\).

Explanatory Comments.

Comment 1: The expansion process starts with the matrix \(X = V_{q+1} \in \mathbb {R}^{n\times k}\) and ends with the matrix \(X = X_{q+1} \in \mathbb {R}^{n\times p}\) that satisfies

$$\begin{aligned} X^T_{q+1} X_{q+1} = I \in \mathbb {R}^{p\times p}. \end{aligned}$$
(11)

Note that the first k columns of \(X_{q+1}\) are the columns of \(V_{q+1}\). This feature stands behind the monotonicity property. (The description of the expansion process omits indices to ease the notations.)

Comment 2: Steps a) and b) achieve Gram-Schmidt orthogonalizations of \(\mathbf{z}\) against the former columns of X. (One orthogonalization is not sufficient, but twice is enough ...) Here we use classical G-S orthogonalization while in earlier versions of the Heart iteration we have used modified G-S orthogonalization, e.g., [4]- [6]. The classical G-S is somewhat simpler and may save time.

Comment 3: The normalization of \(\mathbf{z}\) is carried out by replacing \(\mathbf{z}\) with \(\mathbf{z}/ \Vert \mathbf{z}\Vert _2\), where \(\Vert \cdot \Vert _2\) denotes the Euclidean vector norm.

Comment 4: The matrix C is a symbol for an inverting operator. That is, matrix-vector products of the form \(C\mathbf{z}\) are computed by solving a suitable linear system such as (3) or (4).

If the linear system is solved by some iterative method, then there is no need to compute a highly accurate solution. In other words, it is possible to use “inexact inversions”. If the linear solver requires a factorization of G (or \(G^T G)\), then the factorization is done only once, before starting the iterative process. Another alternative is to define C as a suitable polynomial filter, e.g., [13].

Comment 5: The expansion process builds a Krylov subspace whose starting vector is defined by (10). In earlier explicit methods there was no orthogonalization against the columns of \(V_{q+1}\), which causes certain difficulties, e.g., [14, 16].

Comment 6: The above iteration is a “semi-compact Heart iteration”. The name “Heart iteration” comes from the similarity to the heart’s systole-diastole cardiac cycle: Step 1 achieves subspace contraction, while in Step 2 the subspace is expanded. Former versions of the Heart iteration are proposed in [4]- [8]. It is a “semi-compact” scheme since the building of \(GX_q\) is separated from the expansion process. In compact versions the Rayleigh-quotient matrix is built during the expansion process, see Sect. 6.

Retrieving the Smallest Eigenpairs

At the end of the iterative process we obtain the k smallest singular values of G and an orthonormal matrix \(V\in \mathbb {R}^{n\times k}\) whose columns are the corresponding right singular vectors of G. Hence it is needed to convert this information into the k smallest eigenvalues of G and the corresponding eigenvectors. Usually a right singular vector of G is also an eigenvector of G. Yet this rule can be violated when G has a pair of nonzero eigenvalues, \(\uplambda _j\) and \(\uplambda _{j+1}\) say, such that \(\uplambda _{j+1} = - \uplambda _j\). In this case it is possible to have a right singular vector that corresponds to \(| \uplambda _j| = | \uplambda _{j+1}|\) which is not an eigenvector of G. However, as explained below, it is possible to ensure that V will have the “retrieving ability”, which means that the columns of V provide an orthonormal basis for the desired invariant subspace of G. (An invariant subspace that corresponds to the smallest k eigenvalues of G.) In this case the related eigenpairs are obtained as follows. First compute the Rayleigh quotient matrix

$$\begin{aligned} R = V^T GV, \end{aligned}$$
(12)

then achieve a spectral decomposition of R to obtain the related Ritz values and Ritz vectors. More precisely, let \(Q \in \mathbb {R}^{k\times k}\) be an orthonormal matrix such that

$$\begin{aligned} Q^T V^T GVQ = \mathrm{diag}\{ \rho _1, \dots , \rho _k\}, \end{aligned}$$
(13)

then \(\rho _1, \dots , \rho _k\) are the smallest eigenvalues of G, and the columns of the matrix VQ are the corresponding eigenvectors of G.

The question of whether V has the retrieving ability is related to the following property. Let \(\uplambda _j\) be a given eigenvalue of G. Then \(\uplambda _j\) has the “sign property” if any other eigenvalue of G, say \(\uplambda _i\), that satisfies \(|\uplambda _i |=|\uplambda _j|\) also satisfies \(\uplambda _i = \uplambda _j\). In other words, an eigenvalue \(\uplambda _j\) has the “sign property" if all the other eigenvalues that have the same modulus as \(\uplambda _j\) also have the same sign as \(\uplambda _j\). The importance of this feature comes from the following observations.

Theorem 1

Let \(\uplambda _j\) be an eigenvalue of G that has the sign property. Let \(\mathbb {E}_j\) denote the eigenspace of G that corresponds to \(\uplambda _j\). Let \(\mathbb {W}_j\) denote the eigenspace of \(G^2\) that corresponds to \(\uplambda ^2_j\). In other words, \(\mathbb {W}_j\) spans all the right singular vectors of G that correspond to \(|\uplambda _j|\). Then

$$\begin{aligned} \mathbb {W}_j = \mathbb {E}_j. \end{aligned}$$
(14)

That is, a right singular vector of G that corresponds to \(|\uplambda _j|\) is also an eigenvector of G that corresponds to \(\uplambda _j\).

Proof

Let \(\mathbf{v}_j \in \mathbb {R}^n\) be an eigenvector of G that corresponds to \(\uplambda _j\). The the equality \(G^2\mathbf{v}_j = \uplambda ^2_j\mathbf{v}_j\) means that \(\mathbf{v}_j\) is also a singular vector of G that corresponds to \(|\uplambda _j|\). This shows that \(\mathbb {E}_j\) is contained in \(\mathbb {W}_j\). Therefore, since both subspaces have the same dimension, \(\mathbb {W}_j = \mathbb {E}_j\). \(\square \)

Theorem 2

Let \(\uplambda _j\) be an eigenvalue of G that has not the sign property. In this case there exist singular vectors of G that correspond to \(|\uplambda _j|\) which are not eigenvectors of G, but we have the following relations. Let \(\mathbb {W}_j\) be as above. Let \(\mathbb {E}^+_j\) and \(\mathbb {E}^-_j\) denote eigenspaces of G that correspond to \(|\uplambda _j|\) and \(-|\uplambda _j|\), respectively. Then

$$\begin{aligned} \mathbb {W}_j = \mathbb {E}^+_j \oplus \mathbb {E}^-_j. \end{aligned}$$
(15)

That is, \(\mathbb {W}_j\) is a direct sum of the eigenspaces \(\mathbb {E}^+_j\) and \(\mathbb {E}^-_j\).

Proof

Here both \(\mathbb {E}^+_j\) and \(\mathbb {E}^-_j\) are contained in \(\mathbb {W}_j\). Consequently the direct sum \(\mathbb {E}^+_j \oplus \mathbb {E}^-_j\) is also contained in \(\mathbb {W}_j\). On the other hand, since \(\mathbb {E}^+_j\) and \(\mathbb {E}^-_j\) are mutually orthogonal,

$$\begin{aligned} \mathrm{dim}(\mathbb {E}^+_j \oplus \mathbb {E}^-_j) = \mathrm{dim}(\mathbb {E}^+_j) + \mathrm{dim}(\mathbb {E}^{-}_j). \end{aligned}$$
(16)

This shows that the direct sum has the same dimension as \(\mathbb {W}_j\), which proves (15). \(\square \)

As before it is possible to assume that the eigenvalues of G satisfy (1). Then, since G is symmetric, the singular values of G are \(|\uplambda _j|, \; j = 1,\dots , n\), and the k smallest singular values that we compute are

$$\begin{aligned} |\uplambda _1| \le | \uplambda _2| \le \dots \le | \uplambda _k|. \end{aligned}$$
(17)

This means that \(|\uplambda _k|\) is the largest singular value that we compute, while the next singular value in size, which is not computed, is \(| \uplambda _{k+1}|\). It is also possible to assume that the columns of V are ordered in a similar manner,

$$\begin{aligned} V = [\mathbf{v}_1, \dots , \mathbf{v}_k] \end{aligned}$$
(18)

where \(\mathbf{v}_j, \; j = 1, \dots , k\), is a right singular vector of G that corresponds to \(|\uplambda _j|\). With these notations and assumptions Theorems 1-2 lead to the following conclusions.

Theorem 3

The Rayleigh quotient matrix (12) is a block diagonal matrix. Eigenvalues that have the sign property result in \(1\times 1\) blocks. The larger blocks correspond to eigenvalues without this property, where each block refers to a different singular value. \(\square \)

Proof

The jth column of \(V^TGV\) has the form \(V^TG\mathbf{v}_j, \; \, j = 1,\dots , k\), and the entries of this column are \(\mathbf{v}^T_i G\mathbf{v}_j, \; \, i = 1, \dots , k\). If \(\uplambda _j\) has the sign property then \(G\mathbf{v}_j = \uplambda _j\mathbf{v}_j\) and \(V^TG\mathbf{v}_j = \uplambda _j\mathbf{e}_j\), where \(\mathbf{e}_j\) denotes the jth column of the \(k\times k\) identity matrix. Otherwise, when \(\uplambda _j\) has not the sign property, there exist indices \(i_1 < i_2\) for which the set \(\uplambda _i, i = i_1, \dots , i_2\), includes all the eigenvalues \(\uplambda _i\) that satisfy \(|\uplambda _i| = | \uplambda _j|\). In this case (15) tells us that \(\mathbf{v}_j\) is a linear combination of eigenvectors whose eigenvalues come from this set. Consequently the vector \(G\mathbf{v}_j\) is also a linear combination of these eigenvectors, and the vector \(V^TG\mathbf{v}_j\) is a linear combination of the vectors \(\mathbf{e}_i, \; i = i_1, \dots , i_2\). \(\square \)

Corollary 4

For \(j = 1, \dots , k\), if

$$\begin{aligned} | \uplambda _j| < | \uplambda _{k+1}| \end{aligned}$$
(19)

then \(\mathbb {E}_j\) is contained in \(\mathrm{Range}(V)\). \(\square \)

The last observation indicates that the key for determining whether V has the retrieving ability lies in the properties of \(\uplambda _k\).

Corollary 5

If \(|\uplambda _k|< |\uplambda _{k+1}|\) then V has the retrieving ability. \(\square \)

Corollary 6

Assume that \(|\uplambda _k| = |\uplambda _{k+1}|\). If \(\uplambda _k\) has the sign property then V has the retrieving ability. Otherwise, when \(\uplambda _k\) does not have the sign property, the retrieving ability is not ensured. \(\square \)

If V has the retrieving ability then the absolute values of the computed Ritz values should equal the corresponding singular values (which have been computed together with V). This gives a simple practical test to verify that the retrieving ability indeed holds. Otherwise, when this test fails, the situation described in Corollary 6 holds. That is, \(|\uplambda _k| = |\uplambda _{k+1}|\) and the set of all the eigenvalues of G whose modulus equals \(|\uplambda _k|\) includes both a negative eigenvalue and a positive eigenvalue. Consequently a tiny shift of G, say \(\tau \), is likely to split this set into two separate sets: In one set all the eigenvalues have the same positive value, in the other set all the eigenvalues have the same negative value, while the corresponding absolute values differ by the amount of \(| 2\tau |\). Therefore, if the retrieving test fails then Step 1 is repeated with a slightly different shift, which ensures the retrieving ability.

The Monotonicity Property

The monotonicity property of the computed singular values is based on the following extensions of Cauchy interlace theorem and Poincaré separation theorem. (The original theorems of Cauchy and Poincaré refer to eigenvalues of symmetric matrices. See, for example, [11] and [15].)

Theorem 7

Let G be an \(n\times n\) matrix, with singular values

$$\begin{aligned} 0 \le \sigma _1 \le \sigma _2 \le \dots \le \sigma _n. \end{aligned}$$

Let the \(n\times k\) matrix F be obtained from G by deleting \(n-k\) columns of G and let,

$$\begin{aligned} 0 \le \eta _1 \le \eta _2 \le \dots \le \eta _k \end{aligned}$$

denote the singular values of F. Then

$$\begin{aligned} \sigma _j \le \eta _j \; \hbox { for }\; j = 1, \dots , k, \end{aligned}$$
(20)

and

$$\begin{aligned} \eta _{k+1-j} \le \sigma _{n+1-j} \; \hbox { for } \; j = 1,\dots , k. \end{aligned}$$
(21)

In particular, for \(k = n-1\) we obtain the interlacing relations

$$\begin{aligned} \sigma _1\le \eta _1 \le \sigma _2\le \eta _2 \le \sigma _3 \le \dots \le \sigma _{n-1} \le \eta _{n-1} \le \sigma _n. \end{aligned}$$
(22)

Proof

The proof is concluded by applying Cauchy interlace theorem on the matrices \(G^TG\) and \(F^TF\). \(\square \)

Corollary 8

Let the \(n\times p\) matrix X have orthonormal columns, and let

$$\begin{aligned} 0 \le \xi _1 \le \xi _2 \le \dots \le \xi _p \end{aligned}$$

denote the singular values of the \(n\times p\) matrix GX. Let the \(n\times k\) matrix V be obtained from X by deleting \(\ell = p-k\) columns of X, and let

$$\begin{aligned} \eta _1 \le \eta _2 \le \dots \le \eta _k \end{aligned}$$

denote the singular values of the \(m\times k\) matrix GV. Then

$$\begin{aligned} \sigma _j\le \xi _j \le \eta _j \; \; \hbox {for } \; j = 1, \dots , k, \end{aligned}$$
(23)

and

$$\begin{aligned} \eta _{k+1-j} \le \xi _{p+1-j} \le \sigma _{n+1-j} \; \; \hbox {for} \; j = 1, \dots , k. \end{aligned}$$
(24)

Proof

Let \(Q \in \mathbb {R}^{n\times n}\) be an orthonormal matrix whose first p columns are those of X. Then the matrix GQ has the same singular values as G. Now the inequalities (23) and (24) are concluded by following the proof of Theorem 7. \(\square \)

Observe that the proofs are not using the symmetry of G. In fact, similar arguments show that the above bounds hold for any \(m\times n\) matrix with \(m \ge n\). Moreover, the upper bounds (21) and (24) remain valid when \(m < n\). Yet the lower bounds (20) and (23) are not guaranteed when \(m < n\).

Theorem 9

(Monotonicity) Consider the qth iteration of the new method, \(q = 1,2,\dots \), and let

$$\begin{aligned} 0 \le \sigma ^{(q)}_1 \le \sigma ^{(q)}_2 \le \dots \le \sigma ^{(q)}_p \end{aligned}$$
(25)

denote the singular values of the matrix \(H_q = GX_q\). Then the k smallest singular values of these matrices satisfy

$$\begin{aligned} \sigma _j\le \sigma ^{(q)}_j \le \sigma ^{(q-1)}_j \end{aligned}$$
(26)

for \(j = 1, \dots , k\) and \(q = 1,2,3,\dots \).

Proof

The k Ritz singular values which are computed at Part 1 are the smaller ones,

$$\begin{aligned} \sigma ^{(q)}_1 \le \sigma ^{(q)}_2 \le \dots \le \sigma ^{(q)}_k. \end{aligned}$$

Similarly, the former iteration has computed

$$\begin{aligned} \sigma ^{(q-1)}_1 \le \sigma ^{(q-1)}_2 \le \dots \le \sigma ^{(q-1)}_k, \end{aligned}$$

which are the singular values of the matrix \(GV_q\). Therefore, since the first k columns of \(X_q\) are the columns of \(V_q\), these matrices satisfy the conditions of Corollary 8, and (26) is a direct corollary of (23). \(\square \)

The proofs of Corollary 8 and Theorem 9 emphasize the importance of the orthonormality relations, and provide the motivation behind the proposed Heart iteration. The monotonicity property (26) ensures that \(\sigma ^{(q)}_j\) moves closer to \(\sigma _j\) at each iteration. That is, each iteration improves the computed estimates of the smallest singular values. (In eigenvalues terminology the computed estimates are called Ritz values.)

A Semi-compact Spectral Iteration

The traditional way to achieve extraction is by using the spectral decomposition of the related Rayleigh-quotient matrix. Former restarted Krylov methods compute this matrix via the Lanczos tridiagonalization algorithm, e.g., [3, 13, 17, 21, 22]. Yet the Heart iteration avoids the Lanczos process and builds the Rayleigh-quotient matrix directly. This takes Part 1 into the following form, while Part 2 remains unchanged.

Part 1: Spectral extraction. First compute the Rayleigh-quotient matrix

$$\begin{aligned} S_q = X^T_q GX_q. \end{aligned}$$
(27)

Then compute the k smallest eigenvalues of \(S_q\). The corresponding k eigenvectors of \(S_q\) are assembled into a matrix

$$\begin{aligned} U_q \in \mathbb {R}^{p\times k}, \; \; U^T_qU_q = I \in \mathbb {R}^{k\times k}, \end{aligned}$$
(28)

which is used to compute the related matrix of Ritz vectors

$$\begin{aligned} V_{q+1} = X_qU_q. \end{aligned}$$
(29)

Indeed, this approach works very well when calculating small eigenvalues of positive definite matrices, e.g., [4]. However, when G turns to be indefinite the monotonicity property is not guaranteed, while loss of monotonicity may prevent the algorithm from achieving fast convergence. See Table 4.

A second weakness of this iteration is that C is restricted to (3), since solving (4) may bring inappropriate information that retards convergence. The reasons lie in the difficulties mentioned in Sect. 3.

A Compact Spectral Iteration

This iteration is aimed at computing the k largest eigenvalues of \(G^{-1}\) and the related eigenvectors, which are the actual eigenvectors that we seek. Therefore, the extraction stage is carried out by computing the largest k eigenpairs of the Rayleigh quotient matrix

$$\begin{aligned} S_q = X^T_q G^{-1}_q X_q. \end{aligned}$$
(30)

As before, \(G^{-1}\) is not available and matrix-vector products of the form \(G^{-1}\mathbf{b}\) are computed by solving (3). But here \(S_q\) is computed “on the fly”, during the expansion process. This can be done by noting that the columns of \(S_q\) have the form

$$\begin{aligned} X^T_q (G^{-1}_q\mathbf{x}_j),\; \; \, j = 1, \dots , p, \end{aligned}$$
(31)

where \(\mathbf{x}_j\) denotes the jth column of \(X_q\). Hence the vector \(\mathbf{z}= G^{-1}\mathbf{x}_j\) can be used both to construct \(S_q\) and to expand the Krylov subspace. The resulting “compact” iteration has the following form. (We omit the iteration indices for simplicity.)

Part I: Contraction

Compute the k largest eigenvalues of S and the corresponding eigenvectors. The eigenvalues construct a diagonal matrix, \(D \in \mathbb {R}^{k\times k}\). The eigenvectors are assembled into the matrix

$$\begin{aligned} U \in \mathbb {R}^{p\times k}, \; \; U^T U = I, \end{aligned}$$
(32)

which is used to compute the related matrix of Ritz vectors

$$\begin{aligned} V = XU \in \mathbb {R}^{n\times k}. \end{aligned}$$
(33)

Since both X and U have orthonormal columns the matrix V inherits this property.

Part II: Expansion

First set

$$\begin{aligned} X&= V,\\ S&= D, \\ \mathbf{z}&=G^{-1} (V\mathbf{e}), \end{aligned}$$

and

$$\begin{aligned} \mathbf{r}= X^T\mathbf{z}.\quad \;\, \end{aligned}$$

Then for \(j = k+1, \dots , k+\ell \), do as follows.

  1. (a)

    Set \(\mathbf{z}= \mathbf{z}- X\mathbf{r}\).

  2. (b)

    Set \(\mathbf{z}= \mathbf{z}- X(X^T\mathbf{z})\).

  3. (c)

    Normalize \(\mathbf{z}\)

  4. (d)

    Expand X to be \([X, \mathbf{z}]\).

  5. (e)

    Set \(\mathbf{z}= G^{-1}\mathbf{z}\).

  6. (f)

    Expand S by computing the vector

    $$\begin{aligned} \mathbf{r}= (r_1,\dots , r_j)^T = X^T\mathbf{z}\end{aligned}$$

    and setting the new entries of S to be

    $$\begin{aligned} s_{ij} = s_{ji} = r_i \; \; \hbox {for } \; i = 1, \dots , j. \end{aligned}$$

The above iteration is a variant of the compact Heart algorithm [7]. Here the compact scheme is implicitly applied on \(G^{-1}\) and computes the k largest eigenpairs of this matrix. Assume for a moment that \(\eta \) is an eigenvalue of \(G^{-1}\) that has the largest modulus, and let \(\eta ^{(q)}\) denote the related Ritz value at the qth iteration, \(q = 0,1,2, \dots \). Then \(1/\eta ^{(q)}\) is the corresponding estimate for the smallest eigenvalue of G, while the monotonicity property for large eigenvalues [7] implies the relations

$$\begin{aligned} \eta ^{(q)} \le \eta ^{(q+1)} \le \eta \; \hbox { when } \, \eta > 0, \end{aligned}$$
(34)

and

$$\begin{aligned} \eta \le \eta ^{(q+1)} \le \eta ^{(q)} \; \hbox { when } \, \eta < 0. \end{aligned}$$
(35)

Similar relations hold for the other Ritz values, forcing them to proceed monotonically toward the desired eigenvalues of \(G^{-1}\).

The compact spectral iteration becomes attractive when we have an effective method for solving (3) accurately. The accuracy is needed in order to ensure that the computed Rayleigh-quotient matrix indeed equals (30). Thus, clearly, in this approach it is not possible to used inexact solutions.

Table 1 The test matrices
Table 2 Computing the k smallest eigenvalues with a semi-compact SVD iteration, \(l = {\hbox {max}}\{ 40, k\}\)
Table 3 Computing the k smallest eigenvalues with a compact Spectral iteration, \(l = \hbox {max}\{ 40, k\}\)
Table 4 Lack of monotonicity in the semi-compact Spectral iteration: Computing the \(k=10\) smallest eigenvalues of the Harmonic Roots matrix, \(l = 50\)
Table 5 Computation times and number of iterations for a Uniform Dist matrix, \( n=500,000\)

The Initial Orthonormal Matrix

To start the algorithm we need an “initial” orthonormal matrix \(X_0 \in \mathbb {R}^{n\times p}\) where \(p = k+\ell \). In our experiments \(X_0\) is obtained by starting the expansion process with

$$\begin{aligned} \mathbf{x}_1= & {} C\mathbf{e}/ \Vert C\mathbf{e}\Vert _2, \end{aligned}$$
(36)
$$\begin{aligned} X= & {} [ \mathbf{x}_1] \in \mathbb {R}^{n\times 1}, \end{aligned}$$
(37)

and

$$\begin{aligned} \mathbf{z}= C \mathbf{x}_1. \end{aligned}$$
(38)

As before, \(\mathbf{e}= (1, 1, \dots , 1)^T \in \mathbb {R}^n\) and \(C\in \mathbb {R}^{n\times n}\) symbolizes an inverting operator.

Numerical Experiments

In this section we describe some experiments with the proposed methods. The semi-compact SVD iteration is the algorithm described in Sect. 2 with C defined by solving (4). The semi-compact Spectral iteration is the algorithm of Sect. 5, while the compact Spectral iteration is the algorithm described in Sect. 6.

The test matrices are large sparse symmetric indefinite matrices that have the form

$$\begin{aligned} G = HDH^T \in \mathbb {R}^{n\times n} \end{aligned}$$
(39)

where \(D \in \mathbb {R}^{n\times n}\) is a diagonal matrix,

$$\begin{aligned} D = \mathrm{diag}\{ \uplambda _1,\dots , \uplambda _n\}, \end{aligned}$$
(40)

and \(H \in \mathbb {R}^{n\times n}\) is a sparse orthonormal matrix,

$$\begin{aligned} H = \prod ^4_{i=1} H_i. \end{aligned}$$
(41)

The matrices \(H_i, \; i = 1,\dots , 4\), are \(n\times n\) sparse random Householder matrices,

$$\begin{aligned} H_i = I - 2\mathbf{h}_i \mathbf{h}^T_i/\mathbf{h}^T_i \mathbf{h}_i, \end{aligned}$$
(42)

where \(\mathbf{h}_i \in \mathbb {R}^n\) is a sparse random vector. To generate these vectors we have used MATLAB’s command \(\mathbf{h}= sprand (n, 1\), density) with density \(= 200/n\). Since H is a product of Householder matrices, it is orthonormal matrix, and the diagonal entries of D are the eigenvalues of G. The details of the test matrices are specified in Table 1.

The figures in Tables 2-3 provide the number of iterations that were needed to satisfy the stopping condition

$$\begin{aligned} \Psi _q = (\sum \limits ^k_{j = 1} | \uplambda _j - \uplambda ^{(q)}_j |)/(\sum \limits ^k_{j=1} |\uplambda _j|) \le 10^{-13}. \end{aligned}$$
(43)

As before, the eigenvalues of G are assumed to satisfy (1), and \(\uplambda ^{(q)}_j, \; j = 1, \dots , k\), are the Ritz values at the qth iteration, \(q = 0,1,2\dots \). (In the SVD algorithm the Ritz values were computed from the spectral decomposition of the related Rayleigh quotient matrix (12).) If the number of iterations equals 0 it means that the stopping condition is satisfied at the starting point, which is computed as described in Sect. 7.

The experiments in Table 2 and 3 were carried out with \(l = \hbox {max}\{ 40, k\}\). Comparing these tables shows that the tested methods behave in a similar way, regarding the number of iterations. The first three test problems are difficult to solve and require several iterations. A similar difficulty is shared by MATLAB’s “eigs” function. For example, using “eigs” to compute the \(k = 10\) smallest eigenvalues of these matrices ends with the warning “ 0 of the 10 requested eigenvalues converged”. Yet in the other test matrices we observe fast convergence.

The semi-compact Spectral method of Sect. 5 lacks the monotonicity property. This deficiency is illustrated in Table 4, which tracks the behavior of the eigenvalues error, \(\Psi _q\), which is defined in (43). The results of Table 4 were obtained when computing the \(k = 10\) smallest eigenvalues of the Harmonic Roots matrix. A similar lack of monotonicity was noticed in the Normal Dist matrix and the Uniform Dist matrix. These examples illustrate the importance of having the monotonicity property.

Given k, the number of eigenvalues that we want to compute, it is desired to use an “optimal” value of l, which minimizes the overall computation time that is needed to achieve our stopping condition. Clearly, increasing l is likely to reduce the number of iterations. Yet at the same time it increases the computational effort per iteration. Hence, finding an optimal value for l is not that straightforward. The experiments in Table 5 illustrate the way k and l effect the number of iterations and the overall computation time. The results are obtained by applying the semi-compact SVD algorithm on a huge “Uniform Dist” matrix with \(n = 500,000\). To make this algorithm more effective we have used MATLAB’s function “ichol” to compute an incomplete Cholesky factorization of \(G^TG\). (This was done before starting the iterative process, and the timing results include the factorization time.)

The last column of Table 5 includes timing results for MATLAB’s “eigs” function. This enables us to compare SVD with “eigs”. We see that for \(l = k\) the computation time of the semi-compact SVD algorithm is nearly twice of “eigs” time. This is quite encouraging since the SVD algorithm was programmed (in MATLAB) exactly as described in Sect. 2. Yet, as in other restarted Krylov methods, the basic iteration can be improved in several ways. For example, more effective orthogonalization schemes, locking of converged eigenvectors, and an improved rule for the choice of l.

Concluding Remarks

The monotonicity property ensures that each iteration provides an improved set of Ritz values. This property holds when using the Heart iteration to calculate a cluster of exterior eigenvalues. Yet, as Table 4 shows, it does not hold when calculating interior eigenvalues, which raises the question of how to overcome this difficulty.

The compact spectral method of Sect. 6 attains monotonicity by replacing G with \(G^{-1}\). The replacement requires an effective algorithm for calculating accurate solutions of (3). But such algorithm is not always available when solving large indefinite linear systems. A second way that ensures monotonicity is the SVD iteration of Sect. 2. This option has the advantage that it is able to use inexact solutions of (3) or (4). Both iterations are very simple and enjoy fast rate of convergence. The experiments that we have done are quite encouraging.