1 Introduction

In [1], Bai et al. introduced the HSS iteration method with a resulting iteration matrix which is a product of four terms. See also [3]. An investigation of a permutation of those four terms results in a matrix that is similar to the original matrix and thus has the same spectral radius as the original matrix. This permuted iteration matrix is developed into a viable iteration scheme to solve a system of linear equations. This new scheme is related to an iteration scheme of Bruce Kellogg [4] and appears in Section 3. Further properties are found in Section 4 while Section 5 contains three examples which illustrate that in terms of iteration counts the new Kellogg-type HSS iteration method is no better and no worse than the HSS iteration scheme. Of course, this is to be expected since the new scheme has the same spectral radius as the original HSS scheme.

2 HSS iteration method

We consider iteration schemes to solve the system of linear equations \(Ax=b\) where \(A\in \mathbb {R}^{n\times n}\) and \(x,\ b\in \mathbb {R}^n \). Let \(A= H + S\) where \(H=\frac{A+A^*}{2}\) and \(S=\frac{A-A^*}{2}\) are the Hermitian and skew-Hermitian (HS) parts of the matrix A. The matrix A is said to be positive definite when its Hermitian part H is a positive definite matrix. Throughout this paper, A is assumed to be positive definite.

The HSS iteration method for solving \(Ax=b\) is the following two-step scheme that appears in [1]. See also [2,3,4].

HSS iteration scheme Given an initial guess \(x^{0}\), for \(k=0,1,2,\dots \), until \(\{x^{k}\}\) converges, compute

$$\begin{aligned} \begin{aligned} (\alpha I + H) \ x^{k+ \frac{1}{2}} = (\alpha I -S) x^{k} \ + \ b \\ (\alpha I + S) \ x^{k+1} = (\alpha I-H) x^{k+ \frac{1}{2}} + \ b , \end{aligned} \end{aligned}$$
(1)

where \(\alpha \) is a given positive constant. It easily follows that \(\alpha I + H \) and \(\alpha I + S\) are both invertible.

In [1], it was shown that the quasi-optimal choice for \(\alpha \) is given by the following: \(\alpha =\sqrt{\text {min}_{\sigma (H)} * \text {max}_{ \sigma (H)}}\), where \(\sigma (.) \) denotes the spectrum of a matrix.

Eliminating \(x^{k+\frac{1}{2}} \) from the above iteration scheme yields the following:

\(x^{k+1}=\ \Gamma (\alpha ) x^{k}+ (\alpha I + S)^{-1}(\alpha I - H)(\alpha I + H)^{-1} b + {(\alpha I +S)^{-1}} b\), where we define \(\Gamma (\alpha ):= (\alpha I + \ S)^{-1}(\alpha I - \ H)(\alpha I+H)^{-1}(\alpha I -S).\)

Now, we consider the four terms in \(\Gamma (\alpha ) \) and permute them in order to define the new iteration matrices below.

$$\begin{aligned} \begin{aligned} \Psi (\alpha ) := (\alpha I + \ S)^{-1}(\alpha I -S)(\alpha I - \ H)(\alpha I+H)^{-1}, \\ \text { and} \\ \Theta (\alpha ) := (\alpha I+H)^{-1}(\alpha I - \ H)(\alpha I + \ S)^{-1}(\alpha I -S). \end{aligned} \end{aligned}$$
(2)

In the next lemma, we give the relationship between the spectral radii of the three matrices \(\Gamma (\alpha )\), \(\Psi (\alpha )\) and \(\Theta (\alpha )\). Let \(\rho (.)\) denote the spectral radius of a matrix.

Lemma 1

For \(\alpha >0\), \(\rho (\Gamma (\alpha )) =\rho (\Psi (\alpha ))= \rho (\Theta (\alpha )) < 1\).

Proof

\(\Gamma (\alpha )\) is similar to \(\Theta (\alpha )\) since

\((\alpha I +S)\Gamma (\alpha ) (\alpha I + S)^{-1} = (\alpha I - H)(\alpha I + H)^{-1}(\alpha I -S)(\alpha I + S)^{-1}=\Theta (\alpha )\), whereas \(\Theta (\alpha )\) and \(\Psi (\alpha ) \) have the same spectral radius which follows from the well-known product theorem that \(\rho (A_1A_2)=\rho (A_2A_1)\). Hence, the non-zero eigenvalues of \(\Gamma (\alpha )\), \(\Psi (\alpha )\), and \(\Theta (\alpha )\) are equal, and their respective spectral radii are less than one follows from [1]. \(\square \)

The previous lemma suggests studying different orderings of the matrices from the HSS iteration scheme.

3 Kellogg-type HSS iteration method

The following iteration scheme for solving \(Ax=b\) is a modified HSS iteration method after Bruce Kellogg’s scheme [5] for ADI. We call this method the Kellogg-type HSS iteration method.

The Kellogg-type HSS iteration method Given an initial guess \(x^{0}\), for \(k=0,1,2,\dots \), until \(\{x^{k}\}\) converges, compute

$$\begin{aligned} \begin{aligned} (\alpha I + H) \ x^{k+ \frac{1}{2}} = (\alpha I -H) x^{k} \ + \ b_1 \\ (\alpha I + S) \ x^{k+1} = (\alpha I -S) x^{k+ \frac{1}{2}} + \ b_2 , \end{aligned} \end{aligned}$$
(3)

where \(\alpha \) is a positive constant and \(b=b_1+b_2\).

Upon eliminating \(x^{k+\frac{1}{2}} \), one has

\(x^{k+1}=\ \Psi (\alpha ) x^{k}+ (\alpha I + S)^{-1}(\alpha I - S)(\alpha I + H)^{-1} b_1 +(\alpha I + S)^{-1} b_2\).

The following lemma will be used to show the convergence of this iteration scheme to a unique solution to the system of equations \(Ax=b\).

Lemma 2

For any \(c_1,\ c_2\ \in \mathbb {R}^n\) there exists unique \(y,\ z\ \in \mathbb {R}^n\) satisfying

$$\begin{aligned} \begin{aligned} z = (\alpha I + H)^{-1}(\alpha I- H)y + c_1 \\ y = (\alpha I + S)^{-1}(\alpha I- S)z + c_2 \end{aligned} \end{aligned}$$
(4)

Furthemore,

$$\begin{aligned} \begin{aligned} y=\Psi (\alpha )y+ (\alpha I + S)^{-1}(\alpha I- S)c_1 +c_2 \\ \text { and } \\ z=\Theta (\alpha ) z+ (\alpha I + H)^{-1}(\alpha I- H)c_2 + c_1 \end{aligned} \end{aligned}$$
(5)

Proof

The above system of equations may be represented by \( \widetilde{A}\) \(\begin{bmatrix} z \\ y \end{bmatrix}\) = \(\begin{bmatrix} c_1 \\ c_2 \end{bmatrix}\) where \( \widetilde{A}\) = \(\begin{bmatrix} I &{} -(\alpha I + H)^{-1}(\alpha I- H)\\ -(\alpha I + S)^{-1}(\alpha I- S)&{} I \end{bmatrix}\)

whose determinant may be found from the following product of matrices:

\(\begin{bmatrix} I &{} -(\alpha I + H)^{-1}(\alpha I- H)\\ -(\alpha I + S)^{-1}(\alpha I- S)&{} I \end{bmatrix}\) \(\begin{bmatrix} I &{} 0\\ (\alpha I + S)^{-1}(\alpha I- S)&{} I \end{bmatrix}\)

= \(\begin{bmatrix} I - \Theta (\alpha )&{}\quad -(\alpha I + H)^{-1}(\alpha I- H)\\ 0 &{} I \end{bmatrix}\)

Now, since \(\rho (\Theta (\alpha )) < 1 \), it follows that the determinant of the last matrix is non-zero, and hence, the determinant of \( \tilde{A} \) is non-zero. This completes the proof of the first part, while the second part follows by substituting for y into the first equation and then substituting for z in the second equation.

It is worth noting that (5) uncouples (4). In view of our definitions for \(\Psi (\alpha )\) and \(\Theta (\alpha )\), the Kellogg-type HSS iteration may be rewritten in a recursive manner as follows:

$$\begin{aligned} \begin{aligned} x^{k+ 1} = \Psi (\alpha ) x^{k} \ + (\alpha I + S)^{-1}(\alpha I- S)(\alpha I + H)^{-1}b_1 + {(\alpha I +S)^{-1}} b_2\\ \ x^{k+\frac{1}{2}} = \Theta (\alpha ) x^{k- \frac{1}{2}} + (\alpha I + H)^{-1}(\alpha I- H) {(\alpha I +S)^{-1}} b_2 +(\alpha I + H)^{-1}b_1. \end{aligned} \end{aligned}$$
(6)

\(\square \)

We wish to show that the sequences {\(x^k\)} and {\(x^{k+\frac{1}{2}}\)} approach y and z respectively, and also that \(y+z\) is the solution to the system of equations \(Ax=b\) where \(\begin{bmatrix} z \\ y \end{bmatrix}\) is the solution of (4). Formally, we state this in the following theorem.

Theorem 1

Let \(x^{k}\) and \(x^{k+\frac{1}{2}}\) be defined by (6) then \(\displaystyle \lim _{k\rightarrow \infty } x^{k} =y\) and \(\displaystyle \lim _{k\rightarrow \infty } x^{k+\frac{1}{2}}=z\) and \(y+z =x = A^{-1}b\).

Proof

Let z and y be the unique solutions of (4) where \(c_1=(\alpha I + H)^{-1}b_1\) and \(c_2= {(\alpha I +S)^{-1}} b_2\) and set \(e^{n+\frac{1}{2}} = z- x^{n+\frac{1}{2}} \). Then, by Lemma 2 and the definition of \(x^{n+\frac{1}{2}}\) in the Kellogg-type HSS iteration

$$\begin{aligned} \begin{aligned} z- x^{n+\frac{1}{2}}&= \{ (\alpha I + H)^{-1}(\alpha I- H)y +c_1 \}- x^{n+\frac{1}{2}} \\&= (\alpha I + H)^{-1}(\alpha I- H)y +c_1- (\alpha I + H)^{-1}(\alpha I- H)x^n -c_1. \end{aligned} \end{aligned}$$
(7)

Canceling the \(c_1\) terms one has \(e^{n+\frac{1}{2}} = (\alpha I + H)^{-1}(\alpha I- H)(y-x^n)\).

Applying Lemma 2 and the definition of \(x^n\) from the Kellogg-type HSS iteration \(e^{n+\frac{1}{2}}\) becomes \( (\alpha I + H)^{-1}(\alpha I- H)[ (\alpha I + S)^{-1}(\alpha I- S)z + c_2-(\alpha I + S)^{-1}(\alpha I- S)x^{n-\frac{1}{2}}-c_2 ]\). Canceling the \(c_2\) terms, one has \( e^{n+\frac{1}{2}} = \Theta (\alpha )(z-x^{n-\frac{1}{2}}) =\Theta (\alpha ) e^{n-\frac{1}{2}}\) from which convergence follows since \(\rho (\Theta (\alpha ))<1\). \(\square \)

In a similar fashion, set \(e^{n+1} = y - x^{n+1} \). Then, using the definition of \(x^{n+1}\) and \(x^{n+\frac{1}{2}}\), one obtains

$$\begin{aligned} \begin{aligned} e^{n+1}&=y - (\alpha I + S)^{-1}(\alpha I- S)x^{n+\frac{1}{2}} - c_2\\&= (y -c_2)- (\alpha I + S)^{-1}(\alpha I- S)((\alpha I + H)^{-1}(\alpha I- H)x^n +c_1) . \end{aligned} \end{aligned}$$
(8)

By Lemma 2, \((\alpha I + S)^{-1}(\alpha I- S)c_1 = (\alpha I + S)^{-1}(\alpha I- S)z - \Psi (\alpha ) y\) and \( y - c_2= (\alpha I + S)^{-1}(\alpha I- S)z\). This results in \( e^{n+1} =y - c_2 - \Psi (\alpha )x^n-(\alpha I + S)^{-1}(\alpha I- S)z + \Psi (\alpha )y = \Psi (\alpha )(y-x^n)= \Psi (\alpha ) e^n\) from which convergence follows since \(\rho (\Psi (\alpha )) < 1\).

It remains to show that \( y+z \) is the solution of the system of equations \(Ax=b\).

To this end, set \(c_1 = (\alpha I + H) ^{-1} b_1\) and \(c_2 = (\alpha I + S) ^{-1} b_2\). By Lemma 2, the vector \(\begin{bmatrix} z \\ y \end{bmatrix}\) satisfies

$$\begin{aligned} \begin{aligned} z = (\alpha I + H)^{-1}(\alpha I- H)y + (\alpha I + H)^{-1}b_1 \\ y = (\alpha I + S)^{-1}(\alpha I- S)z + {(\alpha I +S)^{-1}} b_2 \end{aligned} \end{aligned}$$
(9)

Multiplying the first equation by \((\alpha I + H)\) and the second by \((\alpha I + S)\) and adding them together yield

$$\begin{aligned} (\alpha I + H)z+(\alpha I + S)y=(\alpha I - H) y +(\alpha I - S)z +b_1+b_2 \end{aligned}$$
(10)

This simplifies to \(H(y+z) +S(y+z) = b_1+b_2\) = \(A(y+z)=b\) as promised.

Also, we note that in the HSS Scheme, \(\displaystyle \lim _{n\rightarrow \infty } x^{n} =y \) and \(\displaystyle \lim _{n\rightarrow \infty } x^{n+\frac{1}{2}} =z\).

4 Cyclic reduction

We develop a cyclic reduction scheme similar to [6, 170]. With \(\widetilde{A}\) as previously defined, one sets \(J = I-\widetilde{A} =\) \(\begin{bmatrix} 0 &{} (\alpha I + H)^{-1}(\alpha I- H)\\ (\alpha I + S)^{-1}(\alpha I- S)&{} 0 \end{bmatrix}\) and then

\(J ^2= \) \(\begin{bmatrix} \Theta (\alpha ) &{}0 \\ 0 &{} \Psi (\alpha ) \end{bmatrix}\)

Lemma 3

For the matrix J defined above \(\rho (J ^2)<1 \) and \( \rho (J )<1 \).

Proof

Since \(\rho (\Theta (\alpha ))<1\) and \(\rho (\Psi (\alpha ))<1\) and \(J ^2\) is a block diagonal matrix with \(\Theta (\alpha )\) and \(\Psi (\alpha )\) along the diagonal, it follows that \(\rho (J ^2)<1\). Also, since the eigenvalues of the square of a matrix are the squares of the eigenvalues of the original matrix, it follows that \( \rho (J )<1 \). \(\square \)

Moreover, \(J ^2\) defines the following uncoupled iteration scheme:

$$\begin{aligned} \begin{aligned} z^{m+1} = \Theta (\alpha ) z^{m} + k_1 \\ y^{m+1} = \Phi (\alpha ) y^{m} + k_2 \end{aligned} \end{aligned}$$
(11)

where \(k_1= (\alpha I + H)^{-1}(\alpha I- H) {(\alpha I +S)^{-1}} b_2+ (\alpha I + H)^{-1}b_1\) and \(k_2=(\alpha I + S)^{-1}(\alpha I- S)(\alpha I + H)^{-1}b_1 + {(\alpha I +S)^{-1}} b_2\).

We now focus our attention on the solution of reduced equation \( z^{m+1} = \Theta (\alpha ) z^{m} + k_1.\) Now, since \(\rho ( \Theta (\alpha ))< 1 \), we have the following iteration scheme:

Cyclic reduction scheme Solving the reduced equation above, we obtain a sequence \(z^{m }\) converging to z. Now, having found the vector z, we use Lemma 2 and form the vector \(y= (\alpha I + S)^{-1}(\alpha I- S)z + {(\alpha I +S)^{-1}} b_2\), and as previously shown, we have \(x=y+z\) as a solution to the system of equations \(Ax=b.\)

5 Examples

In the following examples, a solution vector to the system of equations \(Ax=b\) is constructed as \(x(i)= (\frac{i}{N}) \sin (\frac{i*\pi }{6})\) where \(i=1,2,...,N\), with an initial guess \( x^0=(1,1,...,1)^T\), and the reported error in each of our schemes is measured by \(\left\| {Ay-b}\right\| _2\) where y is the resulting approximate answer. Furthermore, in the HSS iteration, Kellogg-type HSS iteration and the Cyclic Reduction iteration, \(\alpha =\sqrt{\text {min}_{\sigma (H)} * \text {max}_{ \sigma (H)}}\), and in the Kellogg-type HSS iteration, \(b_1\) and \(b_2\) are set to b and \((0,0,...,0)^T\), respectively. The criteria for stopping the HSS iteration and the Kellogg-type HSS iteration is \(\big \Vert {x-y}\big \Vert _2<10^{-5}\), and the criteria for stopping the cyclic reduction iteration occur when the 2-norm of two consecutive iterates is less than \(10^{-5}\). Finally, the runtime of the main iteration loop which includes all necessary matrix operations is included in the loop.

Example 1

Consider the system of linear equations with nxn coefficient matrix \(A = I \otimes T +T \otimes I \), where \(T = \text {tridiag}(-1-r,2,-1+r)\), is an mxm matrix with \(r=\frac{1}{m+1}\) resulting in \(n= m^2.\) For \(m=8\), the nxn matrix is 64x64 and the following are obtained.

The HSS iteration scheme required 38 iterations to obtain an error of \(2.3x 10^{-6}\), and the Kellogg-type HSS iteration scheme required 40 iterations to obtain an error of \(2.2x10^{-6}\). Since both schemes ran in negligible time, 100 iterations were executed of the main iteration loop for both schemes and they both required 0.126 time.

The cyclic reduction scheme was more expensive requiring 53 iterations to obtain an error of \( 4.7 x 10^{-6}\) and 0.006 runtime.

Example 2

Consider the system of linear equations with nxn coefficient matrix A= \( \begin{bmatrix} B &{} E\\ -E' &{} .5 I \end{bmatrix}\) where B= \( \begin{bmatrix} I \otimes T +T \otimes I &{} 0\\ 0 &{} I \otimes T +T \otimes I \end{bmatrix}\), E= \( \begin{bmatrix} I \otimes F \\ F \otimes I \end{bmatrix}\) where \(T = \text {tridiag}(-1,2,-1)\), \(F= h*\text {tridiag}(-1,1-0)\) are mxm matrices with \(h=\frac{1}{m+1}\) resulting in \(n=3m^2 \).

For \(m=5\), the nxn matrix is 75x75, and the following is obtained.

The HSS iteration scheme required 26 iterations to obtain an error of \(3.1 x 10^{-6}\),

and the Kellogg-type iteration scheme required 27 iterations to obtain an error of \(4.5 x 10^{-6}\). Both schemes ran in a negligible amount of time. As in the previous example, 100 iterations were executed of the main iteration loop with resulting identical cpu time of 0.251.

The cyclic reduction scheme required 29 iterations to obtain an error of \(4.5x 10^{-5}\) and a 0.063 runtime.

Example 3

Consider the system of linear equations with 256x256 coefficient matrix in [7].

$$\begin{aligned} A= \begin{bmatrix} 1&{} 1&{} 0&{}0&{}0&{}0&{}0&{} 0 \\ -1 &{}3 &{}2 &{}0&{}0&{}0&{}0 &{}0 \\ 0&{}-1&{} 5&{}3&{}0&{}0&{}...&{}0 \\ 0&{}0&{}-1&{} 7&{}4&{}0&{}0&{}... \\ .&{}.&{}.&{}.&{}.&{}. \\ 0&{}0&{}0&{}..&{}-1&{}2 N-5&{}N-2&{} \\ 0&{}0&{}0&{}..&{}&{}-1&{}2 N-3&{}N-1 \\ 0&{}0&{}0&{}0 &{}..&{} 0&{}-1&{}2N-1\\ \end{bmatrix} \\ \end{aligned}$$

The HSS iteration scheme required 11 iterations to obtain an error of \(7.5x 10^{-6}\)and 0.047 runtime.

Kellogg-type HSS iteration scheme required 12 iterations to obtain an error of \(6.0x 10^{-6}\) and 0.048 runtime.

Finally, the cyclic reduction scheme required 12 iterations to obtain an error of \(6.0x 10^{-6}\) and 0.062 runtime.

6 A closer look at the Kellogg-type HSS iteration scheme

In the examples using the Kellogg-type HSS iteration scheme, when finding \(b_1\) and \(b_2\), one set them to b and \((0,0,...,0)^T\) respectively. Code was written for seven different choices of \(b_1\) and \(b_2\) in each of the above examples. \(b_1\) was written as \(\beta * b\) and \(b_2\) as \((1-\beta ) *b\) for \( \beta \in \{ 0,1/4,1/2,3/4,1\}\). One can also select for \(b_1\) the positive or negative elements of b and \(b_2=b-b_1\). The only real change in our data is in Example 1, where selecting \(b_1\) to be the positive elements of b increased the number of iterations to meet our convergence criteria to 41 from 40. In Example 3, selecting \(b_1\) to be the positive elements of b produced an error of \(6.8x 10^{-6}\) with a runtime of 0.048 while selecting \(b_1\) to be the negative elements of b produced an error of \(5.2x 10^{-6}\) with a runtime of 0.047. Otherwise, there was no significant changes in all seven selections of \(b_1\) and \(b_2\) in our accuracy other than in the sixth digit.

In conclusion, since the spectral radius of all iteration matrices is the same in all the iteration schemes, the convergence to a solution of the system of linear equations has about the same number of iterations for all the iteration convergence schemes considered in this paper, although cyclic reduction is clearly more expensive.