1 Introduction

Consider the generalized saddle-point problem

$$ \mathscr{A}z:= \begin{bmatrix} A & B^{T} \\ -B & 0 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}= \begin{bmatrix} f \\ g \end{bmatrix}:=b, $$
(1)

where \(A\in R^{n\times n}\), \(B\in R^{m\times n}\), \(C\in R^{m\times m}\), \(f\in R^{n}\), \(f\in R^{m}\) and \(m\leq n\).

This class of linear systems arises in many scientific and engineering applications such as a mixed finite element approximation of elliptic partial differential equations, optimization, optimal control, structural analysis and electrical networks; see [1,2,3,4,5,6,7,8,9,10,11].

Recently, Benzi et al. [12, 13] studied the linear systems of the form (1) whose coefficient matrix

$$ \mathscr{A}= \begin{bmatrix} A & B^{T} \\ -B & 0 \end{bmatrix}= \begin{bmatrix} A_{1} & 0 & B_{1}^{T} \\ 0 & A_{2} & B_{2}^{T} \\ -B_{1} & -B_{2} & 0 \end{bmatrix} $$
(2)

satisfy all of the assumptions:

  • A=[A100A2], \(B=[B_{1}, B_{2}]\), \(A_{i}\in R^{n_{i}\times n_{i}}\) for \(i=1,2\) and \(n_{1}+n_{2}=n\), and \(B_{i}\in R^{m\times n_{i}}\) for \(i=1,2\);

  • \(A_{i}\) is positive definite (i.e., it has positive definite symmetric part \(H_{i}=(A_{i}+A^{T}_{i})/{2}\)) for \(i=1,2\);

  • \(\operatorname{rank}(B)=m\).

They [12] split the coefficient matrix \(\mathscr{A}\) as

$$ \mathscr{A}=\mathscr{A}_{1}+\mathscr{A}_{2}, $$
(3)

where

$$ \mathscr{A}_{1}= \begin{bmatrix} A_{1} & 0 & B_{1}^{T} \\ 0 & 0 & 0 \\ -B_{1} & 0 & 0 \end{bmatrix},\quad \text{and} \quad \mathscr{A}_{2}= \begin{bmatrix} 0 & 0 & 0 \\ 0 & A_{2} & B_{2}^{T} \\ 0 & -B_{2} & 0 \end{bmatrix}, $$
(4)

which is called dimensional splitting of \(\mathscr{A}\), and proposed the following alternate direction iterative method:

$$ \textstyle\begin{cases} (\alpha I+\mathscr{A}_{1})x^{(k+\frac{1}{2})}=(\alpha I-\mathscr{A} _{2})x^{(k)}+b, \\ (\alpha I+\mathscr{A}_{2})x^{(k+1)}=(\alpha I-\mathscr{A}_{1})x^{(k+ \frac{1}{2})}+b, \end{cases} $$
(5)

which was proved to converge unconditionally for any \(\alpha >0\). Meanwhile, based on dimensional splitting of \(\mathscr{A}\), they [12, 13] proposed the dimensional splitting preconditioner for linear system (1), and applied a Krylov subspace method like restarted GMRES to the preconditioned linear system, and hence established some good results.

In this paper, we propose two types of alternate direction iterative methods: one is that on base of the dimensional splitting (3) the quantitative matrix αI is replaced by two nonnegative diagonal matrices \(\mathscr{D}_{1}\) and \(\mathscr{D}_{2}\) to form a new alternate direction iterative scheme; another is to propose a new splitting of \(\mathscr{A}\), i.e.,

$$ \mathscr{A}=\mathscr{B}_{1}+\mathscr{B}_{2}, $$
(6)

where

$$ \mathscr{B}_{1}= \begin{bmatrix} A_{1} & 0 & 0 \\ 0 & 0 & B_{2}^{T} \\ 0 & -B_{2} & 0 \end{bmatrix}\quad \mbox{and}\quad \mathscr{B}_{2}= \begin{bmatrix} 0 & 0 & B_{1}^{T} \\ 0 & A_{2} & 0 \\ -B_{1} & 0 & 0 \end{bmatrix}, $$
(7)

and apply the two nonnegative diagonal matrices \(\mathscr{D}_{1}\) and \(\mathscr{D}_{2}\) to the new splitting such that another new alternate direction iterative scheme is obtained. Then some convergence results are established for the two alternate direction iterative schemes and a numerical example is given to show that the proposed ADI methods are much more effective and efficient than the existing one.

The paper is organized as follows. Two alternate direction iterative schemes are proposed in Sect. 2. The main convergence results of these two schemes are given in Sect. 3. In Sect. 4, a numerical examples is presented to demonstrate the proposed methods are very effective and efficient in this paper. A conclusion is given in Sect. 5.

2 The ADI methods

In this section, two alternate direction iterative schemes are proposed based on the previous two splittings (3) and (6). Let

$$ \mathscr{D}_{1}= \begin{bmatrix} 0 & 0 & 0 \\ 0 & \alpha I_{n_{2}} & 0 \\ 0 & 0 & \frac{\alpha }{2}I_{m} \end{bmatrix} \quad \mbox{and}\quad \mathscr{D}_{2}= \begin{bmatrix} \alpha I_{n_{1}} & 0 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & \frac{\alpha }{2}I_{m} \end{bmatrix}, $$
(8)

where \(\alpha >0\) and \(I_{m}\) is the \(m\times m\) identity matrix. Then for the two splittings (3) and (6) one has

$$ \begin{aligned}[b] \mathscr{A}&=( \mathscr{D}_{1}+\mathscr{A}_{1})-(\mathscr{D}_{1}- \mathscr{A}_{2}) \\ &=(\mathscr{D}_{2}+\mathscr{A}_{2})-( \mathscr{D}_{2}-\mathscr{A}_{1}) \\ &=(\mathscr{D}_{1}+\mathscr{B}_{1})-( \mathscr{D}_{1}-\mathscr{B}_{2}) \\ &=(\mathscr{D}_{2}+\mathscr{B}_{2})-( \mathscr{D}_{2}-\mathscr{B}_{1}), \end{aligned} $$
(9)

which form the following two alternate direction iterative schemes.

Given an initial guess \(x^{(0)}\), for \(k=0,1,2,\ldots \) , until \(\{x^{(k)}\}\)converges, compute

$$\begin{aligned}& \textstyle\begin{cases} (\mathscr{D}_{1}+\mathscr{A}_{1})x^{(k+\frac{1}{2})}=(\mathscr{D}_{1}- \mathscr{A}_{2})x^{(k)}+b, \\ (\mathscr{D}_{2}+\mathscr{A}_{2})x^{(k+1)}=(\mathscr{D}_{2}- \mathscr{A}_{1})x^{(k+\frac{1}{2})}+b, \end{cases}\displaystyle \quad \mbox{and}\quad \end{aligned}$$
(10)
$$\begin{aligned}& \textstyle\begin{cases} (\mathscr{D}_{1}+\mathscr{B}_{1})x^{(k+\frac{1}{2})}=(\mathscr{D}_{1}- \mathscr{B}_{2})x^{(k)}+b, \\ (\mathscr{D}_{2}+\mathscr{B}_{2})x^{(k+1)}=(\mathscr{D}_{2}- \mathscr{B}_{1})x^{(k+\frac{1}{2})}+b, \end{cases}\displaystyle \end{aligned}$$
(11)

where \(\mathscr{D}_{1}\)and \(\mathscr{D}_{2}\)are defined in (8).

Eliminating \(x^{(k+\frac{1}{2})}\) in iterations (10) and (11), we obtain the stationary schemes

$$\begin{aligned}& x^{(k+1)}=\mathscr{L}x^{(k)}+f, \quad k=1,2,\ldots , \quad \mbox{and} \end{aligned}$$
(12)
$$\begin{aligned}& x^{(k+1)}=\mathscr{T}x^{(k)}+g,\quad k=1,2, \ldots , \end{aligned}$$
(13)

where

$$ \mathscr{L}=(\mathscr{D}_{2}+\mathscr{A}_{2})^{-1}( \mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}(\mathscr{D} _{1}- \mathscr{A}_{2}) $$
(14)

and

$$ \mathscr{T}=(\mathscr{D}_{2}+\mathscr{B}_{2})^{-1}( \mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}+ \mathscr{B}_{1})^{-1}(\mathscr{D} _{1}- \mathscr{B}_{2}) $$
(15)

are the iteration matrices of the ADI iterations (12) and (13), respectively. It is easy to see that (14) and (15), respectively, are similar to the matrices

$$ \hat{\mathscr{L}}=(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{A}_{2}) (\mathscr{D} _{2}+\mathscr{A}_{2})^{-1} $$
(16)

and

$$ \hat{\mathscr{T}}=(\mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}+ \mathscr{B}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{B}_{2}) (\mathscr{D} _{2}+\mathscr{B}_{2})^{-1}. $$
(17)

As is shown in [8], the iteration matrix \(\mathscr{L}\) is induced by the unique splitting \(\mathscr{A}=\mathscr{P}-\mathscr{Q}\) with \(\mathscr{P}\) nonsingular, i.e., \(\mathscr{L}=\mathscr{P}^{-1} \mathscr{Q}=I-\mathscr{P}^{-1}\mathscr{A}\). Furthermore, \(f= \mathscr{P}^{-1}b\). The matrices \(\mathscr{P}\) and \(\mathscr{Q}\) are given by

$$ \mathscr{P}=\frac{1}{\alpha }(\mathscr{D}_{1}+ \mathscr{A}_{1}) ( \mathscr{D}_{2}+\mathscr{A}_{2}), \qquad \mathscr{Q}=\frac{1}{\alpha }(\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}-\mathscr{A}_{2}). $$
(18)

Also, the iteration matrix \(\mathscr{T}\) is induced by the unique splitting \(\mathscr{A}=\mathscr{M}-\mathscr{N}\) with

$$ \mathscr{M}=\frac{1}{\alpha }(\mathscr{D}_{1}+ \mathscr{B}_{1}) ( \mathscr{D}_{2}+\mathscr{B}_{2}) \quad \mbox{nonsingular},\qquad \mathscr{N}=\frac{1}{\alpha }( \mathscr{D}_{2}- \mathscr{B}_{1}) (\mathscr{D}_{1}- \mathscr{B}_{2}), $$
(19)

i.e., \(\mathscr{T}=\mathscr{M}^{-1}\mathscr{N}=I-\mathscr{M}^{-1} \mathscr{A}\). Furthermore, \(g=\mathscr{M}^{-1}b\). We often refer to \(\mathscr{P}\) or \(\mathscr{M}\) as the preconditioner.

3 The convergence of the ADI methods

In this section, some convergence results on the ADI methods will be established. First, the following lemmas will used in this section.

Lemma 1

Let \(A=M-N\in C^{n\times n}\)withAandMnonsingular and let \(T=NM^{-1}\). Then \(A-TAT^{*}=(I-T)(AA^{-*}M^{*}+N)(I-T^{*})\).

The proof is similar to the proof of Lemma 5.30 in [1].

Lemma 2

Let \(A\in R^{n\times n}\)be symmetric and positive definite. If \(A=M-N\)withMnonsingular is a splitting such that \(M+N\)has a nonnegative definite symmetric part, then \(\|T\|_{A}=\|A^{-1/2}TA^{1/2}\|_{2}\leq 1\), where \(T=NM^{-1}\).

Proof

It follows from Lemma 1 that

$$ \begin{aligned}[b] A-TAT^{T}&=(I-T) \bigl(M^{T}A^{-T}A+N\bigr) \bigl(I-T^{T}\bigr) \\ &=(I-T) \bigl(M^{T}+N\bigr) \bigl(I-T^{T}\bigr). \end{aligned} $$
(20)

Since

$$ \begin{aligned}[b] 2\bigl(M^{T}+N \bigr)&=2\bigl(M^{T}+M-A\bigr) \\ &=\bigl(M+N^{T}\bigr)+\bigl(M^{T}+N\bigr) \\ &=(M+N)+(M+N)^{T} \\ &\succeq 0, \end{aligned} $$
(21)

it follows from (20) that \(A-TAT^{T}\succeq 0\) and thus

$$ A\succeq TAT^{T}. $$
(22)

From (22), we have \(I\succeq (A^{-1/2}TA^{1/2})(A^{-1/2}TA ^{1/2})^{T}\succeq 0\). Therefore,

$$ \Vert T \Vert _{A}= \bigl\Vert A^{-1/2}TA^{1/2} \bigr\Vert _{2}=\sqrt{\rho \bigl[\bigl(A^{-1/2}TA^{1/2} \bigr) \bigl(A ^{-1/2}TA^{1/2}\bigr)^{T}\bigr]} \leq 1. $$

This completes the proof. □

Lemma 3

Let \(\mathscr{A}_{i}\), \(\mathscr{B}_{i}\)and \(\mathscr{D}_{i}\)be defined in (4) and (8) for \(i=1,2\). If \({A}_{i}\)has positive definite symmetric part \(H_{i}\)and \(0<\alpha \leq 2\lambda _{\mathrm{min}}(H_{i})\)with \(\lambda _{\mathrm{min}}(H _{i})\)the smallest eigenvalue of \(H_{i}\), then

$$ \bigl\Vert (\mathscr{D}_{j}- \mathscr{A}_{i}) (\mathscr{D}_{i}+\mathscr{A}_{i})^{-1} \bigr\Vert _{2}\leq 1 \quad \textit{and}\quad \bigl\Vert ( \mathscr{D}_{j}-\mathscr{B}_{i}) (\mathscr{D}_{i}+ \mathscr{B}_{i})^{-1} \bigr\Vert _{2}\leq 1, $$
(23)

where \(j=2\)if \(i=1\)and \(j=1\)if \(i=2\).

Proof

We only prove the former inequality in (23) and the same method can yield the latter one. Let \(M_{i}=\mathscr{D}_{i}+\mathscr{A}_{i}\) and \(N_{i}=-\mathscr{D}_{j}+\mathscr{A}_{i}\). Then we have

$$ C_{i}:=M_{i}-N_{i}=\mathscr{D}_{i}+ \mathscr{D}_{j}=\alpha \operatorname{diag}(I_{n _{1}},I_{n_{1}},I_{m})= \alpha I\succ 0, $$

where I is the \((n_{1}+n_{2}+m)\times (n_{1}+n_{2}+m)\) identity matrix, and

$$ M_{i}+N_{i}=2\mathscr{A}_{i}+ \mathscr{D}_{i}-\mathscr{D}_{j}. $$

When \(i=1\) and \(j=2\)

$$ M_{1}+N_{1}=2\mathscr{A}_{1}+ \mathscr{D}_{1}-\mathscr{D}_{2}= \begin{bmatrix} 2A_{1}-\alpha I_{n_{1}} & 0 & 2B_{1}^{T} \\ 0 & \alpha I_{n_{2}} & 0 \\ -2B_{1} & 0 & 0 \end{bmatrix}. $$

Noting \(0<\alpha \leq 2\lambda _{\mathrm{min}}(H_{i})\), \(2H_{i}-\alpha I_{n_{i}}=(A^{T}_{i}+A_{i})-\alpha I_{n_{i}}\succeq 0\). Thus

$$ \bigl[(M_{1}+N_{1})^{T}+(M_{1}+N_{1}) \bigr]/2= \begin{bmatrix} (A^{T}_{i}+A_{i})-\alpha I_{n_{i}} & 0 & 0 \\ 0 & \alpha I_{n_{2}} & 0 \\ 0 & 0 & 0 \end{bmatrix}\succeq 0, $$

which shows that \(M_{1}+N_{1}\) has a nonnegative definite symmetric part. Similarly, \(M_{2}+N_{2}\) also has a nonnegative definite symmetric part. Thus, \(M_{i}+N_{i}\) has a nonnegative definite symmetric part for \(i=1,2\). Let \(T_{i}=N_{i}M_{i}^{-1}\). Then it follows from Lemma 2 that

$$ \Vert T_{i} \Vert _{C_{i}}= \bigl\Vert C_{i}^{-1/2}TC_{i}^{1/2} \bigr\Vert _{2}= \Vert T \Vert _{2}\leq 1. $$

Consequently, \(\|T_{i}\|_{2}=\|N_{i}M_{i}^{-1}\|_{2}=\|(\mathscr{D} _{j}-\mathscr{A}_{i})(\mathscr{D}_{i}+\mathscr{A}_{i})^{-1}\|_{2} \leq 1\) for \(i=1,2\). This completes the proof. □

Theorem 1

Consider problem (1) and assume that \(\mathscr{A}\)satisfies the assumptions above. Then \(\mathscr{A}\)is nonsingular. Further, if \(0<\alpha \leq 2\delta \)with \(\delta =\min \{ \lambda _{\mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H_{2})\}\), then \(\|\hat{\mathscr{L}}\|_{2}\leq 1\)and \(\|\hat{\mathscr{T}}\|_{2}\leq 1\).

Proof

The proof of the nonsingularity of \(\mathscr{A}\) can be found in [10]. Since \(0<\alpha \leq 2\delta =2\min \{\lambda _{ \mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H_{2})\}\), Lemma 3 shows that (23) hold for \(i=1\), \(j=2\) and \(i=2\), \(j=1\). As a result,

$$\begin{aligned}& \begin{aligned} &\begin{aligned} \Vert \hat{\mathscr{L}} \Vert _{2}&= \bigl\Vert (\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1} \bigr\Vert _{2} \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigr\Vert _{2} \bigl\Vert (\mathscr{D}_{1}- \mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1} \bigr\Vert _{2} \\ &\leq 1, \end{aligned} \\ &\begin{aligned} \Vert \hat{\mathscr{T}} \Vert _{2}&= \bigl\Vert (\mathscr{D}_{2}-\mathscr{B}_{1}) ( \mathscr{D}_{1}+\mathscr{B}_{1})^{-1}( \mathscr{D}_{1}-\mathscr{B}_{2}) ( \mathscr{D}_{2}+ \mathscr{B}_{2})^{-1} \bigr\Vert _{2} \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{B}_{1}) ( \mathscr{D}_{1}+ \mathscr{B}_{1})^{-1} \bigr\Vert _{2} \bigl\Vert (\mathscr{D}_{1}- \mathscr{B}_{2}) ( \mathscr{D}_{2}+\mathscr{B}_{2})^{-1} \bigr\Vert _{2} \\ &\leq 1. \end{aligned} \end{aligned} \end{aligned}$$
(24)

This completes the proof. □

Theorem 2

Consider problem (1) and assume that \(\mathscr{A}\)satisfies the assumptions above. If \(0<\alpha \leq 2\delta \)with \(\delta =\min \{\lambda _{\mathrm{min}}(H_{1}),\lambda _{\mathrm{min}}(H _{2})\}\), then the iterations (10) and (11) are convergent; that is, \(\rho (\mathscr{L})<1\)and \(\rho (\mathscr{T})<1\).

Proof

Firstly, we prove \(\rho (\mathscr{L})<1\). Since \(\mathscr{L}(\alpha )\) is similar to \(\hat{\mathscr{L}}\), \(\rho (\mathscr{L})=\rho ( \hat{\mathscr{{L}}})\). Let λ is an eigenvalue of \(\hat{\mathscr{{L}}}(\alpha )\) satisfying \(|\lambda |=\rho ( \hat{\mathscr{{L}}})\) and x is the corresponding eigenvector with \(\|x\|_{2}=1\) (note that it must have \(x\neq 0\)). Then \(\hat{\mathscr{{L}}}x=\lambda x\) and consequently,

$$ \begin{aligned}[b] \lambda &=x^{*}\hat{ \mathscr{L}}x=x^{*}(\mathscr{D}_{2}-\mathscr{A} _{1}) (\mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}- \mathscr{A}_{2}) (\mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}x \\ &=u^{*}v, \end{aligned} $$
(25)

where \(u=(\mathscr{D}_{1}+\mathscr{A}^{*}_{1})^{-1}(\mathscr{D}_{2}- \mathscr{A}^{*}_{1})x\) and \(v=(\mathscr{D}_{1}-\mathscr{A}_{2})( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x\). Using the Cauchy–Schwarz inequality,

$$ \vert \lambda \vert ^{2}\leq u^{*}u\cdot v^{*}v. $$
(26)

The equality in (26) holds if and only if \(u=kv\), where \(k\in \mathbb {C}\). Also, Lemma 3 yields

$$\begin{aligned}& \begin{aligned} &\begin{aligned} u^{*}u&=x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &\leq \max_{ \Vert x \Vert _{2}=1}x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*} _{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &\leq \bigl\Vert (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigr\Vert _{2}^{2} \\ &\leq 1, \end{aligned} \\ &\begin{aligned} v^{*}v&=x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D} _{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &\leq \max_{ \Vert x \Vert _{2}=1}x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*} _{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}- \mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &\leq \bigl\Vert (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1} \bigr\Vert _{2}^{2} \\ &\leq 1. \end{aligned} \end{aligned} \end{aligned}$$
(27)

As a result, if \(u\neq kv\), then it follows from (26) and (27) that

$$ \rho ^{2}\bigl[\mathscr{L}(\alpha )\bigr]=\rho ^{2}\bigl[\hat{\mathscr{{L}}}(\alpha )\bigr]= \vert \lambda \vert ^{2}< u^{*}u\cdot v^{*}v\leq 1; $$
(28)

if \(u=kv\) and \(u^{*}u\cdot v^{*}v<1\), then

$$ \rho ^{2}\bigl[\mathscr{L}(\alpha )\bigr]=\rho ^{2}\bigl[\hat{\mathscr{{L}}}(\alpha )\bigr]= \vert \lambda \vert ^{2}=u^{*}u\cdot v^{*}v< 1. $$
(29)

In what follows we will prove by contradiction that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\) do not hold simultaneously.

Assume that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\). Since \(u^{*}u\leq 1\) and \(v^{*}v\leq 1\), \(|k|=u^{*}u=v^{*}v=1\). Then it follows from (27) that

$$\begin{aligned}& \begin{aligned} &\begin{aligned} u^{*}u&=x^{*}(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+ \mathscr{A}_{1})^{-1} \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)x \\ &=\rho \bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+ \mathscr{A}_{1})^{-1}\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl( \mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr)\bigr] \\ &= 1, \end{aligned} \\ &\begin{aligned} v^{*}v&=x^{*}\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D} _{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x \\ &=\rho \bigl[\bigl(\mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}- \mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}\bigr] \\ &= 1. \end{aligned} \end{aligned} \end{aligned}$$
(30)

Noting \(\|x\|_{2}=1\), (30) implies that x is the eigenvector of \((\mathscr{D}_{2}-\mathscr{A}_{1})(\mathscr{D}_{1}+\mathscr{A}_{1})^{-1}( \mathscr{D}_{1}+\mathscr{A}^{*}_{1})^{-1}(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1})\) and \((\mathscr{D}_{2}+\mathscr{A}^{*}_{2})^{-1}( \mathscr{D}_{1}-\mathscr{A}^{*}_{2})(\mathscr{D}_{1}-\mathscr{A}_{2})( \mathscr{D}_{2}+\mathscr{A}_{2})^{-1}\) corresponding to their having the same eigenvalue, 1, i.e.,

$$ \begin{aligned} &(\mathscr{D}_{2}- \mathscr{A}_{1}) (\mathscr{D}_{1}+\mathscr{A}_{1})^{-1} \bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1}\bigr)x=x, \\ &\bigl(\mathscr{D}_{2}+\mathscr{A}^{*}_{2} \bigr)^{-1}\bigl(\mathscr{D}_{1}- \mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2}) ( \mathscr{D}_{2}+ \mathscr{A}_{2})^{-1}x=x. \end{aligned} $$
(31)

Since

$$ (\mathscr{D}_{2}-\mathscr{A}_{1}) ( \mathscr{D}_{1}+\mathscr{A}_{1})^{-1}\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}-\mathscr{A} ^{*}_{1}\bigr) = \begin{bmatrix} E & 0 & F^{T} \\ 0 & 0 & 0 \\ F & 0 & G \end{bmatrix}, $$
(32)

where E, F and G denote nonzero matrices, the former equation in (31) can be written as

$$ \begin{bmatrix} F & 0 & F^{T} \\ 0 & 0 & 0 \\ F & 0 & G \end{bmatrix} \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix}= \begin{bmatrix} x_{1} \\ x_{2} \\ x_{3} \end{bmatrix}, $$

which indicates \(x_{2}=0\). Therefore, \(x=[x^{*}_{1},0,x_{3}^{*}]^{*}\). Let \(y=(\mathscr{D}_{2}+\mathscr{A}_{2})^{-1}x\). Then \(x=(\mathscr{D} _{2}+\mathscr{A}_{2})y\). The latter equation in (31) becomes

$$ \bigl(\mathscr{D}_{1}-\mathscr{A}^{*}_{2} \bigr) (\mathscr{D}_{1}-\mathscr{A}_{2})y=\bigl( \mathscr{D}_{2}+\mathscr{A}^{*}_{2}\bigr) ( \mathscr{D}_{2}+\mathscr{A}_{2})y, $$
(33)

and consequently

$$ \bigl[\mathscr{D}_{2}^{2}- \mathscr{D}_{1}^{2}+\alpha \bigl(\mathscr{A}^{*}_{2}+ \mathscr{A}_{2}\bigr)\bigr]y=0, $$
(34)

that is,

$$ \begin{bmatrix} \alpha ^{2}I_{n_{1}} & 0 & 0 \\ 0 & \alpha (A^{*}_{2}+{A}_{2}-\alpha I_{n_{2}}) & 0 \\ 0 & 0 & 0 \end{bmatrix} \begin{bmatrix} y_{1} \\ y_{2} \\ y_{3} \end{bmatrix}=0, $$
(35)

which indicates \(y_{1}=0\). Therefore, \(y=[0,y_{2}^{*},y_{3}^{*}]^{*}\). Also, \(x=(\mathscr{D}_{2}+\mathscr{A}_{2})y\). Then

$$ \begin{bmatrix} x_{1} \\ 0 \\ x_{3} \end{bmatrix}= \begin{bmatrix} \alpha I_{n_{1}} & 0 & 0 \\ 0 & {A}_{2} & B_{2}^{T} \\ 0 & -B_{2} & \frac{\alpha }{2} I_{m} \end{bmatrix} \begin{bmatrix} 0 \\ y_{2} \\ y_{3} \end{bmatrix}= \begin{bmatrix} 0 \\ {A}_{2}y_{2}+B_{2}^{T}y_{3} \\ -B_{2}y_{2}+\frac{\alpha }{2} y_{3} \end{bmatrix}, $$
(36)

which shows that

$$ x_{1}=y_{1}=0,\qquad {A}_{2}y_{2}+B_{2}^{T}y_{3}=0, \quad \mbox{and}\quad x_{3}=-B_{2}y_{2}+ \frac{\alpha }{2} y_{3}. $$
(37)

Since \(u=kv\),

$$ \bigl(\mathscr{D}_{1}+\mathscr{A}^{*}_{1} \bigr)^{-1}\bigl(\mathscr{D}_{2}- \mathscr{A}^{*}_{1} \bigr)x=k(\mathscr{D}_{1}-\mathscr{A}_{2}) (\mathscr{D} _{2}+\mathscr{A}_{2})^{-1}x, $$
(38)

which can be written as

$$ \bigl(\mathscr{D}_{2}-\mathscr{A}^{*}_{1} \bigr) (\mathscr{D}_{2}+\mathscr{A}_{2})y=k\bigl( \mathscr{D}_{1}+\mathscr{A}^{*}_{1}\bigr) ( \mathscr{D}_{1}-\mathscr{A}_{2})y $$
(39)

for \(x=(\mathscr{D}_{2}+\mathscr{A}_{2})y\). Further, (39) becomes

$$ \bigl[\bigl(\mathscr{D}_{2}^{2}-k \mathscr{D}_{1}^{2}\bigr)+(\mathscr{D}_{2}+k \mathscr{D}_{1})\mathscr{A}_{2}-\mathscr{A}^{*}_{1}( \mathscr{D}_{2}+k \mathscr{D}_{1})-(1-k) \mathscr{A}^{*}_{1}\mathscr{A}_{2} \bigr]y=0, $$
(40)

i.e.,

$$ \begin{bmatrix} \alpha ^{2}I_{n_{1}}-\alpha A_{1}^{*} & (k-1)B_{1}B_{2}^{T} & \frac{(1+k) \alpha }{2}B_{1}^{T} \\ 0 & k\alpha (A_{2}-\alpha I_{n_{2}}) & k\alpha B_{2}^{T} \\ -\alpha B_{1} & -\frac{(1+k)\alpha }{2}B_{1} & \frac{(1-k)\alpha ^{2}}{4} I_{m} \end{bmatrix} \begin{bmatrix} 0 \\ y_{2} \\ y_{3} \end{bmatrix}= \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}, $$
(41)

i.e.,

$$ \textstyle\begin{cases} (k-1)B_{1}B_{2}^{T}y_{2}+\frac{(1+k)\alpha }{2}B_{1}^{T}y_{3}=0, \\ k\alpha (A_{2}-\alpha I_{n_{2}})y_{2}+k\alpha B_{2}^{T}y_{3}=0, \\ -\frac{(1+k)\alpha }{2}B_{1}y_{2}+\frac{(1-k)\alpha ^{2}}{4}y_{3}=0. \end{cases} $$
(42)

Here, we assert \(k\neq 1\). Otherwise, assume \(k=1\). Then (25) shows \(\lambda =u^{*}v=v^{*}v=1\) for \(u=kv\) and \(v^{*}v=1\). Note that λ is an eigenvalue of \(\hat{\mathscr{L}}\) which is similar to \(\mathscr{L}(\alpha )\). Thus \(\hat{\mathscr{L}}\) and \(\mathscr{L}=[( \mathscr{D}_{1}+\mathscr{A}_{1})(\mathscr{D}_{2}+\mathscr{A}_{2})]^{-1}[( \mathscr{D}_{2}-\mathscr{A}_{1})(\mathscr{D}_{1} -\mathscr{A}_{2})]\) have the same eigenvalue, 1. Let w be the eigenvector of \(\mathscr{L}\) corresponding to the eigenvalue 1 (note that necessarily \(w\neq 0\)). One has

$$ \mathscr{L}w=\bigl[(\mathscr{D}_{1}+ \mathscr{A}_{1}) (\mathscr{D}_{2}+ \mathscr{A}_{2}) \bigr]^{-1}\bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) (\mathscr{D} _{1}-\mathscr{A}_{2})\bigr]w=w $$
(43)

and consequently

$$ \bigl[(\mathscr{D}_{2}-\mathscr{A}_{1}) (\mathscr{D}_{1} -\mathscr{A}_{2})-( \mathscr{D}_{1}+\mathscr{A}_{1}) (\mathscr{D}_{2}+ \mathscr{A}_{2})\bigr]w=- \alpha \mathscr{A}w=0. $$
(44)

Since \(\mathscr{A}\) is nonsingular, (44) yields \(w=0\), which contradicts that w is an eigenvector of \(\mathscr{L}(\alpha )\). Thus, \(k\neq 1\) and \(1-k\neq 0\). From the third equation in (42), one has

$$ y_{3}=\kappa B_{1}y_{2} $$
(45)

with \(\kappa :=\frac{2(1+k)}{\alpha (1-k)}\). Then it follows from the second equation in (37) that

$$ \mathscr{J}y_{2}=\bigl(A_{2}+\kappa B_{1}^{T}B_{1}\bigr)y_{2}=0, $$
(46)

where \(\mathscr{J}=A_{2}+\kappa B_{1}^{T}B_{1}\). Note \(|k|=1\) and \(k\neq 1\). Let \(k=\cos \theta +i\sin\theta \), where \(i=\sqrt{-1}\), \(\theta \in R\), \(\theta \neq 2t\pi \) and t is an integer. Then

$$ \kappa := \frac{2(1+k)}{\alpha (1-k)}= \frac{2[(1+\cos \theta )+i\sin\theta ]}{\alpha [(1-\cos \theta )-i\sin \theta ]}= \frac{2i}{\alpha }\tan \frac{\theta }{2} $$
(47)

is either pure imaginary or zero. As a result, \(\mathscr{J}^{*}+ \mathscr{J}=A_{2}^{T}+A_{2}\succ 0\) for \(A_{2}\) is positive definite. Thus, \(\mathscr{J}\) is positive definite and hence nonsingular. Equation (46) indicates \(y_{2}=0\) and thus (45) shows that \(y_{3}=0\). Then it follows from the third equation in (37) that \(x_{3}=0\). Therefore, \(x=[0,0,0]^{*}\), which contradicts that x is an eigenvector of \(\hat{\mathscr{L}}(\alpha )\) with \(\|x\|_{2}=1\). By the proof above, it is easy to see that \(u=kv\) and \(u^{*}u\cdot v^{*}v=1\) do not hold simultaneously. Therefore, \(\rho [\mathscr{L}(\alpha )]=| \lambda |<1\) and consequently, the iteration (10) converges.

By the same method, we can obtain \(\rho (\mathscr{T})<1\). Therefore, iterations (10) and (11) are both convergent. This completes the proof. □

4 A numerical example

A numerical example is given in this section to show that the proposed alternate direction iterative methods are very effective.

Example 1

Consider problem (1) and assume that \(\mathscr{A}\) is shown in (2), where \(A_{1}=A_{2}=\operatorname{tri}(1,1,-1)\in R^{n \times n}\), \(B_{1}=B_{2}=I_{n}\in R^{n\times n}\), an \(n\times n\) identity matrix and \(b=(1,1,\ldots ,1)^{T}\in R^{2n}\).

We conduct numerical experiments to compare the performance of the three alternate direction iterative schemes (5), (10) and (11) for the problem (1). The former scheme (5) written as Algorithm 1 (A1) was proposed denoted by Benzi et al. in [12, 13], while the latter schemes (10) and (11) written by Algorithm 2 (A2) and Algorithm 3 (A3) are proposed in this paper. These three algorithms were coded in Matlab, and all computations were performed on a HP dx7408 PC (Intel core E4500 CPU, 2.2 GHz, 1 GB RAM) with Matlab 7.9 (R2009b).

The stopping criterion is defined as

$$ \mathrm{RERE}=\frac{ \Vert x^{k+1}-x^{k} \Vert _{2}}{\max \{1, \Vert x^{k} \Vert _{2}\}}=< 10^{-6}. $$

Numerical results are presented in Table 1. In particular, we report in Fig. 1 the change of RE of A1, A2 and A3 when \(n=1000\) with the iteration number increasing.

Figure 1
figure 1

When \(n=1000\) the change of RE of A1, A2 and A3 with the iteration number increasing

Table 1 Performance of A1, A2, and A3 with different n

From Table 1, we can make the following observations. (i) A2 (i.e., Algorithm 2) generally has much smaller iteration number than A1 and A3 (Algorithm 1 and Algorithm 3) when \(n=500\), \(n=1000\) and \(n=1500\); (ii) A3 has much less computing time than A2 and A1. Thus, both A2 and A3 are generally superior to A1 in terms of iteration number and computing time. Therefore, the proposed methods are more effective and efficient than the existing method.

Figure 1 shows that RE generated by A3 quickly converges to 0 with the iteration number increasing when \(n=1000\). Therefore, A2 is superior to A1 and A3 in terms of iteration number.

5 Conclusions

In this paper we propose two alternate direction iterative methods for generalized saddle-point systems based on two splitting forms of generalized saddle-point matrix, and then establish some convergence theorems for these two iterative methods. Finally, we present a numerical example to demonstrate that the proposed alternate direction iterative methods are superior to the existing one.