1 Introduction

Consider the following non-symmetric saddle point linear system

$$\begin{aligned} {{{\mathscr {A}}}}{{{\mathscr {U}}}}= \left[ \begin{array}{cc} A&{}B\\ -B^\mathrm{T}&{} 0 \end{array}\right] \left[ \begin{array}{c} x\\ y \end{array}\right] =\left[ \begin{array}{c} f\\ -g \end{array}\right] = b, \end{aligned}$$
(1)

where \(A\in {\mathbb {R}}^{n\times n}\) is positive definite (symmetric or non-symmetric); \(B\in {\mathbb {R}}^{n\times m}(m\le n)\) is a rectangular matrix of rank \(r\le m\); \(f\in {\mathbb {R}}^{n}\) and \(g\in {\mathbb {R}}^{m}\) are the given vectors.

In general, matrices A and B in \({{\mathscr {A}}}\) are large and sparse. System (1) is important and arises in a variety of scientific and engineering applications, such as computational fluid dynamics, constrained optimization, mixed or hybrid finite elements approximations of second-order elliptic problems, see [1, 7, 15].

In recent years, many studies have focused on solving large linear systems in saddle point form. Iterative methods are used for solving saddle point problems (1), when matrix blocks A and B are large and sparse. Some of these methods, such as Uzawa [7], inexact Uzawa [16] and the Hermitian and skew-Hermitian splitting method [2, 18, 21] have been presented. In reality, these methods use much less memory compared to Krylov subspace methods, but Krylov subspace methods are very efficient. Unfortunately, for solving saddle point problems (1), Krylov subspace methods work very slowly and they require good preconditioners to increase the speed of convergence.

Different preconditioners based on the matrix splitting of the (1, 1)-block A have been proposed. For example, Bai and Zhang [6] proposed a regularized conjugate gradient method for symmetric positive definite system of linear equations by shifting the coefficient matrix. Shift-splitting preconditioner has been presented by Bai et al. [5] for non-Hermitian positive definite system of linear equations, to accelerate the convergence of the Krylov subspace methods. Cao et al. applied shift-splitting preconditioner and a local shift-splitting preconditioner to solve symmetric saddle point problems and extended it to generalized shift-splitting preconditioner for non-symmetric saddle point problems [10, 13]. Also, Shen et al. used generalized shift-splitting preconditioners for solving nonsingular and singular generalized saddle point problems [23].

Moreover, semi-convergence of the shift-splitting iteration method and spectral analysis of the shift-splitting preconditioned saddle point matrix have been studied by Cao et al. [11] and Ren et al. [22], respectively. Cao et al. used the generalize shift-splitting matrix as a preconditioner and analyzed eigenvalue distribution of the preconditioned saddle point matrix [12]. Zhou et al. [26] and Huang et al. [17], respectively, proposed modified shift-splitting (MSS) and generalized modified shift-splitting (GMSS) preconditioners, for solving non-Hermitian saddle point problems. They used symmetric and skew-symmetric splitting of the (1, 1)-block A to make these preconditioners. In addition, Dou et al. [14] presented the fast shift-splitting (FSS) preconditioners for non-symmetric saddle point problems. Recently, a general class of shift-splitting (GCSS) preconditioners has been proposed for non-Hermitian saddle point problems arising from time-harmonic eddy current problems by Cao [9].

In this paper, we work on the saddle point problems (1) in which the (1, 1)-block A has a dominant positive definite part i.e., we can split A as

$$\begin{aligned} A=P+S, \end{aligned}$$
(2)

where P is a positive definite matrix and S is a skew-symmetric matrix and in some matrix norm \(\Vert .\Vert \), \(\Vert P\Vert \gg \Vert S\Vert \), see [3]. We present new modified shift-splitting (NMSS) preconditioners for this type of the saddle point problems (1). The convergence of the iterative method, which is produced by these preconditioners, is investigated. We apply these preconditioners to both singular and nonsingular saddle point problems (1). Also, we study the eigenvalues distribution of the NMSS-preconditioned matrix. Finally, practical numerical examples are presented to show the effectiveness of the NMSS preconditioners

2 New modified shift-splitting method

Assume that \(A=P+S\) is the splitting of the (1, 1)-block A of the coefficient matrix \({{\mathscr {A}}}\) in (1), where P is a positive definite matrix and S is skew-symmetric matrix. In this study, we choose splitting \(P=L+D+U^\mathrm{T}\) and \(S=U-U^\mathrm{T}\) as positive definite and skew-symmetric splitting of A, where D is diagonal matrix, L and U are strictly lower and upper triangular matrices of A, respectively. Let

$$\begin{aligned} {{{\mathscr {A}}}=\,{{\mathscr {M}}}-{{\mathscr {N}}}=\,} \frac{1}{2} \left[ \begin{array}{cc}\alpha I+2P&{}B\\ -B^\mathrm{T}&{}\beta I\end{array}\right] - \frac{1}{2} \left[ \begin{array}{cc}\alpha I-2S&{}-B\\ B^\mathrm{T}&{}\beta I\end{array}\right] , \end{aligned}$$
(3)

where \(\alpha ,\ \beta >0\) are two constants, and I is the unit matrix with appropriate dimension. This splitting gives the following new modified shift-splitting (NMSS) iteration method for saddle point problem (1).

2.1 NMSS iteration method

Given an initial guess \({u^{(0)}}^\mathrm{T}=\left( {x^{(0)}}^\mathrm{T},\ {y^{(0)}}^\mathrm{T} \right) \),

for \(k=0,1,2,\ldots \) to convergence, compute \({u^{(k)}}^\mathrm{T}=\left( {x^{(k)}}^\mathrm{T},\ {y^{(k)}}^\mathrm{T} \right) \) as follows:

$$\begin{aligned} {{{\mathscr {M}}}}u^{(k+1)}={{{\mathscr {N}}}}u^{k}+b, \end{aligned}$$
$$\begin{aligned} \displaystyle \frac{1}{2} \left[ \begin{array}{cc}\alpha I+2P&{}B\\ -B^\mathrm{T}&{}\beta I\end{array}\right] \left[ \begin{array}{c} x^{(k+1)}\\ y^{(k+1)}\end{array} \right] = \displaystyle \frac{1}{2} \left[ \begin{array}{cc}\alpha I-2S&{}-B\\ B^\mathrm{T}&{}\beta I\end{array}\right] \left[ \begin{array}{c} x^{(k)}\\ y^{(k)}\end{array} \right] + \left[ \begin{array}{c} f\\ -g\end{array} \right] . \end{aligned}$$
(4)

Consequently, NMSS iteration method can be expressed as

$$\begin{aligned} \left[ \begin{array}{c} x^{(k+1)}\\ y^{(k+1)}\end{array} \right] ={{{\mathscr {M}}}}^{-1}{{{\mathscr {N}}}}\left[ \begin{array}{c} x^{(k)}\\ y^{(k)}\end{array} \right] +{{{\mathscr {M}}}}^{-1} \left[ \begin{array}{c} f\\ -g\end{array} \right] ={\varGamma }^{-1} \left[ \begin{array}{c} x^{(k)}\\ y^{(k)}\end{array} \right] +d. \end{aligned}$$
(5)

In NMSS iteration method or when using \({{{\mathscr {M}}}} \) as a preconditioner for krylov subspace methods we need to solve the following system of linear equations \({{{\mathscr {M}}}}z=r\). Let \(r^\mathrm{T}=(r_1^\mathrm{T},r_2^\mathrm{T})\) and \(z^\mathrm{T}=(z_1^\mathrm{T},z_2^\mathrm{T})\), where \(r_1,z_1\in {\mathbb {R}}^n\) and \(r_2,z_2\in {\mathbb {R}}^m\)

$$\begin{aligned} \displaystyle \frac{1}{2} \left[ \begin{array}{cc}\alpha I+2P&{}B\\ -B^\mathrm{T}&{}\beta I\end{array}\right] \left[ \begin{array}{c} z_1\\ z_2\end{array} \right] =\left[ \begin{array}{c} r_1\\ r_2\end{array} \right] . \end{aligned}$$
(6)

An easy computation shows that (6) is equivalent to the following equations:

$$\begin{aligned} \left( \alpha I+2P+\displaystyle \frac{1}{\beta }BB^\mathrm{T}\right) z_1= & {} 2r_1-\displaystyle \frac{2}{\beta }B r_2,\nonumber \\ z_2= & {} \displaystyle \frac{1}{\beta }(2r_2+B^\mathrm{T} z_1). \end{aligned}$$
(7)

The approximate solution of the linear system (7) can be obtained by conjugate gradient method (for symmetric P) and Lanczos method (for non-symmetric P). In addition, linear system (7) can be solved by some direct methods.

3 Convergence of NMSS iteration method

In this section, we will investigate behavior convergence of NMSS method when saddle point system (1) is nonsingular. As we know, an NMSS method is convergent if and only if \(\rho (\varGamma )<1\).

Let us assume that \(\lambda \) is an eigenvalue of the iteration matrix \(\varGamma \) of the NMSS method and \(u=[x^\mathrm{T},y^\mathrm{T}]^\mathrm{T}\) is the corresponding eigenvector, then we have

$$\begin{aligned} \varGamma u=\lambda u\equiv {{{\mathscr {M}}}}^{-1}{{{\mathscr {N}}}}u=\lambda u\equiv {{{\mathscr {N}}}}u=\lambda {{{\mathscr {M}}}}u, \end{aligned}$$
(8)

which is equivalent to

$$\begin{aligned} \left[ \begin{array}{cc} \alpha I-2S &{} -B\\ B^\mathrm{T}&{}\beta I \end{array} \right] \left[ \begin{array}{c}x\\ y\end{array} \right] =\lambda \left[ \begin{array}{cc} \alpha I+2P &{} B\\ -B^\mathrm{T}&{}\beta I \end{array}\right] \left[ \begin{array}{c}x\\ y\end{array} \right] . \end{aligned}$$
(9)

We can write (9) as follows:

$$\begin{aligned} (\alpha I-2S)x -By&=\lambda (\alpha I +2P)x + \lambda By, \end{aligned}$$
(10)
$$\begin{aligned} B^\mathrm{T}x+\beta y&= -\lambda B^\mathrm{T} x + \lambda \beta y. \end{aligned}$$
(11)

If \(\lambda =1\) is substituted in (8), we obtain \({\mathcal{A}}u=0\), which contradicts the nonsingularity of \({{{\mathscr {A}}}}\). Also, suppose that \(x=0\). We conclude from (11) and \(\lambda \ne 1\) that \(y=0\). But, this is impossible, then \(x\ne 0\) and the next lemma immediately follows.

Lemma 3.1

Let A be positive definite, B be of full column rank and \(\alpha ,\beta >0\) be given constants. If \(\lambda \) is an eigenvalue of the iteration matrix \(\varGamma \) and \(u=(x^\mathrm{T},y^\mathrm{T})^\mathrm{T}\) is the eigenvector of \(\ \varGamma \) corresponding to \(\lambda \), then \(\lambda \ne 1\) and \(x\ne 0\).

Lemma 3.2

[20] Both roots of the complex equation \(\lambda ^2-\phi \lambda +\psi =0\) are less than one in modulus if and only if \(|\phi -{\bar{\phi }}\psi |+|\psi |^2<1\), where \({\bar{\phi }}\) denotes the conjugate complex of \(\phi \) .

Theorem 3.3

Let \(A\in {\mathbb {R}}^{n\times n}\) be positive definite, \(B\in {\mathbb {R}}^{n\times m}\) be of full column rank and \(\alpha ,\ \beta >0\) be given constants. If \(\lambda \) is an eigenvalue of iteration matrix \({\varGamma }\) and \(u=[x^\mathrm{T},y^\mathrm{T}]^\mathrm{T}\) is the eigenvector of \(\ \varGamma \) corresponding to \(\lambda \), then NMSS iteration method converges to the unique solution of problem (1) if and only if parameters \(\alpha \) and \( \beta \) satisfy.

  1. 1.

    If \(c=0\), then

    $$\begin{aligned} \alpha >\displaystyle \frac{b^2-|a|^2}{a_1}; \end{aligned}$$
    (12)
  2. 2.

    If \(c\ne 0\), then

    $$\begin{aligned} \beta a_1(\alpha a_1+|a|^2-b^2)-c(a_2-b)^2>0, \end{aligned}$$
    (13)

where

$$\begin{aligned} a=a_1+i a_2=\frac{x^*Px}{x^*x},(a_1>0),\ ib=\frac{x^*Sx}{x^*x}\ \text {and}\ c=\frac{x^*BB^\mathrm{T}x}{x^*x} . \end{aligned}$$

Proof

By Lemma 3.1, we know that \(\lambda \ne 1\) . Moreover, we can obtain from (11) that

$$\begin{aligned} y=\frac{(1+\lambda )B^\mathrm{T}x}{\beta (\lambda -1)}. \end{aligned}$$
(14)

By substituting (14) in (10), we have

$$\begin{aligned} \alpha \beta (\lambda -1)^2 x+2\beta (\lambda -1)Sx+2\beta \lambda (\lambda -1)Px+(1+\lambda )^2BB^\mathrm{T}x=0. \end{aligned}$$
(15)

Since \(x\ne 0\) , by multiplying \(\displaystyle \frac{x^*}{x^*x}\) in (15), we obtain

$$\begin{aligned} \alpha \beta (\lambda -1)^2 +2\beta (\lambda -1)\frac{x^*Sx}{x^*x}+2\beta \lambda (\lambda -1)\frac{x^*Px}{x^*x}+(1+\lambda )^2\frac{x^*BB^\mathrm{T}x}{x^*x}=0. \end{aligned}$$
(16)

Let

$$\begin{aligned} a=a_1+i a_2=\frac{x^*Px}{x^*x},\ ib=\frac{x^*Sx}{x^*x},\ c=\frac{x^*BB^\mathrm{T}x}{x^*x}. \end{aligned}$$

Then (16) is simplified to

$$\begin{aligned} \alpha \beta (\lambda -1)^2 +2\beta (\lambda -1)ib+2\beta \lambda (\lambda -1)a+(1+\lambda )^2c=0. \end{aligned}$$
(17)
  1. 1.

    If \(c=0\) (i.e., \(B^\mathrm{T}x=0\)), then (17) is reduced to

    $$\begin{aligned} \alpha (\lambda -1) +2ib+2\lambda a=0, \end{aligned}$$

    which gives

    $$\begin{aligned} \lambda =\frac{ \alpha -2ib}{\alpha +2a}. \end{aligned}$$

    Thus, \(|\lambda |<1\) if and only if

    $$\begin{aligned} \alpha >\frac{b^2-|a|^2}{a_1}. \end{aligned}$$
    (18)
  2. 2.

    If \(c\ne 0\) (i.e., \(B^\mathrm{T}x\ne 0\)), then by arranging (17) in terms of \(\lambda \), we obtain the following quadratic equation:

    $$\begin{aligned} (\alpha \beta +2\beta a+c)\lambda ^{2} +(-2\alpha \beta +2i\beta b-2\beta a+2c)\lambda +(\alpha \beta -2i\beta b+c)=0. \end{aligned}$$
    (19)

    We divide (19) by \((\alpha \beta +2\beta a+c)\ne 0\), then

    $$\begin{aligned} \lambda ^{2} -\varphi \lambda +\psi =0 , \end{aligned}$$
    (20)

    where

    $$\begin{aligned} \phi =2\frac{\beta (\alpha +a_{1} +ia_{2} )-i\beta b-c}{\beta (\alpha +2a_{1} +2ia_{2} )+c}\quad \text {and} \quad \psi =\frac{\beta (\alpha -2ib)+c}{\beta (\alpha +2a_{1} +2ia_{2} )+c} . \end{aligned}$$

Through Lemma 3.2, we know that a sufficient and necessary condition for the roots of the equation (20) to satisfy \(\left| \lambda \right| <1\) if and only if \(\left| \phi -\bar{\phi }\psi \right| +\left| \psi \right| ^{2} <1\). Some computations show that condition \(\left| \phi -\bar{\phi }\psi \right| +\left| \psi \right| ^{2} <1\) is equivalent to

$$\begin{aligned} \beta a_{1} (\alpha a_{1} +\left| a\right| ^{2}-b^{2} )-c(a_{2} -b)^{2} >0 . \end{aligned}$$
(21)

Thus, if the condition (21) holds, then the NMSS iteration method must be convergent. \(\square \)

Remark 3.4

In Theorem 3.3, if (1, 1)-block A has a dominant positive definite part, then \(|a|\gg |b|\). We conclude that (12) for all \(\alpha >0\) holds. On the other hand, there is no restriction on \(\beta \), except non-negativity. Therefore, in this case, the iteration method is convergent unconditionally.

4 Semi-convergence of the NMSS iteration method for singular saddle point problems

Let B in (1) be rank deficient, i.e., rank\((B)= r (r< m)\). Since B is rank deficient, \( {{{\mathscr {A}}}} \) is singular and we study the semi-convergence of the NMSS iteration method for solving the singular saddle point problems (1). According to [8], iteration scheme (9) is semi-convergent if and only if the following two conditions are met.

  1. (i)

    Elementary divisors associated with \(\lambda = 1 \in \sigma (\varGamma )\) are linear, i.e., \(\mathrm{rank}\; (I-\varGamma )^{2} =\mathrm{rank}\; (I-\varGamma )\), or equivalently, \( \mathrm{index}(I - \varGamma ) = 1\).

  2. (ii)

    If \(\lambda \in \sigma (\varGamma )\) with \(|\lambda |=1\), then \(\lambda =1\), i.e., \(\nu (\varGamma )<1\), where \(\sigma (\varGamma )\) denotes the spectrum of \(\varGamma \) and \(\nu (\varGamma ) =\max \left\{ |\lambda |, \lambda \in \sigma (\varGamma ),\lambda \ne 1\right\} \) is the pseudo-spectral radius of \(\varGamma \).

For the first condition of semi-convergence, the following theorem will be present. It can be proved the same way as Theorem 4.1 in [14].

Theorem 4.1

Let \(A\in {\mathbb {R}} ^{n\times n} \) be positive definite and \(B\in {\mathbb {R}} ^{n\times m} \;\) be rank deficient. Suppose that \(\alpha \, ,\ \beta \, >0\) and \(\varGamma \) is the iteration matrix of the NMSS iteration method. Then

$$\begin{aligned} \mathrm{rank}\, (I-\varGamma )^{2} =\mathrm{rank}\, (I-\varGamma ). \end{aligned}$$
(22)

In what follows, second condition of the semi-convergence will be studied. Let \(B=U\; \left[ \begin{array}{cc} {B_{r} }&{0} \end{array}\right] \; V^\mathrm{T} \) be the singular value decomposition of B, where

$$\begin{aligned} B_{r} =\left[ \begin{array}{c} {\varSigma _{r} } \\ {0} \end{array}\right] \quad \text {and}\quad \varSigma _{r} =\mathrm{diag}\, (\sigma _{1} ,\sigma _{2},\ldots ,\sigma _{r} )\in {\mathbb {R}} ^{r\times r}, \end{aligned}$$

with \(U\in {\mathbb {R}} ^{n\times n} \; \text {and}\; V\in {\mathbb {R}} ^{m\times m} \) being two orthogonal matrices and \(\sigma _{i} \; \left( i=1,\ldots ,r\right) \) being singular values of B.

We define

$$\begin{aligned} {\tilde{P}} =U^\mathrm{T} P\, U\quad \text {and}\quad {\tilde{S}} =U^\mathrm{T} S\, U \end{aligned}$$

and consider the block diagonal matrix

$$\begin{aligned} Q=\left[ \begin{array}{cc} {U} &{} {0} \\ {0} &{} {V} \end{array}\right] , \end{aligned}$$

which is an \(\left( n+m\right) \times \left( n+m\right) \) orthogonal matrix. Iteration matrix \(\varGamma \) is similar to matrix \(\hat{\varGamma } =Q ^\mathrm{T}\varGamma Q\). Hence, \(\varGamma \) has the same spectrum as \({\hat{\varGamma }}\). Now, we try to convert \({\hat{\varGamma }}\) to the new form using similarity, which can be clustered their eigenvalues, therefore

$$\begin{aligned} {\hat{\varGamma }} =Q ^\mathrm{T}\varGamma Q =Q^\mathrm{T} {{{\mathscr {M}}}}^{-1}{{{\mathscr {N}}}} Q =Q ^\mathrm{T}{{{\mathscr {M}}}}^{-1} Q\; Q ^\mathrm{T}{{{\mathscr {N}}}} Q =(Q^\mathrm{T}{{{\mathscr {M}}}} Q)^{-1} \; (Q^\mathrm{T}{{{\mathscr {N}}}} Q), \end{aligned}$$

i.e.,

$$\begin{aligned} \begin{array}{ll} {\hat{\varGamma }}&{}= \left[ \begin{array}{ccc} {\left( \begin{array}{cc} {U }^\mathrm{T} &{} {0} \\ {0} &{} {V^\mathrm{T} } \end{array}\right) } &{} {\left( \begin{array}{cc} {\alpha I+2P} &{} {B} \\ {-B^\mathrm{T} } &{} {\beta I} \end{array}\right) } &{} {\left( \begin{array}{cc}{U } &{} {0} \\ {0} &{} V \end{array}\right) } \end{array}\right] ^{-1} \left[ \begin{array}{ccc} {\left( \begin{array}{cc} {U }^\mathrm{T} &{} {0} \\ {0} &{} {V^\mathrm{T} } \end{array}\right) } &{} {\left( \begin{array}{cc} {\alpha I-2S} &{} {-B} \\ {B^\mathrm{T} } &{} {\beta I} \end{array}\right) } &{} \left( \begin{array}{cc}{U } &{} {0} \\ {0} &{} V \end{array}\right) \end{array}\right] \\ &{} =\left( \begin{array}{cc} \alpha I+2U ^\mathrm{T}PU &{} U^\mathrm{T} BV \\ -V^\mathrm{T} B^\mathrm{T} U &{} \beta I \end{array}\right) ^{-1} \left( \begin{array}{cc} \alpha I-2U^\mathrm{T} SU &{} -U^\mathrm{T} BV \\ V^\mathrm{T} B^\mathrm{T} U &{} \beta I \end{array}\right) \\ &{}=\left( \begin{array}{ccc} \alpha I+2{\tilde{P}} &{} B_{r} &{} 0 \\ -B_{r}^\mathrm{T} &{} \beta I &{} 0 \\ 0 &{} 0 &{} \beta I \end{array}\right) ^{-1} \left( \begin{array}{ccc} \alpha I-2{\tilde{S}} &{} -B_{r} &{} 0 \\ B_{r}^\mathrm{T} &{} \beta I &{} 0 \\ 0 &{} 0 &{} \beta I \end{array}\right) \\ &{}=\left[ \begin{array}{cc} {\left( \begin{array}{cc} {\alpha I+2{\tilde{P}} } &{} {B_{r} } \\ {-B_r^\mathrm{T} } &{} {\beta I} \end{array}\right) ^{-1} \left( \begin{array}{cc} {\alpha I-2{\tilde{S}} } &{} {-B_{r} } \\ {B_{r}^\mathrm{T} } &{} {\beta I} \end{array}\right) } &{} {0} \\ {0} &{} {I} \end{array}\right] . \end{array} \end{aligned}$$

Let \({\tilde{\varGamma }} =\left( \begin{array}{cc} {\alpha I+2{\tilde{P}} } &{} {B_{r} } \\ {-B_r^\mathrm{T} } &{} {\beta I} \end{array}\right) ^{-1} \left( \begin{array}{cc} {\alpha I-2{\tilde{S}} } &{} {-B_{r} } \\ {B_{r}^\mathrm{T} } &{} {\beta I} \end{array}\right) .\) Then \({\hat{\varGamma }} =\left( \begin{array}{cc} {{\tilde{\varGamma }} } &{} {0} \\ {0} &{} {I} \end{array}\right) .\) Matrix \({\tilde{\varGamma }} \) can be viewed as the iteration matrix of the NMSS iteration method applied to

$$\begin{aligned} \left( \begin{array}{cc} {{\tilde{A}}} &{} {B_{r} } \\ {-B_{r}^\mathrm{T} } &{} {0} \end{array}\right) \; \left( \begin{array}{c} {{\tilde{x}}} \\ {{\tilde{y}}} \end{array}\right) =\left( \begin{array}{c} {{\tilde{f}}} \\ -\tilde{g} \end{array}\right) . \end{aligned}$$
(23)

Because \({\tilde{A}}=UAU^\mathrm{T}\) is positive definite and \(B_r\) is full rank then (23) is nonsingular. Let \({\tilde{u}}=\left( {\tilde{x}}^\mathrm{T},{\tilde{y}}^\mathrm{T}\right) ^\mathrm{T}\) be an eigenvector of \({\tilde{\varGamma }}\), the relations in Theorem 3.3 can be expressed for new nonsingular system (23) with iteration matrix \({\tilde{\varGamma }}\) as follows:

when \({\tilde{c}}=0,\ \alpha >\displaystyle \frac{{\tilde{b}}^2-|{\tilde{a}}|^2}{{\tilde{a}}_1},\) and \({\tilde{c}}\ne 0,\ \beta {\tilde{a}}_1(\alpha {\tilde{a}}_1+|{\tilde{a}}|^2-{\tilde{b}}^2)-{\tilde{c}}({\tilde{a}}_2-{\tilde{b}})^2>0\),

where

$$\begin{aligned} {\tilde{a}}={\tilde{a}}_1+i {\tilde{a}}_2=\frac{{\tilde{x}}^*{\tilde{P}}{\tilde{x}}}{{\tilde{x}}^*{\tilde{x}}},({\tilde{a}}_1>0),\ i{\tilde{b}}=\frac{{\tilde{x}}^*{\tilde{S}} {\tilde{x}}}{{\tilde{x}}^*{\tilde{x}}}\ \text {and}\ {\tilde{c}}=\frac{{\tilde{x}}^*B_rB_r^\mathrm{T}{\tilde{x}}}{{\tilde{x}}^*{\tilde{x}}}. \end{aligned}$$

Then, under the above conditions, \(\rho ({{\tilde{\varGamma }}})<1\) and the second condition of semi-convergence is satisfied. These concepts are briefed in the following theorem.

Theorem 4.2

Let \(A\in {\mathbb {R}} ^{n\times n} \) be positive definite and \(B\in {\mathbb {R}} ^{n\times m}\) be rank deficient. Assume that \(\alpha \, ,\ \beta \, >0\) and \(\varGamma \) is the iteration matrix of the NMSS iteration method. Then \(\nu (\varGamma )<1\) if and only if the following conditions are satisfied:

  1. 1.

    If \({\tilde{c}}=0\), then

    $$\begin{aligned} \alpha >\displaystyle \frac{{\tilde{b}}^2-|{\tilde{a}}|^2}{{\tilde{a}}_1}. \end{aligned}$$
    (24)
  2. 2.

    If \({\tilde{c}}\ne 0\), then

    $$\begin{aligned} \beta {\tilde{a}}_1(\alpha {\tilde{a}}_1+|{\tilde{a}}|^2-{\tilde{b}}^2)-{\tilde{c}}({\tilde{a}}_2-{\tilde{b}})^2>0. \end{aligned}$$
    (25)

Using Theorems 4.1 and 4.2, we conclude the semi-convergence of the NMSS iteration method for singular saddle point problem (1).

5 Preconditioning properties

In preceding sections, we study convergence and semi-convergence of the NMSS method as an iteration method. Although, similar to the other shift-splitting methods, we do not expect fast convergence for the NMSS method in the actual implementations. Therefore, we focus on the preconditioner generated by this method, i.e., \(\mathcal{P}_\mathrm{NMSS}.\) We use this preconditioner to accelerate the convergence of the GMRES method as a Krylov subspace method. Also, we study the eigenvalues distribution of the preconditioned matrix \({{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{{\mathscr {A}}}}.\)

Lemma 5.1

Let \(A\in {\mathbb {R}}^{n\times n}\) be positive definite, \(B\in {\mathbb {R}}^{n\times m}\) be of full column rank and \(\alpha ,\beta >0\) be given constants. Assume that \(A=P+S\) is a dominant positive definite and skew-Hermitian splitting. Let \(a,\ b\), and c be defined as in Theorem 3.3 and \(\alpha ,\beta >0\) satisfy (12) or (13). Then all eigenvalues of the NMSS-preconditioned matrix \(\mathcal{P}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) are located in a circle centered at (1, 0) with radius strictly less than 1.

Proof

Suppose that \(\mu \) and \(\lambda \) are eigenvalues of the NMSS-preconditioned matrix \({{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) and NMSS iteration matrix \(\varGamma \), respectively. With respect to the relation between \({{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) and \(\varGamma \), i.e.,

$$\begin{aligned} {{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}={{\mathscr {M}}}^{-1}{{\mathscr {A}}}=\mathcal{M}^{-1}({{\mathscr {M}}}-{{\mathscr {N}}})=I-{{\mathscr {M}}}^{-1}{\mathscr {N}}=I-\varGamma , \end{aligned}$$

we have \( \lambda =1-\mu \). If \(\alpha \) and \(\beta \) satisfy (12) or (13) then \(|\lambda |<1\). Thus, we obtain

$$\begin{aligned} | 1-\mu |=|\lambda |<1, \end{aligned}$$

and the lemma follows. \(\square \)

Theorem 5.2

Under the hypotheses of Lemma 5.1, if \(\mu \) is an eigenvalue of the NMSS-preconditioned matrix \(\mathcal{P}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) and \(u=[x^\mathrm{T},y^\mathrm{T}]^\mathrm{T}\) be its associated eigenvector, then we have

  1. 1.

    \(\mu \ne 0 \) and \(x\ne 0\).

  2. 2.

    If \(y=0\), then \(x\in null(B^\mathrm{T})\) and \(\mu \rightarrow 1\) as \(\alpha \rightarrow 0_+\), where \(a,\ b\), and c are defined as in Theorem 3.3.

  3. 3.

    If \(y\ne 0\), then \(\mu \rightarrow 2\) as \(\beta \rightarrow 0_+\).

Proof

Let \(\mu \) be the eigenvalue of the preconditioned matrix \(\mathcal{P}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) and \( \left[ \begin{array}{c} x\\ y \end{array} \right] \) be its associated eigenvector. Therefore,

$$\begin{aligned} {{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{\mathscr {A}}} \left[ \begin{array}{c} x\\ y \end{array} \right] =\left( \frac{1}{2} \left[ \begin{array}{cc} \alpha I+2P&{}B\\ -B^\mathrm{T}&{}\beta I \end{array} \right] \right) ^{-1} \left[ \begin{array}{cc} A&{}B\\ -B^\mathrm{T}&{}0 \end{array} \right] \left[ \begin{array}{c} x\\ y \end{array} \right] =\mu \left[ \begin{array}{c} x\\ y \end{array} \right] , \end{aligned}$$

which equivalent to

$$\begin{aligned} \left[ \begin{array}{cc} A&{}B\\ -B^\mathrm{T}&{}0 \end{array} \right] \left[ \begin{array}{c} x\\ y \end{array} \right] =\frac{\mu }{2} \left[ \begin{array}{cc} \alpha I+2P&{}B\\ -B^\mathrm{T}&{}\beta I \end{array} \right] \left[ \begin{array}{c} x\\ y \end{array} \right] . \end{aligned}$$
(26)

Using (26), the following equations are implied:

$$\begin{aligned} 2Ax+2By&=\mu (\alpha I+2P)x+\mu By, \end{aligned}$$
(27)
$$\begin{aligned} -2B^\mathrm{T}x&=-\mu B^\mathrm{T}x+\mu \beta y. \end{aligned}$$
(28)

We obtain y in (28) and replace it in (27), then we have

$$\begin{aligned} (2A-\mu \alpha I-2\mu P)x=\frac{(\mu -2)^2}{\mu \beta }BB^\mathrm{T}x. \end{aligned}$$
(29)

Multiplying \(\displaystyle \frac{x^*}{x^*x}\) to both sides of (29), then

$$\begin{aligned} 2\displaystyle \frac{x^*Ax}{x^*x}-\mu \alpha -2\mu \displaystyle \frac{x^*Px}{x^*x}=\frac{(\mu -2)^2}{\mu \beta }\displaystyle \frac{x^*BB^\mathrm{T}x}{x^*x}. \end{aligned}$$
(30)

We use notations in Theorem 3.3 and rewrite (30) as follows:

$$\begin{aligned} 2(a+ib)-\mu \alpha -2\mu a=\displaystyle \frac{(\mu -2)^2}{\mu \beta }c. \end{aligned}$$
(31)

By collecting terms in (31), we obtain

$$\begin{aligned} ((2a+\alpha )\beta +c)\mu ^2-(2\beta (a+ib)+4c)\mu +4c=0. \end{aligned}$$
(32)

Proof of the first part immediately follows from Lemma 3.1. For the second statement, we set \(y=0\), then (27) and (28) become

$$\begin{aligned} (2A-\mu \alpha I-2\mu P)x=0, \end{aligned}$$
(33)

and

$$\begin{aligned} (\mu -2)B^\mathrm{T}x=0. \end{aligned}$$
(34)

(34) implies either \(\mu =2\) or \(B^\mathrm{T}x=0\). For \(\mu =2\), (33) becomes

$$\begin{aligned} (\alpha I+ P-S)x=0. \end{aligned}$$
(35)

We multiply (35) from the left in \(\displaystyle \frac{x^*}{x^*x}\) and obtain

$$\begin{aligned} \alpha + a-ib=(\alpha +a_1)+i(a_2-b)=0. \end{aligned}$$
(36)

This leads to a contradiction with the positive definiteness of A and the positivity of \(\alpha \). \(\mu \ne 2\) concludes that \(B^\mathrm{T}x=0\), i.e., \(x\in \mathrm{null}(B^\mathrm{T})\). From (31), we drive

$$\begin{aligned} \mu =\displaystyle \frac{2(a+ib)}{\alpha +2a}=1-\frac{\alpha -2ib}{\alpha +2a}. \end{aligned}$$

If \(\alpha \rightarrow 0_+\), then \( \mu =\displaystyle 1+\frac{ib}{a}\). Also, since A has a dominant positive definite part, we conclude that, \(|a|\gg |b|\) for all \(x\in {\mathbb {C}}^n\), then \(\mu \) tends to 1. To prove the third part (3), since \(y\ne 0\), we conclude that \(B^\mathrm{T}x\ne 0\). Then \(c>0\). We solve quadratic equation (32). The roots of this equation are as follows:

$$\begin{aligned} \mu _{\pm }=\displaystyle \frac{\beta (a+ib)+2c\pm \sqrt{(\beta (a+ib)+2c)^2-4c((2a+\alpha )\beta +c)}}{(2a+\alpha )\beta +c}. \end{aligned}$$

Now, if \(\beta \rightarrow 0_+\), so \(\mu \rightarrow 2\), which completes the proof. \(\square \)

Now, we study the eigenvalues distribution of the NMSS-preconditioned matrix \({{{\mathscr {P}}}}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) in singular case. We give the following lemma and theorem. The proofs of this lemma and theorem are similar to nonsingular case, so we give them without proof.

Lemma 5.3

Let \(A\in {\mathbb {R}}^{n\times n}\) be positive definite, \(B\in {\mathbb {R}}^{n\times m}\), \(\mathrm{rank}(B)=r<m< n\) and \(\alpha ,\beta >0\) be given constants. Assume that \(A=P+S\) is a dominant positive define and skew-Hermitian splitting of A. Let \({\tilde{a}},\tilde{ b}\), and \({\tilde{c}}\) are defined as in Theorem 4.2 and \(\alpha ,\beta >0\) satisfy (24) or (25). Then all eigenvalues of the NMSS-preconditioned matrix \(\mathcal{P}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) are located in a circle centered at \((1,\,0)\) with radius 1.

Theorem 5.4

Under the hypotheses of Lemma 5.3. If \(\mu \ne 0\) is an eigenvalue of the NMSS-preconditioned matrix \(\mathcal{P}_\mathrm{NMSS}^{-1}{{\mathscr {A}}}\) and \(u=[x^\mathrm{T},y^\mathrm{T}]^\mathrm{T}\) is its associated eigenvector, then we have

  1. 1.

    \(x\ne 0\).

  2. 2.

    If \(y=0\), then \(x\in null(B^\mathrm{T})\) and \(\mu \rightarrow 1\) as \(\alpha \rightarrow 0_+\), where \({\tilde{a}},\ {\tilde{b}}\), and \({\tilde{c}}\) are defined as in Theorem 4.2.

  3. 3.

    If \(y\ne 0\), then \(\mu \rightarrow 2\) as \(\beta \rightarrow 0_+\).

6 Numerical results

In this section, we present two examples to illustrate the effectiveness of the NMSS preconditioner for saddle point problem (1) arising from a model Stokes problem. We use left preconditioning with GMRES as a Krylov subspace method. We compare the elapsed CPU time (s) (CPU) and the number of iterations (IT) of the NMSS preconditioner with GMRES without preconditioning and GMRES method with GMSS preconditioner [17], Uzawa-HSS and PU-STS preconditioners [19, 24, 25]. In these examples, all of the optimal parameters are provided experimentally. We find them based on the least number of iterations in the method. We choose right-hand side vector b so that \(\mathcal{U}=(1,\ldots ,1)^\mathrm{T}\) is the exact solution of (1). We run examples with zero as initial vector and terminated if \(ERR=\Vert b-\mathcal{AU}^{(k)}\Vert _2/\Vert b\Vert _2<=10^{-9}\) is satisfied. All of the examples are performed by Matlab on a computer with Intel Core i7 CUP 2.0 GHz and 8GB memory.

Example 6.1

We consider the following nonsingular saddle point problem:

$$\begin{aligned} \begin{array}{ll} A=\left[ \begin{array}{cc} I\otimes T+T\otimes I&{} 0\\ 0&{} I\otimes T+T\otimes I \end{array}\right] \in {\mathbb {R}}^{2p^2\times 2p^2},&{} B=\left[ \begin{array}{c} I\otimes F\\ F\otimes I \end{array} \right] \in {\mathbb {R}}^{2p^2\times p^2}\end{array}, \end{aligned}$$
(37)

where

$$\begin{aligned} \begin{array}{ll} T=\displaystyle \frac{\nu }{h^2}\ .\ \text {tridiag}\left( -1,2,-1\right) +\frac{w}{2h}.\ \text {tridiag} (-1,0,1)\in {\mathbb {R}}^{p\times p},&F=\displaystyle \frac{1}{h}\ .\ \text {tridiag}\left( -1,1,0\right) \in {\mathbb {R}}^{p\times p}.\end{array} \end{aligned}$$

Here, \(\otimes \) is Kronecker product and \(h=\displaystyle \frac{1}{p+1}\) is the discretization mesh size. We find w such that (1, 1)-block A in (1) has a dominant positive definite part. This feature decreases the number of iterations of the GMRES method when NMSS is used as its preconditioner. Saddle point problem (1) with the matrices given in (37) has been studied in [19]. As for the matrix Q in the Uzawa-HSS and PU-STS methods, we choose \(Q = B^\mathrm{T} ({\text {diag}} (A ))^{-1} B\).

For this example, we have \(n = 2p^2\) and \(m = p^2\). Hence, the total number of variables is \(m+n = 3p^2\). We test three \(\nu \), i.e., \(\nu =1, 0.1\ {\text {and}}\ 0.01\). For each \(\nu \), four different type of p are used, i.e., \(p = 16,\ 24,\ 32,\ 64\). In Tables 1, 2 and 3, we list numerical results on different uniform grids with \(\nu = 1,\ \nu = 0.1\ and\ \nu = 0.01\), respectively. In these tables, No Pr. denotes the GMRES method without preconditioning. \({{{\mathscr {P}}}}_\mathrm{GMSS}\) , \({{{\mathscr {P}}}}_{Uzawa-HSS}\) and \({{{\mathscr {P}}}}_{PU-STS}\), respectively, denote GMRES method with the left GMSS preconditioning, left Uzawa-HSS preconditioning and the left PU-STS preconditioning.

From Tables 1, 2 and 3, we observe that GMRES without preconditioning is very slow and the new preconditioner NMSS is faster than GMSS preconditioner. Numerical results show that the number of iterations of the new method is so much less than the Uzawa-HSS and PU-STS methods when they are used as preconditioner for GMRES method.

Figure 1 shows the eigenvalues distribution of the matrix \(\mathcal{A}\), and the NMSS, GMSS, MSS, Uzawa-HSS and PU-STS preconditioned matrices, respectively. We can see that for NMSS, GMSS and MSS preconditioned matrices, eigenvalues are well clustered around (1, 0) and (2, 0), especially most of the eigenvalues of the NMSS-preconditioned matrix are clustered near (1, 0).

Table 1 Numerical results for example 1, \(\nu =1\)
Table 2 Numerical results for example 1, \(\nu =0.1\)
Table 3 Numerical results for example 1, \(\nu =0.01\)
Fig. 1
figure 1

Eigenvalue distribution of the saddle point matrix and the preconditioned saddle point matrices for Example 6.1

Example 6.2

We consider the following singular saddle point problem:

$$\begin{aligned} \begin{array}{ll} A=\left[ \begin{array}{cc} I\otimes T+T\otimes I&{} 0\\ 0&{} I\otimes T+T\otimes I \end{array}\right] \in {\mathbb {R}}^{2p^2\times 2p^2},&B=[{\hat{B}}\ b_1\ b_2] \in {\mathbb {R}}^{2p^2\times p^2+2} , \end{array} \end{aligned}$$
(38)

where

$$\begin{aligned} \begin{array}{ll} T=\displaystyle \frac{\nu }{h^2}\ .\ \text {tridiag}\left( -1,2,-1\right) +\frac{w}{2h}.\ \text {tridiag}(-1,0,1)\in {\mathbb {R}}^{p\times p},&{} {\hat{B}}=\left[ \begin{array}{c} I\otimes F\\ F\otimes I \end{array} \right] \in {\mathbb {R}}^{2p^2\times p^2} ,\end{array} \end{aligned}$$
$$\begin{aligned} b_1={\hat{B}} \left[ \begin{array}{c} e\\ 0 \end{array} \right] ,\ b_2={\hat{B}} \left[ \begin{array}{c} 0\\ e \end{array}\ \right] ,\ e= \left[ \begin{array}{c} 1\\ \vdots \\ 1 \end{array}\ \right] \in {\mathbb {R}}^{p^2/2},\ F=\displaystyle \frac{1}{h}\ .\ \text {tridiag}\left( -1,1,0\right) \in {\mathbb {R}}^{p\times p}. \end{aligned}$$

where \(\otimes \) is Kronecker product and \(h=\displaystyle \frac{1}{p+1}\) is the discretization mesh size. We find w such that (1, 1)-block A in (1) has a dominant positive definite for \(\nu =1,0.1\ \text {and}\ 0.01\). This decreases the number of iterations of the GMRES method when NMSS is used as its preconditioner. Saddle point problem (1) with the matrices given in (38) has been studied in [19].

For this example, we have \(n = 2p^2\) and \(m = p^2+2\) and same as Example 6.1, we test three \(\nu \), i.e., \(\nu =1, 0.1\ \mathrm{{and}}\ 0.01\) and for each \(\nu \), four different p are used, i.e., \(p = 16,\ 24,\ 32,\ 64\). In Tables 4, 5 and 6, we list numerical results on different uniform grids with \(\nu = 1,\ \nu = 0.1\ \mathrm{and}\ \nu = 0.01\), respectively.

Table 4 Numerical results for example 2, \(\nu =1\)
Table 5 Numerical results for example 2, \(\nu =0.1\)
Table 6 Numerical results for example 2, \(\nu =0.01\)

According to tables, we compare NMSS preconditioner with GMSS and MSS preconditioners. From Tables 4, 5 and 6, we can see that for singular system, GMRES without preconditioning is also very slow and new preconditioner NMSS is faster than GMSS and MSS preconditioner.

Figure 2 gives the eigenvalue distribution of the matrix \({{{\mathscr {A}}}}\) and the NMSS, GMSS and MSS preconditioned matrices, respectively. This figure shows that except zero, the other eigenvalues same as nonsingular case are clustered near (1, 0) and (2, 0). With respect to choosing parameters in this example, eigenvalues of the NMSS-preconditioned matrix are more clustered than other preconditioned matrices.

Fig. 2
figure 2

Eigenvalue distribution of the saddle point matrix and the preconditioned saddle point matrices for Example 6.2

7 Conclusion

In this work, we present new preconditioner based on the positive definite and skew-symmetric splitting of (1, 1)-block A of the saddle point problem (1). The convergence and semi-convergence of the NMSS method for solving nonsingular and singular saddle point problems are, respectively, investigated. The numerical results show that if (1, 1)-block A in saddle point problem (1) has a dominant positive definite part, then the NMSS preconditioner acts better than GMSS, MSS, Uzawa-HSS and PU-STS preconditioners. However, if (1, 1)-block A has no positive definite dominant part, we should not expect to see proper results. Moreover, this new preconditioner can be used when (1, 1)-block A is symmetric or non-symmetric while for GMSS and MSS, A must be non-symmetric.