1 Introduction

Consider the equality constrained quadratic program:

$$\begin{aligned} \min _{x\in {\mathbb {R}}^n}\; \frac{1}{2}x^\mathrm{T}Ax - b^\mathrm{T}x \quad \hbox {s.t.}~ Bx = c, \end{aligned}$$
(1.1)

where \(A \in {\mathbb {R}}^{n\times n}\) is symmetric and \(B \in {\mathbb {R}}^{m\times n}\) with \(m<n\). The matrix A can be indefinite, but is assumed to be positive definite in the null space of B. Without loss of generality, we assume that B is of full rank m. The system of stationarity for the quadratic program (1.1) is

$$\begin{aligned} Ax + B^\mathrm{T}y - b= & {} 0, \\ Bx -c= & {} 0, \end{aligned}$$

where \(x\in {\mathbb {R}}^n\) is the primal variable and \(y\in {\mathbb {R}}^m\) is the Lagrangian multiplier (or dual variable). In matrix form, the \(n+m\) by \(n+m\) system is

$$\begin{aligned} \left( \begin{array}{cc} A &{} B^\mathrm{T} \\ B &{} 0 \end{array}\right) \left( \begin{array}{c} x \\ y \end{array}\right) = \left( \begin{array}{c} b \\ c \end{array}\right) , \end{aligned}$$
(1.2)

which is commonly called the augmented system or saddle point system—a problem with a wide range of applications in various areas of computational science and engineering. Numerical solutions of this problem have been extensively studied in the literature; see the survey paper [1] for a comprehensive review and a thorough list of references up to 2004.

The augmented Lagrangian technique has been used to make the (1,1)-block of the saddle point system positive definite. In this approach, an equivalent system is solved,

$$\begin{aligned} Ax + B^\mathrm{T}y - b + \gamma B^\mathrm{T}(Bx - c)= & {} 0,\\ Bx - c= & {} 0 \end{aligned}$$

with a parameter \(\gamma > 0\), which has the matrix form

$$\begin{aligned} \left( \begin{array}{cc} A + \gamma B^\mathrm{T}B &{} B^\mathrm{T} \\ B &{} 0 \end{array}\right) \left( \begin{array}{c} x \\ y \end{array}\right) = \left( \begin{array}{c} b + \gamma B^\mathrm{T}c \\ c \end{array}\right) . \end{aligned}$$
(1.3)

The following result is a well-known fact.

Proposition 1.1

Let A be symmetric positive definite in the null space of B. If \(A \succeq 0\), then \(A + \gamma B^\mathrm{T}B \succ 0\) for \(\gamma \in (0,+\infty );\) otherwise, there exists some \({{\hat{\gamma }}} > 0\) such that

$$\begin{aligned} \gamma \in ({\hat{\gamma }},+\infty ) ~~\Longrightarrow ~~ A + \gamma B^\mathrm{T}B \succ 0. \end{aligned}$$
(1.4)

1.1 Notation

For matrix \(M \in {\mathbb {R}}^{n\times n}\), \(\sigma (M)\) denotes the spectrum of M and \(\rho (M)\) the spectral radius of M. For symmetric M, \(\lambda _{\max }(M)\) (\(\lambda _{\min }(M)\)) is the maximum (minimum) eigenvalue of M. By \(M \succ 0\) (\(M \succeq 0\)), we mean that M is symmetric positive definite (semi-definite). For a complex number \(z \in {\mathbb {C}}\), \(\mathfrak {R}(z)\) denotes the real part of z and \(\mathfrak {I}(z)\) the imaginary part.

2 A Class of Stationary Iterative Methods

In this section, we describe a class of stationary iterative methods for solving the saddle point problem (1.3) where the (1,1)-block has been made positive definite. For convenience, we re-parameterize the first equation and introduce another parameter into the second. The equivalent system under consideration is

$$\begin{aligned} \left( \begin{array}{cc} H(\alpha ) &{} -B^\mathrm{T} \\ \tau B &{} 0 \end{array}\right) \left( \begin{array}{c} x \\ y \end{array}\right) = \left( \begin{array}{c} \alpha b + B^\mathrm{T}c \\ \tau c \end{array}\right) , \end{aligned}$$
(2.1)

where \(\alpha > 0\), \(\tau \ne 0\) and

$$\begin{aligned} H(\alpha ) = \alpha A + B^\mathrm{T}B \succ 0. \end{aligned}$$

Comparing (1.3) to (2.1), we see that \(\alpha =1/\gamma >0\) and the multiplier y has been rescaled along with a sign change. These changes are cosmetic except that one more parameter \(\tau \) is introduced into the second equation of (2.1).

Since the equation \(Bx=c\) is equivalent to \(QBx=Qc\) for any non-singular \(Q\in {\mathbb {R}}^{m\times m}\), B and c in (2.1) can obviously be replaced by QB and Qc, respectively.

2.1 Splitting of the (1,1)-Block

In our framework, the (1,1)-block submatrix \(H(\alpha )\) in (2.1) is split into a “left part” L and a “right part” R; that is,

$$\begin{aligned} H := \alpha A + B^\mathrm{T}B = L - R. \end{aligned}$$
(2.2)

We drop the \(\alpha \)-dependence from H, as well as from L and R, since \(\alpha \) will always be fixed in our analysis as long as \(H \succ 0\) is maintained, even though it can also be varied to improve convergence performance.

In this report, unless otherwise noted, splittings refer to those for the (1,1)-block submatrix H rather than for the entire \((2 \times 2)\)-block augmented matrix of the saddle point problem. Moreover, we will associate a splitting with a left–right pair (LR). Simplest examples of splittings include

$$\begin{aligned} L=H,\quad \;\; R=0; \end{aligned}$$

or after partitioning H into 2-by-2 blocks,

$$\begin{aligned} L = \left( \begin{array}{cc} H_{11} &{} 0 \\ 0 &{} H_{22} \\ \end{array}\right) ,\quad \;\; R = -\left( \begin{array}{cc} 0 &{} H_{12} \\ H_{21} &{} 0 \end{array}\right) , \end{aligned}$$

which is of block Jacobi type; or

$$\begin{aligned} L = \left( \begin{array}{cc} H_{11} &{} 0 \\ H_{21} &{} H_{22} \\ \end{array}\right) ,\quad \;\; R = -\left( \begin{array}{cc} 0 &{} H_{12} \\ 0 &{} 0 \end{array}\right) , \end{aligned}$$
(2.3)

which is of block Gauss–Seidel type. We note that when \(H \succ 0\) and (LR) is a Gauss–Seidel splitting, either element-wise or block-wise, it is known that \(\rho (L^{-1}R)<1\).

In general, one can first partition H into p-by-p blocks for any \(p \in \{1,2,\cdots ,n\}\), then perform a block splitting. In addition, splittings can be of SOR type involving an extra relaxation parameter. To keep notation simple, however, we will not carry such a parameter in a splitting (LR) since it does not affect our analysis.

2.2 A Stationary Iteration Class

We consider a class of stationary iterations consisting of all possible splittings (LR) for which the spectral radius of \(L^{-1}R\) does not exceed the unity (plus an additional technical condition to be specified soon). This class of stationary iterative methods, that we call the \(\mathtt {\{L,\!R\}}\)-class for lack of a more descriptive term, iterates as follows:

$$\begin{aligned} x^{k+1}= & {} L^{-1}\left( Rx^k + B^\mathrm{T}(y^k + c) + \alpha b\right) , \end{aligned}$$
(2.4a)
$$\begin{aligned} y^{k+1}= & {} y^k - \tau \left( Bx^{k+1}-c\right) , \end{aligned}$$
(2.4b)

where (LR) is any admissible splitting and \(\tau \) represents a step length in multiplier updates.

It is easy to see that the \(\mathtt {\{L,\!R\}}\)-class iterations (2.4) correspond to the following splitting of the \((2 \times 2)\)-block augmented matrix in system (2.1):

$$\begin{aligned} \left( \begin{array}{cc} H &{} -B^\mathrm{T} \\ \tau B &{} 0 \end{array}\right) = \left( \begin{array}{cc} L &{} 0 \\ \tau B &{} I \end{array}\right) - \left( \begin{array}{cc} R &{} B^\mathrm{T} \\ 0 &{} I \end{array}\right) . \end{aligned}$$
(2.5)

Therefore, the resulting iteration matrix is

$$\begin{aligned} M(\tau ) := \left( \begin{array}{cc} L &{} 0 \\ \tau B &{} I \end{array}\right) ^{-1}\!\! \left( \begin{array}{cc} R &{} B^\mathrm{T} \\ 0 &{} I \end{array}\right) = \left( \begin{array}{cc} L^{-1}R &{} L^{-1}B^\mathrm{T} \\ -\tau BL^{-1}R &{} I-\tau BL^{-1}B^\mathrm{T} \end{array}\right) . \end{aligned}$$
(2.6)

It is worth observing that the results of the present paper still hold if in the right-hand side of (2.5) the identity matrix in the (2,2)-blocks is replaced by any symmetric positive definite matrixFootnote 1.

From the well-known theory for stationary iterative methods for linear systems, we have

Proposition 2.1

A member of the \(\mathtt {\{L,\!R\}}\)-class converges Q-linearly from any initial point if and only if the corresponding iteration matrix \(M(\tau )\), for some value \(\tau \), satisfies

$$\begin{aligned} \rho (M(\tau )) < 1. \end{aligned}$$
(2.7)

In this paper, we establish that, under two reasonable assumptions, condition (2.7) holds for the entire \(\mathtt {\{L,\!R\}}\)-class.

2.3 Classic Methods ALM and ADMM

The trivial splitting \((L,R)=(H,0)\) gives the classic augmented Lagrangian multiplier (ALM) method [2, 3], which is also equivalent to Uzawa’s method [4] applied to (1.3). In this case,

$$\begin{aligned} M(\tau ) = \left( \begin{array}{cc} 0 &{} H^{-1}B^\mathrm{T} \\ 0 &{} I-\tau BH^{-1}B^\mathrm{T} \end{array}\right) , \end{aligned}$$

and

$$\begin{aligned} \rho (M(\tau )) = \rho \left( I-\tau BH^{-1}B^\mathrm{T}\right) \end{aligned}$$
(2.8)

leading to the well-known convergence result for the multiplier method.

Proposition 2.2

The augmented Lagrangian multiplier method applied to the quadratic program (1.1) converges Q-linearly from any initial point for \(\tau {\in } \left( 0,2/\lambda _{\max }(BH^{-1}B^\mathrm{T})\right) \), where \(H=\alpha A+B^\mathrm{T}B \succ 0\). Moreover, when \(A \succeq 0\), \(\tau \in (0,2)\) suffices for convergence.

The classic ALM method, or Uzawa’s method applied to (1.3), is the unique member of the \(\mathtt {\{L,\!R\}}\)-class that requires solving systems involving the entire (1,1)-block submatrix H (with a different right-hand side from iteration to iteration). On the other hand, all other \(\mathtt {\{L,\!R\}}\)-class members only require solving systems involving the left part L which can be much less expensive if L are chosen to exploit problem structures.

When the splitting of H is of the \((2 \times 2)\)-block Gauss–Seidel type as is defined in (2.3), the associated \(\mathtt {\{L,\!R\}}\)-class member reduces to the classic alternating direction method of multipliers, i.e., ADMM [5, 6], for which convergence has been established for general convex functions not restricted to quadratics. However, such a general theory requires objective functions to be a sum of two separable functions with respect to two block variables, and both convex in the entire space. Apparently, no convergence results are available, to the best of our knowledge, when the objective is non-separable, or is convex only in a subspace, or the number of block variables exceeds two (unless algorithmic modifications are introduced).

3 Convergence of the Entire Class

We present a unified convergence result for the entire \(\mathtt {\{L,\!R\}}\)-class under two assumptions:

  1. A1.

    \(H := \alpha A + B^\mathrm{T}B \succ 0\), where B is of rank m.

  2. A2.

    \(H=L-R\) satisfies \(\rho (L^{-1}R) \leqslant 1\) and the condition (3.1).

We know that Assumption A1 holds for appropriate \(\alpha \) values if \(A \in {\mathbb {R}}^{n\times n}\) is positive definite in the null space of B, see Proposition 1.1. We further require that \(L^{-1}R\) have no eigenvalue of unit modulus or greater except possibly the unity itself being an eigenvalue; that is,

$$\begin{aligned} \max \left\{ |\mu |: \mu \in \sigma (L^{-1}R) {\setminus } \{1\} \right\} < 1. \end{aligned}$$
(3.1)

Now we present a unified convergence theorem for the entire \(\mathtt {\{L,\!R\}}\)-class.

Theorem 3.1

Let \(\{(x^k,y^k)\}\) be generated from any initial point by a member of the \(\mathtt {\{L,\!R\}}\)-class defined by (2.4). Under Assumptions A1–A2, there exists \(\eta > 0\) such that for all \(\tau \in (0, 2\eta )\) the sequence \(\{(x^k,y^k)\}\) converges Q-linearly to the solution of (1.1).

The proof is left to the next section after we develop some technical results. We note that the convergence interval \((0, 2\eta )\) is member-dependent. It can also depend on the value of the parameter \(\alpha >0\) in \(H(\alpha ) = \alpha A + B^\mathrm{T}B \succ 0\).

It is worth noting that the theorem only requires \(L^{-1}R\), as a linear mapping in \({\mathbb {R}}^n\), to be non-expansive (plus a technical condition) rather than contractive. Convergence would not necessarily happen if one kept iterating on the primal variable x only. However, timely updating the multiplier y helps iterates for the pair (xy) converge together.

4 Technical Results and Proof of Convergence

We first derive some useful technical lemmas. Let \(\lambda (\tau )\) be an eigenvalue of \(M(\tau )\), i.e.,

$$\begin{aligned} \lambda (\tau ) \in \sigma (M(\tau )). \end{aligned}$$
(4.1)

The eigenvalue system corresponding to \(\lambda \) is

$$\begin{aligned} \left( \begin{array}{cc} L^{-1}R &{} L^{-1}B^\mathrm{T} \\ -\tau BL^{-1}R &{} I-\tau BL^{-1}B^\mathrm{T} \end{array}\right) \left( \begin{array}{c} u(\tau ) \\ v(\tau ) \end{array}\right) = \lambda (\tau ) \left( \begin{array}{c} u(\tau ) \\ v(\tau ) \end{array}\right) , \end{aligned}$$
(4.2)

where \((u,v) \in {\mathbb {C}}^{n}\times {\mathbb {C}}^m\) is nonzero. For simplicity, we will often skip the \(\tau \)-dependence of the eigenpair if no confusion arises.

Lemma 4.1

If \(\rho (L^{-1}R) \leqslant 1\), then

$$\begin{aligned} \rho (M(0)) = 1. \end{aligned}$$

Under condition (3.1), the maximum eigenvalue of \(M(\tau )\) in modulus, \(\lambda (\tau )\), satisfies

$$\begin{aligned} \lim _{\tau \rightarrow 0} \frac{\lambda (\tau )-1}{\lambda (\tau )}=0. \end{aligned}$$
(4.3)

Proof

From the definition of \(M(\tau )\) in (2.6),

$$\begin{aligned} M(0) = \left( \begin{array}{cc} L^{-1}R &{} L^{-1}B^\mathrm{T} \\ 0 &{} I \end{array}\right) . \end{aligned}$$

Hence by our assumption \(\rho (M(0)) = \max (1,\rho (L^{-1}R)) = 1 \in \sigma (M(0))\). The second part follows from the continuity of eigenvalues as functions of matrix elements and condition (3.1).

Lemma 4.2

Let the matrix A be positive definite in the null space of B and \(\alpha > 0\), or the sum \(H = \alpha A + B^\mathrm{T}B\) be positive definite. Then for any \(\tau >0\), \( 1 \notin \sigma (M(\tau )) \) where \(M(\tau )\) is defined as in (2.6).

Proof

We examine eigensystem (4.2). Rearranging the first equation of (4.2), we have

$$\begin{aligned} (\lambda L - R) u = B^\mathrm{T}v. \end{aligned}$$
(4.4)

Multiplying the first equation by \(\tau B\) and adding to the second of (4.2), after rearranging we obtain

$$\begin{aligned} (1-\lambda )v = \lambda \tau Bu. \end{aligned}$$
(4.5)

Suppose that \(\lambda =1\). Then (4.5) implies \(Bu=0\). By definition (2.2), equation (4.4) reduces to

$$\begin{aligned} (L - R) u \equiv (\alpha A + B^\mathrm{T}B)u = B^\mathrm{T}v. \end{aligned}$$

Multiplying the above equation by \(u^*\) and invoking \(Bu=0\), we arrive at \(u^*Hu = u^*Au = 0\), contradicting to the assumption of the lemma.

Lemma 4.3

Let \((\lambda ,(u,v))\) be an eigenpair of \(M(\tau )\) as is given in (4.2) where \(\lambda \notin \{0,1\}\) and \(Bu \ne 0\), then

$$\begin{aligned} \lambda = 1 - {\tau }\left( {\frac{u^*Hu}{u^*B^\mathrm{T}Bu} + \frac{\lambda -1}{\lambda }\frac{u^*Ru}{u^*B^\mathrm{T}Bu}}\right) ^{-1}. \end{aligned}$$
(4.6)

Proof

It follows readily from (4.5) that

$$\begin{aligned} v = \frac{\lambda \tau }{1-\lambda } Bu. \end{aligned}$$
(4.7)

Substituting the above into (4.4) and in view of (2.2), we have

$$\begin{aligned} \left( \lambda H + (\lambda -1)R\right) u = \frac{\lambda \tau }{1-\lambda } B^\mathrm{T}Bu, \end{aligned}$$

or after a rearrangement,

$$\begin{aligned} \left( H - \frac{\tau }{1-\lambda } B^\mathrm{T}B\right) u = \frac{1-\lambda }{\lambda } Ru. \end{aligned}$$
(4.8)

Multiplying both sides of (4.8) by \(u^*\), we have

$$\begin{aligned} u^*Hu - \frac{\tau }{1-\lambda } u^*B^\mathrm{T}Bu = \frac{1-\lambda }{\lambda } u^*Ru. \end{aligned}$$

Since \(u^*B^\mathrm{T}Bu \ne 0\), the above equation can be rewritten into

$$\begin{aligned} \frac{\tau }{1-\lambda } = \frac{u^*Hu}{u^*B^\mathrm{T}Bu} + \frac{\lambda -1}{\lambda } \frac{u^*Ru}{u^*B^\mathrm{T}Bu}. \end{aligned}$$
(4.9)

Solving for the \(\lambda \) on the left-hand side of (4.9) while fixing the ones on the right, we obtain the desired result where the denominator term must be nonzero.

Lemma 4.4

Let \(\tau , \kappa \in {\mathbb {R}}\) and \(z = \mathfrak {R}(z) + i\mathfrak {I}(z) \in {\mathbb {C}}\) such that \(\kappa + \mathfrak {R}(z) > 0\). Then

$$\begin{aligned} \tau \in \left( 0, \, 2(\kappa +\mathfrak {R}(z))\right) ~~\Longleftrightarrow ~~ \left| 1 - \frac{\tau }{\kappa + z}\right| < 1. \end{aligned}$$
(4.10)

Moreover, \(\tau = \kappa +\mathfrak {R}(z)\) minimizes the above modulus so that

$$\begin{aligned} \min _\tau \left| 1 - \frac{\tau }{\kappa + z}\right| = \left| 1 - \frac{\kappa +\mathfrak {R}(z)}{\kappa + z}\right| = \frac{|\mathfrak {I}(z)|}{|\kappa +z|}. \end{aligned}$$
(4.11)

Proof

By direct calculation,

$$\begin{aligned} \left| 1 - \frac{\tau }{\kappa + z}\right| ^2 = 1 - \tau \frac{2(\kappa +\mathfrak {R}(z))-\tau }{|\kappa +z|^2} = \frac{(\kappa +\mathfrak {R}(z)-\tau )^2 + \mathfrak {I}(z)^2}{(\kappa +\mathfrak {R}(z))^2 + \mathfrak {I}(z)^2}, \end{aligned}$$
(4.12)

from which both (4.10) and (4.11) follow.

Now we are ready to prove Theorem 3.1.

Proof

The proof is based on Lemmas 4.1, 4.3 and 4.4, while Lemma 4.2 is implicitly used.

Let \((\lambda (\tau ),(u(\tau ),v(\tau )))\) be an eigenpair of \(M(\tau )\) corresponding to an eigenvalue of maximum modulus. Clearly, \(\lambda (\tau ) \notin \{0,1\}\). We need to prove that \(|\lambda (\tau )| < 1\) for some values of \(\tau > 0\). In the rest of the proof, we often skip the dependence on \(\tau \).

We consider two cases: \(Bu=0\) and \(Bu \ne 0\). If \(Bu=0\), then (4.5) implies \(v=0\), and (4.4) implies that \((\lambda ,u)\) is an eigenpair of \(L^{-1}R\). Therefore, \(|\lambda |<1\) by Assumption A2. Now we assume that \(Bu \ne 0\). By Lemmas 4.3 and 4.4, \(|\lambda (\tau )| < 1\) if and only if the following inclusion is feasible,

$$\begin{aligned} \tau \in \left( 0,\, 2 \Theta (\tau )\right) , \end{aligned}$$
(4.13)

where

$$\begin{aligned} \Theta (\tau ):= & {} \frac{u^*Hu}{u^*B^\mathrm{T}Bu} + \mathfrak {R}\left( \frac{\lambda -1}{\lambda }\frac{u^*Ru}{u^*B^\mathrm{T}Bu}\right) \nonumber \\= & {} \frac{u^*u}{u^*B^\mathrm{T}Bu}\left( \frac{u^*Hu}{u^*u} + \mathfrak {R}\left( \frac{\lambda -1}{\lambda }\frac{u^*Ru}{u^*u}\right) \right) . \end{aligned}$$
(4.14)

Under Assumption A2, we know from (4.3) in Lemma 4.1 that \(1-1/\lambda (\tau ) \rightarrow 0\) as \(\tau \rightarrow 0\). Hence, in view of the boundedness of \({u^*Ru}/{u^*u}\), for any \(\delta \in (0,1)\) there exists \(\xi _{\delta } > 0\) such that

$$\begin{aligned} \mathfrak {R}\left( \frac{\lambda -1}{\lambda }\frac{u^*Ru}{u^*u}\right) \geqslant -\left| \frac{\lambda -1}{\lambda }\right| \, \frac{|u^*Ru|}{u^*u} \geqslant -\delta \lambda _{\min }(H),\quad \;\; \forall \, \tau \in (0, 2\xi _{\delta }). \end{aligned}$$
(4.15)

We now estimate \(\Theta (\tau )\) for \(\tau \in (0, 2\xi _{\delta })\) from (4.14) and (4.15),

$$\begin{aligned} \Theta (\tau )\geqslant & {} \frac{\lambda _{\min }(H) -\delta \lambda _{\min }(H)}{\lambda _{\max }(B^\mathrm{T}B)} = (1-\delta )\frac{\lambda _{\min }(H)}{\lambda _{\max }(B^\mathrm{T}B)} := \theta _{\delta } > 0, \nonumber \\&\forall \, \tau \in (0, 2\xi _{\delta }). \end{aligned}$$
(4.16)

It follows from (4.16) that inclusion (4.13) indeed holds for all \(\tau \in (0, 2\eta ),\) where

$$\begin{aligned} \eta := \min (\xi _{\delta },\theta _{\delta }). \end{aligned}$$
(4.17)

This completes the proof.

In view of the second part of Lemma 4.4, if there exists \(\tau _o>0\) such that \(\tau _o = \Theta (\tau _o)\), then the optimal rate of convergence (for a given \(\alpha >0\) and a given splitting) would be

$$\begin{aligned} \frac{|\mathfrak {I}(z(\tau _o))|}{|u(\tau _o)^*Hu(\tau _o) + z(\tau _o)|} = \left( 1 + \frac{(u(\tau _o)^*Hu(\tau _o)+\mathfrak {R}(z(\tau _o)))^2}{\mathfrak {I}(z(\tau _o))^2}\right) ^{-\frac{1}{2}} < 1,\qquad \end{aligned}$$
(4.18)

where \( z(\tau ) := \frac{\lambda (\tau )-1}{\lambda (\tau )}{(u(\tau )^*Ru(\tau ))} \) whose imaginary part must be nonzero at \(\tau =\tau _o\). Of course, such an optimal rate of convergence is generally not computable in practice.

5 Remarks

The \(\mathtt {\{L,\!R\}}\)-class defined by (2.4) is constructed from splittings of the (1,1)-block of the saddle point system matrix that includes, but is not limited to, all known convergent splittings for positive definite matrices, offering adaptivity to problem structures with guaranteed convergence.

Those \(\mathtt {\{L,\!R\}}\)-class members associated with block Gauss–Seidel splittings are natural extensions to the classic ADMM specialized to quadratic programs. In contrast to the existing general convergence theory for ADMM, Theorem 3.1 does not require separability, nor convexity in the entire space, and imposes no restriction on the number of blocks, while giving a Q-linear rate of convergence. It should be of great interest to extend these properties beyond quadratic programs, which will be a topic to be addressed in another work.

The convergence of certain members of the \(\mathtt {\{L,\!R\}}\)-class has been studied in [7] under the assumption that L is symmetric positive definite. In [8], a special case corresponding to the SOR-splitting has been analyzed.