A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications

Li, Xudong; Sun, Defeng; Toh, Kim-Chuan

doi:10.1007/s10107-018-1247-7

A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications

Full Length Paper
Series A
Published: 23 February 2018

Volume 175, pages 395–418, (2019)
Cite this article

Mathematical Programming Submit manuscript

1524 Accesses
23 Citations
Explore all metrics

Abstract

For a symmetric positive semidefinite linear system of equations $\mathcal{Q}{{\varvec{x}}}= {{\varvec{b}}}$, where ${{\varvec{x}}}= (x_1,\ldots ,x_s)$ is partitioned into s blocks, with $s \ge 2$, we show that each cycle of the classical block symmetric Gauss–Seidel (sGS) method exactly solves the associated quadratic programming (QP) problem but added with an extra proximal term of the form $\frac{1}{2}\Vert {{\varvec{x}}}-{{\varvec{x}}}^k\Vert _\mathcal{T}^2$, where $\mathcal{T}$ is a symmetric positive semidefinite matrix related to the sGS decomposition of $\mathcal{Q}$ and ${{\varvec{x}}}^k$ is the previous iterate. By leveraging on such a connection to optimization, we are able to extend the result (which we name as the block sGS decomposition theorem) for solving convex composite QP (CCQP) with an additional possibly nonsmooth term in $x_1$, i.e., $\min \{ p(x_1) + \frac{1}{2}\langle {{\varvec{x}}},\,\mathcal{Q}{{\varvec{x}}}\rangle -\langle {{\varvec{b}}},\,{{\varvec{x}}}\rangle \}$, where $p(\cdot )$ is a proper closed convex function. Based on the block sGS decomposition theorem, we extend the classical block sGS method to solve CCQP. In addition, our extended block sGS method has the flexibility of allowing for inexact computation in each step of the block sGS cycle. At the same time, we can also accelerate the inexact block sGS method to achieve an iteration complexity of $O(1/k^2)$ after performing k cycles. As a fundamental building block, the block sGS decomposition theorem has played a key role in various recently developed algorithms such as the inexact semiproximal ALM/ADMM for linearly constrained multi-block convex composite conic programming (CCCP), and the accelerated block coordinate descent method for multi-block CCCP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Proximal Alternating Direction Method of Multipliers for DC Programming with Structured Constraints

Article 11 May 2024

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Article 15 May 2024

Efficiency of higher-order algorithms for minimizing composite functions

Article 10 October 2023

References

Axelsson, O.: Iterative Solution Methods. Cambridge University Press, Cambridge (1994)
Book MATH Google Scholar
Bai, M.R., Zhang, X.J., Ni, G.Y., Cui, C.F.: An adaptive correction approach for tensor completion. SIAM J. Imaging Sci. 9, 1298–1323 (2016)
Article MathSciNet MATH Google Scholar
Bai, S., Qi, H.-D.: Tackling the flip ambiguity in wireless sensor network localization and beyond. Digital Signal Process. 55, 85–97 (2016)
Article Google Scholar
Bank, R.E., Dupont, T.F., Yserentant, H.: The hierarchical basis multigrid method. Numerische Mathematik 52, 427–458 (1988)
Article MathSciNet MATH Google Scholar
Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23, 2037–2060 (2013)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1995)
MATH Google Scholar
Bi, S., Pan, S., Sun, D. F.: Multi-stage convex relaxation approach to noisy structured low-rank matrix recovery, arXiv:1703.03898 (2017)
Ding, C., Qi, H.-D.: Convex optimization learning of faithful Euclidean distance representations in nonlinear dimensionality reduction. Math. Program. 164, 341–381 (2017)
Article MathSciNet MATH Google Scholar
Ding, C., Qi, H.-D.: Convex Euclidean distance embedding for collaborative position localization with NLOS mitigation. Comput. Optim. Appl. 66, 187–218 (2017)
Article MathSciNet MATH Google Scholar
Chen, L., Sun, D.F., Toh, K.-C.: An efficient inexact symmetric Gauss–Seidel based majorized ADMM for high-dimensional convex composite conic programming. Math. Program. 161, 237–270 (2017)
Article MathSciNet MATH Google Scholar
Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015)
Article MathSciNet MATH Google Scholar
Fercoq, O., Richtárik, P.: Optimization in high dimensions via accelerated, parallel, and proximal coordinate descent. SIAM Rev. 28, 739–771 (2016)
Article MathSciNet MATH Google Scholar
Ferreira, J. B., Khoo, Y., Singer, A.: Semidefinite programming approach for the quadratic assignment problem with a sparse graph, arXiv:1703.09339 (2017)
Freund, R. W.: Preconditioning of symmetric, but highly indefinite linear systems, In: Proceedings of 15th imacs world congress on scientific computation modelling and applied mathematics, Berlin, Germany, pp. 551–556 (1997)
Greenbaum, A.: Iterative Methods for Solving Linear Systems. SIAM, Philadelphia (1997)
Book MATH Google Scholar
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)
Article MathSciNet MATH Google Scholar
Hackbusch, W.: Iterative Solutions of Large Sparse Systems of Equations. Springer, New York (1994)
Book MATH Google Scholar
Han, D., Sun, D. F., Zhang, L.: Linear rate convergence of the alternating direction method of multipliers for convex composite programming, Math. Oper. Res. (2017). https://doi.org/10.1287/moor.2017.0875
Jiang, K.F., Sun, D.F., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22, 1042–1064 (2012)
Article MathSciNet MATH Google Scholar
Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford splitting methods for convex-concave saddle-point problems. SIAM J. Numer. Anal. 53, 421–444 (2015)
Article MathSciNet MATH Google Scholar
Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford algorithms for TV-and TGV-regularized variational imaging problems. J. Math. Imaging Vis. 52, 317–344 (2015)
Article MathSciNet MATH Google Scholar
Lam, X.Y., Marron, J.S., Sun, D.F., Toh, K.-C.: Fast algorithms for large scale extended distance weighted discrimination, arXiv:1604.05473. Journal Computational and Graphical Statistics (2016, to appear)
Li, X.D., Sun, D.F., Toh, K.-C.: QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming, arXiv:1512.08872 (2015)
Li, X.D., Sun, D.F., Toh, K.-C.: A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions. Math. Program. 155, 333–373 (2016)
Article MathSciNet MATH Google Scholar
Li, X.D.: A two-phase augmented Lagrangian method for convex composite quadratic programming, PhD thesis, Department of Mathematics, National University of Singapore (2015)
Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30, 408–425 (1992)
Article MathSciNet MATH Google Scholar
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM. J. Optim. 22, 341–362 (2012)
MathSciNet MATH Google Scholar
Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27, 110–123 (2017)
Article MathSciNet MATH Google Scholar
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. SIAM, Philadelphia (2000)
Book MATH Google Scholar
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144, 1–38 (2014)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Robinson, S.M.: Some continuity properties of polyhedral multifunctions. Math. Program. Study 14, 206–214 (1981)
Article MathSciNet MATH Google Scholar
Sadd, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)
Book Google Scholar
Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. Advances in neural information processing systems (NIPS), pp. 1458–1466 (2011)
Sun, D.F., Toh, K.-C., Yang, L.Q.: An efficient inexact ABCD method for least squares semidefinite programming. SIAM J. Optim. 26, 1072–1100 (2016)
Article MathSciNet MATH Google Scholar
Sun, D.F., Toh, K.-C., Yang, L.Q.: A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Optim. 25, 882–915 (2015)
Article MathSciNet MATH Google Scholar
Sun, J.: On monotropic piecewise qudratic programming, PhD Thesis, Department of Mathematics, University of Washington, Seattle (1986)
Tappenden, R., Richtárik, R., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170, 144–176 (2016)
Article MathSciNet MATH Google Scholar
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 125, 387–423 (2010)
Article MathSciNet MATH Google Scholar
Varga, R.S.: Matrix Iterative Analysis. Springer, Berlin (2009)
MATH Google Scholar
Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27, 124–145 (2017)
Article MathSciNet MATH Google Scholar
Xiao, L., Lu, Z.: On the complexity analysis of randomized block-coordinate descent methods. Math. Program. 152, 615–642 (2015)
Article MathSciNet MATH Google Scholar
Young, D.M.: On the accelerated SSOR method for solving large linear systems. Adv. Math. 23, 215–217 (1997)
Article MathSciNet MATH Google Scholar
Zhang, X., Xu, C., Zhang, Y., Zhu, T., Cheng, L.: Multivariate regression with grossly corrupted observations: a robust approach and its applications, arXiv:1701.02892 (2017)
Zhou, Z.R., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165, 689–728 (2017)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank the Associate Editor and anonymous referees for their helpful comments.

Author information

Authors and Affiliations

Department of Operations Research and Financial Engineering, Princeton University, Sherrerd Hall, Princeton, NJ, 08544, USA
Xudong Li
Department of Applied Mathematics, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
Defeng Sun
Department of Mathematics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore, 119076, Singapore
Defeng Sun
Department of Mathematics, and Institute of Operations Research and Analytics, National University of Singapore, 10 Lower Kent Ridge Road, Singapore, 119076, Singapore
Kim-Chuan Toh

Authors

Xudong Li
View author publications
You can also search for this author in PubMed Google Scholar
Defeng Sun
View author publications
You can also search for this author in PubMed Google Scholar
Kim-Chuan Toh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xudong Li.

Additional information

Defeng Sun: On leave from Department of Mathematics, National University of Singapore.

Appendix: Proof of part (b) of Proposition 2

To begin the proof, we state the following lemma from [35].

Lemma 2

Suppose that $\{u_k\}$ and $\{\lambda _k\}$ are two sequences of nonnegative scalars, and $\{ s_k\}$ is a nondecreasing sequence of scalars such that $s_0\ge u_0^2$. Suppose that for all $k\ge 1$, the inequality $ u_k^2 \le s_k + 2\sum _{i=1}^k \lambda _i u_i $ holds. Then for all $k\ge 1$, $ u_k \le \bar{\lambda }_k + \sqrt{ s_k + \bar{\lambda }_k^2},$ where $\bar{\lambda }_k = \sum _{i=1}^k \lambda _i$.

Proof

In this proof, we let $\Delta ^j = \Delta (\tilde{\varvec{\delta }}^j,\varvec{\delta }^j)$. Note that under the assumption that $t_j=1$ for all $j\ge 1$, $\widetilde{{{\varvec{x}}}}^j = {{\varvec{x}}}^{j-1}$. Note also that from (30), we have that $\Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert \le M \epsilon _j$, where M is given as in Proposition 2.

For $j\ge 1$, from the optimality of ${{\varvec{x}}}^j$ in (29), one can show that for all $x\in \mathcal{X}$

$$\begin{aligned} F({{\varvec{x}}}) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{Q}} - \langle \mathcal{T}_{\mathcal{Q}}({{\varvec{x}}}^j - {{\varvec{x}}}^{j-1}),\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{Q}} + \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{T}_{\mathcal{Q}}} - \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^{j-1}\Vert ^2_{\mathcal{T}_{\mathcal{Q}}} + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{{\hat{\mathcal{Q}}}} - \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^{j-1}\Vert ^2_{{\hat{\mathcal{Q}}}} + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\= & {} \frac{1}{2}\Vert {{\varvec{x}}}^j - {{\varvec{x}}}^{j-1}\Vert _{{\hat{\mathcal{Q}}}}^2 + \langle {{\varvec{x}}}^{j-1}-{{\varvec{x}}},\,{\hat{\mathcal{Q}}}({{\varvec{x}}}^j-{{\varvec{x}}}^{j-1})\rangle + \langle \Delta ^j,\,{{\varvec{x}}}-{{\varvec{x}}}^j\rangle .\nonumber \\ \end{aligned}$$

(41)

Let ${{\varvec{e}}}^j = {{\varvec{x}}}^j-{{\varvec{x}}}^*$. By setting ${{\varvec{x}}}= {{\varvec{x}}}^{j-1}$ and ${{\varvec{x}}}={{\varvec{x}}}^*$ in (41), we get

$$\begin{aligned} F({{\varvec{x}}}^{j-1}) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2} \Vert {{\varvec{e}}}^j-{{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 + \langle \Delta ^j,\,{{\varvec{e}}}^{j-1}-{{\varvec{e}}}^j\rangle , \end{aligned}$$

(42)

$$\begin{aligned} F({{\varvec{x}}}^*) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2} \Vert {{\varvec{e}}}^j\Vert _{\hat{\mathcal{Q}}}^2 -\frac{1}{2} \Vert {{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 - \langle \Delta ^j,\,{{\varvec{e}}}^j\rangle . \end{aligned}$$

(43)

By multiplying $j-1$ to (42) and combining with (43), we get

$$\begin{aligned} (a_j + b_j^2)\le & {} (a_{j-1}+b_{j-1}^2) - (j-1)\Vert {{\varvec{e}}}^j-{{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 + 2\langle \Delta ^j,\, j {{\varvec{e}}}^j-(j-1){{\varvec{e}}}^{j-1} \rangle \nonumber \\\le & {} (a_{j-1}+b_{j-1}^2) + 2 \Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert \Vert j {{\varvec{e}}}^j-(j-1){{\varvec{e}}}^{j-1} \Vert _{\hat{\mathcal{Q}}}\nonumber \\\le & {} (a_{j-1}+b_{j-1}^2) + 2\Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert (j b_j + (j-1)b_{j-1}) \nonumber \\\le & {} \cdots \nonumber \\\le & {} a_1 + b_1^2 + 2 \sum _{i=2}^j M\epsilon _i (i b_i +(i-1)b_{i-1}) \nonumber \\\le & {} { b_0^2 + 2 \sum _{i=1}^j 2Mi \epsilon _i b_i }, \end{aligned}$$

(44)

where $a_j = 2j [F({{\varvec{x}}}^j)-F({{\varvec{x}}}^*)]$ and $b_j = \Vert {{\varvec{e}}}^j\Vert _{\hat{\mathcal{Q}}}$. Note that the last inequality follows from (43) with $j=1$, the fact that the sequence $\{\epsilon _i\}$ is non-increasing, and some simple manipulations.

To summarize, we have $b_j^2 \le b_0^2 + 2 \sum _{i=1}^j 2M i\epsilon _i b_i $. By applying Lemma 2, we get

$$\begin{aligned} b_j \;\le \; \bar{\lambda }_j + \sqrt{ b_0^2 + \bar{\lambda }_j^2} \;\le \; b_0 + 2\bar{\lambda }_j, \end{aligned}$$

where $\bar{\lambda }_j = \sum _{i=1}^j \lambda _i$ with $\lambda _i = 2M i \epsilon _i$. Applying the above result to (44), we get

$$\begin{aligned} a_j \;\le \; b_0^2 + 2 \sum _{i=1}^j \lambda _i (2\bar{\lambda }_i + b_0) \; \le \; (b_0 + 2\bar{\lambda }_j)^2. \end{aligned}$$

From here, the required result in Part (b) of Proposition 2 follows. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, X., Sun, D. & Toh, KC. A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications. Math. Program. 175, 395–418 (2019). https://doi.org/10.1007/s10107-018-1247-7

Download citation

Received: 24 May 2017
Accepted: 08 February 2018
Published: 23 February 2018
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s10107-018-1247-7

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications

Abstract

Access this article

Similar content being viewed by others

A Proximal Alternating Direction Method of Multipliers for DC Programming with Structured Constraints

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Efficiency of higher-order algorithms for minimizing composite functions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of part (b) of Proposition 2

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications

Abstract

Access this article

Similar content being viewed by others

A Proximal Alternating Direction Method of Multipliers for DC Programming with Structured Constraints

Relaxed-inertial derivative-free algorithm for systems of nonlinear pseudo-monotone equations

Efficiency of higher-order algorithms for minimizing composite functions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix: Proof of part (b) of Proposition 2

Appendix: Proof of part (b) of Proposition 2

Lemma 2

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation