Skip to main content
Log in

A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

For a symmetric positive semidefinite linear system of equations \(\mathcal{Q}{{\varvec{x}}}= {{\varvec{b}}}\), where \({{\varvec{x}}}= (x_1,\ldots ,x_s)\) is partitioned into s blocks, with \(s \ge 2\), we show that each cycle of the classical block symmetric Gauss–Seidel (sGS) method exactly solves the associated quadratic programming (QP) problem but added with an extra proximal term of the form \(\frac{1}{2}\Vert {{\varvec{x}}}-{{\varvec{x}}}^k\Vert _\mathcal{T}^2\), where \(\mathcal{T}\) is a symmetric positive semidefinite matrix related to the sGS decomposition of \(\mathcal{Q}\) and \({{\varvec{x}}}^k\) is the previous iterate. By leveraging on such a connection to optimization, we are able to extend the result (which we name as the block sGS decomposition theorem) for solving convex composite QP (CCQP) with an additional possibly nonsmooth term in \(x_1\), i.e., \(\min \{ p(x_1) + \frac{1}{2}\langle {{\varvec{x}}},\,\mathcal{Q}{{\varvec{x}}}\rangle -\langle {{\varvec{b}}},\,{{\varvec{x}}}\rangle \}\), where \(p(\cdot )\) is a proper closed convex function. Based on the block sGS decomposition theorem, we extend the classical block sGS method to solve CCQP. In addition, our extended block sGS method has the flexibility of allowing for inexact computation in each step of the block sGS cycle. At the same time, we can also accelerate the inexact block sGS method to achieve an iteration complexity of \(O(1/k^2)\) after performing k cycles. As a fundamental building block, the block sGS decomposition theorem has played a key role in various recently developed algorithms such as the inexact semiproximal ALM/ADMM for linearly constrained multi-block convex composite conic programming (CCCP), and the accelerated block coordinate descent method for multi-block CCCP.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Axelsson, O.: Iterative Solution Methods. Cambridge University Press, Cambridge (1994)

    Book  MATH  Google Scholar 

  2. Bai, M.R., Zhang, X.J., Ni, G.Y., Cui, C.F.: An adaptive correction approach for tensor completion. SIAM J. Imaging Sci. 9, 1298–1323 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  3. Bai, S., Qi, H.-D.: Tackling the flip ambiguity in wireless sensor network localization and beyond. Digital Signal Process. 55, 85–97 (2016)

    Article  Google Scholar 

  4. Bank, R.E., Dupont, T.F., Yserentant, H.: The hierarchical basis multigrid method. Numerische Mathematik 52, 427–458 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  5. Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23, 2037–2060 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1995)

    MATH  Google Scholar 

  7. Bi, S., Pan, S., Sun, D. F.: Multi-stage convex relaxation approach to noisy structured low-rank matrix recovery, arXiv:1703.03898 (2017)

  8. Ding, C., Qi, H.-D.: Convex optimization learning of faithful Euclidean distance representations in nonlinear dimensionality reduction. Math. Program. 164, 341–381 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  9. Ding, C., Qi, H.-D.: Convex Euclidean distance embedding for collaborative position localization with NLOS mitigation. Comput. Optim. Appl. 66, 187–218 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  10. Chen, L., Sun, D.F., Toh, K.-C.: An efficient inexact symmetric Gauss–Seidel based majorized ADMM for high-dimensional convex composite conic programming. Math. Program. 161, 237–270 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  11. Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  12. Fercoq, O., Richtárik, P.: Optimization in high dimensions via accelerated, parallel, and proximal coordinate descent. SIAM Rev. 28, 739–771 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  13. Ferreira, J. B., Khoo, Y., Singer, A.: Semidefinite programming approach for the quadratic assignment problem with a sparse graph, arXiv:1703.09339 (2017)

  14. Freund, R. W.: Preconditioning of symmetric, but highly indefinite linear systems, In: Proceedings of 15th imacs world congress on scientific computation modelling and applied mathematics, Berlin, Germany, pp. 551–556 (1997)

  15. Greenbaum, A.: Iterative Methods for Solving Linear Systems. SIAM, Philadelphia (1997)

    Book  MATH  Google Scholar 

  16. Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  17. Hackbusch, W.: Iterative Solutions of Large Sparse Systems of Equations. Springer, New York (1994)

    Book  MATH  Google Scholar 

  18. Han, D., Sun, D. F., Zhang, L.: Linear rate convergence of the alternating direction method of multipliers for convex composite programming, Math. Oper. Res. (2017). https://doi.org/10.1287/moor.2017.0875

  19. Jiang, K.F., Sun, D.F., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22, 1042–1064 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  20. Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford splitting methods for convex-concave saddle-point problems. SIAM J. Numer. Anal. 53, 421–444 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  21. Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford algorithms for TV-and TGV-regularized variational imaging problems. J. Math. Imaging Vis. 52, 317–344 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  22. Lam, X.Y., Marron, J.S., Sun, D.F., Toh, K.-C.: Fast algorithms for large scale extended distance weighted discrimination, arXiv:1604.05473. Journal Computational and Graphical Statistics (2016, to appear)

  23. Li, X.D., Sun, D.F., Toh, K.-C.: QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming, arXiv:1512.08872 (2015)

  24. Li, X.D., Sun, D.F., Toh, K.-C.: A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions. Math. Program. 155, 333–373 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  25. Li, X.D.: A two-phase augmented Lagrangian method for convex composite quadratic programming, PhD thesis, Department of Mathematics, National University of Singapore (2015)

  26. Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30, 408–425 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  27. Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  28. Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM. J. Optim. 22, 341–362 (2012)

    MathSciNet  MATH  Google Scholar 

  29. Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27, 110–123 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  30. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. SIAM, Philadelphia (2000)

    Book  MATH  Google Scholar 

  31. Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144, 1–38 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  32. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  33. Robinson, S.M.: Some continuity properties of polyhedral multifunctions. Math. Program. Study 14, 206–214 (1981)

    Article  MathSciNet  MATH  Google Scholar 

  34. Sadd, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)

    Book  Google Scholar 

  35. Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. Advances in neural information processing systems (NIPS), pp. 1458–1466 (2011)

  36. Sun, D.F., Toh, K.-C., Yang, L.Q.: An efficient inexact ABCD method for least squares semidefinite programming. SIAM J. Optim. 26, 1072–1100 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  37. Sun, D.F., Toh, K.-C., Yang, L.Q.: A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Optim. 25, 882–915 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  38. Sun, J.: On monotropic piecewise qudratic programming, PhD Thesis, Department of Mathematics, University of Washington, Seattle (1986)

  39. Tappenden, R., Richtárik, R., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170, 144–176 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  40. Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  41. Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 125, 387–423 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  42. Varga, R.S.: Matrix Iterative Analysis. Springer, Berlin (2009)

    MATH  Google Scholar 

  43. Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27, 124–145 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  44. Xiao, L., Lu, Z.: On the complexity analysis of randomized block-coordinate descent methods. Math. Program. 152, 615–642 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  45. Young, D.M.: On the accelerated SSOR method for solving large linear systems. Adv. Math. 23, 215–217 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  46. Zhang, X., Xu, C., Zhang, Y., Zhu, T., Cheng, L.: Multivariate regression with grossly corrupted observations: a robust approach and its applications, arXiv:1701.02892 (2017)

  47. Zhou, Z.R., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165, 689–728 (2017)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Associate Editor and anonymous referees for their helpful comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xudong Li.

Additional information

Defeng Sun: On leave from Department of Mathematics, National University of Singapore.

Appendix: Proof of part (b) of Proposition 2

Appendix: Proof of part (b) of Proposition 2

To begin the proof, we state the following lemma from [35].

Lemma 2

Suppose that \(\{u_k\}\) and \(\{\lambda _k\}\) are two sequences of nonnegative scalars, and \(\{ s_k\}\) is a nondecreasing sequence of scalars such that \(s_0\ge u_0^2\). Suppose that for all \(k\ge 1\), the inequality \( u_k^2 \le s_k + 2\sum _{i=1}^k \lambda _i u_i \) holds. Then for all \(k\ge 1\), \( u_k \le \bar{\lambda }_k + \sqrt{ s_k + \bar{\lambda }_k^2},\) where \(\bar{\lambda }_k = \sum _{i=1}^k \lambda _i\).

Proof

In this proof, we let \(\Delta ^j = \Delta (\tilde{\varvec{\delta }}^j,\varvec{\delta }^j)\). Note that under the assumption that \(t_j=1\) for all \(j\ge 1\), \(\widetilde{{{\varvec{x}}}}^j = {{\varvec{x}}}^{j-1}\). Note also that from (30), we have that \(\Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert \le M \epsilon _j\), where M is given as in Proposition 2.

For \(j\ge 1\), from the optimality of \({{\varvec{x}}}^j\) in (29), one can show that for all \(x\in \mathcal{X}\)

$$\begin{aligned} F({{\varvec{x}}}) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{Q}} - \langle \mathcal{T}_{\mathcal{Q}}({{\varvec{x}}}^j - {{\varvec{x}}}^{j-1}),\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{Q}} + \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{\mathcal{T}_{\mathcal{Q}}} - \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^{j-1}\Vert ^2_{\mathcal{T}_{\mathcal{Q}}} + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\\ge & {} \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^j\Vert ^2_{{\hat{\mathcal{Q}}}} - \frac{1}{2}\Vert {{\varvec{x}}}- {{\varvec{x}}}^{j-1}\Vert ^2_{{\hat{\mathcal{Q}}}} + \langle \Delta ^j,\,{{\varvec{x}}}- {{\varvec{x}}}^j\rangle \nonumber \\= & {} \frac{1}{2}\Vert {{\varvec{x}}}^j - {{\varvec{x}}}^{j-1}\Vert _{{\hat{\mathcal{Q}}}}^2 + \langle {{\varvec{x}}}^{j-1}-{{\varvec{x}}},\,{\hat{\mathcal{Q}}}({{\varvec{x}}}^j-{{\varvec{x}}}^{j-1})\rangle + \langle \Delta ^j,\,{{\varvec{x}}}-{{\varvec{x}}}^j\rangle .\nonumber \\ \end{aligned}$$
(41)

Let \({{\varvec{e}}}^j = {{\varvec{x}}}^j-{{\varvec{x}}}^*\). By setting \({{\varvec{x}}}= {{\varvec{x}}}^{j-1}\) and \({{\varvec{x}}}={{\varvec{x}}}^*\) in (41), we get

$$\begin{aligned} F({{\varvec{x}}}^{j-1}) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2} \Vert {{\varvec{e}}}^j-{{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 + \langle \Delta ^j,\,{{\varvec{e}}}^{j-1}-{{\varvec{e}}}^j\rangle , \end{aligned}$$
(42)
$$\begin{aligned} F({{\varvec{x}}}^*) - F({{\varvec{x}}}^j)\ge & {} \frac{1}{2} \Vert {{\varvec{e}}}^j\Vert _{\hat{\mathcal{Q}}}^2 -\frac{1}{2} \Vert {{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 - \langle \Delta ^j,\,{{\varvec{e}}}^j\rangle . \end{aligned}$$
(43)

By multiplying \(j-1\) to (42) and combining with (43), we get

$$\begin{aligned} (a_j + b_j^2)\le & {} (a_{j-1}+b_{j-1}^2) - (j-1)\Vert {{\varvec{e}}}^j-{{\varvec{e}}}^{j-1}\Vert _{\hat{\mathcal{Q}}}^2 + 2\langle \Delta ^j,\, j {{\varvec{e}}}^j-(j-1){{\varvec{e}}}^{j-1} \rangle \nonumber \\\le & {} (a_{j-1}+b_{j-1}^2) + 2 \Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert \Vert j {{\varvec{e}}}^j-(j-1){{\varvec{e}}}^{j-1} \Vert _{\hat{\mathcal{Q}}}\nonumber \\\le & {} (a_{j-1}+b_{j-1}^2) + 2\Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert (j b_j + (j-1)b_{j-1}) \nonumber \\\le & {} \cdots \nonumber \\\le & {} a_1 + b_1^2 + 2 \sum _{i=2}^j M\epsilon _i (i b_i +(i-1)b_{i-1}) \nonumber \\\le & {} { b_0^2 + 2 \sum _{i=1}^j 2Mi \epsilon _i b_i }, \end{aligned}$$
(44)

where \(a_j = 2j [F({{\varvec{x}}}^j)-F({{\varvec{x}}}^*)]\) and \(b_j = \Vert {{\varvec{e}}}^j\Vert _{\hat{\mathcal{Q}}}\). Note that the last inequality follows from (43) with \(j=1\), the fact that the sequence \(\{\epsilon _i\}\) is non-increasing, and some simple manipulations.

To summarize, we have \(b_j^2 \le b_0^2 + 2 \sum _{i=1}^j 2M i\epsilon _i b_i \). By applying Lemma 2, we get

$$\begin{aligned} b_j \;\le \; \bar{\lambda }_j + \sqrt{ b_0^2 + \bar{\lambda }_j^2} \;\le \; b_0 + 2\bar{\lambda }_j, \end{aligned}$$

where \(\bar{\lambda }_j = \sum _{i=1}^j \lambda _i\) with \(\lambda _i = 2M i \epsilon _i\). Applying the above result to (44), we get

$$\begin{aligned} a_j \;\le \; b_0^2 + 2 \sum _{i=1}^j \lambda _i (2\bar{\lambda }_i + b_0) \; \le \; (b_0 + 2\bar{\lambda }_j)^2. \end{aligned}$$

From here, the required result in Part (b) of Proposition 2 follows. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, X., Sun, D. & Toh, KC. A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications. Math. Program. 175, 395–418 (2019). https://doi.org/10.1007/s10107-018-1247-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-018-1247-7

Keywords

Mathematics Subject Classification

Navigation