Abstract
For a symmetric positive semidefinite linear system of equations \(\mathcal{Q}{{\varvec{x}}}= {{\varvec{b}}}\), where \({{\varvec{x}}}= (x_1,\ldots ,x_s)\) is partitioned into s blocks, with \(s \ge 2\), we show that each cycle of the classical block symmetric Gauss–Seidel (sGS) method exactly solves the associated quadratic programming (QP) problem but added with an extra proximal term of the form \(\frac{1}{2}\Vert {{\varvec{x}}}-{{\varvec{x}}}^k\Vert _\mathcal{T}^2\), where \(\mathcal{T}\) is a symmetric positive semidefinite matrix related to the sGS decomposition of \(\mathcal{Q}\) and \({{\varvec{x}}}^k\) is the previous iterate. By leveraging on such a connection to optimization, we are able to extend the result (which we name as the block sGS decomposition theorem) for solving convex composite QP (CCQP) with an additional possibly nonsmooth term in \(x_1\), i.e., \(\min \{ p(x_1) + \frac{1}{2}\langle {{\varvec{x}}},\,\mathcal{Q}{{\varvec{x}}}\rangle -\langle {{\varvec{b}}},\,{{\varvec{x}}}\rangle \}\), where \(p(\cdot )\) is a proper closed convex function. Based on the block sGS decomposition theorem, we extend the classical block sGS method to solve CCQP. In addition, our extended block sGS method has the flexibility of allowing for inexact computation in each step of the block sGS cycle. At the same time, we can also accelerate the inexact block sGS method to achieve an iteration complexity of \(O(1/k^2)\) after performing k cycles. As a fundamental building block, the block sGS decomposition theorem has played a key role in various recently developed algorithms such as the inexact semiproximal ALM/ADMM for linearly constrained multi-block convex composite conic programming (CCCP), and the accelerated block coordinate descent method for multi-block CCCP.
Similar content being viewed by others
References
Axelsson, O.: Iterative Solution Methods. Cambridge University Press, Cambridge (1994)
Bai, M.R., Zhang, X.J., Ni, G.Y., Cui, C.F.: An adaptive correction approach for tensor completion. SIAM J. Imaging Sci. 9, 1298–1323 (2016)
Bai, S., Qi, H.-D.: Tackling the flip ambiguity in wireless sensor network localization and beyond. Digital Signal Process. 55, 85–97 (2016)
Bank, R.E., Dupont, T.F., Yserentant, H.: The hierarchical basis multigrid method. Numerische Mathematik 52, 427–458 (1988)
Beck, A., Tetruashvili, L.: On the convergence of block coordinate descent type methods. SIAM J. Optim. 23, 2037–2060 (2013)
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1995)
Bi, S., Pan, S., Sun, D. F.: Multi-stage convex relaxation approach to noisy structured low-rank matrix recovery, arXiv:1703.03898 (2017)
Ding, C., Qi, H.-D.: Convex optimization learning of faithful Euclidean distance representations in nonlinear dimensionality reduction. Math. Program. 164, 341–381 (2017)
Ding, C., Qi, H.-D.: Convex Euclidean distance embedding for collaborative position localization with NLOS mitigation. Comput. Optim. Appl. 66, 187–218 (2017)
Chen, L., Sun, D.F., Toh, K.-C.: An efficient inexact symmetric Gauss–Seidel based majorized ADMM for high-dimensional convex composite conic programming. Math. Program. 161, 237–270 (2017)
Fercoq, O., Richtárik, P.: Accelerated, parallel, and proximal coordinate descent. SIAM J. Optim. 25, 1997–2023 (2015)
Fercoq, O., Richtárik, P.: Optimization in high dimensions via accelerated, parallel, and proximal coordinate descent. SIAM Rev. 28, 739–771 (2016)
Ferreira, J. B., Khoo, Y., Singer, A.: Semidefinite programming approach for the quadratic assignment problem with a sparse graph, arXiv:1703.09339 (2017)
Freund, R. W.: Preconditioning of symmetric, but highly indefinite linear systems, In: Proceedings of 15th imacs world congress on scientific computation modelling and applied mathematics, Berlin, Germany, pp. 551–556 (1997)
Greenbaum, A.: Iterative Methods for Solving Linear Systems. SIAM, Philadelphia (1997)
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26, 127–136 (2000)
Hackbusch, W.: Iterative Solutions of Large Sparse Systems of Equations. Springer, New York (1994)
Han, D., Sun, D. F., Zhang, L.: Linear rate convergence of the alternating direction method of multipliers for convex composite programming, Math. Oper. Res. (2017). https://doi.org/10.1287/moor.2017.0875
Jiang, K.F., Sun, D.F., Toh, K.-C.: An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP. SIAM J. Optim. 22, 1042–1064 (2012)
Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford splitting methods for convex-concave saddle-point problems. SIAM J. Numer. Anal. 53, 421–444 (2015)
Kristian, B., Sun, H.P.: Preconditioned Douglas–Rachford algorithms for TV-and TGV-regularized variational imaging problems. J. Math. Imaging Vis. 52, 317–344 (2015)
Lam, X.Y., Marron, J.S., Sun, D.F., Toh, K.-C.: Fast algorithms for large scale extended distance weighted discrimination, arXiv:1604.05473. Journal Computational and Graphical Statistics (2016, to appear)
Li, X.D., Sun, D.F., Toh, K.-C.: QSDPNAL: a two-phase augmented Lagrangian method for convex quadratic semidefinite programming, arXiv:1512.08872 (2015)
Li, X.D., Sun, D.F., Toh, K.-C.: A Schur complement based semi-proximal ADMM for convex quadratic conic programming and extensions. Math. Program. 155, 333–373 (2016)
Li, X.D.: A two-phase augmented Lagrangian method for convex composite quadratic programming, PhD thesis, Department of Mathematics, National University of Singapore (2015)
Luo, Z.-Q., Tseng, P.: On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30, 408–425 (1992)
Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46, 157–178 (1993)
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems, SIAM. J. Optim. 22, 341–362 (2012)
Nesterov, Y., Stich, S.U.: Efficiency of the accelerated coordinate descent method on structured optimization problems. SIAM J. Optim. 27, 110–123 (2017)
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. SIAM, Philadelphia (2000)
Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Math. Program. 144, 1–38 (2014)
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
Robinson, S.M.: Some continuity properties of polyhedral multifunctions. Math. Program. Study 14, 206–214 (1981)
Sadd, Y.: Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia (2003)
Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. Advances in neural information processing systems (NIPS), pp. 1458–1466 (2011)
Sun, D.F., Toh, K.-C., Yang, L.Q.: An efficient inexact ABCD method for least squares semidefinite programming. SIAM J. Optim. 26, 1072–1100 (2016)
Sun, D.F., Toh, K.-C., Yang, L.Q.: A convergent 3-block semi-proximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Optim. 25, 882–915 (2015)
Sun, J.: On monotropic piecewise qudratic programming, PhD Thesis, Department of Mathematics, University of Washington, Seattle (1986)
Tappenden, R., Richtárik, R., Gondzio, J.: Inexact coordinate descent: complexity and preconditioning. J. Optim. Theory Appl. 170, 144–176 (2016)
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)
Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 125, 387–423 (2010)
Varga, R.S.: Matrix Iterative Analysis. Springer, Berlin (2009)
Wen, B., Chen, X., Pong, T.K.: Linear convergence of proximal gradient algorithm with extrapolation for a class of nonconvex nonsmooth minimization problems. SIAM J. Optim. 27, 124–145 (2017)
Xiao, L., Lu, Z.: On the complexity analysis of randomized block-coordinate descent methods. Math. Program. 152, 615–642 (2015)
Young, D.M.: On the accelerated SSOR method for solving large linear systems. Adv. Math. 23, 215–217 (1997)
Zhang, X., Xu, C., Zhang, Y., Zhu, T., Cheng, L.: Multivariate regression with grossly corrupted observations: a robust approach and its applications, arXiv:1701.02892 (2017)
Zhou, Z.R., So, A.M.-C.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165, 689–728 (2017)
Acknowledgements
The authors would like to thank the Associate Editor and anonymous referees for their helpful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Defeng Sun: On leave from Department of Mathematics, National University of Singapore.
Appendix: Proof of part (b) of Proposition 2
Appendix: Proof of part (b) of Proposition 2
To begin the proof, we state the following lemma from [35].
Lemma 2
Suppose that \(\{u_k\}\) and \(\{\lambda _k\}\) are two sequences of nonnegative scalars, and \(\{ s_k\}\) is a nondecreasing sequence of scalars such that \(s_0\ge u_0^2\). Suppose that for all \(k\ge 1\), the inequality \( u_k^2 \le s_k + 2\sum _{i=1}^k \lambda _i u_i \) holds. Then for all \(k\ge 1\), \( u_k \le \bar{\lambda }_k + \sqrt{ s_k + \bar{\lambda }_k^2},\) where \(\bar{\lambda }_k = \sum _{i=1}^k \lambda _i\).
Proof
In this proof, we let \(\Delta ^j = \Delta (\tilde{\varvec{\delta }}^j,\varvec{\delta }^j)\). Note that under the assumption that \(t_j=1\) for all \(j\ge 1\), \(\widetilde{{{\varvec{x}}}}^j = {{\varvec{x}}}^{j-1}\). Note also that from (30), we have that \(\Vert {\hat{\mathcal{Q}}}^{-1/2}\Delta ^j\Vert \le M \epsilon _j\), where M is given as in Proposition 2.
For \(j\ge 1\), from the optimality of \({{\varvec{x}}}^j\) in (29), one can show that for all \(x\in \mathcal{X}\)
Let \({{\varvec{e}}}^j = {{\varvec{x}}}^j-{{\varvec{x}}}^*\). By setting \({{\varvec{x}}}= {{\varvec{x}}}^{j-1}\) and \({{\varvec{x}}}={{\varvec{x}}}^*\) in (41), we get
By multiplying \(j-1\) to (42) and combining with (43), we get
where \(a_j = 2j [F({{\varvec{x}}}^j)-F({{\varvec{x}}}^*)]\) and \(b_j = \Vert {{\varvec{e}}}^j\Vert _{\hat{\mathcal{Q}}}\). Note that the last inequality follows from (43) with \(j=1\), the fact that the sequence \(\{\epsilon _i\}\) is non-increasing, and some simple manipulations.
To summarize, we have \(b_j^2 \le b_0^2 + 2 \sum _{i=1}^j 2M i\epsilon _i b_i \). By applying Lemma 2, we get
where \(\bar{\lambda }_j = \sum _{i=1}^j \lambda _i\) with \(\lambda _i = 2M i \epsilon _i\). Applying the above result to (44), we get
From here, the required result in Part (b) of Proposition 2 follows. \(\square \)
Rights and permissions
About this article
Cite this article
Li, X., Sun, D. & Toh, KC. A block symmetric Gauss–Seidel decomposition theorem for convex composite quadratic programming and its applications. Math. Program. 175, 395–418 (2019). https://doi.org/10.1007/s10107-018-1247-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-018-1247-7
Keywords
- Convex composite quadratic programming
- Block symmetric Gauss–Seidel
- Schur complement
- augmented Lagrangian method