Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization

Patrascu, Andrei; Necoara, Ion

doi:10.1007/s10898-014-0151-9

Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization

Published: 02 February 2014

Volume 61, pages 19–46, (2015)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Andrei Patrascu¹ &
Ion Necoara¹

1462 Accesses
47 Citations
Explore all metrics

Abstract

In this paper we analyze several new methods for solving nonconvex optimization problems with the objective function consisting of a sum of two terms: one is nonconvex and smooth, and another is convex but simple and its structure is known. Further, we consider both cases: unconstrained and linearly constrained nonconvex problems. For optimization problems of the above structure, we propose random coordinate descent algorithms and analyze their convergence properties. For the general case, when the objective function is nonconvex and composite we prove asymptotic convergence for the sequences generated by our algorithms to stationary points and sublinear rate of convergence in expectation for some optimality measure. Additionally, if the objective function satisfies an error bound condition we derive a local linear rate of convergence for the expected values of the objective function. We also present extensive numerical experiments for evaluating the performance of our algorithms in comparison with state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficiency of Coordinate Descent Methods for Structured Nonconvex Optimization

Classes of linear programs solvable by coordinate-wise minimization

Article 14 April 2021

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Article 21 February 2015

References

Auslender, A.: Optimisation Methodes Numeriques. Masson, Paris (1976)
MATH Google Scholar
Beck, A.: The 2-Coordinate Descent Method for Solving Double-Sided Simplex Constrained Minimization Problems. Technical Report (2012)
Bertsekas, D.: Nonlinear Programming. Athena Scientific (1999)
Bonettini, S.: Inexact block coordinate descent methods with application to nonnegative matrix factorization. J. Numer. Anal. 22, 1431–1452 (2011)
Article MathSciNet Google Scholar
Calamai, P.H., More, J.J.: Projected gradient methods for linearly constrained problems. Math. Program. 39, 93–116 (1987)
Article MathSciNet MATH Google Scholar
Chapelle, O., Sindhwani, V., Keerthi, S.: Optimization techniques for semi-supervised support vector machines. J. Mach. Learn. Res. 2, 203–233 (2008)
Google Scholar
Fainshil, L., Margaliot, M.: A maximum principle for positive bilinear control systems with applications to positive linear switched systems. SIAM J. Control Optim. 50, 2193–2215 (2012)
Article MathSciNet MATH Google Scholar
Judice, J., Raydan, M., Rosa, S.S., Santos, S.A.: On the solution of the symmetric eigenvalue complementarity problem by the spectral projected gradient algorithm. Comput. Optim. Appl. 47, 391–407 (2008)
MathSciNet MATH Google Scholar
Kocvara, M., Outrata, J.: Effective reformulations of the truss topology design problem. Optim. Eng. (2006)
Lin, C.J., Lucidi, S., Palagi, L., Risi, A., Sciandrone, M.: Decomposition algorithm model for singly linearly-constrained problems subject to lower and upper bounds. J. Optim. Theory Appl. 141(1), 107–126 (2009)
Article MathSciNet MATH Google Scholar
Lu, Z., Xiao, L.: Randomized Block Coordinate Non-monotone Gradient Method for a Class of Nonlinear Programming. Technical Report (2013)
Mongeau, M., Torki, M.: Computing eigenelements of real symmetric matrices via optimization. Comput. Optim. Appl. 29, 263–287 (2004)
Article MathSciNet MATH Google Scholar
Necoara, I., Nesterov, Y., Glineur, F.: A Random Coordinate Descent Method for Large Optimization Problems with Linear Constraints. Technical Report (2011). http://acse.pub.ro/person/ion-necoara/
Necoara, I., Patrascu, A.: A random coordinate descent algorithm for optimization problems with composite objective function and linear coupled constraints. Comput. Optim. Appl. (2013)
Necoara, I.: Random coordinate descent algorithms for multi-agent convex optimization over networks. IEEE Trans. Autom. Control 58(8), 1–12 (2013)
Article MathSciNet Google Scholar
Necoara, I., Clipici, D.: Efficient parallel coordinate descent algorithm for convex optimization problems with separable constraints: application to distributed MPC. J. Process Control 23(3), 243–253 (2013)
Article Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer, Dordrecht (2004)
Book MATH Google Scholar
Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. SIAM J. Optim. 22(2), 341–362 (2012)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite objective function. Math. Program. 140(1), 125–161 (2013)
Article MathSciNet MATH Google Scholar
Parlett, B.N.: The Symmetric Eigenvalue Problem. SIAM (1997)
Poliak, B.T.: Introduction to Optimization. Optimization Software (1987)
Powell, M.J.D.: On search directions for minimization algorithms. Mathematical Programming (1973)
Richtarik, P., Takac, M.: Efficient Serial and Parallel Coordinate Descent Methods for Huge-Scale Truss Topology Design. Operations Research Proceedings, Springer, pp. 27–32 (2012)
Richtarik, P., Takac, M.: Iteration complexity of randomized block coordinate descent methods for minimizing a composite function. Mathematical Programming (2012)
Richtarik, P., Takac, M.: Parallel Coordinate Descent Methods for Big Data Optimization, Technical Report, (2012). http://www.maths.ed.ac.uk/~richtarik/
Rockafeller, R.T.: The elementary vectors of a subspace in ${\mathbb{R}}^N$. In: Bose, R.C., Downling, T.A. (eds.) Combinatorial Mathematics and its Applications, Proceedings of the Chapel Hill Conference, pp. 104–127 (1969)
Rockafeller, R.T.: Network Flows and Monotropic Optimization. Wiley-Interscience, New York (1984)
Google Scholar
Shalev-Shwartz, S., Zhang, T.: Stochastic dual coordinate ascent methods for regularized loss minimization. J. Mach. Learn. Res. 14, 567–599 (2013)
MathSciNet MATH Google Scholar
Thi, H.A.L., Moeini, M., Dihn, T.P., Judice, J.: A DC programming approach for solving the symmetric eigenvalue complementarity problem. Comput. Optim. Appl. 51, 1097–1117 (2012)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A coordinate gradient descent for nonsmooth separable minimization. Math. Program. 117, 387–423 (2009)
Article MathSciNet MATH Google Scholar
Tseng, P., Yun, S.: A block coordinate gradient descent method for linearly constrained nonsmooth separable optimization. J. Optim. Theory Appl. 140, 513–535 (2009)
Article MathSciNet MATH Google Scholar
Tseng, P.: Approximation accuracy, gradient methods and error bound for structured convex optimization. Math. Program. 125(2), 263–295 (2010)
Article MathSciNet MATH Google Scholar
Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
Book MATH Google Scholar

Download references

Author information

Authors and Affiliations

Automatic Control and Systems Engineering Department, University Politehnica Bucharest, 060042 , Bucharest, Romania
Andrei Patrascu & Ion Necoara

Authors

Andrei Patrascu
View author publications
You can also search for this author in PubMed Google Scholar
Ion Necoara
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ion Necoara.

Additional information

The research leading to these results has received funding from: the European Union (FP7/2007–2013) EMBOCON under grant agreement no 248940; CNCS (project TE-231, 19/11.08.2010); ANCS (project PN II, 80EU/2010); POSDRU/89/1.5/S/62557.

Appendix

Proof of Lemma 4

We derive our proof based on the following remark (see also [5]), for given $u, v \in \mathbb {R}^n$ if $\langle v, u-v \rangle > 0$, then

$$\begin{aligned} \frac{||u||}{||v||} \le \frac{\langle u, u-v \rangle }{\langle v, u-v \rangle }. \end{aligned}$$

(27)

Let $\alpha > \beta > 0$. Taking $u = \text {prox}_{\alpha h}(x + \alpha d) -x $ and $v= \text {prox}_{\beta h}(x+ \beta d) - x$, we show first that inequality $\langle v, u-v \rangle > 0$ holds. Given a real constant $c>0$, from the optimality conditions corresponding to proximal operator we have:

$$\begin{aligned} x - \text {prox}_{c h}(x) \in \partial c h(\text {prox}_{c h}(x)). \end{aligned}$$

Therefore, from the convexity of $h$ we can derive that:

$$\begin{aligned} c h(z) \ge c h(\text {prox}_{c h}(y)) + \langle y - \text {prox}_{ch}(y), z - \text {prox}_{c h}(y) \rangle \quad \; \forall y, z \in \mathbb {R}^n. \end{aligned}$$

Taking $c=\alpha $, $z = \text {prox}_{\beta h}(x + \beta d)$ and $y = x + \alpha d$ we have:

$$\begin{aligned} \langle u, u - v \rangle \le \alpha \left( \langle d, u-v \rangle + h(\text {prox}_{\beta h} (x+\beta d)) - h(\text {prox}_{\alpha h}(x+ \alpha d)) \right) . \end{aligned}$$

(28)

Also, if $c=\beta $, $z= \text {prox}_{\alpha h}(x+ \alpha d)$ and $y = x+ \beta d$, then we have:

$$\begin{aligned} \langle v, u-v \rangle \ge \beta \left( \langle d, u-v \rangle + h(\text {prox}_h(x+\beta d)) - h(\text {prox}_h(x+ \alpha d)) \right) . \end{aligned}$$

(29)

Summing these two inequalities and taking in account that $\alpha > \beta $ we get:

$$\begin{aligned} \langle d, u -v \rangle + h(\text {prox}_h(x+\beta d)) - h(\text {prox}_h(x+ \alpha d))> 0. \end{aligned}$$

Therefore, replacing this expression into inequality (28) leads to $\langle v, u-v \rangle > 0$. Finally, from (27),(28) and (29) we get the inequality:

$$\begin{aligned} \frac{||u||}{||v||} \le \frac{\alpha \langle d, u-v \rangle }{\beta \langle d, u-v \rangle }, \end{aligned}$$

and then the statement of Lemma 4 can be easily derived. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patrascu, A., Necoara, I. Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization. J Glob Optim 61, 19–46 (2015). https://doi.org/10.1007/s10898-014-0151-9

Download citation

Received: 04 May 2013
Accepted: 17 January 2014
Published: 02 February 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s10898-014-0151-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization

Abstract

Access this article

Similar content being viewed by others

Efficiency of Coordinate Descent Methods for Structured Nonconvex Optimization

Classes of linear programs solvable by coordinate-wise minimization

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Efficient random coordinate descent algorithms for large-scale structured nonconvex optimization

Abstract

Access this article

Similar content being viewed by others

Efficiency of Coordinate Descent Methods for Structured Nonconvex Optimization

Classes of linear programs solvable by coordinate-wise minimization

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendix

Appendix

Proof of Lemma 4

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation