Skip to main content
Log in

Linear convergence of first order methods for non-strongly convex optimization

  • Full Length Paper
  • Series A
  • Published:
Mathematical Programming Submit manuscript

Abstract

The standard assumption for proving linear convergence of first order methods for smooth convex optimization is the strong convexity of the objective function, an assumption which does not hold for many practical applications. In this paper, we derive linear convergence rates of several first order methods for solving smooth non-strongly convex constrained optimization problems, i.e. involving an objective function with a Lipschitz continuous gradient that satisfies some relaxed strong convexity condition. In particular, in the case of smooth constrained convex optimization, we provide several relaxations of the strong convexity conditions and prove that they are sufficient for getting linear convergence for several first order methods such as projected gradient, fast gradient and feasible descent methods. We also provide examples of functional classes that satisfy our proposed relaxations of strong convexity conditions. Finally, we show that the proposed relaxed strong convexity conditions cover important applications ranging from solving linear systems, Linear Programming, and dual formulations of linearly constrained convex problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. This result can be also proved using simple algebraic arguments. More precisely, from Courant-Fischer theorem we know that \(\Vert Ax\Vert \ge \sigma _{\min }(A) \Vert x\Vert \) for all \(x \in \text {Im}(A^T)\). Since we assume that our polyhedron \(\mathcal{P}=\{x: \; Ax=b\}\) is non-empty, then \(x - [x]_\mathcal{P} \in \text {Im}(A^T)\) for all \(x \in \mathbb {R}^n\) (from KKT conditions of \(\min _{z: Az=b} \Vert x - z\Vert ^2\) we have that there exists \(\mu \) such that \(x - [x]_{\mathcal{P}} + A^T \mu =0\)). In conclusion, we get:

    $$\begin{aligned} \Vert Ax - b\Vert = \Vert Ax - A [x]_{\mathcal{P}} \Vert \ge \sigma _{\text {min}}(A) \Vert x - [x]_{\mathcal{P}}\Vert = \sigma _{\text {min}}(A) \text {dist}_2(x,\mathcal {P}) \quad \forall x \in \mathbb {R}^n. \end{aligned}$$
  2. See Remark 1 below for an example satisfying this condition.

References

  1. Leventhal, D., Lewis, A.S.: Randomized methods for linear constraints: convergence rates and conditioning. Math. Oper. Res. 35(3), 641–654 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  2. Liu, J., Wright, S., Re, C., Bittorf, V., Sridhar, S.: An asynchronous parallel stochastic coordinate descent algorithm. J. Mach. Learn. Res. 16(1), 285–322 (2015)

    MathSciNet  MATH  Google Scholar 

  3. Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Kluwer, Dordrecht (2004)

    Book  MATH  Google Scholar 

  4. Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4), 1574–1609 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  5. Wright, S.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  6. Burke, J.V., Deng, S.: Weak sharp minima revisited Part III: error bounds for differentiable convex inclusions. Math. Program. 116(1–2), 37–56 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Lewis, A.S., Pang, J.S.: Error bounds for convex inequality systems. In: Chapter In: Generalized Convexity, Generalized Monotonicity–Recent Results. Springer, Berlin (1998)

  8. Yangy, T., Lin, Q.: A stochastic gradient method with linear convergence rate for a class of non-smooth non-strongly convex optimization. Tech. rep. (2015). www.arxiv.org

  9. Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46(1), 157–178 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  10. Necoara, I., Clipici, D.: Parallel random coordinate descent method for composite minimization: convergence analysis and error bounds. SIAM J. Optim. 26(1), 197–226 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  11. Wang, P.W., Lin, C.J.: Iteration complexity of feasible descent methods for convex optimization. J. Mach. Learn. Res. 15(4), 1523–1548 (2014)

    MathSciNet  MATH  Google Scholar 

  12. Zhang, H., Cheng, L.: Restricted strong convexity and its applications to convergence analysis of gradient type methods in convex optimization. Optim. Lett. 9(5), 961–979 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  13. Beck, A., Shtern, S.: Linearly convergent away-step conditional gradient for non-strongly convex functions. Math. Program. 164(1–2), 1–27 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  14. Drusvyatskiy, D., Lewis, A.: Error bounds, quadratic growth, and linear convergence of proximal methods. Tech. rep., (2016). (arXiv:1602.06661)

  15. Zhou, Z., So, A.: A unified approach to error bounds for structured convex optimization problems. Math. Program. 165(2), 689–728 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49(4), 263–265 (1952)

    Article  MathSciNet  Google Scholar 

  17. Klatte, D., Thiere, G.: Error bounds for solutions of linear equations and inequalities. Math. Methods Oper. Res. 41(2), 191–214 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  19. O’Donoghue, B., Candes, E.: Adaptive restart for accelerated gradient schemes. Found. Comput. Math. 15(3), 715–732 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  20. Bubeck, S.: Convex optimization: algorithms and complexity. Found. Trends Mach. Learn. 8(3–4), 231–357 (2015)

    Article  MATH  Google Scholar 

  21. O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  22. Lin, H., Mairal, J., Harchaoui, Z.: A universal Catalyst for first-order optimization. In: Advances in neural information processing systems, 3384–3392 (2015)

Download references

Acknowledgements

The research leading to these results has received funding from the Executive Agency for Higher Education, Research and Innovation Funding (UEFISCDI), Romania: PN-III-P4-PCE-2016-0731, project ScaleFreeNet, No. 39/2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to I. Necoara.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Necoara, I., Nesterov, Y. & Glineur, F. Linear convergence of first order methods for non-strongly convex optimization. Math. Program. 175, 69–107 (2019). https://doi.org/10.1007/s10107-018-1232-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10107-018-1232-1

Mathematics Subject Classification

Navigation