Skip to main content
Log in

Adaptive Restart for Accelerated Gradient Schemes

  • Published:
Foundations of Computational Mathematics Aims and scope Submit manuscript

Abstract

In this paper we introduce a simple heuristic adaptive restart technique that can dramatically improve the convergence rate of accelerated gradient schemes. The analysis of the technique relies on the observation that these schemes exhibit two modes of behavior depending on how much momentum is applied at each iteration. In what we refer to as the ‘high momentum’ regime the iterates generated by an accelerated gradient scheme exhibit a periodic behavior, where the period is proportional to the square root of the local condition number of the objective function. Separately, it is known that the optimal restart interval is proportional to this same quantity. This suggests a restart technique whereby we reset the momentum whenever we observe periodic behavior. We provide a heuristic analysis that suggests that in many cases adaptively restarting allows us to recover the optimal rate of convergence with no prior knowledge of function parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Algorithm 2
Fig. 2
Algorithm 3
Fig. 3
Fig. 4
Algorithm 4
Algorithm 5
Fig. 5
Algorithm 6
Fig. 6

Similar content being viewed by others

References

  1. A. Auslender, M. Teboulle, Interior gradient and proximal methods for convex and conic optimization, SIAM J. Optim. 16(3), 697–725 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  2. A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci. 2, 183–202 (2009).

    Article  MATH  MathSciNet  Google Scholar 

  3. S. Becker, E. Candès, M. Grant, Templates for convex cone problems with applications to sparse signal recovery, Math. Program. Comput. 3(3), 165–218 (2011).

    Article  MATH  MathSciNet  Google Scholar 

  4. S. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, Cambridge, 2004).

    Book  MATH  Google Scholar 

  5. E. Candès, J. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements, Commun. Pure Appl. Math. 59(8), 1207–1223 (2006).

    Article  MATH  Google Scholar 

  6. E. Candès, M. Wakin, An introduction to compressive sampling, IEEE Signal Process. Mag. 25(2), 21–30 (2008).

    Article  Google Scholar 

  7. A. Chambolle, R. De Vore, N. Lee, B. Lucier, Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage, IEEE Trans. Image Process. 7(3), 319–335 (1998).

    Article  MATH  MathSciNet  Google Scholar 

  8. A. Chiang, Fundamental Methods of Mathematical Economics (McGraw-Hill, New York, 1984).

    Google Scholar 

  9. I. Daubechies, M. Defrise, C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun. Pure Appl. Math. 57(11), 1413–1457 (2004).

    Article  MATH  Google Scholar 

  10. D. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006).

    Article  MATH  MathSciNet  Google Scholar 

  11. M. Gu, L. Lim, C. Wu, PARNES: A rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals. Technical report (2009). arXiv:0911.0492.

  12. M. Hestenes, E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952).

    Article  MATH  MathSciNet  Google Scholar 

  13. G. Lan, R. Monteiro, Iteration complexity of first-order penalty methods for convex programming. Manuscript, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, June 2008

  14. G. Lan, Z. Lu, R. Monteiro, Primal-dual first-order methods with o(1/ϵ) iteration-complexity for cone programming, Math. Program. 1–29 (2009).

  15. J. Liu, L. Yuan, J. Ye, An efficient algorithm for a class of fused lasso problems, in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July (2010), pp. 323–332.

    Google Scholar 

  16. A. Nemirovski, Efficient methods in convex programming. Lecture notes (1994). http://www2.isye.gatech.edu/~nemirovs/Lect_EMCO.pdf.

  17. A. Nemirovski, D. Yudin, Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience Series in Discrete Mathematics (Wiley, New York, 1983).

    Google Scholar 

  18. Y. Nesterov, A method of solving a convex programming problem with convergence rate O(1/k 2), Sov. Math. Dokl. 27(2), 372–376 (1983).

    MATH  Google Scholar 

  19. Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course (Kluwer Academic, Dordrecht, 2004).

    Book  Google Scholar 

  20. Y. Nesterov, Gradient methods for minimizing composite objective function. CORE discussion paper (2007). http://www.ecore.be/DPs/dp_1191313936.pdf.

  21. J. Nocedal, S. Wright, Numerical Optimization. Springer Series in Operations Research (Springer, Berlin, 2000).

    Google Scholar 

  22. B. Polyak, Introduction to Optimization. Translations Series in Mathematics and Engineering (Optimization Software, Publications Division, New York, 1987).

    Google Scholar 

  23. R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B 58(1), 267–288 (1994).

    MathSciNet  Google Scholar 

  24. P. Tseng, On accelerated proximal gradient methods for convex-concave optimization (2008). http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf.

Download references

Acknowledgements

We are very grateful to Stephen Boyd for his help and encouragement. We would also like to thank Stephen Wright for his advice and feedback, and Stephen Becker and Michael Grant for useful discussions. E. C. would like to thank the ONR (grant N00014-09-1-0258) and the Broadcom Foundation for their support. We must also thank two anonymous reviewers for their constructive feedback.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Brendan O’Donoghue.

Additional information

Communicated by Felipe Cucker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

O’Donoghue, B., Candès, E. Adaptive Restart for Accelerated Gradient Schemes. Found Comput Math 15, 715–732 (2015). https://doi.org/10.1007/s10208-013-9150-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10208-013-9150-3

Keywords

Mathematics Subject Classification

Navigation