Adaptive Restart for Accelerated Gradient Schemes

O’Donoghue, Brendan; Candès, Emmanuel

doi:10.1007/s10208-013-9150-3

Adaptive Restart for Accelerated Gradient Schemes

Published: 03 July 2013

Volume 15, pages 715–732, (2015)
Cite this article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Brendan O’Donoghue¹ &
Emmanuel Candès¹

6002 Accesses
340 Citations
3 Altmetric
Explore all metrics

Abstract

In this paper we introduce a simple heuristic adaptive restart technique that can dramatically improve the convergence rate of accelerated gradient schemes. The analysis of the technique relies on the observation that these schemes exhibit two modes of behavior depending on how much momentum is applied at each iteration. In what we refer to as the ‘high momentum’ regime the iterates generated by an accelerated gradient scheme exhibit a periodic behavior, where the period is proportional to the square root of the local condition number of the objective function. Separately, it is known that the optimal restart interval is proportional to this same quantity. This suggests a restart technique whereby we reset the momentum whenever we observe periodic behavior. We provide a heuristic analysis that suggests that in many cases adaptively restarting allows us to recover the optimal rate of convergence with no prior knowledge of function parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fundamentals of Artificial Neural Networks and Deep Learning

Introduction to Machine Learning

The Frank-Wolfe Algorithm: A Short Introduction

Article Open access 13 December 2023

References

A. Auslender, M. Teboulle, Interior gradient and proximal methods for convex and conic optimization, SIAM J. Optim. 16(3), 697–725 (2006).
Article MATH MathSciNet Google Scholar
A. Beck, M. Teboulle, A fast iterative shrinkage-thresholding algorithm for linear inverse problems, SIAM J. Imaging Sci. 2, 183–202 (2009).
Article MATH MathSciNet Google Scholar
S. Becker, E. Candès, M. Grant, Templates for convex cone problems with applications to sparse signal recovery, Math. Program. Comput. 3(3), 165–218 (2011).
Article MATH MathSciNet Google Scholar
S. Boyd, L. Vandenberghe, Convex Optimization (Cambridge University Press, Cambridge, 2004).
Book MATH Google Scholar
E. Candès, J. Romberg, T. Tao, Stable signal recovery from incomplete and inaccurate measurements, Commun. Pure Appl. Math. 59(8), 1207–1223 (2006).
Article MATH Google Scholar
E. Candès, M. Wakin, An introduction to compressive sampling, IEEE Signal Process. Mag. 25(2), 21–30 (2008).
Article Google Scholar
A. Chambolle, R. De Vore, N. Lee, B. Lucier, Nonlinear wavelet image processing: variational problems, compression, and noise removal through wavelet shrinkage, IEEE Trans. Image Process. 7(3), 319–335 (1998).
Article MATH MathSciNet Google Scholar
A. Chiang, Fundamental Methods of Mathematical Economics (McGraw-Hill, New York, 1984).
Google Scholar
I. Daubechies, M. Defrise, C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint, Commun. Pure Appl. Math. 57(11), 1413–1457 (2004).
Article MATH Google Scholar
D. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006).
Article MATH MathSciNet Google Scholar
M. Gu, L. Lim, C. Wu, PARNES: A rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals. Technical report (2009). arXiv:0911.0492.
M. Hestenes, E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Natl. Bur. Stand. 49(6), 409–436 (1952).
Article MATH MathSciNet Google Scholar
G. Lan, R. Monteiro, Iteration complexity of first-order penalty methods for convex programming. Manuscript, School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, June 2008
G. Lan, Z. Lu, R. Monteiro, Primal-dual first-order methods with o(1/ϵ) iteration-complexity for cone programming, Math. Program. 1–29 (2009).
J. Liu, L. Yuan, J. Ye, An efficient algorithm for a class of fused lasso problems, in Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, July (2010), pp. 323–332.
Google Scholar
A. Nemirovski, Efficient methods in convex programming. Lecture notes (1994). http://www2.isye.gatech.edu/~nemirovs/Lect_EMCO.pdf.
A. Nemirovski, D. Yudin, Problem Complexity and Method Efficiency in Optimization. Wiley-Interscience Series in Discrete Mathematics (Wiley, New York, 1983).
Google Scholar
Y. Nesterov, A method of solving a convex programming problem with convergence rate O(1/k ²), Sov. Math. Dokl. 27(2), 372–376 (1983).
MATH Google Scholar
Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course (Kluwer Academic, Dordrecht, 2004).
Book Google Scholar
Y. Nesterov, Gradient methods for minimizing composite objective function. CORE discussion paper (2007). http://www.ecore.be/DPs/dp_1191313936.pdf.
J. Nocedal, S. Wright, Numerical Optimization. Springer Series in Operations Research (Springer, Berlin, 2000).
Google Scholar
B. Polyak, Introduction to Optimization. Translations Series in Mathematics and Engineering (Optimization Software, Publications Division, New York, 1987).
Google Scholar
R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. B 58(1), 267–288 (1994).
MathSciNet Google Scholar
P. Tseng, On accelerated proximal gradient methods for convex-concave optimization (2008). http://pages.cs.wisc.edu/~brecht/cs726docs/Tseng.APG.pdf.

Download references

Acknowledgements

We are very grateful to Stephen Boyd for his help and encouragement. We would also like to thank Stephen Wright for his advice and feedback, and Stephen Becker and Michael Grant for useful discussions. E. C. would like to thank the ONR (grant N00014-09-1-0258) and the Broadcom Foundation for their support. We must also thank two anonymous reviewers for their constructive feedback.

Author information

Authors and Affiliations

Stanford University, 450 Serra Mall, Stanford, CA, 94305, USA
Brendan O’Donoghue & Emmanuel Candès

Authors

Brendan O’Donoghue
View author publications
You can also search for this author in PubMed Google Scholar
Emmanuel Candès
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brendan O’Donoghue.

Additional information

Communicated by Felipe Cucker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

O’Donoghue, B., Candès, E. Adaptive Restart for Accelerated Gradient Schemes. Found Comput Math 15, 715–732 (2015). https://doi.org/10.1007/s10208-013-9150-3

Download citation

Received: 02 January 2013
Accepted: 25 February 2013
Published: 03 July 2013
Issue Date: June 2015
DOI: https://doi.org/10.1007/s10208-013-9150-3

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Restart for Accelerated Gradient Schemes

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Introduction to Machine Learning

The Frank-Wolfe Algorithm: A Short Introduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Adaptive Restart for Accelerated Gradient Schemes

Abstract

Access this article

Similar content being viewed by others

Fundamentals of Artificial Neural Networks and Deep Learning

Introduction to Machine Learning

The Frank-Wolfe Algorithm: A Short Introduction

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation