A fast non-monotone line search for stochastic gradient descent

Fathi Hafshejani, Sajad; Gaur, Daya; Hossain, Shahadat; Benkoczi, Robert

doi:10.1007/s11081-023-09836-6

A fast non-monotone line search for stochastic gradient descent

Research Article
Published: 23 September 2023

(2023)
Cite this article

Optimization and Engineering Aims and scope Submit manuscript

Sajad Fathi Hafshejani¹,
Daya Gaur¹^na1,
Shahadat Hossain¹^na1 &
…
Robert Benkoczi¹^na1

152 Accesses
Explore all metrics

Abstract

We give an improved non-monotone line search algorithm for stochastic gradient descent (SGD) for functions that satisfy interpolation conditions. We establish theoretical convergence guarantees for the algorithm for non-convex functions. We conduct a detailed empirical evaluation to validate the theoretical results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems

Article 01 July 2020

Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

Article 01 January 2021

A conjugate gradient method with sufficient descent property

Article 12 December 2014

Notes

”https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html”.

References

Ahookhosh M, Amini K, Peyghami MR (2012) A nonmonotone trust-region line search method for large-scale unconstrained optimization. Appl Math Model 36(1):478–487
Article MathSciNet MATH Google Scholar
Amini K, Ahookhosh M, Nosratipour H (2014) An inexact line search approach using modified nonmonotone strategy for unconstrained optimization. Numer Algorithms 66(1):49–78
Article MathSciNet MATH Google Scholar
Chamberlain R, Powell M, Lemarechal C, Pedersen H (1982) The watchdog technique for forcing convergence in algorithms for constrained optimization. Algorithms for Constrained Minimization of Smooth Nonlinear Functions. Springer, New York, pp 1–17
MATH Google Scholar
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. JMLR 12(7):2121–2159
MathSciNet MATH Google Scholar
Ghadimi S, Lan G (2013) Stochastic first-and zeroth-order methods for nonconvex stochastic programming. SIAM J Optim 23(4):2341–2368
Article MathSciNet MATH Google Scholar
Gower RM, Loizou N, Qian X, Sailanbayev A, Shulgin E, Richt’arik P (2019) SGD: general analysis and improved rates. In: International conference on machine learning, pp 5200–5209
Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23(4):707–716
Article MathSciNet MATH Google Scholar
Karimi H, Nutini J, Schmidt M (2016) Linear convergence of gradient and proximal-gradient methods under the Polyak–łojasiewicz condition. In: Joint European conference on machine learning and knowledge discovery in databases, pp 795–811
Loizou N, Vaswani S, Laradji IH, Lacoste-Julien S (2021) Stochastic Polyak stepsize for SGD: an adaptive learning rate for fast convergence. In: International conference on artificial intelligence and statistics, pp 1306–1314
Moulines E, Bach F (2011) Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in neural information processing systems, vol 24
Needell D, Ward R, Srebro N (2014) Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. In: Advances in neural information processing systems, vol 27
Nemirovski A, Juditsky A, Lan G, Shapiro A (2009) Robust stochastic approximation approach to stochastic programming. SIAM J Optim 19(4):1574–1609
Article MathSciNet MATH Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Book MATH Google Scholar
Robbins H, Monro S (1951) A stochastic approximation method. Ann Math Stat 22:400–407
Article MathSciNet MATH Google Scholar
Schaul T, Zhang S, LeCun Y (2013) No more pesky learning rates. In: Dasgupta S, McAllester D (eds) Proceedings of the 30th international conference on machine learning. Proceedings of machine learning research, vol 28, pp 343–351
Tan C, Ma S, Dai Y-H, Qian Y (2016) Barzilai–Borwein step size for stochastic gradient descent. In: Advances in neural information processing systems, vol 29
Tieleman T, Hinton G (2017) Divide the gradient by a running average of its recent magnitude. coursera: neural networks for machine learning. Technical report
Vaswani S, Dubois-Taine B, Babanezhad R (2022) Towards noise-adaptive, problem-adaptive (accelerated) stochastic gradient descent. In: International conference on machine learning, pp 22015–22059
Vaswani S, Mishkin A, Laradji I, Schmidt M, Gidel G, Lacoste-Julien S (2019) Painless stochastic gradient: interpolation, line-search, and convergence rates. In: Advances in neural information processing systems, vol 32
Yang Z (2021) On the step size selection in variance-reduced algorithm for nonconvex optimization. Expert Syst Appl 169:114336
Article Google Scholar

Download references

Author information

Daya Gaur, Shahadat Hossain and Robert Benkoczi have contributed equally to this work.

Authors and Affiliations

Department of Math and Computer Science, University of Lethbridge, 4401 University Dr W, Lethbridge, AB, T1K 3M4, Canada
Sajad Fathi Hafshejani, Daya Gaur, Shahadat Hossain & Robert Benkoczi

Authors

Sajad Fathi Hafshejani
View author publications
You can also search for this author in PubMed Google Scholar
Daya Gaur
View author publications
You can also search for this author in PubMed Google Scholar
Shahadat Hossain
View author publications
You can also search for this author in PubMed Google Scholar
Robert Benkoczi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sajad Fathi Hafshejani.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Fathi Hafshejani, S., Gaur, D., Hossain, S. et al. A fast non-monotone line search for stochastic gradient descent. Optim Eng (2023). https://doi.org/10.1007/s11081-023-09836-6

Download citation

Received: 12 May 2023
Revised: 16 August 2023
Accepted: 17 August 2023
Published: 23 September 2023
DOI: https://doi.org/10.1007/s11081-023-09836-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A fast non-monotone line search for stochastic gradient descent

Abstract

Access this article

Similar content being viewed by others

A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems

Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

A conjugate gradient method with sufficient descent property

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A fast non-monotone line search for stochastic gradient descent

Abstract

Access this article

Similar content being viewed by others

A Heuristic Adaptive Fast Gradient Method in Stochastic Optimization Problems

Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

A conjugate gradient method with sufficient descent property

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation