Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

Wang, L.; Wu, H.; Matveev, I. A.

doi:10.1134/S106423072101010X

Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

COMPUTER METHODS
Published: 19 February 2021

Volume 60, pages 75–86, (2021)
Cite this article

Journal of Computer and Systems Sciences International Aims and scope

L. Wang¹,
H. Wu¹ &
I. A. Matveev²

166 Accesses
2 Citations
Explore all metrics

Abstract

The use of stochastic gradient algorithms for nonlinear optimization is of considerable interest, especially in the case of high dimensions. In this case, the choice of the step size is of key importance for the convergence rate. In this paper, we propose two new stochastic gradient algorithms that use an improved Barzilai–Borwein step size formula. Convergence analysis shows that these algorithms enable linear convergence in probability for strongly convex objective functions. Our computational experiments confirm that the proposed algorithms have better characteristics than two-point gradient algorithms and well-known stochastic gradient methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Accelerated gradient methods for nonconvex nonlinear and stochastic programming

Article 21 February 2015

New stepsizes for the gradient method

Article 28 November 2019

A modified Perry conjugate gradient method and its global convergence

Article 22 October 2014

REFERENCES

K. Chaudhuri, C. Monteleoni, and D. Sarwate, “Differentially private empirical risk minimization,” J. Mach. Learn. Res., No. 12, 1069–1109 (2011).
H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Stat. 22, 400–407 (1951).
Article MathSciNet Google Scholar
Yu. E. Nesterov, “A method for solving a convex programming problem with a convergence rate O(1/k ²),” Dokl. Akad. Nauk SSSR 269, 543–547 (1983).
MathSciNet Google Scholar
A. A. Gaivoronskii, “Nonstationary stochastic programming problems,” Cybernetics 14, 575–579 (1978).
Article MathSciNet Google Scholar
B. T. Polyak, “New method of stochastic approximation type,” Autom. Remote Control 51, 937–946 (1990).
MathSciNet MATH Google Scholar
L. Xiao and T. Zhang, “A proximal stochastic gradient method with progressive variance reduction,” SIAM J. Optimiz. 24, 2057–2075 (2014).
Article MathSciNet Google Scholar
R. L. Roux, M. Schmidt, and F. Bach, “A stochastic gradient method with an exponential convergence rate for finite training sets,” Adv. Neural Inform. Process. Syst. 4, 2663–2671 (2012).
Google Scholar
S. Shalevshwartz and T. Zhang, “Stochastic dual coordinate ascent methods for regularized loss minimization,” J. Mach. Learn. Res. 14, 567–599 (2013).
MathSciNet MATH Google Scholar
R. Johnson and T. Zhang, “Accelerating stochastic gradient descent using predictive variance reduction,” in Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 315–323.
J. Barzilai and J. M. Borwein, “Two-point step size gradient methods,” IMA J. Numer. Anal. 8, 141–148 (1988).
Article MathSciNet Google Scholar
C. Tan, S. Ma, Y. H. Dai, and Y. Qian, “Barzilai–Borwein step size for stochastic gradient descent,” in Advances in Neural Information Processing Systems 29, Ed. by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Curran Assoc., New York, 2016), pp. 685–693.
Google Scholar
M. Raydan, “On the Barzilai and Borwein choice of steplength for the gradient method,” IMA J. Numer. Anal. 13, 321–326 (1993).
Article MathSciNet Google Scholar
Y. Dai, J. Yuan, and Y. X. Yuan, “Modified two-point stepsize gradient methods for unconstrained optimization,” Comput. Optimiz. Appl. 22, 103–109 (2002).
Article MathSciNet Google Scholar
Y. X. Yuan, “Step-sizes for the gradient method,” Am. Math. Soc. 42, 785–796 (2008).
MathSciNet MATH Google Scholar
X. B. Jin, X. Y. Zhang, K. Huang, and G. G. Geng, “Stochastic conjugate gradient algorithm with variance reduction,” IEEE Trans. Neural Networks Learn. Syst. 30, 1360–1369 (2018).
Article MathSciNet Google Scholar
Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course (Kluwer Academic, Dordrecht, 2004).
Book Google Scholar
www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/. Accessed June 6, 2020.

Download references

Funding

This study was supported in part by the Nanjing University of Aeronautics and Astronautics (project no. NG2019004), National Natural Science Foundation of China (project no. 11971231), and Russian Foundation for Basic Research (project no. 19-01-00625).

Author information

Authors and Affiliations

Nanjing University of Aeronautics and Astronautics, Nanjing, China
L. Wang & H. Wu
Federal Research Center Computer Science and Control, Russian Academy of Sciences, Moscow, Russia
I. A. Matveev

Authors

L. Wang
View author publications
You can also search for this author in PubMed Google Scholar
H. Wu
View author publications
You can also search for this author in PubMed Google Scholar
I. A. Matveev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to L. Wang or I. A. Matveev.

Additional information

Translated by Yu. Kornienko

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wang, L., Wu, H. & Matveev, I.A. Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization. J. Comput. Syst. Sci. Int. 60, 75–86 (2021). https://doi.org/10.1134/S106423072101010X

Download citation

Received: 09 June 2020
Revised: 17 June 2020
Accepted: 27 July 2020
Published: 19 February 2021
Issue Date: January 2021
DOI: https://doi.org/10.1134/S106423072101010X

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions