Skip to main content
Log in

Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization

  • COMPUTER METHODS
  • Published:
Journal of Computer and Systems Sciences International Aims and scope

Abstract

The use of stochastic gradient algorithms for nonlinear optimization is of considerable interest, especially in the case of high dimensions. In this case, the choice of the step size is of key importance for the convergence rate. In this paper, we propose two new stochastic gradient algorithms that use an improved Barzilai–Borwein step size formula. Convergence analysis shows that these algorithms enable linear convergence in probability for strongly convex objective functions. Our computational experiments confirm that the proposed algorithms have better characteristics than two-point gradient algorithms and well-known stochastic gradient methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.

Similar content being viewed by others

REFERENCES

  1. K. Chaudhuri, C. Monteleoni, and D. Sarwate, “Differentially private empirical risk minimization,” J. Mach. Learn. Res., No. 12, 1069–1109 (2011).

  2. H. Robbins and S. Monro, “A stochastic approximation method,” Ann. Math. Stat. 22, 400–407 (1951).

    Article  MathSciNet  Google Scholar 

  3. Yu. E. Nesterov, “A method for solving a convex programming problem with a convergence rate O(1/k 2),” Dokl. Akad. Nauk SSSR 269, 543–547 (1983).

    MathSciNet  Google Scholar 

  4. A. A. Gaivoronskii, “Nonstationary stochastic programming problems,” Cybernetics 14, 575–579 (1978).

    Article  MathSciNet  Google Scholar 

  5. B. T. Polyak, “New method of stochastic approximation type,” Autom. Remote Control 51, 937–946 (1990).

    MathSciNet  MATH  Google Scholar 

  6. L. Xiao and T. Zhang, “A proximal stochastic gradient method with progressive variance reduction,” SIAM J. Optimiz. 24, 2057–2075 (2014).

    Article  MathSciNet  Google Scholar 

  7. R. L. Roux, M. Schmidt, and F. Bach, “A stochastic gradient method with an exponential convergence rate for finite training sets,” Adv. Neural Inform. Process. Syst. 4, 2663–2671 (2012).

    Google Scholar 

  8. S. Shalevshwartz and T. Zhang, “Stochastic dual coordinate ascent methods for regularized loss minimization,” J. Mach. Learn. Res. 14, 567–599 (2013).

    MathSciNet  MATH  Google Scholar 

  9. R. Johnson and T. Zhang, “Accelerating stochastic gradient descent using predictive variance reduction,” in Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 2013, pp. 315–323.

  10. J. Barzilai and J. M. Borwein, “Two-point step size gradient methods,” IMA J. Numer. Anal. 8, 141–148 (1988).

    Article  MathSciNet  Google Scholar 

  11. C. Tan, S. Ma, Y. H. Dai, and Y. Qian, “Barzilai–Borwein step size for stochastic gradient descent,” in Advances in Neural Information Processing Systems 29, Ed. by D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett (Curran Assoc., New York, 2016), pp. 685–693.

    Google Scholar 

  12. M. Raydan, “On the Barzilai and Borwein choice of steplength for the gradient method,” IMA J. Numer. Anal. 13, 321–326 (1993).

    Article  MathSciNet  Google Scholar 

  13. Y. Dai, J. Yuan, and Y. X. Yuan, “Modified two-point stepsize gradient methods for unconstrained optimization,” Comput. Optimiz. Appl. 22, 103–109 (2002).

    Article  MathSciNet  Google Scholar 

  14. Y. X. Yuan, “Step-sizes for the gradient method,” Am. Math. Soc. 42, 785–796 (2008).

    MathSciNet  MATH  Google Scholar 

  15. X. B. Jin, X. Y. Zhang, K. Huang, and G. G. Geng, “Stochastic conjugate gradient algorithm with variance reduction,” IEEE Trans. Neural Networks Learn. Syst. 30, 1360–1369 (2018).

    Article  MathSciNet  Google Scholar 

  16. Y. Nesterov, Introductory Lectures on Convex Optimization: A Basic Course (Kluwer Academic, Dordrecht, 2004).

    Book  Google Scholar 

  17. www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/. Accessed June 6, 2020.

Download references

Funding

This study was supported in part by the Nanjing University of Aeronautics and Astronautics (project no. NG2019004), National Natural Science Foundation of China (project no. 11971231), and Russian Foundation for Basic Research (project no. 19-01-00625).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to L. Wang or I. A. Matveev.

Additional information

Translated by Yu. Kornienko

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, L., Wu, H. & Matveev, I.A. Stochastic Gradient Method with Barzilai–Borwein Step for Unconstrained Nonlinear Optimization. J. Comput. Syst. Sci. Int. 60, 75–86 (2021). https://doi.org/10.1134/S106423072101010X

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S106423072101010X

Navigation