Skip to main content
Log in

A new inexact stochastic recursive gradient descent algorithm with Barzilai–Borwein step size in machine learning

  • Original Paper
  • Published:
Nonlinear Dynamics Aims and scope Submit manuscript

Abstract

The inexact SARAH (iSARAH) algorithm as a variant of SARAH algorithm for variance reduction has recently surged into prominence for solving large-scale optimization problems in the context of machine learning. The performance of the iSARAH significantly depends on the choice of step-size sequence. In this paper, we develop a new algorithm called iSARAH-BB, which employs the Barzilai–Borwein (BB) method to automatically compute step size based on SARAH. By introducing this adaptive step size in the design of the new algorithm, iSARAH-BB can take better advantages of both iSARAH and BB methods. Finally, we analyze the convergence rate and the complexity of the new algorithm under the usual assumptions. Numerical experiments on standard datasets indicate that our proposed iSARAH-BB algorithm is robust to the selection of the initial step size, and it is effective and more competitive than the existing algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability

All data generated or analyzed during this study are included in this published article.

Notes

  1. heart, splice, mushrooms, ijcnn1, a9a and w8a can be downloaded from https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/.

References

  1. Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22(3), 400–407 (1951)

    Article  MATH  Google Scholar 

  2. Ding, F., Yang, H.Z., Liu, F.: Performance analysis of stochastic gradient algorithms under weak conditions. Sci. China Ser. F: Inf. Sci. 51, 1269–1280 (2008)

    MATH  Google Scholar 

  3. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2018)

    Article  MATH  Google Scholar 

  4. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521, 436–444 (2015)

    Article  Google Scholar 

  5. Moulines, E., Bach, F.R.:Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In: Advances in Neural Information Processing Systems, pp. 451–459 (2011)

  6. Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162, 83–112 (2017)

    Article  MATH  Google Scholar 

  7. Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. In: NIPS, pp. 1646–1654 (2014)

  8. Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: NIPS, pp. 315–323 (2013)

  9. Nguyen, L.M., Liu, J., Scheinberg, K., Tak’ač, M.: Stochastic recursive gradient algorithm for nonconvex optimization (2017). arXiv:1705.07261

  10. Nguyen, L.M., Liu, J., Scheinberg, K., Takáč, M.: SARAH: A novel method for machine learning problems using stochastic recursive gradient. In: ICML, pp. 2613–2621 (2017)

  11. Nguyen, L.M., Scheinberg, K., Takáč, M.: Inexact SARAH algorithm for stochastic optimization. Optim. Method. Softw. 36, 237–258 (2020)

    Article  MATH  Google Scholar 

  12. Bottou, L.: Online learning and stochastic approximations. Online Learn. Neural Netw. 17, 9–42 (1998)

    MATH  Google Scholar 

  13. Duchi, J.C., Hazan, E., Singer, Y.J.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MATH  Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam:a method for stochastic optimization. In: International Conference on Learning Representations, pp. 1–13 (2015)

  15. Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988)

    Article  MATH  Google Scholar 

  16. Dai, Y.H., Liao, L.Z.: R-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal. 22(1), 1–10 (2002)

    Article  MATH  Google Scholar 

  17. Fletcher, R.: On the Barzilai–Borwein method. In: Optimization and control with applications, pp. 235–256 (2005)

  18. Yu, T., Liu, X.W., Dai, Y.H., et al.: Stochastic variance reduced gradient methods using a trust-region-like scheme. J. Sci. Comput. 87(5), 1 (2021). https://doi.org/10.1007/s10915-020-01402-x

    Article  MATH  Google Scholar 

  19. Sopya, K., Drozda, P.: Stochastic gradient descent with Barzilai–Borwein update step for SVM. Inf. Sci. 316, 218–233 (2015)

    Article  MATH  Google Scholar 

  20. Tan, C., Ma, S., Dai, Y.H., Qian, Y.: Barzilai–Borwein step size for stochastic gradient descent. In: Neural Information Processing Systems, pp. 685-693 (2016)

  21. Li, B.C., Giannakis, G.B.: Adaptive step sizes in variance reduction via regularization (2019). arXiv:1910.06532

  22. Shao, G.M., Xue, W., Yu, G.H., Zheng, X.: Improved SVRG for finite sum structure optimization with application to binary classification. J. Ind. Manag. Optim. 16(5), 2253–2266 (2020)

    Article  MATH  Google Scholar 

  23. Liu, Y., Wang, X., Guo, T.D.: A linearly convergent stochastic recursive gradient method for convex optimization. Optim. Lett. (2020). https://doi.org/10.1007/s11590-020-01550-x

    Article  MATH  Google Scholar 

  24. Yang, Z., Wang, C., Zhang, Z.M., Li, J.: Random Barzilai–Borwein step size for mini-batch algorithms. Eng. Appl. Artif. Intel. 72, 124–135 (2018)

    Article  Google Scholar 

  25. Yang, Z., Chen, Z.P., Wang, C.: Accelerating mini-batch SARAH by step size rules. Inf. Sci. 1, 157–173 (2019)

    MATH  Google Scholar 

Download references

Acknowledgements

The authors are indebted to the editors and anonymous referees for their a number of helpful comments and suggestions that improved the quality of this manuscript.

Funding

This work was supported by Research Project Foundation of Shanxi Scholarship Council of China (No. 2017- 104); Basic Research Program of Shanxi Province (Free exploration) project (Nos. 202103021224303, 20210302124688).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fu-sheng Wang.

Ethics declarations

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Ym., Wang, Fs., Li, Jx. et al. A new inexact stochastic recursive gradient descent algorithm with Barzilai–Borwein step size in machine learning. Nonlinear Dyn 111, 3575–3586 (2023). https://doi.org/10.1007/s11071-022-07987-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11071-022-07987-2

Keywords

Navigation