Skip to main content
Log in

On R-linear convergence analysis for a class of gradient methods

  • Published:
Computational Optimization and Applications Aims and scope Submit manuscript

Abstract

Gradient method is a simple optimization approach using minus gradient of the objective function as a search direction. Its efficiency highly relies on the choices of the stepsize. In this paper, the convergence behavior of a class of gradient methods, where the stepsize has an important property introduced in (Dai in Optimization 52:395–415, 2003), is analyzed. Our analysis is focused on minimization on strictly convex quadratic functions. We establish the R-linear convergence and derive an estimate for the R-factor. Specifically, if the stepsize can be expressed as a collection of Rayleigh quotient of the inverse Hessian matrix, we are able to show that these methods converge R-linearly and their R-factors are bounded above by \(1-\frac{1}{\varkappa }\), where \(\varkappa\) is the associated condition number. Preliminary numerical results demonstrate the tightness of our estimate of the R-factor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The definition of \(A^p\): let \(A=U^TDU\) be the eigendecomposition, where \(D=\mathrm{diag}(\lambda _1,\lambda _2,\ldots ,\lambda _n)\) and \(U\in {\mathbb {R}}^{n\times n}\) is an orthonormal matrix, then \(A^{p}=U^TD^{p}U\). Here \(D^{p}=\mathrm{diag}({\lambda _1^p},{\lambda _2^p},\ldots ,{\lambda _n^p})\).

  2. Noticing that the gradient method (2) with the stepsize (4) is invariant under orthogonal transformation of the variables, without loss of generality, we assume that the matrix A is of the form \(A = \mathrm{diag}(0,\ldots ,0,\,\lambda _{\ell },\ldots , \lambda _n),\) where \(0 < \lambda _{\ell } \le \ldots \le \lambda _n\). Let \(b^{(i)}\) and \(g(x)^{(i)}\) denote the i-th component of b and g(x), respectively. It follows from \(Ax^*-b=0\) that \(b^{(i)}=0\) holds for all \(1\le i\le \ell -1\). This shows that, for any \(x\in {\mathbb {R}}^n\) and \(i\in \{1,2,\ldots ,\ell -1\}\), \(g(x)^{(i)}=(Ax-b)^{(i)}=0\). Hence, g(x) belongs to the eigenspace of all positive eigenvalues of A.

References

  1. Akaike, H.: On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method. Ann. Inst. Stat. Math. Tokyo 11, 1–17 (1959)

    Article  MathSciNet  Google Scholar 

  2. Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988)

    Article  MathSciNet  Google Scholar 

  3. Burdakov, O., Dai, Y.-H., Huang, N.: Stabilized Barzilai-Borwein method. J. Comp. Math. 37(6), 916–936 (2019)

    Article  MathSciNet  Google Scholar 

  4. Cauchy, A.: Méthode générale pour la résolution des systemes d’équations simultanées. Comp. Rend. Sci. Paris 25, 536–538 (1847)

    Google Scholar 

  5. Curtis, F.E., Guo, W.: \(R\)-linear convergence of limited memory steepest descent. IMA J. Numer. Anal. 38, 720–742 (2018)

    Article  MathSciNet  Google Scholar 

  6. Dai, Y.-H.: Alternate step gradient method. Optimization 52, 395–415 (2003)

    Article  MathSciNet  Google Scholar 

  7. Dai, Y.-H.: A new analysis on the Barzilai-Borwein gradient method. J. Oper. Res. Soc. China 1(2), 187–198 (2013)

    Article  Google Scholar 

  8. Dai, Y.-H., Al-Baali, M., Yang, X.: A positive Barzilai-Borwein-like stepsize and an extension for symmetric linear systems. In: Numerical Analysis and Optimization, pp. 59–75. Springer, New York p (2015)

    Chapter  Google Scholar 

  9. Dai, Y.-H., Fletcher, R.: On the asymptotic behaviour of some new gradient methods. Math. Program. 103, 541–559 (2005)

    Article  MathSciNet  Google Scholar 

  10. Dai, Y.-H., Fletcher, R.: Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming. Numer. Math. 100(1), 21–47 (2005)

    Article  MathSciNet  Google Scholar 

  11. Dai, Y.-H., Hager, W.W., Schittkowski, K., Zhang, H.: The cyclic Barzilai-Borwein method for unconstrained optimization. IMA J. Numer. Anal. 26(3), 604–627 (2006)

    Article  MathSciNet  Google Scholar 

  12. Dai, Y.-H., Huang, Y., Liu, X.-W.: A family of spectral gradient methods for optimization. Comput. Optim. Appl. 74, 43–65 (2019)

    Article  MathSciNet  Google Scholar 

  13. Dai, Y.-H., Liao, L.-Z.: \(R\)-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal. 22(1), 1–10 (2002)

    Article  MathSciNet  Google Scholar 

  14. Dai, Y.-H., Yuan, Y.: Analysis of monotone gradient methods. J. Ind. Manag. Optim. 1(2), 181–192 (2005)

    MathSciNet  MATH  Google Scholar 

  15. De Asmundis, R., di Serafino, D., Hager, W.W., Toraldo, G., Zhang, H.: An efficient gradient method using the Yuan steplength. Comput. Optim. Appl. 59(3), 541–563 (2014)

    Article  MathSciNet  Google Scholar 

  16. De Asmundis, R., di Serafino, D., Landi, G.: On the regularizing behavior of the SDA and SDC gradient methods in the solution of linear ill-posed problems. J. Comput. Appl. Math. 302, 81–93 (2016)

    Article  MathSciNet  Google Scholar 

  17. Fletcher, R.: On the Barzilai-Borwein method. In: In Optimization and Control with Applications, pp. 235–256. Springer, New York (2005)

    Chapter  Google Scholar 

  18. Frassoldati, G., Zanni, L., Zanghirati, G.: New adaptive stepsize selections in gradient methods. J. Ind. Manag. Optim. 4(2), 299 (2008)

    MathSciNet  MATH  Google Scholar 

  19. Friedlander, A., Martínez, J.M., Molina, B., Raydan, M.: Gradient method with retards and generalizations. SIAM J. Numer. Anal. 36(1), 275–289 (1998)

    Article  MathSciNet  Google Scholar 

  20. Huang, Y., Dai, Y.-H., Liu, X.-W.: Equipping Barzilai-Borwein method with two-dimensional quadratic termination property, arXiv preprint arXiv:2010.12130, (2020)

  21. Li, D.-W., Sun, R.-Y.: On a faster \(R\)-Linear convergence rate of the Barzilai-Borwein method, arXiv preprint arXiv:2101.00205, (2021)

  22. Luenberge, D.G.: Optimization by Vector Space Methods. Wiley, New York (1968)

    Google Scholar 

  23. Malitsky, Y., Mishchenko, K.: Adaptive gradient descent without descent, arXiv preprint arXiv:1910.09529, (2019)

  24. Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (1999)

    Book  Google Scholar 

  25. Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)

    MATH  Google Scholar 

  26. Raydan, M.: On the Barzilai and Borwein choice of steplength for the gradient method. IMA J. Numer. Anal. 13(3), 321–326 (1993)

    Article  MathSciNet  Google Scholar 

  27. Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7(1), 26–33 (1997)

    Article  MathSciNet  Google Scholar 

  28. Raydan, M., Svaiter, B.F.: Relaxed steepest descent and Cauchy-Barzilai-Borwein method. Comput. Optim. Appl. 21(2), 155–167 (2002)

    Article  MathSciNet  Google Scholar 

  29. Sun, W.-Y., Yuan, Y.: Optimization Theory and Methods: Nonlinear Programming. Springer, New York (2006)

    MATH  Google Scholar 

  30. Yuan, Y.: A short note on the Q-linear convergence of the steepest descent method. Math. Program. 123(2), 339–343 (2010)

    Article  MathSciNet  Google Scholar 

  31. Zhigljavsky, A., Pronzato, L., Bukina, E.: An asymptotically optimal gradient algorithm for quadratic optimization with low computational cost. Optim. Lett. 7(6), 1047–1059 (2013)

    Article  MathSciNet  Google Scholar 

  32. Zou, Q., Magoulès, F.: Fast gradient methods with alignment for symmetric linear systems without using Cauchy step. J. Comput. Appl. Math. 381, 113033 (2021)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The author is very grateful to Professor Oleg Burdakov in Linköping University and Professor Yu-Hong Dai in Chinese Academy of Sciences for their valuable and insightful comments on this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Na Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This author was supported by the National Natural Science Foundation of China (No. 12001531)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, N. On R-linear convergence analysis for a class of gradient methods. Comput Optim Appl 81, 161–177 (2022). https://doi.org/10.1007/s10589-021-00333-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10589-021-00333-z

Keywords

AMS subject classification

Navigation