On R-linear convergence analysis for a class of gradient methods

Huang, Na

doi:10.1007/s10589-021-00333-z

On R-linear convergence analysis for a class of gradient methods

Published: 17 November 2021

Volume 81, pages 161–177, (2022)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Na Huang ORCID: orcid.org/0000-0001-7324-0388¹

521 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Gradient method is a simple optimization approach using minus gradient of the objective function as a search direction. Its efficiency highly relies on the choices of the stepsize. In this paper, the convergence behavior of a class of gradient methods, where the stepsize has an important property introduced in (Dai in Optimization 52:395–415, 2003), is analyzed. Our analysis is focused on minimization on strictly convex quadratic functions. We establish the R-linear convergence and derive an estimate for the R-factor. Specifically, if the stepsize can be expressed as a collection of Rayleigh quotient of the inverse Hessian matrix, we are able to show that these methods converge R-linearly and their R-factors are bounded above by $1-\frac{1}{\varkappa }$, where $\varkappa$ is the associated condition number. Preliminary numerical results demonstrate the tightness of our estimate of the R-factor.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

Article 03 April 2024

On the Improved Conditions for Some Primal-Dual Algorithms

Article 02 May 2024

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

Notes

The definition of $A^p$: let $A=U^TDU$ be the eigendecomposition, where $D=\mathrm{diag}(\lambda _1,\lambda _2,\ldots ,\lambda _n)$ and $U\in {\mathbb {R}}^{n\times n}$ is an orthonormal matrix, then $A^{p}=U^TD^{p}U$. Here $D^{p}=\mathrm{diag}({\lambda _1^p},{\lambda _2^p},\ldots ,{\lambda _n^p})$.
Noticing that the gradient method (2) with the stepsize (4) is invariant under orthogonal transformation of the variables, without loss of generality, we assume that the matrix A is of the form $A = \mathrm{diag}(0,\ldots ,0,\,\lambda _{\ell },\ldots , \lambda _n),$ where $0 < \lambda _{\ell } \le \ldots \le \lambda _n$. Let $b^{(i)}$ and $g(x)^{(i)}$ denote the i-th component of b and g(x), respectively. It follows from $Ax^*-b=0$ that $b^{(i)}=0$ holds for all $1\le i\le \ell -1$. This shows that, for any $x\in {\mathbb {R}}^n$ and $i\in \{1,2,\ldots ,\ell -1\}$, $g(x)^{(i)}=(Ax-b)^{(i)}=0$. Hence, g(x) belongs to the eigenspace of all positive eigenvalues of A.

References

Akaike, H.: On a successive transformation of probability distribution and its application to the analysis of the optimum gradient method. Ann. Inst. Stat. Math. Tokyo 11, 1–17 (1959)
Article MathSciNet Google Scholar
Barzilai, J., Borwein, J.M.: Two-point step size gradient methods. IMA J. Numer. Anal. 8(1), 141–148 (1988)
Article MathSciNet Google Scholar
Burdakov, O., Dai, Y.-H., Huang, N.: Stabilized Barzilai-Borwein method. J. Comp. Math. 37(6), 916–936 (2019)
Article MathSciNet Google Scholar
Cauchy, A.: Méthode générale pour la résolution des systemes d’équations simultanées. Comp. Rend. Sci. Paris 25, 536–538 (1847)
Google Scholar
Curtis, F.E., Guo, W.: $R$-linear convergence of limited memory steepest descent. IMA J. Numer. Anal. 38, 720–742 (2018)
Article MathSciNet Google Scholar
Dai, Y.-H.: Alternate step gradient method. Optimization 52, 395–415 (2003)
Article MathSciNet Google Scholar
Dai, Y.-H.: A new analysis on the Barzilai-Borwein gradient method. J. Oper. Res. Soc. China 1(2), 187–198 (2013)
Article Google Scholar
Dai, Y.-H., Al-Baali, M., Yang, X.: A positive Barzilai-Borwein-like stepsize and an extension for symmetric linear systems. In: Numerical Analysis and Optimization, pp. 59–75. Springer, New York p (2015)
Chapter Google Scholar
Dai, Y.-H., Fletcher, R.: On the asymptotic behaviour of some new gradient methods. Math. Program. 103, 541–559 (2005)
Article MathSciNet Google Scholar
Dai, Y.-H., Fletcher, R.: Projected Barzilai-Borwein methods for large-scale box-constrained quadratic programming. Numer. Math. 100(1), 21–47 (2005)
Article MathSciNet Google Scholar
Dai, Y.-H., Hager, W.W., Schittkowski, K., Zhang, H.: The cyclic Barzilai-Borwein method for unconstrained optimization. IMA J. Numer. Anal. 26(3), 604–627 (2006)
Article MathSciNet Google Scholar
Dai, Y.-H., Huang, Y., Liu, X.-W.: A family of spectral gradient methods for optimization. Comput. Optim. Appl. 74, 43–65 (2019)
Article MathSciNet Google Scholar
Dai, Y.-H., Liao, L.-Z.: $R$-linear convergence of the Barzilai and Borwein gradient method. IMA J. Numer. Anal. 22(1), 1–10 (2002)
Article MathSciNet Google Scholar
Dai, Y.-H., Yuan, Y.: Analysis of monotone gradient methods. J. Ind. Manag. Optim. 1(2), 181–192 (2005)
MathSciNet MATH Google Scholar
De Asmundis, R., di Serafino, D., Hager, W.W., Toraldo, G., Zhang, H.: An efficient gradient method using the Yuan steplength. Comput. Optim. Appl. 59(3), 541–563 (2014)
Article MathSciNet Google Scholar
De Asmundis, R., di Serafino, D., Landi, G.: On the regularizing behavior of the SDA and SDC gradient methods in the solution of linear ill-posed problems. J. Comput. Appl. Math. 302, 81–93 (2016)
Article MathSciNet Google Scholar
Fletcher, R.: On the Barzilai-Borwein method. In: In Optimization and Control with Applications, pp. 235–256. Springer, New York (2005)
Chapter Google Scholar
Frassoldati, G., Zanni, L., Zanghirati, G.: New adaptive stepsize selections in gradient methods. J. Ind. Manag. Optim. 4(2), 299 (2008)
MathSciNet MATH Google Scholar
Friedlander, A., Martínez, J.M., Molina, B., Raydan, M.: Gradient method with retards and generalizations. SIAM J. Numer. Anal. 36(1), 275–289 (1998)
Article MathSciNet Google Scholar
Huang, Y., Dai, Y.-H., Liu, X.-W.: Equipping Barzilai-Borwein method with two-dimensional quadratic termination property, arXiv preprint arXiv:2010.12130, (2020)
Li, D.-W., Sun, R.-Y.: On a faster $R$-Linear convergence rate of the Barzilai-Borwein method, arXiv preprint arXiv:2101.00205, (2021)
Luenberge, D.G.: Optimization by Vector Space Methods. Wiley, New York (1968)
Google Scholar
Malitsky, Y., Mishchenko, K.: Adaptive gradient descent without descent, arXiv preprint arXiv:1910.09529, (2019)
Nocedal, J., Wright, S.: Numerical Optimization. Springer, New York (1999)
Book Google Scholar
Ortega, J.M., Rheinboldt, W.C.: Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York (1970)
MATH Google Scholar
Raydan, M.: On the Barzilai and Borwein choice of steplength for the gradient method. IMA J. Numer. Anal. 13(3), 321–326 (1993)
Article MathSciNet Google Scholar
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7(1), 26–33 (1997)
Article MathSciNet Google Scholar
Raydan, M., Svaiter, B.F.: Relaxed steepest descent and Cauchy-Barzilai-Borwein method. Comput. Optim. Appl. 21(2), 155–167 (2002)
Article MathSciNet Google Scholar
Sun, W.-Y., Yuan, Y.: Optimization Theory and Methods: Nonlinear Programming. Springer, New York (2006)
MATH Google Scholar
Yuan, Y.: A short note on the Q-linear convergence of the steepest descent method. Math. Program. 123(2), 339–343 (2010)
Article MathSciNet Google Scholar
Zhigljavsky, A., Pronzato, L., Bukina, E.: An asymptotically optimal gradient algorithm for quadratic optimization with low computational cost. Optim. Lett. 7(6), 1047–1059 (2013)
Article MathSciNet Google Scholar
Zou, Q., Magoulès, F.: Fast gradient methods with alignment for symmetric linear systems without using Cauchy step. J. Comput. Appl. Math. 381, 113033 (2021)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author is very grateful to Professor Oleg Burdakov in Linköping University and Professor Yu-Hong Dai in Chinese Academy of Sciences for their valuable and insightful comments on this manuscript.

Author information

Authors and Affiliations

Department of Applied Mathematics, College of Science, China Agricultural University, Beijing, 100083, China
Na Huang

Authors

Na Huang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Na Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This author was supported by the National Natural Science Foundation of China (No. 12001531)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, N. On R-linear convergence analysis for a class of gradient methods. Comput Optim Appl 81, 161–177 (2022). https://doi.org/10.1007/s10589-021-00333-z

Download citation

Received: 29 May 2021
Accepted: 09 November 2021
Published: 17 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s10589-021-00333-z

Keywords

AMS subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On R-linear convergence analysis for a class of gradient methods

Abstract

Access this article

Similar content being viewed by others

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

On the Improved Conditions for Some Primal-Dual Algorithms

Random Gradient-Free Minimization of Convex Functions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

AMS subject classification

Navigation

On R-linear convergence analysis for a class of gradient methods

Abstract

Access this article

Similar content being viewed by others

$\mathbf{C^{2}}$ -Lusin approximation of strongly convex functions

On the Improved Conditions for Some Primal-Dual Algorithms

Random Gradient-Free Minimization of Convex Functions

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS subject classification

Search

Navigation