On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients

Yashtini, Maryam

doi:10.1007/s11590-015-0936-x

On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients

Short Communication
Published: 26 August 2015

Volume 10, pages 1361–1370, (2016)
Cite this article

Optimization Letters Aims and scope Submit manuscript

Maryam Yashtini¹

980 Accesses
7 Citations
Explore all metrics

Abstract

The gradient descent method minimizes an unconstrained nonlinear optimization problem with \({\mathcal {O}}(1/\sqrt{K})\), where K is the number of iterations performed by the gradient method. Traditionally, this analysis is obtained for smooth objective functions having Lipschitz continuous gradients. This paper aims to consider a more general class of nonlinear programming problems in which functions have Hölder continuous gradients. More precisely, for any function f in this class, denoted by \({{\mathcal {C}}}^{1,\nu }_L\), there is a \(\nu \in (0,1]\) and \(L>0\) such that for all \(\mathbf{x,y}\in {{\mathbb {R}}}^n\) the relation \(\Vert \nabla f(\mathbf{x})-\nabla f(\mathbf{y})\Vert \le L \Vert \mathbf{x}-\mathbf{y}\Vert ^{\nu }\) holds. We prove that the gradient descent method converges globally to a stationary point and exhibits a convergence rate of \({\mathcal {O}}(1/K^{\frac{\nu }{\nu +1}})\) when the step-size is chosen properly, i.e., less than \([\frac{\nu +1}{L}]^{\frac{1}{\nu }}\Vert \nabla f(\mathbf{x}_k)\Vert ^{\frac{1}{\nu }-1}\). Moreover, the algorithm employs \({\mathcal {O}}(1/\epsilon ^{\frac{1}{\nu }+1})\) number of calls to an oracle to find \({\bar{\mathbf{x}}}\) such that \(\Vert \nabla f({{\bar{\mathbf{x}}}})\Vert <\epsilon \).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. (2015). doi:10.1007/s10107-015-0871-8
Goldstein, A.A.: On steepest descent. SIAM J. Control 3, 147–151 (1965)
MathSciNet MATH Google Scholar
Koshak, W.J., Krider, E.P.: A linear method for analyzing lightning field changes. J. Atmos. Sci. 51(4), 473–488 (1994)
Article Google Scholar
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012)
Article MathSciNet MATH Google Scholar
Liao, L.Z., Qi, L., Tam, H.W.: A gradient-based continuous method for large-scale optimization problems. J. Glob. Optim. 31, 271–286 (2005)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Papers 2007/76, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE) (2007). http://ideas.repec.org/p/cor/louvco/2007076.html
Nesterov, Y.: Universal gradient methods for convex optimization problems. Math. Program. 152, 381–404 (2015)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Article MathSciNet MATH Google Scholar
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7, 26–33 (1997)
Article MathSciNet MATH Google Scholar
Raydan, M., Svaiter, B.F.: Relaxed steepest descent and Cauchy–Barzilai–Borwein method. Comput. Optim. Appl. 21, 155–167 (2002)
Article MathSciNet MATH Google Scholar
Tavakoli, R., Zhang, H.: A nonmonotone spectral projected gradient method for large-scale topology optimization. Numer. Algebra Control Optim. 2, 395–412 (2012)
Article MathSciNet MATH Google Scholar
Zhang, Y.: Interior-point gradient methods with diagonal-scalings for simple-bound constrained optimization. Technical Report TR04-06, Department of Computational and Applied Mathematics, Rice University, Houston, Texas (2004)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Article MathSciNet MATH Google Scholar
Chen, Y., Hager, W.W., Yashtini, M., Ye, X., Zhang, H.: Bregman operator splitting with variable stepsize for total variation image reconstruction. Comput. Optim. Appl. 54, 317–342 (2013)
Article MathSciNet MATH Google Scholar
Hager, W.W., Ngo, C., Yashtini, M., Zhang, H.: Alternating direction approximate Newton method for partially parallel imaging. J. Oper. Res. Soc. China 3(2), 139–162 (2015)
Article MathSciNet MATH Google Scholar
Hager, W.W., Yashtini, M., Zhang, H.: An O(1/K) convergence rate for the BOSVS algorithm in total variation regularized least squares problems. Technical Report, Georgia Institute of Technology, School of mathematics (2015)
Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. (2008). doi:10.1007/s10589-008-9215-4
MathSciNet MATH Google Scholar
Barzilai, J., Borwein, J.M.: Two point step size gradient methods. IMA J. Numer. Anal. 8, 141–148 (1988)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Introductory lectures on convex optimization: a basic course. In: Applied Optimization. Kluwer Academic Publishers, Boston (2004). http://opac.inria.fr/record=b1104789
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Raydan, M.: On the Barzilai and Borwein choice of steplength for gradient method. IMA J. Numer. Anal. 13, 321–326 (1993)
Article MathSciNet MATH Google Scholar
De Asmundis, R., Di Serafino, D., Hager, W.W., Toraldo, G., Zhang, H.: An efficient gradient method using the Yuan steplength. Comput. Optim. Appl. 59, 541–563 (2014)
Article MathSciNet MATH Google Scholar
Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23, 2150–2168 (2013)
Article MathSciNet MATH Google Scholar
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Article MathSciNet MATH Google Scholar
Dai, Y.H., Kou, C.X.: A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM J. Optim. 23, 296–320 (2013)
Article MathSciNet MATH Google Scholar
Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2, 21–42 (1992)
Article MathSciNet MATH Google Scholar
Zhang, L., Zhou, W., Li, D.: Some descent three-term conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007)
Article MathSciNet MATH Google Scholar
Wolfe, P.: Convergence conditions for ascent methods. SIAM Rev. 11, 226–235 (1969)
Article MathSciNet MATH Google Scholar
Wolfe, P.: Convergence conditions for ascent methods II: some corrections. SIAM Rev. 13, 185–188 (1971)
Article MathSciNet MATH Google Scholar
Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)
Article MathSciNet MATH Google Scholar
Nemirovsky, A., Yudin, D.: Problem complexity and method efficiency in optimization. Wiley-Interscience, New York (1983). ISBN 0-471-10345-4
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence \(o(1/k^2)\). Doklady AN SSSR 269, 543–547 (1983)
MathSciNet Google Scholar
Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014)
Article MathSciNet MATH Google Scholar
Nemirovskii, A., Nesterov, Y.: Optimal methods for smooth convex minimization. Zh. Vichisl. Mat. Fiz. (in Russian) 25, 356–369 (1985)
MathSciNet Google Scholar
Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonom. i. Mat. Metody 24, 509–517 (1988)
MathSciNet MATH Google Scholar
Lan, G.: Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. 149(1), 1–45 (2015)
Article MathSciNet MATH Google Scholar
Lan, G.: Bundle-type methods uniformly optimal for smooth and non-smooth convex optimization. Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611 (2010)

Download references

Acknowledgments

The author would like to thank Guanghui (George) Lan (University of Florida) for introducing this subject.

Author information

Authors and Affiliations

School of Mathematics, Georgia Institute of Technology, 686 Cherry Street, Atlanta, GA, 30332-0160, USA
Maryam Yashtini

Authors

Maryam Yashtini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maryam Yashtini.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yashtini, M. On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients. Optim Lett 10, 1361–1370 (2016). https://doi.org/10.1007/s11590-015-0936-x

Download citation

Received: 22 September 2014
Accepted: 12 August 2015
Published: 26 August 2015
Issue Date: August 2016
DOI: https://doi.org/10.1007/s11590-015-0936-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients

Abstract

Access this article

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation