Skip to main content
Log in

On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients

  • Short Communication
  • Published:
Optimization Letters Aims and scope Submit manuscript

Abstract

The gradient descent method minimizes an unconstrained nonlinear optimization problem with \({\mathcal {O}}(1/\sqrt{K})\), where K is the number of iterations performed by the gradient method. Traditionally, this analysis is obtained for smooth objective functions having Lipschitz continuous gradients. This paper aims to consider a more general class of nonlinear programming problems in which functions have Hölder continuous gradients. More precisely, for any function f in this class, denoted by \({{\mathcal {C}}}^{1,\nu }_L\), there is a \(\nu \in (0,1]\) and \(L>0\) such that for all \(\mathbf{x,y}\in {{\mathbb {R}}}^n\) the relation \(\Vert \nabla f(\mathbf{x})-\nabla f(\mathbf{y})\Vert \le L \Vert \mathbf{x}-\mathbf{y}\Vert ^{\nu }\) holds. We prove that the gradient descent method converges globally to a stationary point and exhibits a convergence rate of \({\mathcal {O}}(1/K^{\frac{\nu }{\nu +1}})\) when the step-size is chosen properly, i.e., less than \([\frac{\nu +1}{L}]^{\frac{1}{\nu }}\Vert \nabla f(\mathbf{x}_k)\Vert ^{\frac{1}{\nu }-1}\). Moreover, the algorithm employs \({\mathcal {O}}(1/\epsilon ^{\frac{1}{\nu }+1})\) number of calls to an oracle to find \({\bar{\mathbf{x}}}\) such that \(\Vert \nabla f({{\bar{\mathbf{x}}}})\Vert <\epsilon \).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  1. Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. (2015). doi:10.1007/s10107-015-0871-8

  2. Goldstein, A.A.: On steepest descent. SIAM J. Control 3, 147–151 (1965)

    MathSciNet  MATH  Google Scholar 

  3. Koshak, W.J., Krider, E.P.: A linear method for analyzing lightning field changes. J. Atmos. Sci. 51(4), 473–488 (1994)

    Article  Google Scholar 

  4. Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  5. Liao, L.Z., Qi, L., Tam, H.W.: A gradient-based continuous method for large-scale optimization problems. J. Glob. Optim. 31, 271–286 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  6. Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Papers 2007/76, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE) (2007). http://ideas.repec.org/p/cor/louvco/2007076.html

  7. Nesterov, Y.: Universal gradient methods for convex optimization problems. Math. Program. 152, 381–404 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  8. Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  9. Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7, 26–33 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  10. Raydan, M., Svaiter, B.F.: Relaxed steepest descent and Cauchy–Barzilai–Borwein method. Comput. Optim. Appl. 21, 155–167 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  11. Tavakoli, R., Zhang, H.: A nonmonotone spectral projected gradient method for large-scale topology optimization. Numer. Algebra Control Optim. 2, 395–412 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  12. Zhang, Y.: Interior-point gradient methods with diagonal-scalings for simple-bound constrained optimization. Technical Report TR04-06, Department of Computational and Applied Mathematics, Rice University, Houston, Texas (2004)

  13. Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chen, Y., Hager, W.W., Yashtini, M., Ye, X., Zhang, H.: Bregman operator splitting with variable stepsize for total variation image reconstruction. Comput. Optim. Appl. 54, 317–342 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  15. Hager, W.W., Ngo, C., Yashtini, M., Zhang, H.: Alternating direction approximate Newton method for partially parallel imaging. J. Oper. Res. Soc. China 3(2), 139–162 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hager, W.W., Yashtini, M., Zhang, H.: An O(1/K) convergence rate for the BOSVS algorithm in total variation regularized least squares problems. Technical Report, Georgia Institute of Technology, School of mathematics (2015)

  17. Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. (2008). doi:10.1007/s10589-008-9215-4

    MathSciNet  MATH  Google Scholar 

  18. Barzilai, J., Borwein, J.M.: Two point step size gradient methods. IMA J. Numer. Anal. 8, 141–148 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  19. Nesterov, Y.: Introductory lectures on convex optimization: a basic course. In: Applied Optimization. Kluwer Academic Publishers, Boston (2004). http://opac.inria.fr/record=b1104789

  20. Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)

    MATH  Google Scholar 

  21. Raydan, M.: On the Barzilai and Borwein choice of steplength for gradient method. IMA J. Numer. Anal. 13, 321–326 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  22. De Asmundis, R., Di Serafino, D., Hager, W.W., Toraldo, G., Zhang, H.: An efficient gradient method using the Yuan steplength. Comput. Optim. Appl. 59, 541–563 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  23. Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23, 2150–2168 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  24. Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  25. Dai, Y.H., Kou, C.X.: A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM J. Optim. 23, 296–320 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  26. Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2, 21–42 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  27. Zhang, L., Zhou, W., Li, D.: Some descent three-term conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  28. Wolfe, P.: Convergence conditions for ascent methods. SIAM Rev. 11, 226–235 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  29. Wolfe, P.: Convergence conditions for ascent methods II: some corrections. SIAM Rev. 13, 185–188 (1971)

    Article  MathSciNet  MATH  Google Scholar 

  30. Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  31. Nemirovsky, A., Yudin, D.: Problem complexity and method efficiency in optimization. Wiley-Interscience, New York (1983). ISBN 0-471-10345-4

  32. Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence \(o(1/k^2)\). Doklady AN SSSR 269, 543–547 (1983)

    MathSciNet  Google Scholar 

  33. Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  34. Nemirovskii, A., Nesterov, Y.: Optimal methods for smooth convex minimization. Zh. Vichisl. Mat. Fiz. (in Russian) 25, 356–369 (1985)

    MathSciNet  Google Scholar 

  35. Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonom. i. Mat. Metody 24, 509–517 (1988)

    MathSciNet  MATH  Google Scholar 

  36. Lan, G.: Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. 149(1), 1–45 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  37. Lan, G.: Bundle-type methods uniformly optimal for smooth and non-smooth convex optimization. Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611 (2010)

Download references

Acknowledgments

The author would like to thank Guanghui (George) Lan (University of Florida) for introducing this subject.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maryam Yashtini.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yashtini, M. On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients. Optim Lett 10, 1361–1370 (2016). https://doi.org/10.1007/s11590-015-0936-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11590-015-0936-x

Keywords

Navigation