Abstract
The gradient descent method minimizes an unconstrained nonlinear optimization problem with \({\mathcal {O}}(1/\sqrt{K})\), where K is the number of iterations performed by the gradient method. Traditionally, this analysis is obtained for smooth objective functions having Lipschitz continuous gradients. This paper aims to consider a more general class of nonlinear programming problems in which functions have Hölder continuous gradients. More precisely, for any function f in this class, denoted by \({{\mathcal {C}}}^{1,\nu }_L\), there is a \(\nu \in (0,1]\) and \(L>0\) such that for all \(\mathbf{x,y}\in {{\mathbb {R}}}^n\) the relation \(\Vert \nabla f(\mathbf{x})-\nabla f(\mathbf{y})\Vert \le L \Vert \mathbf{x}-\mathbf{y}\Vert ^{\nu }\) holds. We prove that the gradient descent method converges globally to a stationary point and exhibits a convergence rate of \({\mathcal {O}}(1/K^{\frac{\nu }{\nu +1}})\) when the step-size is chosen properly, i.e., less than \([\frac{\nu +1}{L}]^{\frac{1}{\nu }}\Vert \nabla f(\mathbf{x}_k)\Vert ^{\frac{1}{\nu }-1}\). Moreover, the algorithm employs \({\mathcal {O}}(1/\epsilon ^{\frac{1}{\nu }+1})\) number of calls to an oracle to find \({\bar{\mathbf{x}}}\) such that \(\Vert \nabla f({{\bar{\mathbf{x}}}})\Vert <\epsilon \).
References
Ghadimi, S., Lan, G.: Accelerated gradient methods for nonconvex nonlinear and stochastic programming. Math. Program. (2015). doi:10.1007/s10107-015-0871-8
Goldstein, A.A.: On steepest descent. SIAM J. Control 3, 147–151 (1965)
Koshak, W.J., Krider, E.P.: A linear method for analyzing lightning field changes. J. Atmos. Sci. 51(4), 473–488 (1994)
Lan, G.: An optimal method for stochastic composite optimization. Math. Program. 133, 365–397 (2012)
Liao, L.Z., Qi, L., Tam, H.W.: A gradient-based continuous method for large-scale optimization problems. J. Glob. Optim. 31, 271–286 (2005)
Nesterov, Y.: Gradient methods for minimizing composite objective function. CORE Discussion Papers 2007/76, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE) (2007). http://ideas.repec.org/p/cor/louvco/2007076.html
Nesterov, Y.: Universal gradient methods for convex optimization problems. Math. Program. 152, 381–404 (2015)
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103(1), 127–152 (2005)
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7, 26–33 (1997)
Raydan, M., Svaiter, B.F.: Relaxed steepest descent and Cauchy–Barzilai–Borwein method. Comput. Optim. Appl. 21, 155–167 (2002)
Tavakoli, R., Zhang, H.: A nonmonotone spectral projected gradient method for large-scale topology optimization. Numer. Algebra Control Optim. 2, 395–412 (2012)
Zhang, Y.: Interior-point gradient methods with diagonal-scalings for simple-bound constrained optimization. Technical Report TR04-06, Department of Computational and Applied Mathematics, Rice University, Houston, Texas (2004)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009)
Chen, Y., Hager, W.W., Yashtini, M., Ye, X., Zhang, H.: Bregman operator splitting with variable stepsize for total variation image reconstruction. Comput. Optim. Appl. 54, 317–342 (2013)
Hager, W.W., Ngo, C., Yashtini, M., Zhang, H.: Alternating direction approximate Newton method for partially parallel imaging. J. Oper. Res. Soc. China 3(2), 139–162 (2015)
Hager, W.W., Yashtini, M., Zhang, H.: An O(1/K) convergence rate for the BOSVS algorithm in total variation regularized least squares problems. Technical Report, Georgia Institute of Technology, School of mathematics (2015)
Tseng, P., Yun, S.: A coordinate gradient descent method for linearly constrained smooth optimization and support vector machines training. Comput. Optim. Appl. (2008). doi:10.1007/s10589-008-9215-4
Barzilai, J., Borwein, J.M.: Two point step size gradient methods. IMA J. Numer. Anal. 8, 141–148 (1988)
Nesterov, Y.: Introductory lectures on convex optimization: a basic course. In: Applied Optimization. Kluwer Academic Publishers, Boston (2004). http://opac.inria.fr/record=b1104789
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
Raydan, M.: On the Barzilai and Borwein choice of steplength for gradient method. IMA J. Numer. Anal. 13, 321–326 (1993)
De Asmundis, R., Di Serafino, D., Hager, W.W., Toraldo, G., Zhang, H.: An efficient gradient method using the Yuan steplength. Comput. Optim. Appl. 59, 541–563 (2014)
Hager, W.W., Zhang, H.: The limited memory conjugate gradient method. SIAM J. Optim. 23, 2150–2168 (2013)
Hager, W.W., Zhang, H.: A new conjugate gradient method with guaranteed descent and an efficient line search. SIAM J. Optim. 16, 170–192 (2005)
Dai, Y.H., Kou, C.X.: A nonlinear conjugate gradient algorithm with an optimal property and an improved Wolfe line search. SIAM J. Optim. 23, 296–320 (2013)
Gilbert, J.C., Nocedal, J.: Global convergence properties of conjugate gradient methods for optimization. SIAM J. Optim. 2, 21–42 (1992)
Zhang, L., Zhou, W., Li, D.: Some descent three-term conjugate gradient methods and their global convergence. Optim. Methods Softw. 22(4), 697–711 (2007)
Wolfe, P.: Convergence conditions for ascent methods. SIAM Rev. 11, 226–235 (1969)
Wolfe, P.: Convergence conditions for ascent methods II: some corrections. SIAM Rev. 13, 185–188 (1971)
Chen, Y., Lan, G., Ouyang, Y.: Optimal primal-dual methods for a class of saddle point problems. SIAM J. Optim. 24(4), 1779–1814 (2014)
Nemirovsky, A., Yudin, D.: Problem complexity and method efficiency in optimization. Wiley-Interscience, New York (1983). ISBN 0-471-10345-4
Nesterov, Y.: A method for unconstrained convex minimization problem with the rate of convergence \(o(1/k^2)\). Doklady AN SSSR 269, 543–547 (1983)
Devolder, O., Glineur, F., Nesterov, Y.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014)
Nemirovskii, A., Nesterov, Y.: Optimal methods for smooth convex minimization. Zh. Vichisl. Mat. Fiz. (in Russian) 25, 356–369 (1985)
Nesterov, Y.: On an approach to the construction of optimal methods of minimization of smooth convex functions. Ekonom. i. Mat. Metody 24, 509–517 (1988)
Lan, G.: Bundle-level type methods uniformly optimal for smooth and nonsmooth convex optimization. Math. Program. 149(1), 1–45 (2015)
Lan, G.: Bundle-type methods uniformly optimal for smooth and non-smooth convex optimization. Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL 32611 (2010)
Acknowledgments
The author would like to thank Guanghui (George) Lan (University of Florida) for introducing this subject.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yashtini, M. On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients. Optim Lett 10, 1361–1370 (2016). https://doi.org/10.1007/s11590-015-0936-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11590-015-0936-x