Abstract
This paper proposes and develops new linesearch methods with inexact gradient information for finding stationary points of nonconvex continuously differentiable functions on finite-dimensional spaces. Some abstract convergence results for a broad class of linesearch methods are established. A general scheme for inexact reduced gradient (IRG) methods is proposed, where the errors in the gradient approximation automatically adapt with the magnitudes of the exact gradients. The sequences of iterations are shown to obtain stationary accumulation points when different stepsize selections are employed. Convergence results with constructive convergence rates for the developed IRG methods are established under the Kurdyka–Łojasiewicz property. The obtained results for the IRG methods are confirmed by encouraging numerical experiments, which demonstrate advantages of automatically controlled errors in IRG methods over other frequently used error selections.
Similar content being viewed by others
References
Absil, P.-A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16, 531–547 (2005)
Addis, A., Cassioli, A., Locatelli, M., Schoen, F.: A global optimization method for the design of space trajectories. Comput. Optim. Appl. 48, 635–652 (2011)
Aragón Artacho, F.J., Fleming, R.M.T., Vuong, P.T.: Accelerating the DC algorithm for smooth functions. Math. Program. 169, 95–118 (2018)
Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. 116, 5–16 (2009)
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka–Łojasiewicz property. Math. Oper. Res. 35, 438–457 (2010)
Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42, 330–348 (2017)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, Cham (2017)
Beck, A.: First-Order Methods in Optimization. SIAM, Philadelphia (2017)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Bertsekas, D.P.: Nonlinear Programming, 3rd edn. Athena Scientific, Belmont (2016)
Bloomfield, P., Steiger, W.: Least absolute deviations curve fitting. SIAM J. Sci. Stat. Comput. 1, 290–301 (1980)
Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)
Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60, 223–311 (2018)
Burke, J.V., Lewis, A.S., Overton, M.L.: Two numerical methods for optimizing matrix stability. Linear Algebra Appl. 351–352, 147–184 (2002)
Burke, J.V., Lin, Q.: Convergence of the gradient sampling algorithm on directionally Lipschitz functions. Set-Valued Var. Anal. 29, 949–966 (2021)
Conn, A.R., Scheinberg, K., Vicente, L.N.: Introduction to Derivative-Free Optimization. MOS-SIAM Optimization Series, Philadelphia (2008)
Crockett, J.B., Chernoff, H.: Gradient methods of maximization. Pac. J. Math. 5, 33–50 (1955)
Curry, H.B.: The method of steepest descent for non-linear minimization problems. Q. Appl. Math. 2, 258–261 (1944)
Dan, H., Yamashita, N., Fukushima, M.: Convergence properties of the inexact Levenberg–Marquardt method under local error bound conditions. Optim. Methods Softw. 17, 605–626 (2002)
Devolder, O., Glineur, F., Nesterov, Yu.: First-order methods of smooth convex optimization with inexact oracle. Math. Program. 146, 37–75 (2014)
Facchinei, F., Pang, J.-S.: Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. II. Springer, New York (2003)
Gannot, O.: A frequency-domain analysis of inexact gradient methods. Math. Program. 194, 975–1016 (2022)
Gilmore, P., Kelley, C.T.: An implicit filtering algorithm for optimization of functions with many local minima. SIAM J. Optim. 5, 269–285 (1995)
Izmailov, A.F., Solodov, M.V.: Newton-Type Methods for Optimization and Variational Problems. Springer, New York (2014)
Jamil, M.: A literature survey of benchmark functions for global optimization problems. Int. J. Math. Model. Numer. Optim. 4, 150–194 (2013)
Karimi, H., Nutini, J., Schmidt, M.: Linear convergence of gradient and proximal-gradient methods under the Polyak–Łojasiewicz condition. In: Frasconi, P. et al. (eds.) Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Lecture Notes in Computer Science, Part 1. Springer, Cham, pp. 795–811 (2016)
Khanh, P.D., Mordukhovich, B.S., Phat, V.T., Tran, D.B.: Generalized damped Newton algorithms in nonsmooth optimization via second-order subdifferentials. J. Glob. Optim. 86, 93–122 (2023)
Khanh, P.D., Mordukhovich, B.S., Phat, V.T., Tran, D.B.: Globally convergent coderivative-based generalized Newton methods in nonsmooth optimizations. Math. Program. (2023). https://doi.org/10.1007/s10107-023-01980-2
Khanh, P.D., Mordukhovich, B.S., Phat, V.T., Tran, D.B.: A new inexact gradient descent method with applications to nonsmooth convex optimization. arXiv:2303.08785
Kiwiel, K.C.: Convergence of the gradient sampling algorithm for nonsmooth nonconvex optimization. SIAM J. Optim. 18, 379–388 (2007)
Kiwiel, K.C.: A nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimization. SIAM J. Optim. 20, 1983–1994 (2010)
Kiwiel, K.C.: Improved convergence result for the discrete gradient and secant methods for nonsmooth optimization. J. Optim. Theory Appl. 144, 69–75 (2010)
Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier 48, 769–783 (1998)
Lewis, A.S., Luke, D.R., Malick, J.: Local linear convergence for alternating and averaged nonconvex projections. Found. Comput. Math. 9, 485–513 (2009)
Lobanov, A., Gasnikov, A., Stonyakin, F.: Highly smoothness zero-order methods for solving optimization problems under PL condition. arXiv:2305.15828 (2023)
Łojasiewicz, S.: Ensembles Semi-analytiques. Institut des Hautes Etudes Scientifiques, Bures-sur-Yvette (Seine-et-Oise) (1965)
Nesterov, Yu.: Universal gradient methods for convex optimization problems. Math. Program. 152, 381–404 (2015)
Nesterov, Yu.: Lectures on Convex Optimization, 2nd edn. Springer, Cham (2018)
Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press, New York (2015)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2016)
Noll, D.: Convergence of non-smooth descent methods using the Kurdyka–Łojasiewicz inequality. J. Optim. Theory Appl. 160, 553–572 (2014)
Ostrowski, A.: Solution of Equations and Systems of Equations, 2nd edn. Academic Press, New York (1966)
Polyak, B.T.: Gradient methods for minimizing functionals. USSR Comput. Math. Math. Phys. 3, 864–878 (1963)
Polyak, B.T.: Iterative Algorithms for Singular Minimization Problems. Nonlinear Programming, vol. 4, pp. 147–166. Academic Press, London (1981)
Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)
Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control. Optim. 14, 877–898 (1976)
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, Berlin (1998)
Rotaru, T., Glineur, F., Patrinos, P.: Tight convergence rates of the gradient method on hypoconvex functions. https://doi.org/10.48550/arXiv.2203.00775
Ruder, S.: An overview of gradient descent optimization algorithms. https://doi.org/10.48550/arXiv:1609.04747
Themelis, A., Stella, L., Patrinos, P.: Forward-backward quasi-Newton methods for nonsmooth optimization problems. Comput. Optim. Appl. 67, 443–487 (2017)
Vasin, A., Gasnikov, A., Dvurechensky, P., Spokoiny, V.: Accelerated gradient methods with absolute and relative noise in the gradient. Optim. Methods Softw. (2023). https://doi.org/10.1080/10556788.2023.2212503
Xingyu, Z.: On the Fenchel duality between strong convexity and Lipschitz continuous gradient. https://doi.org/10.48550/arXiv.1803.06573
Acknowledgements
The authors are very grateful to anonymous reviewers for their helpful remarks and suggestions, which allowed us to improve the original presentation.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Communicated by Arkadi Nemirovski.
Dedicated to the memory of Boris Polyak,
a great mathematician and incredible person.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Research of this author is funded by Ho Chi Minh City University of Education Foundation for Science and Technology under Grant Number CS.2023.19.02TD
Research of this author was partly supported by the US National Science Foundation under Grants DMS-1808978 and DMS-2204519, by the Australian Research Council under Grant DP-190100555, and by Project 111 of China under grant D21024.
Research of this author was partly supported by the US National Science Foundation under Grant DMS-1808978.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Khanh, P.D., Mordukhovich, B.S. & Tran, D.B. Inexact Reduced Gradient Methods in Nonconvex Optimization. J Optim Theory Appl (2023). https://doi.org/10.1007/s10957-023-02319-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10957-023-02319-9
Keywords
- Nonconvex optimization
- Inexact reduced gradient methods
- Linesearch methods
- Kurdyka–Łojasiewicz property
- Convergence rates