Advertisement

Mathematical Programming

, Volume 165, Issue 2, pp 471–507 | Cite as

From error bounds to the complexity of first-order descent methods for convex functions

  • Jérôme Bolte
  • Trong Phong Nguyen
  • Juan Peypouquet
  • Bruce W. Suter
Full Length Paper Series A

Abstract

This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-Łojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Hölderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework. Our main results inaugurate a simple method: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form \(O(q^{k})\) where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with \(\ell ^1\) regularization.

Keywords

Error bounds Convex minimization Forward-backward method KL inequality Complexity of first-order methods LASSO Compressed sensing 

Mathematics Subject Classification

90C06 90C25 90C60 65K05 

Notes

Acknowledgements

The authors would like to thank Amir Beck, Patrick Combettes, Édouard Pauwels, Marc Teboulle and the anonymous referee for very useful comments.

References

  1. 1.
    Absil, P.-A., Mahony, R., Andrews, B.: Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim. 16, 531–547 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Attouch, H., Bolte, J.: On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program. Ser. B 116, 5–16 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems. An approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35(2), 438–457 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss-Seidel methods. Math. Program. Ser. A 137(1–2), 91–129 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  5. 5.
    Auslender, A., Crouzeix, J.-P.: Global regularity theorems. Math. Oper. Res. 13, 243–253 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Auslender A.: Méthodes numériques pour la résolution des problèmes d’optimisation avec contraintes, PhD Thesis, Université Joseph Fourier Grenoble, France, (1969)Google Scholar
  7. 7.
    Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38(3), 367–426 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. Springer, New York (2011)CrossRefzbMATHGoogle Scholar
  9. 9.
    Baillon, J.-B., Combettes, P.L., Cominetti, R.: There is no variational characterization of the cycles in the method of periodic projections. J. Funct. Anal. 262(1), 400–408 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Beck A., Shtern S.: Linearly Convergent Away-Step Conditional Gradient for Non-strongly Convex Functions, http://arxiv.org/abs/1504.05002
  11. 11.
    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Beck, A., Teboulle, M.: Convergence rate analysis and error bounds for projection algorithms in convex feasibility problem. Optim. Methods Softw. 18(4), 377–394 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Bégout, P., Bolte, J., Jendoubi, M.-A.: On damped second-order gradient systems. J. Differ. Equ. 259(7), 3115–3143 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Belousov, E.G., Klatte, D.: A Frank-Wolfe type theorem for convex polynomial programs. Comput. Optim. Appl. 22, 37–48 (2002)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Bochnak, J., Coste, M., Roy, M.-F.: Real Algebraic Geometry. Springer, Berlin (1998)CrossRefzbMATHGoogle Scholar
  16. 16.
    Bolte, J.: Sur quelques principes de convergence en Optimisation. Université Pierre et Marie Curie, Habilitation à diriger des recherches (2008)Google Scholar
  17. 17.
    Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Amer. Math. Soc. 362(6), 3319–3363 (2010)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18(2), 556–572 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Bolte, J., Pauwels, E.: Majorization-minimization procedures and convergence of SQP methods for semi-algebraic and tame programs. Math. Oper. Res. 41(2), 442–465 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Math. Program. Ser. A 146, 1–16 (2013)zbMATHGoogle Scholar
  22. 22.
    Bruck, R.E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18, 15–26 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  23. 23.
    Brézis H.: Opérateurs maximaux monotones et semi-groupes de contractions dans les espace Hilbert, North-Holland Mathematics studies \( 5\), (North-Holland Publishing Co., (1973)Google Scholar
  24. 24.
    Burke, J.V., Ferris, M.C.: Weak sharp minima in mathematical programming. SIAM J. Control Optim. 31, 1340–1359 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Candès, E.J., Wakin, M.B.: An introduction to compressive sampling. IEEE Signal Process. Mag. 25(2), 21–30 (2008)CrossRefGoogle Scholar
  26. 26.
    Combettes, P.L.: Inconsistent signal feasibility problems: least-squares solutions in a product space. IEEE Trans. Signal Process. 42(11), 2955–2966 (1994)CrossRefGoogle Scholar
  27. 27.
    Combettes, P.L., Pesquet, J.-C.: Proximal Splitting Methods in Signal Processing. In: Bauschke, H.H., et al. (eds.) Fixed-Point Algorithms for Inverse Problems in Science and Engineering Springer Optimization and Its Applications, vol. 49, pp. 185–212 (2011)Google Scholar
  28. 28.
    Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model. Simul. 4, 1168–1200 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Cornejo, O., Jourani, A., Zǎlinescu, C.: Conditioning and upper-lipschitz inverse subdifferentials in nonsmooth optimization problems. J. Optim. Theory Appl. 95(1), 127–148 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Corvellec, J.-N., Motreanu, V.V.: Nonlinear error bounds for lower semicontinuous functions on metric spaces. Math. Program 114(2), 291–319 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Coste M.: An introduction to o-minimal geometry, RAAG Notes, 81 p., Institut de Recherche Mathématiques de Rennes, (1999)Google Scholar
  32. 32.
    Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Comm. Pure Appl. Math. 57, 1413–1457 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Dedieu, J.-P.: Penalty functions in subanalytic optimization. Optimization 26, 27–32 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Drori, Y.: Contributions to the Complexity Analysis of Optimization Algorithms, PhD Thesis, Tel Aviv (2014)Google Scholar
  35. 35.
    Ferris, M.: Finite termination of the proximal point algorithm. Math. Program. 50, 359–366 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Frankel, P., Garrigos, G., Peypouquet, J.: Splitting methods with variable metric for KL functions. J. Optim. Theory Appl. 165(3), 874–900 (2015)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49(4), 263–265 (1952)MathSciNetCrossRefGoogle Scholar
  38. 38.
    Kantorovich, L.V., Akilov, G.P.: Functional Analysis in Normed Spaces. Pergamon, Oxford (1964)zbMATHGoogle Scholar
  39. 39.
    Kurdyka, K.: On gradients of functions definable in o-minimal structures. Ann. Inst. Fourier 48, 769–783 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Li, G.: Global error bounds for piecewise convex polynomials. Math. Program. 137(1–2), 37–64 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  41. 41.
    Li, G., Mordukhovich, B.S., Nghia, T.T.A., Pham, T.S: Error Bounds for Parametric Polynomial Systems with Applications to Higher-Order Stability Analysis and Convergence Rates, ArXiv preprint arXiv:1509.03742, (2015)
  42. 42.
    Li, G., Mordukhovich B.S., Pham T.S.: New fractional error bound for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors, to appear in Math. Program., Ser. A. Google Scholar
  43. 43.
    Liang, J., Fadili, J., Peyré, G.: Local linear convergence of forward-backward under partial smoothness, http://arxiv.org/pdf/1407.5611
  44. 44.
    Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels, in: Les Équations aux Dérivées Partielles, pp. 87–89, Éditions du centre National de la Recherche Scientifique, Paris (1963). Google Scholar
  45. 45.
    Łojasiewicz, S.: Division d’une distribution par une fonction analytique de variables réelles. C. R. Acad. Sci., Paris 246, 683–686 (1958)MathSciNetzbMATHGoogle Scholar
  46. 46.
    Łojasiewicz, S.: Sur la problème de la division. Studia Mathematica 18, 87–136 (1959)MathSciNetCrossRefzbMATHGoogle Scholar
  47. 47.
    Łojasiewicz, S.: Sur la géométrie semi- et sous-analytique. Ann. Inst. Fourier 43, 1575–1595 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Luo, X.D., Luo, Z.Q.: Extensions of Hoffman’s error bound to polynomial systems. SIAM J. Optim. 4, 383–392 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Luo, Z.-Q., Pang, J.S.: Error bounds for analytic systems and their application. Math. Program. 67, 1–28 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  50. 50.
    Luo, Z.-Q., Sturm, J.F.: Error bound for quadratic systems. Appl. Optim. 33, 383–404 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  51. 51.
    Luo, Z.-Q., Tseng, P.: Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46–47(1), 157–178 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  52. 52.
    Mangasarian, O.L.: A condition number for differentiable convex inequalities. Math. Oper. Res. 10, 175–179 (1985)MathSciNetCrossRefzbMATHGoogle Scholar
  53. 53.
    Mordukhovich, B.: Variational analysis and generalized differentiation. I. Basic theory, Grundlehren der Mathematischen Wissenschaften, 330, Springer, Berlin, xxii+579 pp (2006)Google Scholar
  54. 54.
    Nedić, A., Bertsekas, D.: Convergence rate of incremental subgradient algorithms. In: Uryasev, S., Pardalos, P.M. (eds.) Stochastic Optimization: Algorithms and Applications, pp. 263–304. Kluwer Academic Publishers, Dordrecht (2000)Google Scholar
  55. 55.
    Ng, K.F., Zheng, X.Y.: Global error bound with fractional exponents. Math. Program. 88, 357–370 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  56. 56.
    Pauwels, E.: The value function approach to convergence analysis in composite optimization. Oper Res Lett. arXiv preprint https://arxiv.org/abs/1604.01654 (to appear)
  57. 57.
    Pang, J.S.: Error bounds in mathematical programming. Math. Program. 79, 299–332 (1997)MathSciNetzbMATHGoogle Scholar
  58. 58.
    Peypouquet, J.: Asymptotic convergence to the optimal value of diagonal proximal iterations in convex minimization. J. Convex Anal. 16(1), 277–286 (2009)MathSciNetzbMATHGoogle Scholar
  59. 59.
    Peypouquet, J.: Convex Optimization in Normed Spaces: Theory, Methods and Examples. Springer, Cham (2015)CrossRefzbMATHGoogle Scholar
  60. 60.
    Robinson, S.M.: An application of error bounds for convex programming in a linear space. SIAM J. Control 13, 271–273 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
  61. 61.
    Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1972)zbMATHGoogle Scholar
  62. 62.
    Rockafellar, R.T.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14, 877–898 (1976)MathSciNetCrossRefzbMATHGoogle Scholar
  63. 63.
    Vui, H.H.: Global Hölderian error bound for non degenerate polynomials. SIAM J. Optim. 23(2), 917–933 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  64. 64.
    Zǎlinescu, C.: Sharp estimates for Hoffmans constant for systems of linear inequalities and equalities. SIAM J. Optim. 14, 517–533 (2003)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg and Mathematical Optimization Society 2016

Authors and Affiliations

  • Jérôme Bolte
    • 1
    • 4
  • Trong Phong Nguyen
    • 1
  • Juan Peypouquet
    • 2
  • Bruce W. Suter
    • 3
  1. 1.Toulouse School of EconomicsUniversité Toulouse CapitoleToulouseFrance
  2. 2.Departamento de Matemática & AM2VUniversidad Técnica Federico Santa MaríaValparaísoChile
  3. 3.Air Force. Research Laboratory/RITBRomeUSA
  4. 4.The Research Institute of Time StudiesYamaguchi UniversityYamaguchiJapan

Personalised recommendations