On Linear Convergence of Non-Euclidean Gradient Methods without Strong Convexity and Lipschitz Gradient Continuity

  • Heinz H. Bauschke
  • Jérôme BolteEmail author
  • Jiawei Chen
  • Marc Teboulle
  • Xianfu Wang


The gradient method is well known to globally converge linearly when the objective function is strongly convex and admits a Lipschitz continuous gradient. In many applications, both assumptions are often too stringent, precluding the use of gradient methods. In the early 1960s, after the amazing breakthrough of Łojasiewicz on gradient inequalities, it was observed that uniform convexity assumptions could be relaxed and replaced by these inequalities. On the other hand, very recently, it has been shown that the Lipschitz gradient continuity can be lifted and replaced by a class of functions satisfying a non-Euclidean descent property expressed in terms of a Bregman distance. In this note, we combine these two ideas to introduce a class of non-Euclidean gradient-like inequalities, allowing to prove linear convergence of a Bregman gradient method for nonconvex minimization, even when neither strong convexity nor Lipschitz gradient continuity holds.


Non-Euclidean gradient methods Nonconvex minimization Bregman distance Lipschitz-like convexity condition Descent lemma without Lipschitz gradient Gradient dominated inequality Łojasiewicz gradient inequality Linear rate of convergence 

Mathematics Subject Classification

65K05 49M10 90C26 90C30 65K10 



Heinz Bauschke was partially supported by the Natural Sciences and Engineering Research Council of Canada. Jérôme Bolte was partially supported by Air Force Office of Scientific Research, Air Force Material Command, USAF, under Grant Number FA9550-18-1-0226. Jiawei Chen was partially supported by the Natural Science Foundation of China (Nos. 11401487, 11771058), the Basic and Advanced Research Project of Chongqing (cstc2016jcyjA0239) and National Scholarship under the China Scholarship Council. Marc Teboulle was partially supported by the Israel Science Foundation under ISF Grant 1844-16. Xianfu Wang was partially supported by the Natural Sciences and Engineering Research Council of Canada.


  1. 1.
    Palomar, D.P., Eldar, Y.C. (eds.): Convex Optimization in Signal Processing and Communications. Cambridge University Press, New York (2010)zbMATHGoogle Scholar
  2. 2.
    Sra, S., Nowozin, S., Wright, S.J. (eds.): Optimization for Machine Learning. MIT Press, Cambridge (2011)Google Scholar
  3. 3.
    Beck, A.: First-Order Methods in Optimization, MOS-SIAM Series on Optimization, MO25. SIAM, Philadelphia (2017)CrossRefGoogle Scholar
  4. 4.
    Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Belmont (1999)zbMATHGoogle Scholar
  5. 5.
    Polyak, B.T.: Gradient methods for minimizing functionals. Zh. Vychisl. Mat. Mat. Fiz. 3, 643–653 (1963). (in Russian) MathSciNetzbMATHGoogle Scholar
  6. 6.
    Łojasiewicz, S.: Une propriété topologique des sous-ensembles analytiques réels. In Les Équations aux Derivées Partielles, Éditions du Centre National de la Recherche Scientifique, Paris, pp. 87–89 (1963)Google Scholar
  7. 7.
    Łojasiewicz, S.: Ensembles semi-analytiques, Cours miméographié de la Faculté des Sciences d’Orsay, I.H.E.S., Bures-sur-Yvette (1965).
  8. 8.
    Bolte, J., Daniilidis, A., Lewis, A.S.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2007)CrossRefzbMATHGoogle Scholar
  9. 9.
    Bolte, J., Sabach, S., Teboulle, M., Vaisbourd, Y.: First order methods beyond convexity and Lipschitz gradient continuity with applications to quadratic inverse problems. SIAM J. Optim. 28, 2131–2151 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. Ser. A 165, 471–507 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Bertero, M., Boccacci, P., Desider, G., Vicidomini, G.: Image deblurring with Poisson data: from cells to galaxies. Inverse Probl. 25, 123006 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Bauschke, H.H., Bolte, J., Teboulle, M.: A descent lemma beyond Lipschitz gradient continuity: first-order methods revisited and applications. Math. Oper. Res. 42, 330–348 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Bregman, L.M.: The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR Comput. Math. Math. Phys. 7, 200–217 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Nguyen, Q.V.: Forward–backward splitting with Bregman distances. Vietnam J. Math. 45, 519–539 (2017)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Teboulle, M.: A simplified view of first order methods for optimization. Math. Program. Ser. B 170, 67–96 (2018)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Bartlett, P.L., Hazan, E., Rakhlin, A.: Adaptive online gradient descent. In: Platt, J., Koller, D., Singer, Y., Roweis, S. (eds.) Advances Neural Information Processing Systems, vol. 20, pp. 65–72. MIT Press, Cambridge (2007)Google Scholar
  17. 17.
    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces, 2nd edn. Springer, New York (2017)CrossRefzbMATHGoogle Scholar
  18. 18.
    Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)CrossRefzbMATHGoogle Scholar
  19. 19.
    Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis. Springer, corrected 3rd printing (2009)Google Scholar
  20. 20.
    Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38, 367–426 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    Bauschke, H.H., Borwein, J.M.: Legendre functions and the method of random Bregman projections. J. Convex Anal. 4, 27–67 (1997)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Bauschke, H.H., Borwein, J.M.: Joint and separate convexity of the Bregman distance. In: Butnariu, D., Censor, Y., Reich, S. (eds.) Inherently Parallel Algorithms in Feasibility and Optimization and their Applications (Haifa 2000), pp. 23–36. Elsevier, Amsterdam (2001)CrossRefGoogle Scholar
  23. 23.
    Bolte, J., Teboulle, M.: Barrier operators and associated gradient like dynamical systems for constrained minimization problems. SIAM J. Control Optim. 42, 1266–1292 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Burachik, R., Iusem, A.: A generalized proximal point algorithm for the variational inequality problem in a Hilbert space. SIAM J. Optim. 8(1), 197–216 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Censor, Y., Zenios, S.A.: Proximal minimization algorithm with D-functions. J. Optim. Theory Appl. 73, 451–464 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Chen, G., Teboulle, M.: Convergence analysis of a proximal-like minimization algorithm using Bregman functions. SIAM J. Optim. 3, 538–543 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  27. 27.
    Eckstein, J.: Nonlinear proximal point algorithms using Bregman functions, with applications to convex programming. Math. Oper. Res. 18, 202–226 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  28. 28.
    Iusem, A.: On dual convergence and the rate of primal convergence of Bregman convex programming method. SIAM J. Optim. 1, 401–423 (1991)MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Teboulle, M.: Entropic proximal mappings with application to nonlinear programming. Math. Oper. Res. 17, 670–690 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Auslender, A., Teboulle, M.: Projected subgradient methods with non-Euclidean distances for nondifferentiable convex minimization and variational inequalities. Math. Program. Ser. B 120, 27–48 (2009)CrossRefzbMATHGoogle Scholar
  32. 32.
    Beck, A., Teboulle, M.: Mirror descent and nonlinear projected subgradient methods for convex optimization. Oper. Res. Lett. 31, 167–175 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    Nemirovsky, A.S., Yudin, D.B. (eds.): Problem Complexity and Method Efficiency in Optimization. Wiley, New York (1983)Google Scholar
  34. 34.
    Clarke, F.H.: Optimization and Nonsmooth Analysis. Wiley, New York (1983); Republished as Classics in Applied Mathematics, vol. 5. SIAM, Philadelphia (1990)Google Scholar
  35. 35.
    Zhang, H., Dai, Y.-H., Guo, L.: Proximal-like incremental aggregated gradient method with linear convergence under Bregman distance growth conditions. arXiv:1711.01136

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Heinz H. Bauschke
    • 1
  • Jérôme Bolte
    • 2
    Email author
  • Jiawei Chen
    • 3
  • Marc Teboulle
    • 4
  • Xianfu Wang
    • 1
  1. 1.Department of Mathematics, Irving K. Barber SchoolUniversity of British Columbia OkanaganKelownaCanada
  2. 2.Toulouse School of EconomicsUniversité Toulouse 1 CapitoleToulouseFrance
  3. 3.School of Mathematics and StatisticsSouthwest UniversityChongqingChina
  4. 4.School of Mathematical SciencesTel Aviv UniversityRamat AvivIsrael

Personalised recommendations