Convergence Rate of Inertial Proximal Algorithms with General Extrapolation and Proximal Coefficients

Abstract

In a Hilbert space setting \({\mathcal{H}}\), in order to minimize by fast methods a general convex lower semicontinuous and proper function \({\Phi }: {\mathcal{H}} \rightarrow \mathbb {R} \cup \{+\infty \}\), we analyze the convergence rate of the inertial proximal algorithms. These algorithms involve both extrapolation coefficients (including Nesterov acceleration method) and proximal coefficients in a general form. They can be interpreted as the discrete time version of inertial continuous gradient systems with general damping and time scale coefficients. Based on the proper setting of these parameters, we show the fast convergence of values and the convergence of iterates. In doing so, we provide an overview of this class of algorithms. Our study complements the previous Attouch–Cabot paper (SIOPT, 2018) by introducing into the algorithm time scaling aspects, and sheds new light on the Güler seminal papers on the convergence rate of the accelerated proximal methods for convex optimization.

This is a preview of subscription content, log in to check access.

References

  1. 1.

    Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9, 3–11 (2001)

    MathSciNet  Article  Google Scholar 

  2. 2.

    Álvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics. J. Math. Pures Appl. 81, 747–779 (2002)

    MathSciNet  Article  Google Scholar 

  3. 3.

    Apidopoulos, V., Aujol, J.-F., Dossal, Ch.: Convergence rate of inertial forward-backward algorithm beyond Nesterov’s rule. Math. Program. https://doi.org/10.1007/s10107-018-1350-9. HAL-01551873 (2018)

  4. 4.

    Attouch, H., Cabot, A.: Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity. J. Differ. Equ. 263, 5412–5458 (2017)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Attouch, H., Cabot, A.: Convergence rates of inertial forward-backward algorithms. SIAM J. Optim. 28, 849–874 (2018)

    MathSciNet  Article  Google Scholar 

  6. 6.

    Attouch, H., Cabot, A., Chbani, Z., Riahi, H.: Inertial forward-backward algorithms with perturbations: application to Tikhonov regularization. J. Optim. Theory Appl. 179, 1–36 (2018)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM. J. Optim. 29, 2227–2256 (2019)

    MathSciNet  MATH  Google Scholar 

  8. 8.

    Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. Ser. B 168, 123–175 (2018)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case α ≤ 3. ESAIM-COCV, 25 (2019). https://doi.org/10.1051/cocv/2017083

  10. 10.

    Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than 1/k2. SIAM. J. Optim. 26, 1824–1834 (2016)

    MathSciNet  MATH  Google Scholar 

  11. 11.

    Aujol, J. -F., Dossal, Ch: Stability of over-relaxations for the forward-backward algorithm, application to FISTA. SIAM J. Optim. 25, 2408–2433 (2015)

    MathSciNet  Article  Google Scholar 

  12. 12.

    Aujol, J.-F., Dossal, Ch.: Optimal rate of convergence of an ODE associated to the Fast Gradient Descent schemes for b > 0. https://hal.inria.fr/hal-01547251v2 (2017)

  13. 13.

    Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, Cham (2011)

    Google Scholar 

  14. 14.

    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)

    MathSciNet  Article  Google Scholar 

  15. 15.

    Boţ, R. I., Csetnek, E.R., László, S. C.: A second-order dynamical approach with variable damping to nonconvex smooth minimization. Appl. Anal. (2018). https://doi.org/10.1080/00036811.2018.1495330

  16. 16.

    Bonettini, S., Porta, F., Ruggiero, V.: A variable metric forward-backward method with extrapolation. SIAM. J. Sci. Comput. 38, A2558–A2584 (2016)

    MATH  Google Scholar 

  17. 17.

    Burger, M., Sawatzky, A., Steidl, G.: First order algorithms in variational image processing. In: Glowinski, R., Osher, S., Yin, W (eds.) Splitting Methods in Communication, Imaging, Science, and Engineering, pp 345–407. Springer, Cham (2016)

  18. 18.

    Calatroni, L., Chambolle, A.: Backtracking strategies for accelerated descent methods with smooth composite objectives. SIAM J. Optim. 29, 1772–1798 (2019)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Chambolle, A., Dossal, Ch: On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm”. J. Optim. Theory Appl. 166, 968–982 (2015)

    MathSciNet  Article  Google Scholar 

  20. 20.

    Combettes, P.L., Glaudin, L.E.: Proximal activation of smooth functions in splitting algorithms for convex image recovery. SIAM J. Imaging Sci. (2019). To appear

  21. 21.

    Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model Simul. 4, 1168–1200 (2005)

    MathSciNet  Article  Google Scholar 

  22. 22.

    Güler, O.: On the convergence of the proximal point algorithm for convex optimization. SIAM J. Control Optim. 29, 403–419 (1991)

    MathSciNet  Article  Google Scholar 

  23. 23.

    Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2, 649–664 (1992)

    MathSciNet  Article  Google Scholar 

  24. 24.

    Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Program. 159, 81–107 (2016)

    MathSciNet  Article  Google Scholar 

  25. 25.

    Liang, J., Fadili, J., Peyré, G.: Local linear convergence of forward-backward under partial smoothness. In: Ghahramani, Z., et al. (eds.) Advances in Neural Information Processing Systems 27, pp 1970–1978. Curran Associates Inc. (2014)

  26. 26.

    Lorenz, D.A., Pock, Th.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)

    MathSciNet  Article  Google Scholar 

  27. 27.

    May, R.: Asymptotic for a second-order evolution equation with convex potential and vanishing damping term. Turk. J. Math. 41, 681–685 (2017)

    MathSciNet  Article  Google Scholar 

  28. 28.

    Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Soviet. Math. Dokl. 27, 372–376 (1983)

    MATH  Google Scholar 

  29. 29.

    Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization, vol. 87. Kluwer Academic Publishers, Boston (2004)

    Google Scholar 

  30. 30.

    Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)

    MathSciNet  Article  Google Scholar 

  31. 31.

    Parikh, N., Boyd, S.: Proximal algorithms. Foundations and Trends in optimization, vol. 1, pp. 127–239 (2013)

  32. 32.

    Peypouquet, J.: Convex Optimization in Normed Spaces: Theory, Methods and Examples. Springer, Cham (2015)

    Google Scholar 

  33. 33.

    Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17, 1113–1163 (2010)

    MathSciNet  MATH  Google Scholar 

  34. 34.

    Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput Math. Math. Phys. 4, 1–17 (1964)

    Article  Google Scholar 

  35. 35.

    Polyak, B.T.: Introduction to Optimization. New York: Optimization Software (1987)

  36. 36.

    Scheinberg, K., Goldfarb, D., Bai, X.: Fast first-order methods for composite convex optimization with backtracking. Found. Comput. Math. 14, 389–417 (2014)

    MathSciNet  Article  Google Scholar 

  37. 37.

    Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations. arXiv:1810.08907 (2018)

  38. 38.

    Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. NIPS’11 - 25th Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain. HAL-inria-00618152v3 (2011)

  39. 39.

    Su, W.J., Boyd, S., Candès, E.J.: A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method: Theory and Insights. In: Ghahramani, Z., et al. (eds.) Advances Neural Information Processing Systems 27, pp 2510–2518. Curran Associates Inc. (2014)

  40. 40.

    Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23, 1607–1633 (2013)

    MathSciNet  Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Hedy Attouch.

Additional information

This paper is dedicated to Professor Marco A. López Cerdá on the occasion of his 70th birthday.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Some Auxiliary Results

Appendix: Some Auxiliary Results

The following lemmas are used throughout the paper. To establish the weak convergence of the iterates of (IP)\(_{\alpha _{k}, \beta _{k}}\), we apply Opial’s Lemma [30], that we recall in its discrete form.

Lemma 4

Let S be a nonempty subset of \(\mathcal H\), and (xk) a sequence in \({\mathcal{H}}\). Assume that

  1. (i)

    every sequential weak cluster point of (xk) as \(k\to +\infty \), belongs to S;

  2. (ii)

    for every zS, \(\lim _{k\to +\infty }\|x_{k}-z\|\) exists.

Then (xk) converges weakly as \(k \to +\infty \) to a point in S.

Owing to the next lemma, we are able to estimate the rate of convergence of a sequence (εk) supposed to be non-increasing and summable with respect to weight coefficients, see [5, Lemma 21] for the proof.

Lemma 5

Let (τk) be a nonnegative sequence such that \({\sum }_{k=1}^{+\infty } \tau _{k}=+\infty \). Assume that (εk) is a non-negative and non-increasing sequence satisfying \({\sum }_{k=1}^{+\infty } \tau _{k} \varepsilon _{k}<+\infty \). Then we have

$$ \varepsilon_{k} = o\left( \frac{1}{{\sum}_{i=1}^{k} \tau_{i}}\right) \quad \text{ as }~ k\to +\infty. $$

The following result shows the summability of a sequence (ak) satisfying a suitable inequality.

Lemma 6

Given a non-negative sequence (αk) satisfying (K0), let (tk) be the sequence defined by \(t_{k}=1+{\sum }_{i=k}^{+\infty }{\prod }_{j=k}^{i}\alpha _{j}\). Let (ak) and (ωk) be two nonnegative sequences such that

$$ a_{k+1} \leq \alpha_{k}a_{k}+\omega_{k}, $$
(51)

for all k ≥ 0. If \({\sum }_{k=0}^{+\infty }t_{k+1}\omega _{k}<+\infty \), then \({\sum }_{k=0}^{+\infty }a_{k}<+\infty \).

Proof

By Lemma 1, we have tk+ 1αk = tk − 1. Multiplying inequality (51) by tk+ 1 gives

$$ t_{k+1}a_{k+1}\leq (t_{k}-1)a_{k}+t_{k+1}\omega_{k}, $$

or equivalently ak ≤ (tkaktk+ 1ak+ 1) + tk+ 1ωk. By summing from k = 0 to n, we obtain

$$ \sum\limits_{k=0}^{n}a_{k} \leq t_{0} a_{0} - t_{n+1}a_{n+1} + \sum\limits_{k=0}^{n}t_{k+1}\omega_{k} \leq t_{0}a_{0} + \sum\limits_{k=0}^{+\infty}t_{k+1}\omega_{k} < +\infty. $$

The conclusion follows by letting n tend to \(+\infty \). □

Lemma 7

[8, Lemma 5.14] Let (ak) be a sequence of nonnegative numbers such that, for all \(k\in \mathbb {N}\), \({a_{k}^{2}} \leq c^{2} + {\sum }_{j=1}^{k} b_{j} a_{j}\), where (bj) is a summable sequence of nonnegative numbers, and c ≥ 0. Then, for all \(k\in \mathbb {N}\), \(a_{k} \leq c + {\sum }_{j=1}^{\infty } b_{j}\).

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Attouch, H., Chbani, Z. & Riahi, H. Convergence Rate of Inertial Proximal Algorithms with General Extrapolation and Proximal Coefficients. Vietnam J. Math. 48, 247–276 (2020). https://doi.org/10.1007/s10013-020-00399-y

Download citation

Keywords

  • Inertial proximal algorithms
  • General extrapolation coefficient
  • Lyapunov analysis
  • Nesterov accelerated gradient method
  • Nonsmooth convex optimization
  • Time rescaling

Mathematics Subject Classification (2010)

  • 37N40
  • 46N10
  • 49M30
  • 65B99
  • 65K05
  • 65K10
  • 90B50
  • 90C25