Abstract
In a Hilbert space \( {\mathcal H}\), we introduce a new class of first-order algorithms which naturally occur as discrete temporal versions of an inertial differential inclusion jointly involving viscous friction and dry friction. The function \(f:{\mathcal H}\rightarrow {\mathbb {R}}\) to be minimized is supposed to be differentiable (not necessarily convex), and enters the algorithm via its gradient. The dry friction damping function \(\phi :{\mathcal H}\rightarrow {\mathbb {R}}_+\) is convex with a sharp minimum at the origin, (typically \(\phi (x) = r \Vert x\Vert \) with \(r >0\)). It enters the algorithm via its proximal mapping, which acts as a soft threshold operator on the velocities. As a result, we obtain a new class of splitting algorithms involving separately the proximal and gradient steps. The sequence of iterates has a finite length, and therefore strongly converges towards an approximate critical point \(x_{\infty }\) of f (typically \(\Vert \nabla f(x_{\infty })\Vert \le r\)). Under a geometric property satisfied by the limit point \(x_{\infty }\), we obtain geometric and finite rates of convergence. The convergence results tolerate the presence of errors, under the sole assumption of their asymptotic convergence towards zero. By replacing the function f by its Moreau envelope, we extend the results to the case of nonsmooth convex functions. In this case, the algorithm involves the proximal operators of f and \(\phi \) separately. Several variants of this algorithm are considered, including the case of the Nesterov accelerated gradient method. We then consider the extension in the case of additive composite optimization, thus leading to new splitting methods. Numerical experiments are given for Lasso-type problems. The performance profiles, as a comparison tool, demonstrate the efficiency of the Nesterov accelerated method with asymptotic vanishing damping combined with dry friction.
Similar content being viewed by others
Notes
This interesting suggestion was made to us by one of the two anonymous reviewers.
We thank the anonymous reviewer for suggesting it.
References
Adly, S.: A Variational Approach to Nonsmooth Dynamics: Applications in Unilateral Mechanics and Electronics, Springer Briefs in Mathematics. Springer, Berlin (2017)
Adly, S., Attouch, H., Cabot, A.: Finite-time stabilization of nonlinear oscillators subject to dry friction, nonsmooth mechanics and analysis. In: Advances in Mechanics and Mathematics, vol. 12, pp. 289–304. Springer, New York (2006)
Adly, S., Brogliato, B., Le, B.K.: Well-posednesss, robustness and stability analysis of a set-valued controller for Lagrangian systems. SIAM J. Control Optim. 51(2), 1592–1614 (2013)
Alvarez, F.: On the minimizing property of a second-order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38(4), 1102–1119 (2000)
Amann, H., Díaz, J.I.: A note on the dynamics of an oscillator in the presence of strong friction. Nonlinear Anal. 55, 209–216 (2003)
Apidopoulos, V., Aujol, J.-F., Dossal, Ch.: Convergence Rate of Inertial Forward-Backward Algorithm Beyond Nesterov’s Rule, Mathematical Programming, Series A, pp. 1–20. Springer, Berlin (2018)
Attouch, H., Buttazzo, G., Michaille, G.: Variational analysis in Sobolev and BV spaces. Applications to PDEs and optimization, 2nd Edn, MOS/SIAM Series on Optimization, MO 17. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2014)
Attouch, H., Cabot, A.: Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity. J. Differ. Equ. 263, 5412–5458 (2017)
Attouch, H., Cabot, A.: Convergence rates of inertial forward–backward algorithms. SIAM J. Optim. 28(1), 849–874 (2018)
Attouch, H., Cabot, A., Chbani, Z., Riahi, H.: Rate of convergence of inertial gradient dynamics with time-dependent viscous damping coefficient. Evol. Equ. Control Theory 7(3), 353–371 (2018)
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. Ser. B 168, 123–175 (2018)
Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case \(\alpha \le 3\), ESAIM-COCV, 25, published electronically (2019)
Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward–backward method is actually faster than \(1/k^2\). SIAM J. Optim. 26(3), 1824–1834 (2016)
Attouch, H., Chbani, Z., Fadili, J., Riahi, H.: First-order optimization algorithms via inertial systems with Hessian driven damping. Math. Program. Ser. A (2020). https://doi.org/10.1007/s10107-020-01591-1
Aujol, J.-F., Dossal, Ch.: Stability of over-relaxations for the forward–backward algorithm, application to FISTA. SIAM J. Optim. 25(4), 2408–2433 (2015)
Baji, B., Cabot, A.: An inertial proximal algorithm with dry friction: finite convergence results. Set Valued Anal. 9(1), 1–23 (2006)
Bauschke, H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, Berlin (2011)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Bot, R. I., Csetnek, E. R., László, S. C.: A second order dynamical approach with variable damping to nonconvex smooth minimization. Appl. Anal. (2018) (to appear)
Bot, R.I., Csetnek, E.R.: Second order forward–backward dynamical systems for monotone inclusion problems. SIAM J. Control Optim. 54(3), 1423–1443 (2016)
Brézis, H.: Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution, Lecture Notes, vol. 5. North Holland, Amsterdam (1972)
Chambolle, A., Dossal, Ch.: On the convergence of the iterates of the fast iterative shrinkage thresholding algorithm. J. Optim. Theory Appl. 166, 968–982 (2015)
Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)
Díaz, J.I., Liñán, A.: On the asymptotic behavior of a damped oscillator under a sublinear friction term. Rev. R. Acad. Cien. Serie A. Mat. 95(1), 155–160 (2001)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Ghadimi, E., Feyzmahdavian, H. R., Johansson, M.: Global convergence of the heavy-ball method for convex optimization. In: 2015 European Control Conference, July, pp. 310–315 (2015)
Haraux, A., Jendoubi, M.A.: Convergence of solutions of second-order gradient-like systems with analytic nonlinearities. J. Differ. Equ. 144(2), 313–320 (1998)
Haraux, A., Jendoubi, M.A.: The Convergence Problem for Dissipative Autonomous Systems, Classical Methods and Recent Advances. Springer, Berlin (2015)
Lemaréchal, C., Sagastizábal, C.: Practical aspects of the Moreau–Yosida regularization: theoretical preliminaries. SIAM J. Optim. 7(2), 367–385 (1997)
May, R.: Asymptotic for a second-order evolution equation with convex potential and vanishing damping term. Turk. J. Math. 41(3), 681–685 (2017)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Sov. Math. Dokl. 27, 372–376 (1983)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course, Applied Optimization, vol. 87. Kluwer Academic Publishers, Boston, MA (2004)
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17(3–4), 1113–1163 (2010)
Polyak, B.T.: Some methods of speeding up the convergence of iterative methods. Z. Vylist Math. Fiz. 4, 1–17 (1964)
Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)
Siegel, J. W.: Accelerated first-order methods: Differential equations and Lyapunov functions, arXiv:1903.05671v5 [math.OC] (2019)
Su, W., Boyd, S., Candès, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method. J. Mach. Learn. Res. 17, 1–43 (2016)
Acknowledgements
The authors would like to thank the two anonymous reviewers as well as the associated editor for their careful reading and their relevant suggestions and comments that helped considerably to improve this paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Auxiliary results
Auxiliary results
1.1 Finite-time convergence of the continuous dynamic
Theorem 7
Let \(f:{\mathcal H}\rightarrow {\mathbb {R}}\) be a \(\mathcal C^1\) function whose gradient is Lipschitz continuous, and let \(\phi : {\mathcal H}\rightarrow {\mathbb {R}}\) be a convex continuous function that satisfies (DF). Suppose that the function \(\gamma : [t_0, +\infty [ \rightarrow {\mathbb {R}}_+\) belongs to \(L^1 ([t_0, T])\) for any \(T>t_0\). Then, the following properties hold:
a) For any Cauchy data \((x_0, \dot{x}_0 ) \in {\mathcal H}\times {\mathcal H}\), there exists a unique strong global solution of the Heavy Ball system with Dry Friction
satisfying \(x(t_0) = x_0\), and \(\dot{x}(t_0)=\dot{x}_0 \).
b) For any solution trajectory x of \(\mathrm{(HBDF)} \) we have:
(i) \(\Vert \dot{x}\Vert \in L^1([t_0,+\infty [,{\mathbb {R}})\), and therefore \(x_\infty :=\) \(\lim _{t\rightarrow +\infty } x(t)\) exists.
(ii) The limit point \(x_\infty \) is an equilibrium point of \(\mathrm{(HBDF)} \), i.e.
(iii) If
then there exists \(t_1\ge 0\) such that \(x(t)=x_\infty \) for every \(t\ge t_1\).
Proof
An existence proof based on a regularization technique, by using the Moreau-Yosida approximation of \(\phi \), was given in [2] in a finite dimensional setting. We present here an original proof of the existence and uniqueness part a) of Theorem 7, in a general Hilbert space, which is based on the study of evolution equations governed by the Lipschitz perturbation of maximally monotone operators (see [21]). It is used in an essential way that \(\nabla f\) is Lipschitz continuous over the entire space \({\mathcal H}\).
Write \(\mathrm{(HBDF)} \) as
Setting \(u(t):= \dot{x}(t)\), this amounts to solving the first-order evolution equation
with the Cauchy data \(u(t_0)= \dot{x}_0\). Let us introduce the (non-local) operator
Thus, we have to solve
For any two trajectories u and v, we have
where L is the Lipschitz constant of \(\nabla f \). Following the approach developed in [21, Proposition 3.12, page 106], we consider the sequence \((u_n)\) defined recursively by
Given \(u_n\), the existence and uniqueness of \(u_{n+1}\) solution of (68 )with \(u_{n+1}(0) = \dot{x}_0\) is ensured by the classical results concerning the evolution equations governed by subdifferentials of convex functions (see [21, Theorem 3.6, page 72], [7, Theorem 17.2.5]). Let’s give \(T >t_0\). According to the above Lipschitz continuity property of F, the monotonicity of \(\partial \phi \), and \(\gamma (t) \ge 0\), we have for all \(0 \le t \le T\)
which gives
This implies that \((u_n)\) is a Cauchy sequence for the uniform convergence on \([t_0,T]\). Consequently, it converges uniformly on \([t_0,T]\) to a solution u of (67). So, this uniquely defines \(u=\dot{x}\), and at the same time x which is given by to \(x(t)= x_0 + \int _{t_0}^t u(\tau )d\tau \).
For part b), we refer to [2, Theorem 3.2 ].
Remark 8
With the condition \(-\nabla f(x_\infty )\not \in \text{ boundary }(\partial \phi (0)),\) the finite-time convergence of the trajectory to a stationary point of the dynamic (HBDF) is ensured, i.e. there exists \(t_1\ge 0\) such that \(x(t)=x_\infty \) for every \(t\ge t_1\). In addition, an estimate of the final time could be given. In fact, we can show, by integrating the differential inequality satisfied by \(\alpha (t)= \Vert \dot{x}(t)\Vert ^2\)
that
where \(t_0\) is the first time instant such that
We refer to [3] for Lagrangian systems.
Remark 9
The conclusions of Theorem 7 are valid under the key assumption \(-\nabla f(x_\infty ) \not \in \text{ boundary }(\partial \phi (0))\). Since the boundary of the convex set \(\partial \phi (0)\) has an empty interior, it is reasonable to think that the circumstances leading to the relation \(-\nabla f(x_\infty )\in \text{ boundary }(\partial \phi (0))\) are “exceptional”. More precisely, we conjecture that generically with respect to the initial data \((x_0,\dot{x}_0)\in {\mathbb {R}}^n \times {\mathbb {R}}^n\), the point \(x_\infty =\lim _{t\rightarrow +\infty }x(t)\) satisfies the condition \(-\nabla f(x_\infty ) \not \in \text{ boundary }(\partial \phi (0))\). Consequently, this would give a generic finite-time stabilization result in the case of dry friction.
Let us give a counter-example to convergence in finite-time when the condition \(-\nabla f(x_\infty ) \not \in \text{ boundary }(\partial \phi (0))\) is not satisfied, i.e. \(-\nabla f(x_\infty ) \in \text{ boundary }(\partial \phi (0))\). For that purpose, take \({\mathcal H}={\mathbb {R}}\), \(\phi :=|\,.\,|\) (so that \(\partial \phi (0)=[-1,1]\)), \(\gamma =2\) and \(f:=|\,.\,|^2/2\). The differential inclusion \(\mathrm{(HBDF)}\) then reads
Let us choose as initial conditions \(x(0)=-2\) and \(\dot{x}(0)=1\). The unique solution of \(\mathrm{(HBDF)}\) is given by \(x(t)=-1-e^{-t}\), \(t\ge 0\). The trajectory tends toward the value \(x_\infty =-1\), which satisfies \(-f'(x_\infty )=1 \in \text{ boundary }(\partial \phi (0))\). However the convergence does not hold in a finite-time.
Remark 10
It is natural to know if convergence in finite-time is specific to the dry friction situation \(0\in \mathrm{int}(\partial \phi (0))\). To answer this question Amann-Diaz [5] and Diaz-Linan [24] considered the damped oscillator in \({\mathcal H}= {\mathbb {R}}\)
where \(\alpha \in ]0,1[\). This corresponds to a sub-linear friction, the case of dry friction corresponds to the limiting case \(\alpha =0\). They have shown the existence of two curves in the phase space such that, for the solution trajectories with initial data \((x_0, \dot{x}_0)\) belonging to these two curves, there is finite-time stabilization at the origin. Using both energetic and geometrical arguments, they showed that for many other initial data, the solution tends to zero in infinite time, at the rate \(t^{-\frac{\alpha }{1-\alpha }}\).
Rights and permissions
About this article
Cite this article
Adly, S., Attouch, H. First-order inertial algorithms involving dry friction damping. Math. Program. 193, 405–445 (2022). https://doi.org/10.1007/s10107-020-01613-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-020-01613-y
Keywords
- Proximal-gradient algorithms
- Inertial methods
- Differential inclusion
- Dry friction
- Finite convergence
- Lasso problem