Fast Convergence of Inertial Gradient Dynamics with Multiscale Aspects

Ren, Haixin; Ge, Bin; Zhuge, Xiangwu

doi:10.1007/s10957-022-02124-w

Fast Convergence of Inertial Gradient Dynamics with Multiscale Aspects

Published: 29 October 2022

Volume 196, pages 461–489, (2023)
Cite this article

Journal of Optimization Theory and Applications Aims and scope Submit manuscript

330 Accesses
1 Altmetric
Explore all metrics

Abstract

In this paper, the asymptotic properties as $ t\rightarrow +\infty $ of the following second-order differential equation in a Hilbert space $ {\mathcal {H}} $ are studied,

$$\begin{aligned} \ddot{x}(t)+\gamma (t){\dot{x}}(t)+\beta (t)\Big (\nabla \Phi (x(t))+\epsilon (t)\nabla U(x(t))\Big )=0, \end{aligned}$$

where $ \Phi ,U:{\mathcal {H}}\rightarrow {\mathbb {R}} $ are convex differentiable, $ \gamma (t) $ is a positive damping coefficient, $ \beta (t) $ is a time scale coefficient and $ \epsilon (t) $ is a positive nonincreasing function, $\gamma (t)$, $ \beta (t) $ and $ \epsilon (t) $ are all continuously differentiable. This system has applications in the fields of mechanics and optimization. Based on the proper tuning of $ \gamma (t) $ and $ \beta (t) $, we obtain the convergence rates for the values, and the conclusion is that, under the different conditions, the trajectories either converge to one minimizer of $ \Phi $ weakly, or converge to one common minimizer of $ \Phi $ and U weakly. When $ \epsilon (t) $ tends to 0 as t goes to infinity, under the condition that $ \Phi $ or U is convex, the trajectories converge to the unique minimizer of $ \Phi $, or the unique minimizer of U, respectively. Finally, some particular cases are examined, and some numerical experiments are conducted to illustrate our main results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity

Article 24 March 2016

Accelerated Gradient Methods Combining Tikhonov Regularization with Geometric Damping Driven by the Hessian

Article 31 May 2023

Optimal Convergence Rates for Damped Inertial Gradient Dynamics with Flat Geometries

Article 13 March 2023

Data Availability

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

References

Alvarez, F.: On the minimizing property of a second order dissipative system in Hilbert spaces. SIAM J. Control Optim. 38, 1102–1119 (2000). https://doi.org/10.1137/S0363012998335802
Article MathSciNet MATH Google Scholar
Attouch, H., Cabot, A.: Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity. J. Diff. Equ. 263, 5412–5458 (2017). https://doi.org/10.1016/j.jde.2017.06.024
Article MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168, 123–175 (2018). https://doi.org/10.1007/s10107-016-0992-8
Article MathSciNet MATH Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Fast convex optimization via time scaling of damped inertial gradient dynamics. HAL preprint. (2019) https://hal.archives-ouvertes.fr/hal-02138954. Accessed 24 May 2019
Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case $ \alpha \leqslant 3 $. Control Optim. Calc. Var. ESAIM (2019). https://doi.org/10.1051/cocv/2017083
Article MATH Google Scholar
Attouch, H., Chbani, Z., Riahi, H.: Combining fast inertial dynamics for convex optimization with Tikhonov regularization. J. Math. Anal. Appl. 457, 1065–1094 (2018). https://doi.org/10.1016/j.jmaa.2016.12.017
Article MathSciNet MATH Google Scholar
Attouch, H., Cominetti, R.: A dynamical approach to convex minimization coupling approximation with the steepest descent method. J. Diff. Equ. 128, 519–540 (1996). https://doi.org/10.1006/jdeq.1996.0104
Article MathSciNet MATH Google Scholar
Attouch, H., Czarnecki, M.O.: Asymptotic control and stabilization of nonlinear oscillators with non-isolated equilibria. J. Diff. Equ. 179, 278–310 (2002). https://doi.org/10.1006/jdeq.2001.4034
Article MathSciNet MATH Google Scholar
Attouch, H., Czarnecki, M.O.: Asymptotic behavior of gradient-like dynamical systems involving inertia and multiscale aspects. J. Diff. Equ. 262, 2745–2770 (2017). https://doi.org/10.1016/j.jde.2016.11.009
Article MathSciNet MATH Google Scholar
Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than $ 1/k^2 $. SIAM J. Optim. 26, 1824–1834 (2016). https://doi.org/10.1137/15M1046095
Article MathSciNet MATH Google Scholar
Boţ, R.I., Csetnek, E.R., László, S.C.: Tikhonov regularization of a second order dynamical system with Hessian driven damping. Math. Program. 189, 151–186 (2021). https://doi.org/10.1007/s10107-020-01528-8
Article MathSciNet MATH Google Scholar
Bolte, J.: Continuous gradient projection method in Hilbert spaces. J. Optim. Theory Appl. 119, 235–259 (2003). https://doi.org/10.1023/B:JOTA.0000005445.21095.02
Article MathSciNet MATH Google Scholar
Bruck, J., Ronald, E.: Asymptotic convergence of nonlinear contraction semigroups in Hilbert space. J. Funct. Anal. 18, 15–26 (1975). https://doi.org/10.1016/0022-1236(75)90027-0
Article MathSciNet MATH Google Scholar
Ge, B., Zhuge, X., Ren, H.: Convergence rates of damped inerial dynamics from multi-degree-of-freedom system. Optim. Lett. (2022). https://doi.org/10.1007/s11590-022-01855-z
Article MathSciNet MATH Google Scholar
Haraux, A.: Systèmes dynamiques dissipatifs et applications. Masson, Paris (1991)
MATH Google Scholar
Jendoubi, M.A., May, R.: On an asymptotically autonomous system with Tikhonov type regularizing term. Archiv der Mathematik 95, 389–399 (2010). https://doi.org/10.1007/s00013-010-0181-6
Article MathSciNet MATH Google Scholar
May, R.: Asymptotic for a second-order evolution equation with convex potential and vanishing damping term. Turkish J. Math. 41, 681–685 (2017). https://doi.org/10.3906/mat-1512-28
Article MathSciNet MATH Google Scholar
May, R.: On the strong convergence of the gradient projection algorithm with Tikhonov regularizing term. arXiv preprint. (2019) https://doi.org/10.48550/arXiv.1910.07873. Accessed 17 Oct 2019
May, R., Mnasri, C., Elloumi, M.: Asymptotic for a second order evolution equation with damping and regularizing terms. AIMS Math. 6, 4901–4914 (2021). https://doi.org/10.3934/math.2021287
Article MathSciNet MATH Google Scholar
Nesterov, Y.: A method for solving the convex programming problem with convergence rate $O(1/k^2)$. Sov. Math. Doklady 27, 372–376 (1983)
MATH Google Scholar
Nesterov, Y.: Introductory Lectures on Convex Optimization. Kluwer Academic Publishers, Boston (2004). https://doi.org/10.1007/978-1-4419-8853-9
Book MATH Google Scholar
Nesterov, Y.: Smooth minimization of non-smooth functions. Math. Program. 103, 127–152 (2005). https://doi.org/10.1007/s10107-004-0552-5
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140, 125–161 (2013). https://doi.org/10.1007/s10107-012-0629-5
Article MathSciNet MATH Google Scholar
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967). https://doi.org/10.1090/S0002-9904-1967-11761-0
Article MathSciNet MATH Google Scholar
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. Ussr Comput. Math. Math. Phys. 4, 1–17 (1964). https://doi.org/10.1016/0041-5553(64)90137-5
Article Google Scholar
Polyak, B.T.: Introduction to Optimization. Optimization Software Inc, New York (1987)
MATH Google Scholar
Sebbouh, O., Dossal, C., Rondepierre, A.: Convergence rates of damped inertial dynamics under geometric conditions and perturbations. SIAM J. Optim. 30, 1850–1877 (2020). https://doi.org/10.1137/19M1272767
Article MathSciNet MATH Google Scholar
Su, W., Boyd, S., Candes, E.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17:1-43. (2016) https://doi.org/10.48550/arXiv.1503.01243
Xu, B., Wen, B.: On the convergence of a class of inertial dynamical systems with Tikhonov regularization. Optim. Lett. 15, 2025–2052 (2021). https://doi.org/10.1007/s11590-020-01663-3
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 11201095), the Fundamental Research Funds for the Central Universities (Grant No. 3072022TS2402), the Postdoctoral Research Startup Foundation of Heilongjiang (Grant No. LBH-Q14044), and the Science Research Funds for Overseas Returned Chinese Scholars of Heilongjiang Province (Grant No. LC201502).

Author information

Authors and Affiliations

Harbin Engineering University, Harbin, People’s Republic of China
Haixin Ren, Bin Ge & Xiangwu Zhuge

Authors

Haixin Ren
View author publications
You can also search for this author in PubMed Google Scholar
Bin Ge
View author publications
You can also search for this author in PubMed Google Scholar
Xiangwu Zhuge
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Bin Ge.

Ethics declarations

Conflicts of Interest

The authors declare no conflict of interest.

Additional information

Communicated by Radu Ioan Bot.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

In what follows, we prove the existence and the uniqueness of a global solution to the Cauchy problem associated with the evolution system (1). We will use the following lemma, which have been established in [15, Prop. 6.2.1].

Lemma A.1

Let $ F:I\times {\mathcal {X}}\rightarrow {\mathcal {X}} $ where $ I=[t_0,+\infty ) $ and $ {\mathcal {X}} $ is a Banach space. Assume that

(i)
for every $ x\in {\mathcal {X}} $, $ F(\cdot ,x)\in L^1_{loc}(I,{\mathcal {X}}) $;
(ii)
for a.e. $ t\in I $, for every $ x,y\in {\mathcal {X}} $,
$$\begin{aligned} \left\| F(t,x)-F(t,y)\right\| \leqslant K(t,\left\| x\right\| +\left\| y\right\| )\left\| x-y\right\| \end{aligned}$$
where $ K(\cdot ,r)\in L^1_{loc}(I), \forall r\in {\mathbb {R}}_+$;
(iii)
for a.e. $ t\in I$, for every $ x\in {\mathcal {X}} $,
$$\begin{aligned} \left\| F(t,x)\right\| \leqslant P(t)(1+\left\| x\right\| ), \end{aligned}$$
where $ P\in L^1_{loc}(I)$.

Then, for every $ s\in I, x\in {\mathcal {X}} $, there exists a unique solution $ u_{s,x}\in W_{loc}^{1,1}(I,{\mathcal {X}}) $ of the Cauchy problem:

$$\begin{aligned} {\dot{u}}_{s,x}(t)=F(t,u_{s,x}(t)) \text { for a.e. } t\in I , \text { and } u_{s,x}(s)=x . \end{aligned}$$

The following theorem gives the existence and uniqueness of system (1).

Theorem A.1

Suppose that $ \Phi ,U:{\mathcal {H}}\rightarrow {\mathbb {R}} $ are convex, $ {\mathcal {C}}^1 $, with Lipschitz continuous gradient $ \nabla \Phi $ and $ \nabla U $. Assume that $ \beta ,\gamma ,\epsilon :[t_0,+\infty )\rightarrow {\mathbb {R}}_+^* $ are locally integrable. Then, the evolution system (1), with initial condition $ (x(t_0),{\dot{x}}(t_0))=(x_0,{\dot{x}}_0)\in {\mathcal {H}}\times {\mathcal {H}} $, admits a unique global solution $ x:[t_0\rightarrow +\infty ) \rightarrow {\mathcal {H}}$.

Proof

To prove the existence and uniqueness for (1) with initial condition $ (x(t_0),{\dot{x}}(t_0))=(x_0,{\dot{x}}_0) $, we formulate it in the phase space. Set $ I=[t_0,+\infty ) $, and define $ F:I\times {\mathcal {H}}\times {\mathcal {H}}\rightarrow {\mathcal {H}} $ by

$$\begin{aligned} F(t,x,y)=(y,-\gamma (t)y-\beta (t)(\nabla \Phi +\epsilon (t)\nabla U(x))) \end{aligned}$$

We set $ u(t)=(x(t),y(t)) $. The Cauchy problem for (1) can be equivalently formulated as:

$$\begin{aligned} \begin{aligned} {\dot{u}}(t)&=F(t,u(t))\quad \text {for a.e. } t\in I ,\\ u(t_0)&=(x_0,{\dot{x}}_0). \end{aligned} \end{aligned}$$

We are going to verify the three conditions of Lemma A.1.

(i) For each $ (x,y)\in {\mathcal {H}} \times {\mathcal {H}}$, $ F(\cdot ,x,y)\in L^1_{loc}(I,{\mathcal {H}}) $, since the functions $ \beta ,\gamma ,\epsilon $ are so.

(ii) Denote by $ L_1 $ and $ L_2 $ the Lipschitz constant of $ \nabla \Phi $ and $ \nabla U $. For every $ u=(x,y), u'=(x',y')\in {\mathcal {H}}\times {\mathcal {H}} $ and a.e. $ t\in I $,

$$\begin{aligned} \begin{aligned}&\left\| F(t,u)-F(t,u')\right\| \\&\quad ={}\left\| y-y'\right\| +\left\| \gamma (t)(y-y')+\beta (t)(\nabla \Phi (x)-\nabla \Phi (x')+\epsilon (t)(\nabla U(x)-\nabla U(x')))\right\| \\&\quad \leqslant {}(1+\gamma (t)+L_1\beta (t)+L_2\beta (t)\epsilon (t))(\left\| x-x'\right\| +\left\| y-y'\right\| )\\&\quad ={}(1+\gamma (t)+L_1\beta (t)+L_2\beta (t)\epsilon (t))\left\| (x,y)-(x',y')\right\| \end{aligned} \end{aligned}$$

Hence, the second condition is verified, since the real function $ t\mapsto 1+\gamma (t)+L_1\beta (t)+L_2\beta (t)\epsilon (t) $ belongs to $ L_{loc}^1(I,{\mathbb {R}}) $.

(iii) For every $ u=(x,y)\in {\mathcal {H}}\times {\mathcal {H}} $ and a.e. $ t\in I $,

$$\begin{aligned} \begin{aligned}&\left\| F(t,u)\right\| \\&\quad ={}\left\| y\right\| +\left\| \gamma (t)y+\beta (t)(\nabla \Phi (x)-\nabla \Phi (x_0)+\nabla \Phi (x_0) \right. \\&\qquad \left. +\epsilon (t)(\nabla U(x)-\nabla U(x_0)+\nabla U(x_0)))\right\| \\&\quad \leqslant {}(1+\gamma (t))\left\| y\right\| +(L_1\beta (t)+L_2\beta (t)\epsilon (t))\left\| x-x_0\right\| \\&\qquad +\beta (t)\left\| \nabla \Phi (x_0)\right\| +\beta (t)\epsilon (t)\left\| \nabla U(x_0)\right\| \\&\quad \leqslant {} v(t)(1+\left\| x\right\| +\left\| y\right\| ), \end{aligned} \end{aligned}$$

where

$$\begin{aligned} v(t)= & {} \max (1+\gamma (t),L_1\beta (t)+L_2\beta (t)\epsilon (t),\beta (t)((L_1+L_2\epsilon (t))\left\| x_0\right\| \\&+\nabla \Phi (x)+\epsilon (t)\nabla U(x))). \end{aligned}$$

Since $ v(\cdot )\in L_{loc}^1(I,{\mathbb {R}}) $, we conclude that all the conditions of Lemma A.1 are satisfied. Therefore, there exists a unique global solution of system (1) satisfying the initial condition $ (x(t_0),{\dot{x}}(t_0))=(x_0,{\dot{x}}_0) $. $\square $

The following lemma can be found in [26, Lemma 3.2].

Lemma A.2

Let $ \delta >0 $, and let $ w:[\delta ,+\infty )\rightarrow {\mathbb {R}} $ be a continuously differentiable function which is bounded from below. If the positive part $ [{\dot{w}}]_+ $ of $ {\dot{w}} $ belongs to $ L^1(\delta ,+\infty ) $, then $ \lim _{t\rightarrow +\infty }w(t) $ exists.

Proof

Since $ [{\dot{w}}]_+ $ belongs to $ L^1(\delta ,+\infty ) $, we have $ \int _{\delta }^{+\infty }{[{\dot{w}}(t)]_+\mathrm {d}t}<+\infty $. Let $ \theta (t)=w(t)-\int _{\delta }^{t}{[{\dot{w}}(t)]_+\mathrm {d}s} $, hence we obtain $ {\dot{\theta }}(t)=\dot{w }(t)-[{\dot{w}}(t)]_+\leqslant 0 $, which means that $ \theta (t) $ is nonincreasing. Since w(t) is bounded from below, we immediately deduce that $ \theta (t) $ is also bounded from below, it implies that $ \lim _{t\rightarrow +\infty }\theta (t) $ exists. Therefore,

$$\begin{aligned} \lim _{t\rightarrow +\infty }w(t)=\lim _{t\rightarrow +\infty }\theta (t)+\int _{\delta }^{+\infty }{[{\dot{w}}(t)]_+\mathrm {d}t} \end{aligned}$$

exists. $\square $

To establish the weak convergence of the solutions of (1), we shall use Opial’s lemma in [24], which has the continuous form. This argument was first used in [13] to obtain the convergence of nonlinear contraction semigroups.

Lemma A.3

Let S be a nonempty subset of $ {\mathcal {H}} $ and let $ x:[0,+\infty )\rightarrow {\mathcal {H}} $. Assume that

(i)
for every $ z\in S $, $ \lim _{t\rightarrow +\infty }\left\| x(t)-z\right\| $ exists;
(ii)
every weak sequential limit point of x(t) , as $ t\rightarrow +\infty $, belongs to S.

Then, x(t) converges weakly as $ t\rightarrow +\infty $ to a point in S.

Introducing three lemmas about $ \gamma (t) $ and $ \Gamma (t) $, which has been established in [2].

Lemma A.4

Let $ \Gamma $ defined by (3). If there exists $ m>0 $ such that $ \gamma (t)\leqslant m $ for every $ t\geqslant t_2 $. Then, we have $ \Gamma (t)\geqslant \frac{1}{m} $ for every $ t\geqslant t_2 $.

Lemma A.5

Let $ \Gamma $ defined by (3). If there exists $ t_1 \geqslant t_0 $ and $ a\in [0,1) $ such that $ \frac{{\dot{\gamma }}(t)}{\gamma (t)^2}\geqslant -a $ for every $ t\geqslant t_1 $. Then, we have $ \int _{t_0}^{+\infty }{\mathrm {e}^{-\int _{t_0}^{t}{\gamma (u)\mathrm {d}u}}}<+\infty $ and $ \Gamma (t)\leqslant \frac{1}{(1-a)\gamma (t)} $ for every $ t\geqslant t_1 $.

Lemma A.6

Let $ \Gamma $ defined by (3). If there exists $ c\in [0,1[$ such that $ \lim _{t\rightarrow +\infty }\frac{{\dot{\gamma }}(t)}{\gamma (t)^2}=-c $. Then we have $ \Gamma (t)\sim \frac{1}{(1-c)\gamma (t)} $ as $ t\rightarrow +\infty $.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ren, H., Ge, B. & Zhuge, X. Fast Convergence of Inertial Gradient Dynamics with Multiscale Aspects. J Optim Theory Appl 196, 461–489 (2023). https://doi.org/10.1007/s10957-022-02124-w

Download citation

Received: 13 June 2022
Accepted: 08 October 2022
Published: 29 October 2022
Issue Date: February 2023
DOI: https://doi.org/10.1007/s10957-022-02124-w

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Convergence of Inertial Gradient Dynamics with Multiscale Aspects

Abstract

Access this article

Similar content being viewed by others

Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity

Accelerated Gradient Methods Combining Tikhonov Regularization with Geometric Damping Driven by the Hessian

Optimal Convergence Rates for Damped Inertial Gradient Dynamics with Flat Geometries

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interest

Additional information

Publisher's Note

Appendix

Lemma A.1

Theorem A.1

Proof

Lemma A.2

Proof

Lemma A.3

Lemma A.4

Lemma A.5

Lemma A.6

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Fast Convergence of Inertial Gradient Dynamics with Multiscale Aspects

Abstract

Access this article

Similar content being viewed by others

Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity

Accelerated Gradient Methods Combining Tikhonov Regularization with Geometric Damping Driven by the Hessian

Optimal Convergence Rates for Damped Inertial Gradient Dynamics with Flat Geometries

Data Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflicts of Interest

Additional information

Publisher's Note

Appendix

Appendix

Lemma A.1

Theorem A.1

Proof

Lemma A.2

Proof

Lemma A.3

Lemma A.4

Lemma A.5

Lemma A.6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation