Levenberg-Marquardt Dynamics Associated to Variational Inequalities

In connection with the optimization problem infx∈argminΨ{Φ(x)+Θ(x)},\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\underset{x\in\text{argmin}\Psi}{\inf}\{\Phi(x)+\Theta(x)\},$$\end{document} where Φ is a proper, convex and lower semicontinuous function and Θ and Ψ are convex and smooth functions defined on a real Hilbert space, we investigate the asymptotic behavior of the trajectories of the nonautonomous Levenberg-Marquardt dynamical system v(t)∈∂Φ(x(t))λ(t)ẋ(t)+v̇(t)+v(t)+∇Θ(x(t))+β(t)∇Ψ(x(t))=0,\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\left \{ \begin {array}{ll} v(t)\in \partial {\Phi }(x(t))\\ \lambda (t)\dot x(t) + \dot v(t) + v(t) + \nabla {\Theta }(x(t))+\beta (t)\nabla {\Psi }(x(t))=0, \end {array}\right .,$$\end{document} where λ and β are functions of time controlling the velocity and the penalty term, respectively. We show weak convergence of the generated trajectory to an optimal solution as well as convergence of the objective function values along the trajectories, provided λ is monotonically decreasing, β satisfies a growth condition and a relation expressed via the Fenchel conjugate of Ψ is fulfilled. When the objective function is assumed to be strongly convex, we can even show strong convergence of the trajectories.


Introduction
Throughout this manuscript H is assumed to be a real Hilbert space endowed with inner product ·, · and associated norm · = √ ·, · . When T : H → H is a C 1 operator with derivative T , the solving of the equation find x ∈ H such that T x = 0 can be approached by the classical Newton method, which generates an approximating sequence (x n ) n≥0 of a solution of the operator equation through In order to overcome the fact that the classical Newton method assumes the solving of an equation which is in general not well-posed, one can use instead the Levenberg-Marquardt method where Id : H → H denotes the identity operator on H, λ n a regularizing parameter and t n > 0 the step size.
When T : H ⇒ H is a (set-valued) maximally monotone operator, Attouch and Svaiter showed in [10] that the above Levenberg-Marquardt algorithm can be seen as a time discretization of the dynamical system v(t) ∈ T (x(t)) λ(t)ẋ(t) +v(t) + v(t) = 0 (1) for approaching the inclusion problem find x ∈ H such that 0 ∈ T x.
This includes as a special instance the problem of minimizing a proper, convex and lower semicontinuous function, when T is taken as its convex subdifferential. Later on, this investigation has been continued in [2] in the context of minimizing the sum of a proper, convex and lower semicontinuous function with a convex and smooth one.
In the spirit of [10], we approach in this paper the optimization problem where : H → R ∪ {+∞} is a proper, convex and lower semicontinuous function and , : H → R are convex and smooth functions, via the Levenberg-Marquardt dynamical system v(t) ∈ ∂ (x(t)) λ(t)ẋ(t) +v(t) + v(t) + ∇ (x(t)) + β(t)∇ (x(t)) = 0, where λ and β are functions of time controlling the velocity and the penalty term, respectively. If ∂ + N argmin is maximally monotone, then determining an optimal solution x ∈ H of (3) means nothing else than solving the subdifferential inclusion problem find x ∈ H such that 0 ∈ ∂ (x) + ∇ (x) + N argmin (x) (5) or, equivalently, solving the variational inequality find x ∈ argmin and v ∈ ∂ (x) such that v + ∇ (x), y − x ≥ 0 ∀y ∈ argmin . (6) We show weak convergence of the trajectory x(·) generated by (4) to an optimal solution of (3) as well as convergence of the objective function values along the trajectory to the optimal objective value, provided the assumption is fulfilled and the functions λ, β satisfy some mild conditions. If the objective function of (3) is strongly convex, the trajectory x(·) converges even strongly to the unique optimal solution of (3). The condition (7) has its origins in the paper of Attouch and Czarnecki [5], where the solving of inf x∈argmin (x), (8) for , : H → R ∪ {+∞} proper, convex and lower semicontinuous functions, is approached through the nonautonomous first order dynamical system by assuming that the penalizing function β : [0, +∞) → (0, +∞) tend to +∞ as t → +∞. Several ergodic and nonergodic convergence results have been reported in [5] under the key assumption (7). The paper of Attouch and Czarnecki [5] was the starting point of a remarkable number of research articles devoted to penalization techniques for solving optimization problems of type (3), but also generalizations of the latter in form of variational inequalities expressed with maximal monotone operators (see [3, 5-9, 12, 14-17, 20, 21]). In the literature enumerated above, the monotone inclusions problems have been approached either through continuous dynamical systems or through their discrete counterparts formulated as splitting algorithms. We speak in both cases about methods of penalty type, which means in this context that the operator describing the underlying set of the variational inequality under investigation is evaluated as a penalty functional. In the above-listed references one can find more general formulations of the key assumption (7), but also further examples for which these conditions are satisfied. In Remark 5 and Remark 6 we provide more insights into the relations of the dynamical system (14) to other continuous systems (and their discrete counterparts) from the literature.
The results we obtain in this paper are in the spirit of Attouch-Czarnecki [5]. However, since the dynamical system we focus on is a combination of two different types of dynamical systems, the asymptotic analysis is more involved, in the sense that one has to take into consideration the particularities of both continuous systems.

Preliminaries
In this section we present some preliminary definitions, results and tools that will be useful throughout the paper. We consider the following definition of an absolutely continuous function.
(ii) x is continuous and its distributional derivative is Lebesgue integrable on [0, b]; (iii) for every ε > 0, there exists η > 0 such that for any finite family of intervals Remark 1 (a) It follows from the definition that an absolutely continuous function is differentiable almost everywhere, its derivative coincides with its distributional derivative almost everywhere and one can recover the function from its derivativeẋ = y by the integration formula (i). The following results, which can be interpreted as continuous counterparts of the quasi-Fejér monotonicity for sequences, will play an important role in the asymptotic analysis of the trajectories of the dynamical system investigated in this paper. For the proof of Lemma 2 we refer the reader to [2, Lemma 5.1]. Lemma 3 follows by using similar arguments as used in [2, Lemma 5.2]. Lemma 2 Suppose that F : [0, +∞) → R is locally absolutely continuous and bounded from below and that there exists G ∈ L 1 ([0, +∞)) such that for almost every t ∈ [0, +∞) Then there exists lim t→+∞ F (t) ∈ R.
The next result which we recall here is the continuous version of the Opial Lemma. Then there exists x ∞ ∈ S such that x(t) converges weakly to x ∞ as t → +∞.

A Levenberg-Marquardt Dynamical System: Existence and Uniqueness of the Trajectories
Consider the optimization problem where H is a real Hilbert space and the following conditions hold: Here, dom = {x ∈ H : (x) < +∞} denotes the effective domain of the function .
In connection with (13), we investigate the nonautonomous dynamical system where x 0 , v 0 ∈ H and for (x) ∈ R and ∂ (x) := ∅ for (x) ∈ R, denotes the convex subdifferential of . We denote by dom ∂ = {x ∈ H : ∂ (x) = ∅} the domain of the operator ∂ .
Furthermore, we make the following assumptions regarding the functions of time controlling the velocity and the penalty: Let us mention that due to (H 1 λ ),λ(t) exists for almost every t ≥ 0.
Remark 5 (a) In case (x) = 0 for all x ∈ H, the dynamical system (14) becomes The asymptotic convergence of the trajectories generated by (15) has been investigated in [5] under the assumption λ(t) = 1 for all t ≥ 0, for and nonsmooth functions, by replacing their gradients with convex subdifferentials and, consequently, by treating the differential equation as a monotone inclusion (see (9)).
has been investigated in [2] (see, also, [10], for the situation when (x) = 0 for all The dynamical system (17) has been considered in [1] in connection with the problem of finding the minimal norm elements among the minima of , namely, (see also [4] and [9, Section 3]) inf x∈argmin In contrast to (14), where the function describing the constrained set of (13) is penalized, in (17) the objective function of (18) is penalized via a vanishing penalization function (see [1]).
In the following we specify what we understand under a solution of the dynamical system (14).

Definition 2
We say that the pair (x, v) is a strong global solution of (14), if the following properties are satisfied: Similarly to the techniques used in [10], we will show the existence and uniqueness of the trajectories generated by (14) by converting it to an equivalent first order differential equation with respect to z(·), defined by where To this end we will make use of the resolvent and Yosida approximation of the convex subdifferential of . For γ > 0, we denote by the resolvent of γ ∂ . Due to the maximal monotonicity of ∂ , the resolvent J γ ∂ : H → H is a single-valued operator with full-domain, which is, furthermore, nonexpansive, that is 1-Lipschitz continuous. Let us notice that the resolvent of the convex subdifferential is nothing else than the the proximal point operator and for all x ∈ H we have The Yosida regularization of ∂ is defined by and it is γ −1 -Lipschitz continuous. For more properties of these operators we refer the reader to [11]. Assume now that (x, v) is a strong global solution of (14). From (19) we have for every thus, from the definition of the resolvent we derive that relation (ii) in Definition 2 is equivalent to From (19), (20) and the definition of the Yosida regularization we obtain Further, by differentiating (19) and taking into account (iii) in Definition 2, we get for almost every t ∈ [0, +∞) Taking into account (20), (21) and (22) we conclude that z defined in (19) is a strong global solution of the dynamical system Vice versa, if z is a strong global solution of (23), then one obtains via (20) and (21) a strong global solution of (14).
Remark 6 By considering the time discretizationż(t) ≈ z n+1 −z n h n of the above dynamical system and by taking μ constant, from (20) and (23) we obtain the iterative scheme which for h n = 1 yields the following algorithm The convergence of the above algorithm has been investigated in [16] in the more general framework of monotone inclusion problems, under the use of variable step sizes (μ n ) n≥0 and by assuming that which is a condition that can be seen as a discretized version of the one stated in (7). The case (x) = 0 for all x ∈ H has been treated in [8] (see also the references therein).
Next we show that, given x 0 , v 0 ∈ H and by assuming (H 1 λ ) and (H 1 β ), there exists a unique strong global solution of the dynamical system (23). This will be done in the framework of the Cauchy-Lipschitz Theorem for absolutely continuous trajectories (see for example [19, Proposition 6.2.1], [22,Theorem 54]). To this end we will make use of the following Lipschitz property of the resolvent operator as a function of the step size, which actually is a consequence of the classical results [
Notice that the dynamical system (23) can be written as where (28) In the following we denote by L ∇ and L ∇ the Lipschitz constants of ∇ and ∇ , respectively.
(a) Notice that for every t ≥ 0 and every w 1 , w 2 ∈ H we have Indeed, this follows (28), the Lipschitz properties of the operators involved and the definition of μ(t). Further, notice that due to (H 1 λ ) and (H 1 β ), which is for every t ≥ 0 equal to the Lipschitz-constant of f (t, ·), satisfies We fix w ∈ H and b > 0. Due to (H 1 λ ), there exist λ min , λ max > 0 such that Relying on Proposition 7 we obtain for all t ∈ [0, b] the following chain of inequalities: Now (30) follows from the properties of the functions μ and β, and the fact that In the light of the statements proven in (a) and (b), the existence and uniqueness of a strong global solution of the dynamical system (23) follow from [19, Proposition 6.2.1] (see also [22,Theorem 54]).
Finally, similarly to the proof of [10, Theorem 2.4(ii)], one can guarantee the existence and uniqueness of the trajectories generated by (14) by relying on the properties of the dynamical system (23) and on (20) and (21). The details are left to the reader.

Convergence of the Trajectories and of the Objective Function Values
In this section we prove weak convergence for the trajectory generated by the dynamical system (14) to an optimal solution of (13) as well as convergence for the objective function values of the latter along the trajectory. Some techniques from [5] and [10] will be useful in this context.
Remark 8 (a) The conditionλ(t) ≤ 0 for almost every t ∈ [0, +∞) has been used in [10] in the study of the asymptotic convergence of the dynamical system (1), when approaching the monotone inclusion problem (2). (b) Under (H ), due to ≤ δ argmin , we have * ≥ δ * argmin = σ argmin . (c) When = 0 (see Remark 5(b)), it holds N argmin (x) = {0} for every x ∈ argmin = H, * = σ argmin = δ {0} , which shows that in this case (H ) trivially holds. (d) A nontrivial situation in which condition (H ) is fulfilled is when ψ(x) = 1 2 inf y∈C x − y 2 , for a nonempty, convex and closed set C ⊆ H (see [5]). Then (7) holds if and only if which holds when 0 ∈ sqri(dom − argmin ), a condition that is fulfilled, if is continuous at a point in dom ∩ argmin or int(argmin ) ∩ dom = ∅ (we invite the reader to consult also [11,13] and [23] for other sufficient conditions for the above subdifferential sum formula). Here, for M ⊆ H a convex set, sqri M := {x ∈ M : ∪ λ>0 λ(M − x) is a closed linear subspace of H} denotes its strong quasi-relative interior. We always have int M ⊆ sqri M (in general this inclusion may be strict). If H is finite-dimensional, then sqri M coincides with ri M, the relative interior of M, which is the interior of M with respect to its affine hull.
The following differentiability result of the composition of convex functions with absolutely continuous trajectories that is due to Brézis (see [18,Lemme 4,p. 73] and also [5, Lemma 3.2]) will play an important role in our analysis.  → f (x(t)) is absolutely continuous and for every t such that x(t) ∈ dom ∂f we have We start our convergence analysis with the following technical result. (14). Then the following statements are true: The only condition which has to be checked isẋ ∈ L 2 ([0, T ], H). By considering the second relation in (14) and by inner multiplying it withẋ(t), we derive for almost every t ∈ [0, T ]

Lemma 10 Assume that (H ), (H ), (H ), (H 1 λ ) and (H 2 β ) hold and let (x, v) : [0, +∞) → H × H be a strong stable solution of the dynamical system
Using (i) we obtain for almost every t ∈ [0, T ]  (14). Choose arbitrary z ∈ S and p ∈ N argmin (z) such that −p − ∇ (z) ∈ ∂ (z). Define g z , h z : [0, +∞) → [0, +∞) as The following statements are true: Proof For the beginning, we notice that from the definition of S and ( H ) we have (33) For almost every t ≥ 0 it holds according to (14) d dt From (14) and the convexity of , and we have for every t ∈ [0, +∞) (37) From (33) and the convexity and we obtain for every t ∈ [0, +∞) and (39) Further, due to Lemma 10(ii) it holds for almost every t ∈ [0, +∞) On the other hand, using (32) and the Young-Fenchel inequality we obtain for every t ∈ [0, +∞) Finally, we obtain for almost every t ∈ [0, +∞) d dt where the first inequality follows from (41), the second one from (38) the conclusion follows from Lemma 2, (H ) and the fact that g z (t) ≥ 0 for every t ≥ 0. (ii) Let F : [0, +∞) → R be defined by

From (42) we have for almost every
By integration we obtain for every t ∈ [0, +∞) hence F is bounded from below. Furthermore, from (41) we derive for every t ∈ [0, +∞) .
From (H ) and Lemma 2 it follows that lim t→+∞ F (t) exists and it is a real number. Hence Further, since ψ ≥ 0, we obtain for every t ∈ [0, +∞) Similarly to (41) one can show that for every t ∈ [0, +∞) while from (42) we obtain that for almost every t ∈ [0, +∞) it holds d dt By using the same arguments as used in the proof of (43) it yields that Finally, from (43) and (44)  (viii) Combining (vi) and (vii) with g z , h z ≥ 0, we easily derive that we deduce that g z ∈ L 1 ([0, +∞)) and h z ∈ L 1 ([0, +∞)). Finally, notice that due to (i) there exists T > 0 such that g z is bounded on [T , +∞). The boundedness of g z on [0, T ] follows from (38) and the continuity of x and v. Thus, g z ∈ L ∞ ([0, +∞)).
In order to proceed with the asymptotic analysis of the dynamical system (14), we make the following more involved assumptions on the functions λ and β, respectively:  (14). The following statements are true: Proof Take an arbitrary z ∈ S and (according to ( H )) p ∈ N argmin (z) such that −p − ∇ (z) ∈ ∂ (z) and consider the functions g z , h z defined in Lemma 11.
Using now Lemma 11(iv) we get and, since ( + )(x(t)) is bounded from below, this limes inferior is a real number. Let (t n ) n∈N be a sequence with lim n→+∞ t n = +∞ such that Since and lim n→+∞ β(t n ) = +∞, it yields that lim n→+∞ E 1 (t n ) = 0. Thus, since lim t→+∞ The statement follows by taking into consideration that for every t ∈ [0, +∞) in combination with lim t→+∞ β(t) = +∞.

Lemma 13 Assume that (H ), (H ), (H ), (H 3 λ ), (H 3 β ), (H ) and ( H ) hold and let (x, v)
: [0, +∞) → H × H be a strong stable solution of the dynamical system (14). Then Proof Take an arbitrary z ∈ S. From ( H ) there exists p ∈ N argmin (z) such that −p − ∇ (z) ∈ ∂ (z). From Lemma 11(iii) we get We claim that lim inf Since according to the previous lemma x is bounded, this limit inferior is a real number. Let (t n ) n∈N be a sequence with lim n→+∞ t n = +∞ such that Using again that x is bounded, there exists x ∈ H and a subsequence (x(t n k )) such that (x(t n k )) k≥0 converges weakly to x as k → +∞. From (49) we derive Since is weak lower semicontinuous, from Lemma 12(ii) we get From (50) and (47) we conclude that (48) is true. Moreover, due to −p − ∇ (z) ∈ ∂ (z),

Remark 14
One can notice that the conditionβ ≤ kβ has not been used in the proofs of Lemma 12 and Lemma 13.
Taking into account the definition of E 2 and the fact that lim t→+∞ E 2 (t) ∈ R, we conclude that lim t→+∞ E 2 (t) = ( + )(z).
Let (x, v) : [0, +∞) → H × H be a strong stable solution of (14) and assume that x is bounded. In other words, there exists M > 0 such that Take z ∈ dom ∩ argmin and r > 0 such that r > max{ z , M}.
In the following we denote by B(0, r) the closed ball centered at origin with radius r. We will use as follows several times the fact that the normal cone to a set at an element belonging to the interior of this set reduces to {0}. Due to (58), we consequently have ∂ + δ B(0,r) (x(t)) = ∂ (x(t)) + N B(0,r) (x(t)) = ∂ (x(t)) ∀t ≥ 0.
The later implies that the set of optimal solutions to (13) is nonempty, which contradicts the assumption we made. In this way, our claim is proved.
In the last result we show that if the objective function of (13) is strongly convex, then the trajectory x(·) generated by (14) converges strongly to the unique optimal solution of (13).  (14). If + is strongly convex, then x(t) converges strongly to the unique optimal solution of (13) as t → +∞.
Proof Let γ > 0 be such that + is γ -strongly convex. It is a well-known fact that in case the optimization problem (13) has a unique optimal solution, which we denote by z.