On the proximal point algorithm and its Halpern-type variant for generalized monotone operators in Hilbert space

In a recent paper, Bauschke et al. study ρ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}-comonotonicity as a generalized notion of monotonicity of set-valued operators A in Hilbert space and characterize this condition on A in terms of the averagedness of its resolvent JA.\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$J_A.$$\end{document} In this note we show that this result makes it possible to adapt many proofs of properties of the proximal point algorithm PPA and its strongly convergent Halpern-type variant HPPA to this more general class of operators. This also applies to quantitative results on the rates of convergence or metastability (in the sense of T. Tao). E.g. using this approach we get a simple proof for the convergence of the PPA in the boundedly compact case for ρ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}-comonotone operators and obtain an effective rate of metastability. If A has a modulus of regularity w.r.t. zerA\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$zer\, A$$\end{document} we also get a rate of convergence to some zero of A even without any compactness assumption. We also study a Halpern-type variant HPPA of the PPA for ρ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\rho $$\end{document}-comonotone operators, prove its strong convergence (without any compactness or regularity assumption) and give a rate of metastability.

H . This stems from the fact that for A being the subdifferential ∂ f of a proper, convex and lower semi-continuous function f : H → (−∞, ∞], zer A coincides with the set of minimizers of f . An important algorithm for the approximation of zeros of A is the Proximal Point Algorithm PPA [17,20] x n+1 := J γ n A x n , (γ n ) ⊂ (0, ∞), where J γ n := (I + γ n A) −1 : R(I + γ n A) → D(A) is the single valued resolvent of γ n A and A is assumed to satisfy some range condition such as D(A) ⊆ R(I + λA) for all λ > 0 so that the iteration is defined for x 0 ∈ D(A) ⊆ R(I + γ 0 A). Here D(A) and R(A) denote the domain and range of A respectively as defined for set-valued mappings (see e.g. [4,22]).
The range condition trivially holds for maximally monotone operators A such as ∂ f since then R(I + λA) = H . The crucial relation between A and J λA is that the set of zeros of A coincides with the fixed point set of J λA (which, therefore, in particular does not depend on the choice of λ > 0). If A is monotone, then J λA is firmly nonexpansive so that many results from metric fixed point theory apply (see e.g. [4] for all this). In order to be able to treat functions f which are not necessarily convex, one needs to weaken the requirement of A to be monotone from to e.g. stipulating where now ρ may also be negative (see e.g. [7,8]). In the recent paper [5], this condition-called ρ-comonotonicity-is thoroughly investigated and related to properties of J A . One key result is that J A is an averaged mapping whenever (++) holds with ρ > − 1 2 . The averaged mappings form a larger class of mappings than the firmly nonexpansive ones but still have nice properties, e.g. they are strongly nonexpansive. In the recent papers [12,13], we studied from a quantitative point of view the PPA as well as a strongly convergent so-called Halpern-type variant HPPA (in Banach spaces) making use essentially only of the fact that all firmly nonexpansive mappings have a common so-called modulus for being strongly nonexpansive (see [10]). This also holds true for the class of averaged mappings if we have some control on the averaging constant (see [21]). Putting all this together, it is rather straightforward to see that the main results on the PPA and HPPA established in [12,13] generalize (in the case of Hilbert spaces) to ρ-comonotone operators which is the content of this short note. While the PPA has been considered for ρ-comonotone operators before (even for sequences of operators, error terms and relaxations, see [7]) our note shows that by the connection between the comonotonicity of A and the averagedness of J A as established in [5], many proofs for properties of the PPA and the HPPA for monotone operators can be easily adapted to cover the ρ-comonotone case. We also provide new quantitative results on the convergence. For the HPPA, to the best of our knowledge, our note provides the first results in the absence of monotonicity.

Preparatory results
Throughout this paper H is a real Hilbert space and A ⊆ H × H a set-valued operator with the usual definitions of D(A) and zer A. D(A) denotes the topological closure of D(A). We always assume that D(A) = ∅.
In the case where ρ < 0 which we are interested in, ρ-comonotonicity has been studied before in [7] under the name of |ρ|-cohypomonotonicity in the context of proximal methods as discussed in the introduction (see also Remark 3.4 below). Let J A := (I + A) −1 be the resolvent of A.
(2) follows as in [3][p.105] using Proposition 2.2 which is applicable since-by (1) and (2) we get Definition 2.5 [6]. Let C ⊆ H be a nonempty subset of H and T : C → H be a mapping.
T is called strongly nonexpansive (SNE) if T is nonexpansive and for all sequences (x n ), (y n ) in H the following implication is true:

Lemma 2.6 [10, Lemma 2.2] T : C → H is strongly nonexpansive iff T has as an
The proof of [21, Proposition 2.7] establishes: Proposition 2.7 [21]. Let C ⊆ H be some subset of H and T : C → H be an α-averaged mapping for some α ∈ (0, 1). Then T is strongly nonexpansive with SNE-modulus Then for each n ∈ N, J γ n A : is a strongly nonexpansive sequence of mappings C → C in the sense of the papers [1,2].
Proof By the assumptions and Lemma 2.3, γ n A is (ρ/γ n )-comonotone and so, since The claim now follows from Proposition 2.7.

Define
Then u n ∈ Ax n+1 , lim n→∞ u n = 0 and Proof 1) Let p ∈ zer A. Then by Propositions 2.2 and 2.4 (using that γ n A is > − 1 2 > −1 comonotone) Hence by Proposition 2.8 By Proposition 2.4(3) (which is applicable in the nontrivial case where n ≥ 1 due to x n ∈ D(A) and the range condition) and so also lim n→∞ x n − J γ 0 A x n = 0. The lim inf-bound is proved as in [13, Proposition 2.1] using Proposition 2.8. We include the proof here for completeness: Let L ∈ N and δ > 0. Then there exists an n ∈ N with L ≤ n ≤ L + b/δ + 1 such that Now fix δ := ω α (b, ε). Then Proposition 2.8 implies the existence of an n with L ≤ n ≤ (ε, L, b) such that 2) is immediate from 1).
The PPA for maximally monotone operators, while being weakly convergent, fails to be strongly convergent as shown in [9]. In the boundedly compact (i.e. finite dimensional) case there is in general no computable rate of convergence unless some strong metric regularity assumption is made (see [19] and [11]). However, in the boundedly compact case, one can get effective rates of metastability in the sense of T. Tao [24,25] for the Cauchy property of (x n ), i.e. ∀ε > 0 ∀g : N → N ∃n ≤ (ε, g) ∀i, j ∈ [n, n + g(n)] x i − x j < ε .
Note that, noneffectively, this property implies the Cauchy property of (x n ) and hence the existence of a limit x but does not allow one to convert into an effective rate of convergence. One can additionally ensure that for i ∈ [n, n + g(n)], x i is an approximate zero of A which guarantees that x is a zero of A. We now extend our rate of metastability for the PPA from [13] to the ρ-comononotone case:

Theorem 3.2 Let A be as above and assume additionally that D(A) ⊆ ∞
n=0 R(I + γ n A) is boundedly compact and x 0 ∈ D(A). Then (x n ) strongly converges to a zero of A. Moreover, the rate of metastability from [13, Theorem 2.12] also holds in our current situation with being replaced by our definition in Proposition 3.1(1), i.e.
( * ) ∀k ∈ N ∀g ∈ N N ∃n ≤ (k, g, β) ∀i, j ∈ [n, n + g(n)] Since (x n ) is metastable (the first part of ( * )), it is a Cauchy sequence and hence convergent with x := lim n x n ∈ D(A). By the extra clause 'x i ∈F k ' in ( * ), which strengthens the usual formulation of a rate of metastability, we can conclude that x ∈ zer A. Indeed, choosing in ( * ) for given N ∈ N the function g(n) := N we get an For the weak convergence in the noncompact case we reason as follows: let w be a weak sequential cluster point of (x n ). Then there is a subsequence (x n k ) which weakly converges to w. By Proposition 3.1(1) (x n k ) is an approximate fixed point sequence of J γ 0 A . Hence by Browder's demiclosedness principle ([4, Corollary 4.28]) applied to J γ 0 A and C it follows that w ∈ Fi x(J γ 0 A ). Hence we can-using again the fact that (x n ) is Fejér monotone w.r.t. Fi x(J γ 0 A )-conclude that (x n ) weakly converges to w ∈ Fi x(J γ 0 A ) = zer A by [4,Theorem 5.5].
An easy calculation shows that this implies that Also the converse holds: let δ > 0 be such that γ > −2ρ + δ. Then the condition γ +δ . Error terms u n subject to the condition that u n < ∞ (implied by condition (vi) in [7, Theorem 3.1]) can be incorporated even in the quantitative part of our theorem (similar to [15,Theorem 4.5]). Our approach makes the relevance of the averagedness of J γ n A explicit which only implicitly occurs in the proof of [7, Theorem 3.1].

Definition 3.5 [16] Let
As [13, Lemma 2.6] (but reasoning in the proof of zer F ⊆ zer A with -say-γ 0 A and J γ 0 A instead of A, J A ) one shows that Lemma 3.6 With F as defined in the previous definition, zer F = zer A and so (x n ) as defined by the PPA for A is Fejér monotone w.r.t. zer F = zer A, i.e.
As in the case of [13, Theorem 2.8] one now gets Proof The proof is largely identical to that of [13,Theorem 2.8]. We only have to observe that in that latter proof it suffices to have the existence of an n ≤ ρ(ε, b, γ ) (|F(x n+1 )| ≤ u n ≤ ε) (rather than that this holds for all n ≥ ρ(ε, b, γ )) and that this follows from Proposition 3.1(2).

The Halpern-type proximal point algorithm HPPA for comonotone operators
Whereas the PPA even for monotone operators A in general is not strongly convergent ( [9]) a Halpern-type variant strongly converges also for ρ-comonotone operators as we show in this section. Again we assume that (γ n ) ⊂ (0, ∞) with γ n ≥ γ > 0 for all n ∈ N and that A is ρ-comonotone with ρ ∈ (− γ 2 , 0] with zer A = ∅. Let C ⊆ H be a nonempty closed and convex subset such that D(A) ⊆ C ⊆ ∞ n=0 R(I + γ n A).   Then (x n ) strongly converges to the zero of A which is closest to u. Moreover, the rate of metastability from [12,Theorem 4.1] also holds for our current situation if ω η is replaced by ω α from Proposition 2.7 above with α := 1 2((ρ/γ )+1) and ω J (b, ε) := ε. Proof The strong convergence follows from [2, Theorem 3.1] whose assumptions are satisfied by Propositions 2.4(1), 2.8 and 4.2 using also that H has the fixed point property for nonexpansive mappings. The strong convergence also follows using [12,Theorem 4.1] which, moreover, gives the rate of metastability stated in the theorem. For this we only have to observe that the proof of [12,Theorem 4.1] only uses properties of J γ n A which by the results stated above also hold true for ρ-comonotone operators A where now we use ω α and Proposition 2.8 instead of ω η and [12, Lemma 2.4]. Finally, we note that we can take ω J (b, ε) := ε as modulus of uniform continuity for the normalized duality map on B(0, b) since we are in a Hilbert space.

Remark 4.4 Remark 3.3 applies here as well: if
A is maximally ρ-comonotone, then the range condition is satisfied for any closed and convex subset C ⊆ H satisfying D(A) ⊆ C.