Abstract
We use techniques originating from the subdiscipline of mathematical logic called ‘proof mining’ to provide rates of metastability and—under a metric regularity assumption—rates of convergence for a subgradient-type algorithm solving the equilibrium problem in convex optimization over fixed-point sets of firmly nonexpansive mappings. The algorithm is due to H. Iiduka and I. Yamada who in 2009 gave a noneffective proof of its convergence. This case study illustrates the applicability of the logic-based abstract quantitative analysis of general forms of Fejér monotonicity as given by the second author in previous papers.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
In [11], a general logic-based analysis of abstract forms of convergence theorems based on general forms of Fejér monotonicity is given. That paper uses methods from the subdiscipline of mathematical logic called ‘proof mining’, which aims at the extraction of effective bounds from prima facie nonconstructive proofs by logical transformations (see [9] for a book treatment and [10] for a recent survey).
Even in most simple cases of ordinary Fejér monotonicity on the real line and with all the data involved trivially being computable, there, in general, are no computable rates of convergence as one can show using methods from computability theory (see the discussion in [11] and—in particular—[14]) which sharpen known ‘arbitrary slow convergence’ phenomena discussed in optimization to noncomputability results.
Logically speaking, this is because the formulation of the Cauchy-property of a sequence \((x_{n})_{n\in \mathbb {N}}\) (say in a metric space (X,d)), that is
is of the form ∀∃∀ which, in general, is of too high logical complexity (and thus not covered by the general logical metatheorems used in proof mining to extract bounds from noneffective proofs).
What can be achieved by the aforementioned logical metatheorems, however, are effective rates of so-called metastability which, moreover, are highly uniform. Metastability is based on a (noneffectively equivalent but constructively weakened) reformulation of the convergence or Cauchy statements into what is known in logic as Herbrand normal form. In the context of the above example, this reformulation is given by (here [n;n + g(n)] := {n,n + 1,n + 2,…,n + g(n)})
This statement is of the general form ∀∃ (considering the leading two universal quantifiers as one and disregarding the last universal quantifier as it is bounded) and for statements of the above form, the logical metatheorems of proof mining guarantee the extractability of highly uniform effective bounds on ‘\(\exists n\in \mathbb {N}\)’ (see [9]). Such bounds are by now well-known in the literature under the name of rates of metastability (after Tao, see, e.g. [16, 17]).
One important consequence of the Fejér monotonicity (in the very general sense of [11]) of an iterative sequence \((x_{n})_{n\in \mathbb {N}}\) is that effective rates of convergence can be established if some general form of regularity, provided quantitatively by a so-called modulus of regularity, which generalizes many concepts of regularity used in optimization, is given (see [13]). The existence of regularity usually requires to be in a rather special (‘tame’) context where the sets in question are, e.g. semialgebraic so that tools from the model theory of o-minimal structures can be utilized (see, e.g. [3, 6] and—for a concrete example—[4]).
Our paper is intended as a case study to illustrate how the abstract approach from [11, 13] can be used in a very concrete situation to give a perspicuous quantitative analysis of the algorithm being considered. The logic-based notions used in these papers now have a concrete mathematical meaning so that the whole treatment can be given without any reference to logic. We will indicate in the next subsection that we expect that many other algorithms can be analyzed in a similar way.
In [5], the following equilibrium problem over the fixed point set of a firmly nonexpansive mapping is studied: let \(f:\mathbb {R}^{N}\times \mathbb {R}^{N}\to \mathbb {R}\) be a function such that:
-
(1)
f(x,x) = 0 for any \(x\in \mathbb {R}^{N}\);
-
(2)
f(⋅,y) is continuous for any y;
-
(3)
f(x,⋅) is convex for any x.
Now, we can state the equilibrium problem for f (a so-called equilibrium function) over the fixed point set Fix(T) of T, where \(T:\mathbb {R}^{N}\to \mathbb {R}^{N}\) is a firmly nonexpansive mapping with a nonempty fixed point set.
Problem 1 (Equilibrium problem of f over Fix(T))
Find a
Iiduka and Yamada proposed a subgradient-type iterative algorithm \((x_{n})_{n\in \mathbb {N}}\) (see also [7] for a related algorithm also discussed in [5]) and showed that it converges to a point in EP(Fix(T),f). However, there is no quantitative information given in the theorem. In this paper, we provide explicit quantitative versions of this result as well as of several intermediate convergence results such as
As mentioned already, even for N = 1 and f ≡ 0 one can construct a simple computable T such that \((x_{n})_{n\in \mathbb {N}}\) has no computable rate by adapting a counterexample from [14].
Nevertheless, we present a fully effective and highly uniform rate of metastability for the algorithm providing a complete finitary account of the main result in [5].
Moreover, in Section 3, we even give a rate of convergence, modulo an additional metric regularity assumption.
1.1 Analytical preliminaries
Throughout, we consider \(\mathbb {R}^{N}\) (N ≥ 1) as the Euclidean space with the usual inner product 〈⋅,⋅〉 and the induced Euclidean norm \(\left \|\cdot \right \|\). With Br(x) and \(\overline {B_{r}(x)}\), we denote the open and closed ball with radius r > 0 and center \(x\in \mathbb {R}^{N}\) with respect to \(\left \|{\cdot }\right \|\), respectively.
Throughout, if not stated otherwise, let \(T:\mathbb {R}^{N}\to \mathbb {R}^{N}\) be a firmly nonexpansive mapping, that is for all \(x,y\in \mathbb {R}^{N}\):
In particular, T is also nonexpansive.
1.2 The subgradient method of Iiduka and Yamada
for the equilibrium problem utilizing the subgradient of the equilibrium function f:
Algorithm 2 (Subgradient-type method for Problem 1)
Choose ε0 ≥ 0, λ0 > 0 and \(x_{0}\in \mathbb {R}^{N}\) arbitrarily and define \(\rho _{0}:=\left \|x_{0}\right \|\) and set n = 0. Then, repeat:
-
Given \(x_{n}\in \mathbb {R}^{N}\) and ρn ≥ 0, choose εn ≥ 0 and λn > 0.
-
Find a point \(y_{n}\in K_{n}:=\overline {B_{\rho _{n}+1}(0)}\) such that
$$ f(y_{n},x_{n})\geq 0\text{ and }\max_{y\in K_{n}}f(y,x_{n})\leq f(y_{n},x_{n})+\varepsilon_{n}. $$ -
Choose ξn ∈ ∂f(yn,⋅)(xn) arbitrarily, define
$$ x_{n+1}:=T(x_{n}-\lambda_{n}f(y_{n},x_{n})\xi_{n})\text{ and }\rho_{n+1}:=\max\{\rho_{n},\left\|x_{n+1}\right\|\} $$and update n → n + 1.
As discussed in [5], this algorithm is based on the combination of ideas from two well-known algorithms, namely the hybrid steepest descent method of Yamada [19] and the scheme of Iusem and Sosa from [7]. Note that the approximate maximum point yn can be computed effectively due to the error εn whenever the latter is strictly positive and f(⋅,xn) comes with a modulus of uniform continuity on Kn while for εn = 0 there in general would be no computable point yn (see [8] for a discussion of this point in terms of complexity theory).
In [5], the following theorem on the correctness of the algorithm is established:
Theorem 3 (Iiduka and Yamada, 5)
Let Fix(T)≠∅. Assume that there is an M > 0 with \(\left \|\xi _{n}\right \|\leq M\) for all \(n\in \mathbb {N}\). Then, the sequences \((x_{n})_{n\in \mathbb {N}},(y_{n})_{n\in \mathbb {N}}\) generated by the algorithm satisfy:
-
(a)
For all \(u\in {\Omega }_{n}:=\{u^{\prime }\in \text {Fix}(T)\mid f(y_{n},u^{\prime })\leq 0\}\):
$$ \left\|x_{n+1}-u\right\|^{2}\leq\left\|x_{n}-u\right\|^{2}+\lambda_{n}(M^{2}\lambda_{n}-2)(f(y_{n},x_{n}))^{2}. $$In particular, if λn ∈ [0,2/M2]:
$$ \left\|x_{n+1}-u\right\|\leq\left\|x_{n}-u\right\|. $$ -
(b)
If \({\Omega }:=\bigcap _{n=1}^{\infty }{\Omega }_{n}\neq \emptyset \) and \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for some a,b ≥ 0 and all \(n\in \mathbb {N}\), then the sequences \((x_{n})_{n\in \mathbb {N}},(y_{n})_{n\in \mathbb {N}}\) are bounded and
$$ \underset{n\to\infty}{\lim}f(y_{n},x_{n})=0\text{ as well as }\underset{n\to\infty}{\lim}\left\|x_{n}-Tx_{n}\right\|=0. $$ -
(c)
If εn ≥ 0 for all n with \(\lim _{n\to \infty }\varepsilon _{n}=0\), in addition to the requirements for (b), then \((x_{n})_{n\in \mathbb {N}}\) converges to a point in EP(Fix(T),f).
In this paper, we establish quantitative versions of the claims in this theorem.
1.3 The range of the results
First, let us stress that to consider only sets of fixed points Fix(T) of a firmly nonexpansive mapping T in Problem 1 is indeed not limiting in the sense that it still allows us to consider arbitrary closed and convex sets C in place of Fix(T): for any such C, the metric projection PC is a firmly nonexpansive mapping (see, e.g. [1]) with Fix(PC) = C. See [5] for further considerations on this version of Problem 1 over C.
From that perspective, Problem 1 can be seen to indeed encompass many general notions and problems from convex optimization as special cases, including in particular the famous Nash-equilibrium problem (as treated in [5]) as well as the convex minimization problem, the variational inequality problem and the vector minimization problem, next to others (see [7]).
Moreover, allowing arbitrary firmly nonexpansive mappings T in place of plain projections PC can be beneficial in the concrete practical formulation of particular equilibrium problems, as, e.g. Iiduka and Yamada show in their work [5] for the example of the previously mentioned Nash-equilibrium problem. Here, while dealing with sets C where, on the one hand, PC may be computationally untractable, while, on the other hand, C can be given by (the intersection of) simple closed convex sets Ci whose projections \(P_{C_{i}}\) are tractable, a firmly nonexpansive mapping T can be defined using the tractable projections \(P_{C_{i}}\) which is not a projection itself but fulfills Fix(T) = C and inherits tractability from the \(P_{C_{i}}\).
And further, many practical choices of such sets C from convex optimization already lend themselves to representations as fixed point sets of firmly nonexpansive mappings, a prime example maybe being the set of zeros zerA of a monotone (or accretive) operator A. These zero-sets can be expressed as the set of fixed points of the resolventJA corresponding to A which is, in particular, firmly nonexpansive (see [1] for a comprehensive reference on monotone operators).
We expect that various other algorithms for equilibrium problems over suitable sets C can be treated by following a similar analysis provided in this paper (using [11, 13]).
2 A first quantitative analysis
A first consequence of Theorem 3 is the following reformulation of (parts of) part (a).
Lemma 4
Let u ∈Ω≠∅ and M > 0 with \(M\geq \left \|\xi _{n}\right \|\) for all \(n\in \mathbb {N}\). Suppose λn ∈ [0,2/M2] for all \(n\in \mathbb {N}.\) Then,
for all \(n\in \mathbb {N}\). Especially, \(\left \|x_{n}-u\right \|^{2}\) converges.
As the required sequence is monotone, we can obtain a direct rate of metastability for the sequence \(\left (\left \|x_{n}-u\right \|^{2}\right )_{n\in \mathbb {N}}\) from the next lemma which follows immediately from [9], Proposition 2.27 and Remark 2.29.
Lemma 5 (Quantitative version of Lemma 4)
Let u ∈Ω≠∅ with \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\) and let M > 0 with \(\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Furthermore, let λn ∈ [0,2/M2] for all \(n\in \mathbb {N}\). Then, for all \(k,K\in \mathbb {N}\) and all \(g\in \mathbb {N}^{\mathbb {N}}\):
where
Here, \(\tilde {g}^{(n)}(K)\) denotes the n-th iteration of \(\tilde {g}\) starting from K. For the special case of K = 0, we simply write \({\Phi }_{1}(k,g,c_{u}):={\Phi }^{\prime }_{1}(k,g,c_{u},0)\).
Proof
The proof given in [9], Proposition 2.27 and Remark 2.29, only provides the case for K = 0. It is, however, immediately apparent from the proof given there that the argument of \(\tilde {g}^{(\left \lceil {c_{u}(k+1)}\right \rceil )}\) can be chosen to be an arbitrary \(K\in \mathbb {N}\). Similarly, it follows from said proof that the resulting n is then of the form \(n=\tilde {g}^{(i)}(K)\) for some \(i\leq \left \lceil {c_{u}(k+1)}\right \rceil \). Therefore, in particular also n ≥ K by construction of \(\tilde {g}\). □
A lemma used in the proof (in [5]) of Theorem 3, part (b) and (c), is the following which is a direct corollary of part (a).
Lemma 6 (Iiduka and Yamada, 5, p. 257)
Let u ∈Ω≠∅ and M > 0 with \(M\geq \left \|\xi _{n}\right \|\) for all \(n\in \mathbb {N}\). Let further \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\). Then,
for all \(n\in \mathbb {N}\).
Using this lemma, we obtain the following quantitative analysis of the convergence of f(yn,xn) towards 0.
Proposition 7 (Quantitative version of Theorem 3, part (b), I)
Let u ∈Ω≠∅, \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\), and let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Furthermore, let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\). Then, for all \(k\in \mathbb {N}\) and all \(g\in \mathbb {N}^{\mathbb {N}}\):
where
with Φ1 as in Lemma 5.
Proof
Let \(k\in \mathbb {N}\) and \(g\in \mathbb {N}^{\mathbb {N}}\) be arbitrary.
By Lemma 5
such that for all i ∈ [n;n + g(n)] (as then i + 1 ∈ [n;n + g(n) + 1]):
where the first inequality follows from Lemma 6. From this the claim is immediate. □
Lemma 8 (Iiduka and Yamada, 5, p. 257)
Let u ∈Ω≠∅ be arbitrary and M > 0 with \(M\geq \left \|\xi _{n}\right \|\) for all \(n\in \mathbb {N}\). Let further \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\).
-
(i)
For all \(n\in \mathbb {N}\):
$$ \left\|x_{n}-x_{n+1}\right\|^{2}\leq\left\|x_{n}-u\right\|^{2}-\left\|x_{n+1}-u\right\|^{2}+2Mbf(y_{n},x_{n})\left\|x_{n}-x_{n+1}\right\|. $$In particular, if \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\):
$$ \left\|x_{n}-x_{n+1}\right\|^{2}\leq\left\|x_{n}-u\right\|^{2}-\left\|x_{n+1}-u\right\|^{2}+2MbLf(y_{n},x_{n}). $$ -
(ii)
For all \(n\in \mathbb {N}\):
$$ \left\|x_{n+1}-Tx_{n+1}\right\|\leq\left\|x_{n}-x_{n+1}\right\|+Mbf(y_{n},x_{n}). $$
Proposition 9 (Quantitative version of Theorem 3, part (b), II)
Let u ∈Ω≠∅, \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\), \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\) as well as M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Furthermore, let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\). Then, for all \(k\in \mathbb {N}\) and all \(g\in \mathbb {N}^{\mathbb {N}}\):
where we have
with \(g^{\prime }(n):=g(n+1)+1\) as well as
with Φ1 as in Lemma 5.
Proof
Let \(k\in \mathbb {N}\) and \(g\in \mathbb {N}^{\mathbb {N}}\) be arbitrary. As an abbreviation, we write
Using Lemma 6, we at first have
Furthermore, we have (using Lemma 8) for any u ∈Ω:
and therefore (using (∗1)):
for all \(n\in \mathbb {N}\).
By Lemma 5, we have that
such that (using (∗2)) for all i ∈ [m;m + g(m + 1)]:
If we define n = m + 1, then n ≤Φ3(k,g,cu) and
and thus by the above we have
for all i ∈ [n;n + g(n)]. □
Remark 10
Note that a bound L on the diameter of \((x_{n})_{n\in \mathbb {N}}\) as used in Proposition 9 as an input can actually be obtained in terms of cu by setting \(L:=2\sqrt {c_{u}}\) as we have:
To obtain a rate of metastability for the sequence \((x_{n})_{n\in \mathbb {N}}\), we apply recent results of Kohlenbach, Leuştean and Nicolae [11] on Fejér-monotone sequences. Other examples of application of these recent results are especially the derivation of a quantitative version of asymptotic regularity of compositions of two mappings (see [12]). We recall the definition of Fejér monotonicity.
Definition 11
Let (X,d) be a metric space, \(F\subseteq X\) nonempty and \((x_{n})_{n\in \mathbb {N}}\) be a sequence in X. \((x_{n})_{n\in \mathbb {N}}\) is called Fejér-monotone with respect to F, if for all \(n\in \mathbb {N}\) and all p ∈ F:
The authors in [11] actually introduce a generalized form of Fejér monotonicity, but for the purpose of this work, the above is enough. However, we pass to the notion of uniform Fejér monotonicity, as introduced in [11], to formulate the (following) quantitative results.
For this, one considers approximations of the approached set F in form of a descending sequence of sets
for \(k\in \mathbb {N}\) with
Definition 12
\((x_{n})_{n\in \mathbb {N}}\) is called uniformly Fejér monotone with respect to F and \((AF_{k})_{k\in \mathbb {N}}\) if for all \(r,n,m\in \mathbb {N}\):
Any function χ(n,m,r) producing such a \(k\in \mathbb {N}\) is called a modulus of \((x_{n})_{n\in \mathbb {N}}\) being uniformly Fejér monotone.
Under the assumption of (X,d) being boundedly compact, the authors of [11] obtain (in a slightly generalized setting) an explicit effective rate of metastability for the sequence \((x_{n})_{n\in \mathbb {N}}.\) This rate only depends on the particular uniform quantitative reformulations of the assumptions of the setting such as a modulus of uniform Fejér monotonicity and some further quantitative information on the space (X,d) and on how the sequence \((x_{n})_{n\in \mathbb {N}}\) approaches the set F.
With quantitative information on the space (X,d), we here mean explicitly a modulus of total boundedness (as defined in [11]) or a ‘modulus of bounded compactness’. This will be discussed in the proof of Theorem 17.
With quantitative information on how the sequence \((x_{n})_{n\in \mathbb {N}}\) approaches the set F, we mean a bound on \((x_{n})_{n\in \mathbb {N}}\) having approximate F-points. For this, recall the following definition from [11]:
Definition 13
\((x_{n})_{n\in \mathbb {N}}\) has approximate F-points if \(\forall k\in \mathbb {N} \exists N\in \mathbb {N}\left (x_{N}\in AF_{k}\right )\). A bound Φ(k) on ‘\(\exists N\in \mathbb {N}\)’ is called an approximate F-point bound.
As a feasibility check on whether these results can be applied and whether the setup of [5] fits into the above framework, note that part (a) of Theorem 3 can be seen (modulo some refinement of the approximations Ωn) as hinting the uniform Fejér monotonicity of \((x_{n})_{n\in \mathbb {N}}\) (as the sequence from Algorithm 2) with respect to the set Ω being taken as F.
We at first focus on whether (quantitative versions of) these properties of uniform Fejér monotonicity and approximate F-/Ω-points can be obtained by suitably modifying the approximations Ωn.
For this, we need to weaken the conditions of Ωn to allow \((x_{n})_{n\in \mathbb {N}}\) to lie in them further along the approximation. As none of these xn is expected to be a fixed point of T or to satisfy f(ym,xn) ≤ 0, we weaken these properties to that of approximate fixed point and \(f(y_{m},u)\leq \frac {1}{k+1}\), respectively. Part (b) of Theorem 3 gives, as a feasibility check, that in the long run f(yn,xn) is expected to decrease and that the sequence xn contains better and better approximate fixed points of T.
Using this motivation, we define
which plays the role of AFk. By construction, we naturally have that \(({\Omega }^{\prime }_{k})_{k\in \mathbb {N}}\) is descending and
Furthermore, we obtain the following lemma giving a quantitative version of \((x_{n})_{n\in \mathbb {N}}\) having approximate Ω-points with respect to \(\left ({\Omega }^{\prime }_{k}\right )_{k\in \mathbb {N}}\) (modulo some quantitative reformulations of the parameters of Algorithm 2).
Lemma 14
Let u ∈Ω≠∅ with \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\) and let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\) and let \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\). Furthermore, let εn ≥ 0 for all \(n\in \mathbb {N}\) and let εn → 0 (\(n\to \infty \)) where τ is a nondecreasing rate of convergence for εn → 0 (\(n\to \infty \)), that is τ(k + 1) ≥ τ(k) and
Then,
where
with
Proof
Let \(k\in \mathbb {N}\). We again write
Again by Lemma 6, we have
and as in the proof of Proposition 9, we obtain
Note that for \(\overline {2}:n\mapsto 2\), we have
for any \(K\in \mathbb {N}\) and so
Hence, by Lemma 5 (applied to j := i + 1 and \(K:=\max \limits \{ k,\tau (2k+1)\}\)), we have that ∃n ∈ [K;Φ(k,a,b,M,L,τ,cu) − 1]∀i ∈ [n;n + 1]
By \(n+1\geq n\geq \max \limits \{k,\tau (2k+1)\}\) we get \(\varepsilon _{n+1}\leq \frac {1}{2(k+1)}\). Therefore, using (∗1) and (∗2):
as σ(a,b,M,L) ≥ 1.
By the above, we have
separately as f(yn+ 1,xn+ 1) ≥ 0 by definition from Algorithm 2. Also by the definition of Algorithm 2, we have
By definition of the Kj, as \(K_{j}\subseteq K_{j+1}\) and yj ∈ Kj, we have \(y_{0},\dots ,y_{n+1}\in K_{n+1}\). Therefore, we have especially
for all j ≤ n + 1 and as \(n+1\geq n\geq \max \limits \{k,\tau (2k+1)\}\geq k\) by definition of n, we have \(x_{n+1}\in {\Omega }^{\prime }_{k}\) as well as n + 1 ≤Φ(k,a,b,M,L,τ,cu). □
The next two lemmas now give the quantitative version of the uniform Fejér monotonicity of \((x_{n})_{n\in \mathbb {N}}\) with respect to Ω and \(\left ({\Omega }^{\prime }_{k}\right )_{k\in \mathbb {N}}\).
Lemma 15
Let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\) and let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\). Now, let \(n\in \mathbb {N}\) be fixed. For any k ≥ n and any \(u\in {\Omega }^{\prime }_{k}\):
In particular, we have
for all \(u\in {\Omega }^{\prime }_{k}\) and for all \(l\in \mathbb {N}\) with n + l ≤ k + 1 (where for l = 0 the sum is 0).
Proof
We give a quantitative analysis of the proof of (3.6) in [5]. At first, note that ξn ∈ ∂f(yn,⋅)(xn) by the definition of Algorithm 2. Thus, by the definition of the subgradient, we have especially
and thus
for all \(u\in {\Omega }^{\prime }_{k}\).
Therefore, we have:
From this, it naturally follows that
The claim
follows from this by induction on l ≥ 1 with n + l ≤ k + 1 (the case of l = 0 is trivial). □
Lemma 16
Let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\) and let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\). Furthermore, let e ≥ f(yn,xn) for all \(n\in \mathbb {N}\). Then, \((x_{n})_{n\in \mathbb {N}}\) is uniformly Fejér monotone with modulus χ(n,m,r), that is for all \(r,n,m\in \mathbb {N}\):
where
Proof
Fix \(r,n,m\in \mathbb {N}\) and assume m ≥ 1 without loss of generality. Let k = χ(n,m,r), \(u\in {\Omega }^{\prime }_{k}\) and l ≤ m. By Lemma 15, we have (as k + 1 ≥ n + m ≥ n + l by definition)
and by using f(yn,xn) ≤ e, we obtain
from (∗) as l ≤ m. As
we obtain
from this. □
Applying Theorem 5.1 from [11], we now obtain a rate of metastability for \((x_{n})_{n\in \mathbb {N}}\).
Theorem 17 (Quantitative version of Theorem 3, part (c), I)
Let e ≥ f(yn,xn) and \(M\geq \left \|\xi _{n}\right \|\) for all \(n\in \mathbb {N}\). Also, let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\) as well as \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\). Furthermore, let εn ≥ 0, εn → 0 \((n\to \infty )\) and τ be a nondecreasing rate of convergence for εn → 0 (\(n\to \infty \)). Let u ∈Ω≠∅ with \(c_{u}\geq \left \|{x_{0}-u}\right \|^{2}\).
Then, for all \(k\in \mathbb {N}\) and all \(g\in \mathbb {N}^{\mathbb {N}}\):
for Σ(k,g) := Σ0(P(k),k,g,χ,Φ) with
and
where we define
as well as
Proof
The proof is an application of Theorem 5.1 from [11] with \(X:=\overline {B_{L}(x_{0})}\) and \(F:={\Omega }\cap X, AF_{k}:={\Omega }^{\prime }_{k}\cap X\) (and G := H := Id). Here, we use that we have
as we assume \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\) and we have \(\left \|x_{n}-x_{0}\right \|\leq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\) by definition of the diameter. By Example 2.8 of [11], the function \(\gamma (k):=\left \lceil {2(k+1)\sqrt {N}L}\right \rceil ^{N}\) is a modulus of total boundedness of \(\overline {B_{L}(0)}\) and, considering the definition of (II)-moduli of total boundedness from [11], it is straightforward to see that these moduli are ‘translation-invariant’ in the case of normed vector spaces, i.e. any (II)-modulus of total boundedness for a set \(A\subseteq \mathbb {R}^{N}\) is also a (II)-modulus of total boundedness for A + v := {a + v∣a ∈ A} with \(v\in \mathbb {R}^{N}\). In our situation, we thus have that the particular γ is also a (II-)modulus of total boundedness for \(\overline {B_{L}(x_{0})}=\overline {B_{L}(0)}+x_{0}\). By Lemma 14, Φ is an approximate F-point bound and by Lemma 16, χ is a modulus for \((x_{n})_{n\in \mathbb {N}}\) being uniformly Fejér monotone w.r.t. F (and AFk). Applying Theorem 5.1 with γ,Φ,χ gives the result. □
Remark 18
In the above theorem, we can obtain a bound e on f(yn,xn) in terms of cu,a,b and M by setting
as we have (using Lemma 6):
Remark 19
The complexity of our rate of metastability is mainly given by the fact that the function Φ ∘ χg and so, in particular, the ‘counterfunction’ g gets iterated in the definition of Σ0. Some iteration of this sort, however, is unavoidable as the counterexample given in [15] to the computability of the rate of convergence (already for N = 1 and f = 0) shows that an extremely special case of the algorithm studied computes the limit of a decreasing sequence in [0,1] whose rate of metastability necessarily needs this iteration process (see the discussion on p.4 of [11]). In the next section we show that a low-complexity rate of full convergence results under an additional metric regularity assumption.
3 Adding further assumptions
In this section, we investigate two sets of assumptions to strengthen Theorem 17.
3.1 Uniform closedness
In [11], the authors introduce the notion of uniform closedness, an additional assumption on the way the sets AFk approach the set F (using the notation of the previous general setting of Definition 13).
We recall the corresponding definition.
Definition 20
Let (X,d) be a metric space and \(F\subseteq X\) be nonempty. Let \(AF_{k}\subseteq X\) be closed with \(AF_{k}\supseteq AF_{k+1}\) and \(F=\bigcap _{k\in \mathbb {N}}AF_{k}\). F is called uniformly closed for \((AF_{k})_{k\in \mathbb {N}}\) with moduli \(\delta _{F},\omega _{F}:\mathbb {N}\to \mathbb {N}\) if
Under the assumption of uniform closedness, the authors obtain Theorem 5.3 in [11] as a strengthening of Theorem 5.1. In the following, we will observe that, under further quantitative assumptions on the equilibrium function f, Ω is uniformly closed with the previously defined approximations \(\left ({\Omega }^{\prime }_{k}\right )_{k\in \mathbb {N}}\) and compute the corresponding moduli of uniform closedness.
Definition 21
Let \(g:D\subseteq \mathbb {R}^{N}\to \mathbb {R}\) be uniformly continuous. A modulus of uniform continuity for g on D is a function \(\sigma :\mathbb {N}\to \mathbb {N}\) such that
This notion of modulus of uniform continuity differs from the commonly known modulus of continuity in (numerical) analysis but is commonly used in computable and constructive analysis as well as in proof mining (see, e.g. [2, 9, 18]).
We then obtain the following result giving corresponding moduli of uniform closedness in terms of moduli of uniform continuity.
Lemma 22
Let \((y_{j})_{j\in \mathbb {N}}\) be a sequence in \(\mathbb {R}^{N}\) (defining \({\Omega }^{\prime }_{k}\)) and let σj be moduli of uniform continuity for f(yj,⋅) for all \(j\in \mathbb {N}\) on some subset \(D\subseteq \mathbb {R}^{N}\). Then,
where
Proof
Let \(k\in \mathbb {N}\) and let \(u,u^{\prime }\in D\) with \(u^{\prime }\in {\Omega }^{\prime }_{2k+1}\) and
\(u^{\prime }\in {\Omega }^{\prime }_{2k+1}\) is by definition equivalent to
As T is especially nonexpansive, we have \(\left \|Tu-Tu^{\prime }\right \|\leq \left \|u-u^{\prime }\right \|\). As ωΩ(k) ≥ 4k + 3, we have further
and thus
As \(\omega _{\Omega }(k)\geq \sigma ^{max}_{k}(2k+1)\geq \sigma _{i}(2k+1)\) for all i ≤ k by assumption, we have
for all i ≤ k and as σi is a modulus of uniform continuity for f(yi,⋅), we have
for all i ≤ k. Thus, by definition we have \(u\in {\Omega }_{k}^{\prime }\). □
Using these moduli, we obtain the following strengthening of Theorem 17 in correspondence to Theorem 5.3 instead of Theorem 5.1 (of [11]).
Theorem 23 (Quantitative version of Theorem 3, part (c), II)
In addition to the assumptions of Theorem 17, let σj be a modulus of uniform continuity for f(yj,⋅) for any \(j\in \mathbb {N}\) on \(\overline {B_{L}(x_{0})}\).
Then, for all \(k\in \mathbb {N}\) and all \(g\in \mathbb {N}^{\mathbb {N}}\):
for \(\tilde {\Sigma }(k,g):={\Sigma }_{0}(P(k_{0}),k_{0},g,\chi _{k},{\Phi })\) with P,χ,Φ and Σ0 as in Theorem 17 as well as
where we define
Proof
Apply Theorem 5.3 of [11] under the same considerations as in the proof of Theorem 17, using Lemma 22 with \(D:=\overline {B_{L}(x_{0})}.\) The Lemmas 14 and 16 apply as before. □
Remark 24
This theorem is a finitization of Theorem 3, part (c) as it (ineffectively, but elementary) implies back the statement of (c): the metastability trivially implies the Cauchy-statement (and thus convergence) of \((x_{n})_{n\in \mathbb {N}}\). Furthermore, for \(M\in \mathbb {N}\) and g : n↦M, Theorem 23 gives \(\exists i\geq M\left (x_{i}\in {\Omega }_{k}^{\prime }\right )\). Thus, as \({\Omega }_{k}^{\prime }\) is closed, we have \(x:=\lim _{n\to \infty }x_{n}\in {\Omega }_{k}^{\prime }\) and as k was arbitrary, we have x ∈Ω by \({\Omega }=\bigcap _{k\in \mathbb {N}}{\Omega }^{\prime }_{k}\). As in [5], p. 258, it follows elementary that x ∈EP(Fix(T),f).
3.2 Regularity conditions
Using the recent quantitative treatment [13] of very general scenarios of regularity conditions in the context of Fejér monotone sequences, we can give an improvement of Theorems 17 and 23 by adding assumptions on (a quantitative version of) a regularity condition for Ω and obtain (under this assumption) even rates of convergence for the sequence approximating an equilibrium point.
Central for the further results is the following quantitative version of regularity, defined as modulus of regularity in [13]. For this, given a function \(F:\mathbb {R}^{N}\to \mathbb {R}\), we write zerF for the set of zeros of F.
Definition 25 (13)
Let \(F:\mathbb {R}^{N}\to \mathbb {R}\) be a function with zerF≠∅ and fix z ∈zerF and r > 0. A function \(\phi :(0,\infty )\to (0,\infty )\) is a modulus of regularity for F w.r.t. zerF and \(\overline {B_{r}(z)}\) if for all ε > 0 and all \(x\in \overline {B_{r}(z)}\):
The setting for these regularity conditions in [13] is far more general, e.g. being in the context of abstract metric spaces. We will, however, only need the above version for functions over \(\mathbb {R}^{N}\).
Under the assumption of a modulus of regularity for F together with the Fejér monotonicity of a sequence \((x_{n})_{n\in \mathbb {N}}\) w.r.t. zerF and some further assumptions on quantitative information on how the sequence \((x_{n})_{n\in \mathbb {N}}\) interacts with F, the authors obtain effective rates of convergence for the sequence \((x_{n})_{n\in \mathbb {N}}\).
By quantitative information on the interaction of \((x_{n})_{n\in \mathbb {N}}\) with F, we mean precisely that \((x_{n})_{n\in \mathbb {N}}\) has approximate F zeros. For this, recall the following (modification of the) definition from [13].
Definition 26
Let F be as above. We say that a sequence \((x_{n})_{n\in \mathbb {N}}\) has approximate F zeros if
A bound on ‘\(\exists n\in \mathbb {N}\)’ is called an approximate zero bound.
Notice the similarity of approximate zeros with approximate F-points from Definition 13 (although there F has a different meaning). Guided by this similarity, the fact that the particular sequence \((x_{n})_{n\in \mathbb {N}}\) from Algorithm 2 has approximate Ω-points (relative to the representation \(({\Omega }^{\prime }_{k})_{k\in \mathbb {N}}\)) and the fact that \((x_{n})_{n\in \mathbb {N}}\) is Fejér monotone w.r.t. Ω, we are particularly interested in a function F where (1) zerF = Ω and (2) |F(x)| < 1/(k + 1) relates to \(x \in {\Omega }^{\prime }_{k}.\) Towards a particular choice, we first define the set-valued function \(\gamma :\mathbb {R}^{N}\to \mathcal {P}(\mathbb {N})\) through
and the corresponding function \(G:\mathbb {R}^{N}\to \mathbb {R}_{\ge 0}\) defined through
Given a mapping \(T:\mathbb {R}^{N}\to \mathbb {R}^{N}\), we may further define the function \(F:\mathbb {R}^{N}\to \mathbb {R}_{\ge 0}\)
This function is now an adequate choice which fulfills the previously desired requirements (1) and (2). For this, note first that γ(x) has the following property whose proof is immediate:
Lemma 27
For any \(x\in \mathbb {R}^{N}\) and any \(k\in \mathbb {N}\): k ∈ γ(x) implies j ∈ γ(x) for all j ≥ k.
In the following, we verify properties (1) and (2) for F:
Lemma 28
For all \(x\in \mathbb {R}^{N}\) and all \(k\in \mathbb {N}\): \(x\in {\Omega }^{\prime }_{k}\) iff \(F(x)\leq \frac {1}{k+1}\). In particular, Ω = zerF.
Proof
Let \(x\in \mathbb {R}^{N}\) and let \(k\in \mathbb {N}\) and suppose first that \(x\in {\Omega }^{\prime }_{k}\). Then, we have
The latter gives k∉γ(x) and thus j∉γ(x) for all j ≤ k by the contraposition of Lemma 27. Now, either γ(x) = ∅ or γ(x)≠∅. The former gives \(G(x)=0\leq \frac {1}{k+1}\) by definition, the latter gives \(\inf \gamma (x)\geq k+1\) and thus \(G(x)=1/(\inf \gamma (x))\leq 1/(k+1)\). In any way F(x) ≤ 1/(k + 1).
Now, suppose \(x\not \in {\Omega }^{\prime }_{k}\). Then, \(\left \|x-Tx\right \|>\frac {1}{k+1}\) or \(f(y_{j},x)>\frac {1}{k+1}\) for some j ≤ k. The former gives \(F(x)>\frac {1}{k+1}\) immediately. The latter gives k ∈ γ(x). Thus,
and therefore, we have either \(\gamma (x)=\mathbb {N}\) where \(G(x)=2>\frac {1}{k+1}\) by definition, or \(\inf \gamma (x)\geq 1\) where then
In any way F(x) > 1/(k + 1). □
Together with the previous results on approximate Ω-points from Lemma 14, we obtain the following result regarding approximate F zeros.
Lemma 29
Let u ∈Ω≠∅ with \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\) and let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\) as well as \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\). Furthermore, let εn ≥ 0 for all \(n\in \mathbb {N}\) and let εn → 0 (\(n\to \infty \)) where τ is a nondecreasing rate of convergence for εn → 0 (\(n\to \infty \)). Let Φ be as in Lemma 14. Then, for any \(k\in \mathbb {N}\),
Proof
By Lemma 14, we have
By Lemma 28, this implies \(F(x_{n})\leq \frac {1}{k+1}\). □
As the function F may be perceived to be quite artificial, it is of interest to see equivalent characterizations for the existence of a modulus of regularity for F. The following easy consequence of Lemma 28 gives a result in this vein.
Lemma 30
Let u ∈Ω and let r > 0. Then:
-
(1)
If \(\phi :(0,\infty )\to (0,\infty )\) is a modulus of regularity for F w.r.t. Ω and \(\overline {B_{r}(u)}\), then
$$ \forall\varepsilon>0 \forall x\in\overline{B_{r}(u)}\left( x\in{\Omega}_{\left\lceil{\frac{1}{\phi(\varepsilon)}}\right\rceil}^{\prime}\Rightarrow\text{dist}(x,{\Omega})<\varepsilon\right). $$ -
(2)
If \(\psi :(0,\infty )\to \mathbb {N}\) is such that
$$ \forall\varepsilon>0 \forall x\in\overline{B_{r}(u)}\left( x\in{\Omega}_{\psi(\varepsilon)}^{\prime}\Rightarrow\text{dist}(x,{\Omega})<\varepsilon\right), $$then 1/(ψ(ε) + 1) is a modulus of regularity for F w.r.t. Ω and \(\overline {B_{r}(u)}\).
Using this lemma, we obtain the following rate of convergence for the sequence \((x_{n})_{n\in \mathbb {N}}\) generated by Algorithm 2 under the assumption of a modulus of regularity for F.
Theorem 31
Let u ∈Ω≠∅ with \(c_{u}\geq \left \|x_{0}-u\right \|^{2}\) and let M > 0 with \(M\geq \left |\xi _{n}\right |\) for all \(n\in \mathbb {N}\). Let \(\lambda _{n}\in [a,b]\subseteq (0,2/M^{2})\) for all \(n\in \mathbb {N}\) as well as \(L\geq \text {diam}\{x_{n}\mid n\in \mathbb {N}\}\). Furthermore, let εn ≥ 0 for all \(n\in \mathbb {N}\) and suppose εn → 0 (\(n\to \infty \)) with a nondecreasing rate of convergenceτ.
If \(\psi :(0,\infty )\to \mathbb {N}\) is such that
then \((x_{n})_{n\in \mathbb {N}}\) is convergent with \(x=\lim _{n\to \infty }x_{n}\in \text {EP}(\text {Fix}(T),f)\) and
where we define
and
as well as
Proof
The proof is an application of Theorem 4.1, (i) of [13]. Fejér monotonicity of \((x_{n})_{n\in \mathbb {N}}\) w.r.t Ω is already contained in (a) of Theorem 3. By Lemma 30, (2), ϕ is a modulus of regularity for F.
Now, let ε > 0. Then, by Lemma 29, we obtain
As ε was arbitrary, Theorem 4.1 of [13] applies as Ω is closed and we obtain \(x=\lim _{n\to \infty }x_{n}\in {\Omega }\) with the desired rate of convergence. We obtain x ∈EP(Fix(T),f) as in [5], p. 258. □
Change history
04 October 2021
A Correction to this paper has been published: https://doi.org/10.1007/s11075-021-01202-w
References
Bauschke, H., Combettes, P.: Convex analysis and monotone operator theory in Hilbert spaces. CMS Books in Mathematics, 2nd edn. Springer, Cham (2017)
Bishop, E.: Foundations of Constructive Analysis. McGraw-Hill, New York (1967)
Bolte, J., Daniilidis, A., Ley, O., Mazet, L.: Characterizations of Łojasiewicz inequalities: subgradient flows, talweg, convexity. Trans. Am. Math. Soc. 362(6), 3319–3363 (2010)
Borwein, J.M., Li, G., Tam, M.K.: Convergence rate analysis for averaged fixed point iterations in common fixed point problems. SIAM J. Optim. 27, 1–33 (2017)
Iiduka, H., Yamada, I.: A subgradient-type method for the equilibrium problem over the fixed point set and its applications. Optimization 58 (2), 251–261 (2009)
Ioffe, A.D.: An invitation to tame optimization. SIAM J. Optim. 19(4), 1894–1917 (2009)
Iusem, A.N., Sosa, W.: Iterative algorithms for equilibrium problems. Optimization 52, 301–316 (2003)
Ko, K.-I.: Complexity theory of real functions. Birkhäuser, Boston-Basel-Berlin (1991). x + 309 pp.
Kohlenbach, U.: Applied proof theory: Proof Interpretations and Their Use in Mathematics. Springer Monographs in Mathematics. Springer, Berlin (2008)
Kohlenbach, U.: Proof-theoretic methods in nonlinear analysis. In: Sirakov, B., Ney de Souza, P., Viana, M. (eds.) Proc. ICM, vol. 2, pp 61–82. World Scientific (2019)
Kohlenbach, U., Leuştean, L., Nicolae, A.: Quantitative results on Fejér monotone sequences. Commun. Contemp. Math. 20(2) (2018)
Kohlenbach, U., López-Acedo, G., Nicolae, A.: Quantitative asymptotic regularity results for the composition of two mappings. Optimization 66, 1291–1299 (2017)
Kohlenbach, U., López-Acedo, G., Nicolae, A.: Moduli of regularity and rates of convergence for fejér monotone sequenc es. Isr. J. Math. 232, 261–297 (2019)
Neumann, E.: Computational problems in metric fixed point theory and their Weihrauch degrees. Log. Methods Comput. Sci. 11(4) (2015)
Pischke, N.: Quantitative proof-theoretic analysis of a subgradient-type method for equilibrium problems. Bachelor Thesis, TU Darmstadt. 41pp (2020)
Tao, T.: Norm convergence of multiple ergodic averages for commuting transformations. Ergod. Theory Dyn. Syst. 28, 657–688 (2008)
Tao, T.: Soft analysis, hard analysis, and the finite convergence principle. In: Structure and Randomness: Pages from Year One of a Mathematical Blog. American Mathematical Society, Providence (2008)
Weihrauch, K.: Computable Analysis. Springer, Berlin (2000)
Yamada, I.: The hybrid steepest descent method for the variational inequality problem over the intersection of fixed point sets of nonexpansive mappings. In: Inherently Parallel Algorithms for Feasibility and Optimization and Their Applications, pp 473–504. Elsevier, New York (2001)
Funding
Open Access funding enabled and organized by Projekt DEAL. The second author was supported by the German Science Foundation (DFG KO 1737/6-1).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This paper is a condensed version of the Bachelor thesis [15] of the first author written under the supervision of the 2nd author.
The original online version of this article was revised: In page 17 line 5, the “T” was needed to be added in the final term ∥u−Tu′∥ to get ∥Tu−Tu′∥.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Pischke, N., Kohlenbach, U. Quantitative analysis of a subgradient-type method for equilibrium problems. Numer Algor 90, 197–219 (2022). https://doi.org/10.1007/s11075-021-01184-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-021-01184-9