Abstract
This paper aims to obtain a strong convergence result for a Douglas–Rachford splitting method with inertial extrapolation step for finding a zero of the sum of two set-valued maximal monotone operators without any further assumption of uniform monotonicity on any of the involved maximal monotone operators. Furthermore, our proposed method is easy to implement and the inertial factor in our proposed method is a natural choice. Our method of proof is of independent interest. Finally, some numerical implementations are given to confirm the theoretical analysis.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let H be a real Hilbert space with scalar product \(\langle . , . \rangle\) and induced norm \(\Vert \cdot \Vert\). An operator \(A:H\rightarrow 2^H\) with domain D(A) is said to be monotone if
A is maximal monotone if its graph
is not properly contained in the graph of any other monotone operators.
Let us consider the inclusion problem of the form
where A and B are set-valued maximal monotone operators in H. Throughout this paper, we assume that the set of solution, denoted by S, of (1) is nonempty.
The proximal point algorithm (PPA) is the well-known method for solving inclusion problem (1) (see, Lions and Mercier 1979; Martinet 1970; Moreau 1965; Rockafellar 1976). The PPA for solving (1) is expressed as
where \(\lambda >0\) is the proximal parameter. Now, implementing PPA (2) to solve (1) requires computing the resolvent operator of the sum \(A+B\) exactly. This is very difficult to implement and could be as hard as the original inclusion problem (1). This difficulty has led many authors to consider the operator splitting approach to solve (1). The aim of operator splitting method is to circumvent the computation of \(J^\lambda _{A+B}\) when implementing (2) but rather consider the computation of \(J^\lambda _A\) and \(J^\lambda _B\) (Eckstein and Bertsekas 1992; Glowinski and Le Tallec 1989; Lions and Mercier 1979).
When both A and B are single-valued linear operators in (1), Douglas and Rachford (1956) proposed the following method for solving heat conduction problems:
We can eliminate \(u_{k+\frac{1}{2}}\) in (3) above and obtain
Define \(z_k:=\Big (J^\lambda _B\Big )^{-1}u_k\Leftrightarrow u_k=J^\lambda _B(z_k)\). Then, (4) reduces to the following splitting method (known as Douglas–Rachford splitting method)
Lions and Mercier (1979) extended the Douglas–Rachford splitting method (5) to the generic case where both A and B are set-valued nonlinear operators as in our problem (1). The Douglas–Rachford splitting method (5) to the generic case is explained as follows in Lions and Mercier (1979): Starting from an arbitrary iterate \(u_1\) in the domain of B, choosing \(b_1 \in B(u_1)\) and setting \(z_1 = u_1 +\lambda b_1\), then \(u_1 = J^\lambda _B(z_1)\) (the existence of the pair \((u_1,z_1)\) is unique by the Representation Lemma, see Eckstein and Bertsekas 1992, cor. 2.3). Thus a sequence \(\{z_k\}\) is generated by the Douglas–Rachford scheme (5); and consequently a sequence \(\{u_k := J^\lambda _B(z_k)\}\) converging to a solution point of (1) can be generated (see Eckstein 1989, Thm. 3.15). We refer to Combettes (2004) for the precise connection between (5) and the original Douglas–Rachford scheme in Douglas and Rachford (1956) for heat conduction problems. More details on Douglas–Rachford splitting method (5) can also be found in Fukushima (1996), Gabay and Mercier (1976) and Glowinski and Marrocco (1975).
1.1 Motivations and contributions
Boţ et al. (2015) gave the following method for solving (1): \(z_0=z_1\);
where \(\{\alpha _k\}\) is a non-decreasing sequence with \(0\le \alpha _k \le \alpha <1, \forall k \ge 1\) and \(\lambda , \sigma , \delta >0\) such that
-
(a)
\(\delta >\frac{\alpha ^2(1+\alpha )+\alpha \sigma }{1-\alpha ^2}\); and
-
(b)
\(0 <\lambda \le \beta _k \le \theta :=2\frac{\delta -\alpha [\alpha (1+\alpha )+\alpha \delta +\sigma ]}{\delta [1+\alpha (1+\alpha )+\alpha \delta +\sigma ]}\).
Boţ et al. (2015) obtained weak convergence analysis of algorithm (6) for finding common zeros of the sum of two maximal monotone operators and illustrate their results through some numerical experiments. The same conditions (a) and (b) above have been used in recent works in Dong et al. (2018), Shehu (2018) and other associated papers. When \(\alpha _k=0\), it was proved in Bauschke and Combettes (2011, Thm. 25.6(vii)) that \(\{z_k\}\) in (6) converges strongly to a solution of (1) if either A or B is uniformly monotone (A is uniformly monotone if \(\langle x-y,u-v\rangle \ge \phi (\Vert x-y\Vert ), \forall u \in Ax, v\in Ay\), where \(\phi :[0,\infty )\rightarrow [0,\infty )\) is increasing and vanishes only at zero) on every nonempty bounded subset of its domain.
When \(\beta _k=1\) and \(B\equiv 0\), then (6) reduces to the inertial proximal point method proposed by Alvarez and Attouch (2001). In this case, Alvarez and Attouch (2001) assumed that the inertial factor \(\alpha _k\) satisfies the condition \(0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}\) in their convergence result. However, the assumption on the inertial factor \(\alpha _k\) imposed in (6) does not appear as simple as condition \(0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}\), assumed by Alvarez and Attouch (2001).
Problems arise in infinite dimensional spaces in many disciplines like economics, image recovery, electromagnetics, quantum physics, and control theory. For such problems, strong convergence of sequence of iterates \(z_k\) of the proposed iterative procedure is often much more desirable than weak convergence. This is because strong convergence translates the physically tangible property that the energy \(\Vert z_k-z\Vert\) of the error between the iterate \(z_k\) and a solution z eventually becomes arbitrarily small. Another importance of strong convergence is also underlined in the works of Güler (1991), where a convex function f is minimized through the proximal point algorithm. Güler (1991) showed that the rate of convergence of the value sequence \(\{f(z_k)\}\) is better when \(\{z_k\}\) converges strongly than when it converges weakly. For more details on importance of strong convergence, please see Bauschke and Combettes (2001).
Strong convergence methods for solving problem (1) when B is set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator (i.e., \(\langle Ax-Ay,x-y\rangle \ge \kappa \Vert Ax-Ay\Vert ^2,~~\forall x, y\in H\)) have been studied extensively in the literature (see, for example, Boikanyo 2016; Chang et al. 2019; Cholamjiak 2016; Cholamjiak et al. 2018; Dong et al. 2017; Gibali and Thong 2018; López et al. 2012; Riahi et al. 2018; Shehu 2016, 2019; Shehu and Cai 2018; Thong and Cholamjiak 2019; Wang and Wang 2018). However, there are still few results on the strong convergence results concerning more general case of problem (1) when A and B are set-valued maximal monotone operators. This is the gap that this paper aims to fill in.
Our aim in this paper is to prove the strong convergence analysis of the inertial Douglas–Rachford splitting method with different conditions from the conditions (a) and (b) assumed in Boţ et al. (2015) without assuming uniform monotonicity on either maximal monotone operator A or B. Furthermore our assumptions on the inertial factor \(\theta _k\) here in this paper are the same assumptions in the results of Alvarez and Attouch (2001) (which is a special case of our result). In summary,
-
We prove strong convergence analysis of inertial Douglas–Rachford splitting method without using the conditions (a) and (b) assumed in Boţ et al. (2015). Our inertial conditions are the same as the ones assumed in Alvarez and Attouch (2001) for finding zero of a set-valued maximal monotone operator using inertial proximal method.
-
We obtain strong convergence results without assuming that any of the involved maximal monotone operators is uniformly monotone on every nonempty bounded subset. Our strong convergence results are much more general than the current ones in Bauschke and Combettes (2011) and other associated works where strong convergence is obtained.
-
Some numerical examples are given to confirm the importance of the presence of inertial term in our method.
The paper is therefore organized as follows: We first recall some basic explanations of Douglas–Rachford splitting method and introduce our inertial Douglas–Rachford splitting method alongside some results in Sect. 2. The analysis of strong convergence of our proposed method is then investigated in Sect. 3. We give numerical implementations in Sect. 4 and conclude with some final remarks in Sect. 5.
2 Preliminaries
Let us first recall some basics that are required to derive and analyze the Douglas–Rachford splitting method; for the corresponding details, we refer, Eckstein and Bertsekas (1992), He and Yuan (2015), Svaiter (2011) and Zhang and Cheng (2013).
Let \(\lambda > 0\) be a fixed parameter, and let us denote by
the resolvents of A and B, respectively, which are known to be firmly nonexpansive (operator T is firmly-nonexpansive if \(\langle x-y,Tx-Ty\rangle \ge \Vert Tx-Ty\Vert ^2, ~~\forall x,y\in H\)). Furthermore, let us write
for the corresponding reflections (also called Cayley operators), and note that the reflections are nonexpansive operators (T is nonexpansive if \(\Vert Tx-Ty\Vert \le \Vert x-y\Vert ,~~\forall x,y\in H\)).
In Eckstein and Bertsekas (1992) and He and Yuan (2015), the maximal monotone operator \(S_{\lambda ,A,B}\) is defined as
It was shown in Eckstein and Bertsekas (1992) that the Douglas–Rachford splitting method (5) can be converted to
By Eckstein and Bertsekas (1992, Thm. 5), for any given zero \(z^*\) of \(S_{\lambda ,A,B}\), \(J^\lambda _B(z^*)\) is a zero of \(A+B\). Therefore, \(J^\lambda _B(z^*)\) is a solution of (1) whenever \(z^*\) satisfies
Consequently, the Douglas–Rachford splitting method (5) can be rewritten as
where \(e(z_k,\lambda ):=\frac{1}{2}(z_k-R^\lambda _A o R^\lambda _B(z_k))\).
In this paper, our convergence analysis will be conducted for an inertial generalized version of Douglas–Rachford splitting method (8): \(z_0, z_1 \in H\),
with \(\alpha _k \in [0,1), \beta _k \in (0,1]\) and \(\theta _k \in [0,1)\). We get the original Douglas–Rachford method (8) when \(\beta _k=1, \theta _k=0=\alpha _k\) in (9).
We next recall some properties of the projection. For any point \(u \in H\), there exists a unique point \(P_C u \in C\) such that
\(P_C\) is called the metric projection of H onto C. We know that \(P_C\) is a nonexpansive mapping of H onto C. It is also known that \(P_C\) satisfies
In particular, we get from (10) that
Furthermore, \(P_C x\) is characterized by the properties
This characterization implies that
The following result is obtained (Shehu et al. 2020) but we give the proof for the sake of completeness.
Lemma 2.1
Let \(S \subseteq H\) be a nonempty, closed, and convex subset of a real Hilbert space H. Let \(u \in H\) be arbitrarily given, \(z := P_S u\), and \(\Omega := \{ x \in H : \langle x - u, x - z \rangle \le 0 \}\). Then \(\Omega \cap S = \{ z \}\).
Proof
By definition, it follows immediately that \(z \in \Omega \cap S\). Conversely, take an arbitrary \(y \in \Omega \cap S\). Then, in particular, we have \(y \in \Omega\), and it therefore follows that
Using \(z = P_S u\) together with the characterization (12), we also have
In particular, since \(y \in S\), we therefore have \(\langle u-z, z-y \rangle \ge 0\). Hence (14) implies \(\Vert y-z \Vert ^2 \le 0\), so that \(y = z\). This completes the proof. \(\square\)
Finally, we state some basic properties that will be used in our convergence theorems.
Lemma 2.2
The following statements hold in H:
-
(a)
\(\Vert x+y\Vert ^2=\Vert x\Vert ^2+2\langle x,y\rangle +\Vert y\Vert ^2\) for all \(x, y \in H\).
-
(b)
\(2 \langle x-y, x-z \rangle = \Vert x-y \Vert ^2 + \Vert x-z \Vert ^2 - \Vert y-z \Vert ^2\) for all \(x,y,z \in H\).
-
(c)
\(\Vert tx+sy\Vert ^2=t(t+s)\Vert x\Vert ^2+s(t+s)\Vert y\Vert ^2-st\Vert x-y\Vert ^2, \quad \forall x, y \in H, \forall s, t \in \mathbb {R}.\)
Lemma 2.3
(Maingé 2008) Assume that \(\varphi _{k}\in [0,\infty )\) and \(\delta _{k}\in [0,\infty )\) satisfy:
-
(1)
\(\varphi _{k+1}-\varphi _{n}\le \theta _{k}(\varphi _{k}-\varphi _{k-1})+\delta _{k},\)
-
(2)
\(\sum _{k=1}^{\infty }\delta _{k}<\infty ,\)
-
(3)
\(\{\theta _{k}\}\subset [0,\theta ],\) where \(\theta \in (0,1).\)
Then the sequence \(\{\varphi _{k}\}\) is convergent with \(\sum _{k=1}^{\infty }[\varphi _{k+1}-\varphi _{k}]_{+}<\infty ,\) where \([t]_{+}:=\max \{t,0\}\) (for any \(t\in \mathbb {R})\).
3 Analysis of the convergence
For the rest of this paper, we assume that \(S\ne \emptyset\), \(\alpha _k \in (0,1)\) with \(\lim _{k\rightarrow \infty } \alpha _k=0\) and \(\sum _{k=1}^\infty \alpha _k=\infty\), \(0<\beta \le \beta _k\le 1\) and \(0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}\).
Lemma 3.1
Let \(\{z_k\}\) be the sequence generated by (9). For any z satisfying (7), we have
Proof
By (9), we get
We know that \(e(y_k,\lambda )=\frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))\), where \(\lambda >0\) is the proximal parameter, is firmly-nonexpansive (see, He and Yuan 2015, lem. 2.2). Thus,
In particular, for \(z=R^\lambda _AoR^\lambda _B(z)\), we obtain
Putting (17) into (16), we have
Recall that \(\beta _k e(y_k,\lambda )=y_k-z_{k+1}\) implies that
Using (19) in (18) and the condition that \(0<\beta \le \beta _k\le 1\), we have
\(\square\)
Lemma 3.2
Let \(\{z_k\}\) be the sequence generated by (9). For any z satisfying (7), we have
Proof
Moreover, from the definition of \(y_k\), we obtain using Lemma 2.2 (a) that
and, similarly, with z replaced by \(z_{k+1}\) in the previous formula,
Substituting (21) and (22) into (15) and eliminating identical terms, we get
Therefore, we obtain
where the last identity exploits Lemma 2.2 (a) twice. We therefore have
Using the fact that \(\{\theta _k\}\) is non-decreasing and \(\{ \alpha _k \}\) is non-increasing, we then obtain
which is the desired inequality. \(\square\)
Our first central result below shows that the sequence \(\{ z_k \}\) generated by (9) is bounded.
Lemma 3.3
The sequence \(\{z_k\}\) generated by (9) is bounded.
Proof
A simple re-ordering of (20) implies that
where the equality uses once again Lemma 2.2 (a). Hence, by cancellation, re-ordering, and neglecting a non-positive term on the right-hand side, we obtain
Let \(\mu _j:=e^{\sum _{i=1}^{j}\alpha _{i}}, j\ge 1\). Using \(1-x \le e^{-x}\) for all \(x \in \mathbb R\) (or equivalently, \(1-e^{-x} \le x, x \in \mathbb {R}\)), we obtain
Then (29) consequently implies that
Since \(\{\alpha _k\}\) is non-increasing in (0,1), this implies
It then follows from (28) and (30) that
Since \(\mu _{k}\le \mu _{k+1}\), \(\mu _{k+1}=\mu _{k}e^{\alpha _{k+1}}\) and \(\{\alpha _k\}\) is non-increasing in (0,1), we therefore get
which can be rewritten as (since \(\{\alpha _k\}\) is non-increasing in (0,1))
Since the sequence \(\{\theta _k\}\) belongs to the interval \([0,\theta ]\), we have
Using \(\lim _{k \rightarrow \infty } \alpha _k = 0\) and \(\theta \in [0, 1/3)\), it follows that the right-hand side is eventually bounded from below by a positive number, i.e., there is a constant \(\gamma > 0\) such that \(1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k\ge \gamma\) for all \(k \in \mathbb N\) sufficiently large, say, for all \(k \ge k_0\). Hence, we have
This implies that for \(k\ge k_{0}\),
Thus, dividing by \(\mu _{k+1}\) and omitting a non-positive term, we get
where \(t_{k}:=\sum _{i=1}^{k}\alpha _{i}\). Since \(\alpha _k \in (0,1)\) for all \(k \in \mathbb N\), it is easy to see that \(\alpha _{k}e^{t_{k+1}}\le e^{2}(e^{t_{k}}-e^{t_{k-1}})\) for all \(k\ge 2\), so that
which, by (32), \(e^{-t_{k+1}} \le 1\), and the fact that \(\{\theta _k\}\) belongs to the interval \([0, \theta ] \subset [0,\frac{1}{3})\), yields
Using (33), \(\theta \in [0,1)\), and the convergence of the geometric series, a simple calculation gives
Using once again that \(\theta < 1\), this shows that \(\{z_k\}\) is bounded. \(\square\)
Next, we formulate a simple lemma that turns out to be useful for proving the strong convergence result.
Lemma 3.4
Let \(\{z_k\}\) be the sequence generated by (9). Define
for all \(k \in \mathbb N\). Then \(u_k \ge 0\) for all \(k \in \mathbb N\).
Proof
Since \(\{\theta _k\}\) is non-decreasing with \(0\le \theta _k< \frac{1}{3}\), and by Lemma 2.2 (a), we have
and this completes the proof. \(\square\)
Before we prove our main strong convergence result, we state another preliminary result which provides sufficient conditions for the strong convergence of the sequence \(\{z_k\}\) generated by our method (9). In our strong convergence result, we will then show that these sufficient conditions automatically hold.
Lemma 3.5
Let \(\{z_k\}\) be the sequence generated by (9). Assume that
and
Then the entire sequence \(\{z_k\}\) converges strongly to the solution z.
Proof
By assumption, we have
We claim that this already implies
from which the strong convergence of the entire sequence \(\{z_k\}\) to z follows immediately. Assume this limit does not hold. Then there is a subset \(K\subseteq \mathbb {N}\) and a constant \(\rho > 0\) such that
Since \(\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0\) by the assumption and \(0\le \theta <1\), then (recall that if \(\{a_k\}\) and \(\{b_k\}\) are bounded sequences in \(\mathbb {R}\) and one of either \(\{a_k\}\) or \(\{b_k\}\) converges, then \(\limsup _{k\rightarrow \infty } (a_k+b_k)=\limsup _{k\rightarrow \infty } a_k +\limsup _{k\rightarrow \infty } b_k\))
Using (34) and \(\theta _k\le \theta <1\), we get
Consequently, we have \(\limsup _{k \in K}\Vert z_k-z\Vert \le 0.\) Since \(\liminf _{k \in K}\Vert z_k-z\Vert \ge 0\) obviously holds, it follows that \(\lim _{k \in K}\Vert z_k-z\Vert = 0.\) This implies [by (35)]
for all \(k\in K\) sufficiently large, a contradiction to the assumption that \(\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0.\) This completes the proof. \(\square\)
We are now ready to obtain strong convergence of the sequence \(\{ z_k \}\) generated by (9) to an element of S.
Theorem 3.6
The sequence \(\{z_k\}\) generated by (9) strongly converges to z, where \(z=P_Sz_0\).
Proof
Let \(u_{k}\) denote the nonnegative number defined in Lemma 3.4, and let us apply Lemma 3.2. We obtain from (20) that
We now consider two cases.
Case 1 Suppose \(\{u_{k}\}\) is eventually a monotonically decreasing sequence, i.e. for some \(k_{0} \in \mathbb N\) large enough, we have \(u_{k+1} \le u_k\) for all \(k \ge k_0\). Then, since \(u_k\) is nonnegative for all \(k \in \mathbb N\) by Lemma 3.4, we obviously get that \(\{u_{k}\}\) is a convergent sequence. Consequently, it follows that \(\lim _{k\rightarrow \infty }u_{k}=\lim _{k\rightarrow \infty }u_{k+1}\). Since \(\{z_k\}\) is bounded by Theorem 3.3, there exists \(M>0\) such that \(2|\langle z_k-z,z_k-z_0\rangle |\le M.\) Moreover, it follows that there exist \(N\in \mathbb {N}\) and \(\gamma _{1}>0\) such that \(1-3\theta _{k+1}-\alpha _k \ge \gamma _{1}\) for all \(k\ge N\). Therefore, for \(k\ge N\), we obtain from (36) that
Hence
Together with \(\alpha _k \rightarrow 0\), the boundedness of \(\{ z_k \}\), and the convergence of \(\{ u_k \}\), we therefore obtain from the definition of \(u_k\) that the limit
exists and is equal to \(\lim _{k \rightarrow \infty } u_{k+1}\). In particular, Lemma 3.4 therefore implies that \(\lambda \ge 0\). We will show that \(\lambda = 0\) holds; then (37) together with the fact that \(\theta _k \le \theta < 1\) for all \(k \in \mathbb N\) yields the strong convergence of the sequence \(\{ z_k \}\) to the solution z.
By contradiction, assume that \(\lambda > 0\). Since \(\{ z_k \}\) is bounded by Theorem 3.3, it is easy to see that we can choose a subsequence \(\{z_{k_{j}}\}\) which converges weakly to an element \(p\in H\) and such that
We show that \(p \in S\). Observe that the updating rule for \(y_k\) implies
This yields
Let \(Ty:=\frac{1}{2}y+\frac{1}{2}R^\lambda _AoR^\lambda _B(y),~~y \in H\). Then it is clear that T is nonexpansive and \(z \in F(T):=\{x\in H:x=Tx\}\) if and only if \(z=R^\lambda _AoR^\lambda _B(z)\). Similarly, it is easy to see that \(e(y_k,\lambda )= \frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))=y_k-Ty_k\). Therefore,
Demiclosedness Principle of T implies that \(p \in F(T)\). Hence, \(p \in S\). This implies that
where the inequality follows from the characterization (12) of a projection applied to \(z = P_S z_0\) and \(p \in S\). Since (37) yields
and since \(\lambda > 0\) by assumption, we have
for some sufficiently large \(k_1 \in \mathbb N\). Using the identity
we therefore get
from (38). Using once again the assumption that \(\lambda > 0\), this implies
for some sufficiently large \(k_2 \in \mathbb N, k_2 \ge k_1\). From (36), we therefore obtain
This implies
where the second inequality follows from Lemma 3.4. Since \(\lambda > 0\), this gives the summability of the sequence \(\{ \alpha _k \}\), a contradiction to our assumption. Hence we must have \(\lambda = 0\), and this yields the strong convergence of the sequence \(\{ z_k \}\) to z.
Case 2 Assume that \(\{u_k\}\) is not eventually monotonically decreasing. Then let \(\tau :\mathbb {N}\rightarrow \mathbb {N}\) be the map defined for all \(k\ge k_{0}\) (for some \(k_{0} \in \mathbb N\) large enough) by
Clearly, \(\tau (k)\) is a non-decreasing sequence such that \(\tau (k) \rightarrow \infty\) for \(k\rightarrow \infty\) and \(u_{\tau (k)}\le u_{\tau (k)+1}\) for all \(k\ge k_{0}\). Hence, similar to the proof of Case 1, we therefore obtain from (36) that
for some constant \(M > 0\). Thus,
Using the same technique of the proof as in Case 1, one can also derive the limits
Again observe that for \(j\ge 0\) by (36), we have \(u_{j+1}<u_{j}\) when \(x_{j}\not \in \Omega :=\{x\in H: \langle x-z_0,x-z\rangle \le 0\}\) (note that this \(\Omega\) is the same set as in Lemma 2.1). Hence \(x_{\tau (k)}\in \Omega\) for all \(k\ge k_{0}\) since \(u_{\tau (k)} \le u_{\tau (k)+1}\). Since \(\{x_{\tau (k)}\}\) is bounded, we may choose a subsequence (which we again call \(\{x_{\tau (k)}\}\)) which converges weakly to some \(x^{*}\in H\). As \(\Omega\) is a closed and convex set, it is then weakly closed and so \(x^{*} \in \Omega\). Using (43), one can see as in Case 1 that \(z_{\tau (k)}\rightharpoonup x^{*}\) and \(x^* \in S\). Consequently, we have \(x^{*}\in \Omega \cap S\). In view of Lemma 2.1, however, the intersection \(\Omega \cap S\) contains z as its only element. We therefore get \(x^* = z\). Furthermore, we have
since \(x_{\tau (k)}\in \Omega\). Taking lim sup in this last inequality gives
Hence
We claim that this implies \(\lim _{k \rightarrow \infty } u_{\tau (k)+1} = 0\). By definition, \(u_{\tau (k)+1}\) is equal to
Adding and subtracting \(x_{\tau (k)}\) inside the norm of the first term, and using (41), (44), we see that the first term goes to zero. The second term converges to zero also in view of (44), taking into account the boundedness of \(\{ \theta _k \}\). The third term vanishes in the limit because of (41) and noting once again that \(\{ \theta _k \}\) is a bounded sequence. Finally, the last term goes to zero since \(\{ \alpha _k \}\) converges to zero and the sequence \(\{ z_k \}\) is bounded by Theorem 3.3.
We next show that we actually have \(\lim _{k \rightarrow \infty } u_k = 0\). To this end, first observe that, for \(k\ge k_{0},\) one has \(u_{k}\le u_{\tau (k)+1}\) if \(k\ne \tau (k)\) (that is, if \(\tau (k)<k\)) because we necessarily have \(u_{j}>u_{j+1}\) for \(\tau (k)+1\le j\le k-1\). It follows that for all \(k\ge k_{0}\), we have \(u_{k}\le \max \{u_{\tau (k)}, u_{\tau (k)+1}\}=u_{\tau (k)+1} \rightarrow 0\), hence \(\limsup _{k\rightarrow \infty }u_{k}\le 0\). On the other hand, Lemma 3.4 implies that \(\liminf _{k \rightarrow \infty } u_k \ge 0\). Together we obtain \(\lim _{k \rightarrow \infty } u_k = 0\).
Consequently, the boundedness of \(\{ z_k \}\), assumptions on our iterative parameters and (36) show that
Hence the definition of \(u_k\) yields
Using our assumption, it is not difficult to see that this implies the strong convergence of the entire sequence \(\{ z_k \}\) to the particular solution z. The statement therefore follows from Lemma 3.5. \(\square\)
In the special case when B is a set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator in problem (1), iterative procedure (9) reduces to the following: \(z_0, z_1 \in H\),
with \(0<\lambda <2\kappa\). Moreover, we obtain strong convergence for this special case of monotone inclusion for which its proof can be easily obtained by following line of arguments of previous lemmas and Theorem 3.6.
Corollary 3.7
Suppose B is a set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator. Assume that \(S:=\{x\in H: 0\in Ax+Bx\}\ne \emptyset\). Let \(\{z_k\}\) be the sequence generated by (45) with \(0<\beta \le \beta _k\le \frac{1}{2}\), \(0<\lambda <2\kappa\) and \(0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}\). Then \(\{z_k\}\) strongly converges to z, where \(z=P_Sz_0\).
We next relate our results to some existing results from the literature.
Remark 3.8
-
(a)
In the results of Thong and Vinh (see Thong and Vinh 2019, Thm. 3.5), strong convergence for monotone inclusion was obtained under some assumptions on the iterative sequence. The monotone inclusion studied in Thong and Vinh (2019) involves sum of a set-valued maximal monotone operator and single-valued inverse-strongly monotone operator. In this paper, our method is proposed such that no assumption is made on the iterative sequence even for a more general result considered here.
-
(b)
The Algorithm (45) could be taken as the inertial strong convergence version of some recent results in Attouch and Cabot (2019), Boţ and Csetnek (2016), Lorenz and Pock (2015) and Villa et al. (2013). \(\Diamond\)
4 Numerical experiments
In all the examples in this section, we compare our proposed method (9) with the non-inertial version (when \(\theta _n=0\)), Thong and Vinh results (see Thong and Vinh 2019, Thm. 3.5) and Shehu (2016). Our aim is to compare our method with other relevant strong convergence methods in the literature.
Example 4.1
Let \(H=L^2([0,1])\). Let \(A:=\partial \Vert .\Vert\) and \(B=N_C\) in (1), where \(N_C\) is the normal cone of nonempty closed and convex subset C of H (\(N_C(x):=\{x^* \in H:\langle y-x,x^*\rangle \le 0, \forall y \in C \}\)), . Then problem (1) reduces to the following minimization problem: find \(x^{*}\in L^2([0,1])\) such that
Note that \(S\ne \emptyset\) since \(0\in S.\) Furthermore, the resolvent \(J^{\lambda }_B=(I+\lambda N_C)^{-1}=P_C\), and \(J^{\lambda }_A\) is given by the Moreau decomposition
where \(\text {Prox}_{\lambda \Vert .\Vert }(x) := \text {argmin}_y \big \{ \Vert y\Vert + \frac{1}{2} \Vert y - x \Vert ^2 \big \}\), \(P_{B_{\Vert .\Vert _*}}\) is the projection operator and \(B_{\Vert .\Vert _*}\) is the norm unit ball(of the dual norm). Note that in this case, \(L^2([0,1])\) is self dual. Moreover, the projection \(P_{B_{\Vert .\Vert _*}}\) (see Bauschke and Combettes 2011; Cegielski 2012) is given by:
Therefore,
We take C as the ball \(C:=\{x \in H:\Vert x-z\Vert \le r\}\), then
In particular, \(C=\{x\in L^2([0,1]):\int _0^1|x(t)-\sin (\frac{t}{2\pi })|^2 dt \le 16 \}\).
Set \(\lambda =0.02\), \(\beta _k=0.6\) and \(\alpha _k=100/k\). Take \(\Vert z_k-z_{k-1}\Vert \le 10^{-3}\) as the stopping criterion (Fig. 1).
Example 4.2
Suppose \(A:\mathbb {R}^3\rightarrow \mathbb {R}^3\) and \(B:\mathbb {R}^3 \rightarrow \mathbb {R}^3\) are given by
It can be shown that \(\Omega =\{(0,0,0)\}\).
Let \(z_0\) be randomly selected. In Algorithm 3 of Thong and Vinh (2019), we chose \(\lambda =0.1\), \(\beta _k=1/(k+1)\) and \(\tau _k=1/(k+1)^2\). In Algorithm (26) of Shehu (2016), we chose \(\alpha _k=1/k\), \(\beta _k=k/(2k+1)\) and \(r_k=0.1\). Take \(\Vert z_k\Vert \le 0.005\) as the stopping criterion.
For Examples 4.2 and 4.3, we take \(\lambda =0.2\), \(\beta _k=0.5\), \(\alpha _k=\frac{1}{25k}\) in Algorithm (9) and \(\beta =0.2\) in Algorithm 3 of Thong and Vinh (2019).
We compared the algorithm (9), Algorithm 3 in Thong and Vinh (2019) and the algorithm (26) in Shehu (2016). From Fig. 2, we know that the performance of the algorithm (9) is better than that of the other two algorithms.
Example 4.3
Let us consider the following well known \(\ell _1\)-regularized least squares problem, which consists of finding a sparse solution to an underdetermined linear system. Suppose that we solve the following problem:
where \(D \in \mathbb {R}^{m \times n}\) and \(b\in \mathbb {R}^m\). In this case,
while
We remark that there is a commercial software, based on the projected gradient method for solving problem (47), for example SPGL1 (van den Berg and Friedlander 2007; Lorenz 2013) and FISTA (Beck and Teboulle 2009), but this is beyond the scope of this paper. Our interest here is to demonstrate the efficiency of our proposed method (9) using problem (47).
We generate random problems using different choices of \(\lambda\) for \(m=100\) and \(n=1000\). In Algorithm 3 of Thong and Vinh (2019) and the algorithm (26) of Shehu (2016), we chose \(\rho =1\) and \(\lambda =1.9/(\max (eig(D^TD)))\), and in the algorithm (9), we chose \(\rho =0.5\). In addition, select \(r_k=0.2\) in algorithm (26) of Shehu (2016).
Table 3 shows that the algorithm (9) is better when \(\theta _k=0.33\). The numerical result is described in Fig. 3, it illustrates that the performance of Algorithm (9) is better than the other two algorithms.
Remark 4.4
-
(a)
It can be seen from the numerical examples that Algorithm (9) outperforms the methods in Shehu (2016) and Thong and Vinh (2019) (see Figs. 2, 3) for strong convergence of sum of maximal monotone operators. Furthermore, the additional of inertial term improves the acceleration of the proposed method as can be seen in the numerical examples that Algorithm (9) converges faster than the non-inertial case when \(\theta _k=0\) (please see, Tables 1, 2, 3). Also, the optimum choice of \(\theta _k=0\) should be close to the upper bound \(\frac{1}{3}\) from our examples.
-
(b)
Algorithm (9) is sensitive to the choice of the initial point \(z_0\) as can be seen in our examples in Tables 1, 2 and 3.
Remark 4.5
-
(a)
We point out that there are different strategies in the current literature to enforce strong convergence on proximal-like algorithms (in particular, DR splitting); see, e.g., Solodov and Svaiter (2000) and Hirstoaga (2006). In this regard, the results of Hirstoaga (2006) is concerned with ”anchor-point” algorithms as employed in our proposed method (9). One can see in Algorithm 2.1 of Hirstoaga (2006), there is no presence of inertial extrapolation term, \(\theta _k(z_k-z_{k-1})\), which has been proved in the literature to increase the speed of convergence of non-inertial counterpart in most optimization methods. From our proposed method (9), we see that when \(\theta _k\ne 0\) (in this paper, we assume that \(0 \le \theta _k \le \theta <\frac{1}{3}\)), then our method (9) cannot be reduced to Algorithm 2.1 of Hirstoaga (2006) applied to the splitting operator of Eckstein and Bertsekas (1992). As confirmed in our numerical examples in Sect. 4, our method (9) outperforms Algorithm 2.1 of Hirstoaga (2006) when applied to the splitting operator of Eckstein and Bertsekas (1992). Also, our method of proof is different from the method of proof given in Hirstoaga (2006).
-
(b)
The essence of our numerical examples in Sect. 4 is to drive home the implementations and effectiveness of our proposed method (9). As discussed in MacNamara and Strang (2016) and other related chapters in the book, applications of our method (9) to solve problems arising from wireless communications, imaging, networking, finance, hemodynamics, free-surface flows, and other science and engineering problems in infinite-dimensional Hilbert spaces would be discussed separately as a future project. \(\Diamond\)
5 Final remarks
In this paper we propose a Douglas–Rachford splitting method with inertial extrapolation step and give strong convergence analysis of the method. The method is much more applicable for a general class of maximal monotone operators and no uniform monotonicity on any of the involved maximal monotone operators is assumed. Furthermore, the analysis of the algorithm is obtained under the natural condition of the inertial factor \(\theta _k\) being monotone non-decreasing and bounded away from 1/3. Some numerical illustrations are given to test the efficiency and implemnetation of the proposed scheme. The results obtained in this paper could serve as the strong convergence counterpart of already obtained weak convergence methods for inertial Douglas–Rachford splitting methods (Bauschke and Combettes 2011; Beck and Teboulle 2009; Boţ et al. 2015; Lorenz and Pock 2015; Thong and Vinh 2019) in the literature.
Our future project include the following:
References
Alvarez F, Attouch H (2001) An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal 9:3–11
Attouch H, Cabot A (2019) Convergence of a relaxed inertial forward–backward algorithm for structured monotone inclusions. Appl Math Optim. https://doi.org/10.1007/s00245-019-09584-z
Bauschke HH, Combettes PL (2001) A weak-to-strong convergence principle for Fejér-monotone methods in Hilbert spaces. Math Oper Res 26:248–264
Bauschke HH, Combettes PL (2011) Convex analysis and monotone operator theory in Hilbert spaces. CMS books in mathematics. Springer, New York
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Boikanyo OA (2016) The viscosity approximation forward–backward splitting method for zeros of the sum of monotone operators. Abstr Appl Anal. Article ID 2371857
Boţ RI, Csetnek ER (2016) An inertial forward-backward-forward primal-dual splitting algorithm for solving monotone inclusion problems. Numer. Algorithms 71:519–540
Boţ RI, Csetnek ER, Hendrich C (2015) Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl Math Comput 256:472–487
Cegielski A (2012) Iterative methods for fixed point problems in Hilbert spaces. Lecture notes in mathematics 2057. Springer, Berlin
Chang S-S, Wen C-F, Yao J-C (2019) A generalized forward-backward splitting method for solving a system of quasi variational inclusions in Banach spaces. RACSAM 113:729–747
Cholamjiak P (2016) A generalized forward-backward splitting method for solving quasi inclusion problems in Banach spaces. Numer. Algorithms 71:915–932
Cholamjiak W, Cholamjiak P, Prasit SS (2018) An inertial forward-backward splitting method for solving inclusion problems in Hilbert spaces. J. Fixed Point Theory Appl. 20:42
Combettes PL (2004) Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53:475–504
Dong Q, Jiang D, Cholamjiak P, Shehu Y (2017) A strong convergence result involving an inertial forward–backward algorithm for monotone inclusions. J. Fixed Point Theory Appl. 19:3097–3118
Dong QL, Cho YJ, Zhong Zhong LL, Rassias TM (2018) Inertial projection and contraction algorithms for variational inequalities. J. Glob. Optim. 70(3):687–704
Douglas J, Rachford HH (1956) On the numerical solution of heat conduction problems in two or three space variables. Trans. Am. Math. Soc. 82:421–439
Eckstein J (1989) Splitting methods for monotone operators with applications to parallel optimization. Doctoral dissertation, Department of Civil Engineering, Massachusetts Institute of Technology. Available as Report LIDS-TH-1877, Laboratory for Information and Decision Sciences, MIT, Cambridge
Eckstein J, Bertsekas DP (1992) On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(3):293–318
Fukushima M (1996) The primal Douglas–Rachford splitting algorithm for a class of monotone mappings with application to the traffic equilibrium problem. Math. Program. 72:1–15
Gabay D, Mercier B (1976) A dual algorithm for the solution of nonlinear variational problems via finite element approximations. Comput. Math. Appl. 2:17–40
Gibali A, Thong DV (2018) Tseng type methods for solving inclusion problems and its applications. Calcolo 55:49
Glowinski R, Le Tallec P (1989) Augmented Lagrangian and operator-splitting methods in nonlinear mechanics. SIAM Studies in Applied Mathematics, vol 9, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA. ISBN: 0-89871-230-0
Glowinski R, Marrocco A (1975) Approximation par é’ements finis d’ordre un et résolution par pénalisation-dualité d’une classe de problémes non linéaires. R.A.I.R.O. R2:41–76
Güler O (1991) On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29:403–419
He B, Yuan X (2015) On the convergence rate of Douglas–Rachford operator splitting method. Math. Program. 153:715–722
Hirstoaga SA (2006) Iterative selection methods for common fixed point problems. J. Math. Anal. Appl. 324(2):1020–1035
Lions PL, Mercier B (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16:964–979
López G, Martín-Márquez V, Wang F, Xu H-K (2012) Forward–backward splitting methods for accretive operators in Banach spaces. Abstr. Appl. Anal. Article ID 109236
Lorenz DA (2013) Constructing test instances for Basis Pursuit Denoising. IEEE Trans. Signal Process. 61:1210–1214
Lorenz DA, Pock T (2015) An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51:311–325
MacNamara S, Strang G (2016) Operator splitting. In: Glowinski R, Osher S, Yin W (eds) Splitting methods in communication, imaging, science, and engineering. Springer, Berlin, pp 95–114
Maingé PE (2008) Convergence theorem for inertial KM-type algorithms. J. Comput. Appl. Math. 219(1):223–236
Martinet B (1970) Regularisation dequations variationnelles par approximations successives. Rev. Francaise Informat. Recherche. Operationnelle 4:154–158
Moreau JJ (1965) Proximité et dualit’e dans un espace Hilbertien. Bull. Soc. Math. Fr. 93:273–299
Riahi H, Chbani Z, Loumi M-T (2018) Weak and strong convergences of the generalized penalty Forward–Forward and Forward–Backward splitting algorithms for solving bilevel hierarchical pseudomonotone equilibrium problems. Optimization 67:1745–1767
Rockafellar RT (1976) Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14:877–898
Shehu Y (2016) Iterative approximations for zeros of sum of accretive operators in Banach spaces. J Funct Spaces. Article ID 5973468
Shehu Y (2018) Convergence rate analysis of inertial Krasnoselskii–Mann-type iteration with applications. Numer. Funct. Anal. Optim. 39:1077–1091
Shehu Y (2019) Convergence results of forward–backward algorithms for sum of monotone operators in Banach spaces. Results Math. 74:138
Shehu Y, Cai G (2018) Strong convergence result of forward–backward splitting methods for accretive operators in Banach spaces with applications. RACSAM 112:71–87
Shehu Y, Li X-H, Dong Q-L (2020) An efficient projection-type method for monotone variational inequalities in Hilbert spaces. Numer. Algorithms 84:365–388
Solodov MV, Svaiter BF (2000) Forcing strong convergence of proximal point iterations in a Hilbert space. Math. Program. Ser. A 87:189–202
Svaiter BF (2011) On weak convergence of the Douglas–Rachford method. SIAM J. Control Optim. 49:280–287
Thong DV, Cholamjiak P (2019) Strong convergence of a forward–backward splitting method with a new step size for solving monotone inclusions. Comput. Appl. Math. 38:94
Thong DV, Vinh NT (2019) Inertial methods for fixed point problems and zero point problems of the sum of two monotone mappings. Optimization 68:1037–1072
van den Berg E, Friedlander MP (2007) SPGL1: a solver for large-scale sparse reconstruction. http://www.cs.ubc.ca/labs/scl/spgl1. Version 1.9. Accessed 2015
Villa S, Salzo S, Baldassarres L, Verri A (2013) Accelerated and inexact forward–backward. SIAM J. Optim. 23:1607–1633
Wang Y, Wang F (2018) Strong convergence of the forward–backward splitting method with multiple parameters in Hilbert spaces. Optimization 67:493–505
Zhang H, Cheng L (2013) Projective splitting methods for sums of maximal monotone operators with applications. J. Math. Anal. Appl. 406:323–334
Acknowledgements
Open access funding provided by Institute of Science and Technology (IST Austria). The project of Yekini Shehu has received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Program (FP7—2007–2013) (Grant Agreement No. 616160). The authors are grateful to the anonymous referees and the handling Editor for their comments and suggestions which have improved the earlier version of the manuscript greatly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Shehu, Y., Dong, QL., Liu, LL. et al. New strong convergence method for the sum of two maximal monotone operators. Optim Eng 22, 2627–2653 (2021). https://doi.org/10.1007/s11081-020-09544-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11081-020-09544-5