New strong convergence method for the sum of two maximal monotone operators

Shehu, Yekini; Dong, Qiao-Li; Liu, Lu-Lu; Yao, Jen-Chih

doi:10.1007/s11081-020-09544-5

New strong convergence method for the sum of two maximal monotone operators

Research Article
Open access
Published: 31 July 2020

Volume 22, pages 2627–2653, (2021)
Cite this article

Download PDF

You have full access to this open access article

Optimization and Engineering Aims and scope Submit manuscript

New strong convergence method for the sum of two maximal monotone operators

Download PDF

Yekini Shehu^1,2,
Qiao-Li Dong³,
Lu-Lu Liu³ &
…
Jen-Chih Yao⁴

2356 Accesses
5 Citations
Explore all metrics

Abstract

This paper aims to obtain a strong convergence result for a Douglas–Rachford splitting method with inertial extrapolation step for finding a zero of the sum of two set-valued maximal monotone operators without any further assumption of uniform monotonicity on any of the involved maximal monotone operators. Furthermore, our proposed method is easy to implement and the inertial factor in our proposed method is a natural choice. Our method of proof is of independent interest. Finally, some numerical implementations are given to confirm the theoretical analysis.

Convergence results of iterative algorithms for the sum of two monotone operators in reflexive Banach spaces

Article 03 September 2021

Convergence Analysis of a New Forward-Reflected-Backward Algorithm for Four Operators Without Cocoercivity

Article 02 August 2024

Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal strongly monotone operators

Article Open access 20 March 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let H be a real Hilbert space with scalar product $\langle . , . \rangle$ and induced norm $\Vert \cdot \Vert$. An operator $A:H\rightarrow 2^H$ with domain D(A) is said to be monotone if

$$\begin{aligned} \langle u-v,x-y\rangle \ge 0~~\forall x,y \in D(A),~~u \in Ax,v\in Ay. \end{aligned}$$

A is maximal monotone if its graph

$$\begin{aligned} G(A):=\{(x,y):x \in D(A),y \in Ax\} \end{aligned}$$

is not properly contained in the graph of any other monotone operators.

Let us consider the inclusion problem of the form

$$\begin{aligned} 0\in A(u)+B(u), \end{aligned}$$

(1)

where A and B are set-valued maximal monotone operators in H. Throughout this paper, we assume that the set of solution, denoted by S, of (1) is nonempty.

The proximal point algorithm (PPA) is the well-known method for solving inclusion problem (1) (see, Lions and Mercier 1979; Martinet 1970; Moreau 1965; Rockafellar 1976). The PPA for solving (1) is expressed as

$$\begin{aligned} 0\in A(u_{k+1})+B(u_{k+1})+\frac{1}{\lambda }(u_{k+1}-u_k), \end{aligned}$$

(2)

where $\lambda >0$ is the proximal parameter. Now, implementing PPA (2) to solve (1) requires computing the resolvent operator of the sum $A+B$ exactly. This is very difficult to implement and could be as hard as the original inclusion problem (1). This difficulty has led many authors to consider the operator splitting approach to solve (1). The aim of operator splitting method is to circumvent the computation of $J^\lambda _{A+B}$ when implementing (2) but rather consider the computation of $J^\lambda _A$ and $J^\lambda _B$ (Eckstein and Bertsekas 1992; Glowinski and Le Tallec 1989; Lions and Mercier 1979).

When both A and B are single-valued linear operators in (1), Douglas and Rachford (1956) proposed the following method for solving heat conduction problems:

$$\begin{aligned} \left\{ \begin{array}{ll} &{} \frac{1}{\lambda }\Big (u_{k+\frac{1}{2}}-u_k\Big )+A\Big (u_{k+\frac{1}{2}}\Big )+B(u_k) =0,\\ &{} \frac{1}{\lambda }\Big (u_{k+1}-u_{k+\frac{1}{2}}\Big )+B\Big (u_{k+1}\Big )-B(u_k) =0. \end{array} \right. \end{aligned}$$

(3)

We can eliminate $u_{k+\frac{1}{2}}$ in (3) above and obtain

$$\begin{aligned} \Big (J^\lambda _B\Big )^{-1}u_{k+1}=\Big (J^\lambda _A(2J^\lambda _B-I)+(I-J^\lambda _B)\Big )\Big (J^\lambda _B\Big )^{-1}u_k. \end{aligned}$$

(4)

Define $z_k:=\Big (J^\lambda _B\Big )^{-1}u_k\Leftrightarrow u_k=J^\lambda _B(z_k)$. Then, (4) reduces to the following splitting method (known as Douglas–Rachford splitting method)

$$\begin{aligned} z_{k+1}=J^\lambda _A(2J^\lambda _B-I)z_k+(I-J^\lambda _B)z_k. \end{aligned}$$

(5)

Lions and Mercier (1979) extended the Douglas–Rachford splitting method (5) to the generic case where both A and B are set-valued nonlinear operators as in our problem (1). The Douglas–Rachford splitting method (5) to the generic case is explained as follows in Lions and Mercier (1979): Starting from an arbitrary iterate $u_1$ in the domain of B, choosing $b_1 \in B(u_1)$ and setting $z_1 = u_1 +\lambda b_1$, then $u_1 = J^\lambda _B(z_1)$ (the existence of the pair $(u_1,z_1)$ is unique by the Representation Lemma, see Eckstein and Bertsekas 1992, cor. 2.3). Thus a sequence $\{z_k\}$ is generated by the Douglas–Rachford scheme (5); and consequently a sequence $\{u_k := J^\lambda _B(z_k)\}$ converging to a solution point of (1) can be generated (see Eckstein 1989, Thm. 3.15). We refer to Combettes (2004) for the precise connection between (5) and the original Douglas–Rachford scheme in Douglas and Rachford (1956) for heat conduction problems. More details on Douglas–Rachford splitting method (5) can also be found in Fukushima (1996), Gabay and Mercier (1976) and Glowinski and Marrocco (1975).

1.1 Motivations and contributions

Boţ et al. (2015) gave the following method for solving (1): $z_0=z_1$;

$$\begin{aligned} \left\{ \begin{array}{ll} &{} u_k=J^\lambda _B(z_k+\alpha _k(z_k-z_{k-1}))\\ &{} w_k=J^\lambda _A(2u_k-z_k-\alpha _k(z_k-z_{k-1}))\\ &{} z_{k+1}:=z_k +\alpha _k(z_k-z_{k-1})+ \beta _k(w_k - u_k ), \end{array} \right. \end{aligned}$$

(6)

where $\{\alpha _k\}$ is a non-decreasing sequence with $0\le \alpha _k \le \alpha <1, \forall k \ge 1$ and $\lambda , \sigma , \delta >0$ such that

(a)
$\delta >\frac{\alpha ^2(1+\alpha )+\alpha \sigma }{1-\alpha ^2}$; and
(b)
$0 <\lambda \le \beta _k \le \theta :=2\frac{\delta -\alpha [\alpha (1+\alpha )+\alpha \delta +\sigma ]}{\delta [1+\alpha (1+\alpha )+\alpha \delta +\sigma ]}$.

Boţ et al. (2015) obtained weak convergence analysis of algorithm (6) for finding common zeros of the sum of two maximal monotone operators and illustrate their results through some numerical experiments. The same conditions (a) and (b) above have been used in recent works in Dong et al. (2018), Shehu (2018) and other associated papers. When $\alpha _k=0$, it was proved in Bauschke and Combettes (2011, Thm. 25.6(vii)) that $\{z_k\}$ in (6) converges strongly to a solution of (1) if either A or B is uniformly monotone (A is uniformly monotone if $\langle x-y,u-v\rangle \ge \phi (\Vert x-y\Vert ), \forall u \in Ax, v\in Ay$, where $\phi :[0,\infty )\rightarrow [0,\infty )$ is increasing and vanishes only at zero) on every nonempty bounded subset of its domain.

When $\beta _k=1$ and $B\equiv 0$, then (6) reduces to the inertial proximal point method proposed by Alvarez and Attouch (2001). In this case, Alvarez and Attouch (2001) assumed that the inertial factor $\alpha _k$ satisfies the condition $0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}$ in their convergence result. However, the assumption on the inertial factor $\alpha _k$ imposed in (6) does not appear as simple as condition $0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}$, assumed by Alvarez and Attouch (2001).

Problems arise in infinite dimensional spaces in many disciplines like economics, image recovery, electromagnetics, quantum physics, and control theory. For such problems, strong convergence of sequence of iterates $z_k$ of the proposed iterative procedure is often much more desirable than weak convergence. This is because strong convergence translates the physically tangible property that the energy $\Vert z_k-z\Vert$ of the error between the iterate $z_k$ and a solution z eventually becomes arbitrarily small. Another importance of strong convergence is also underlined in the works of Güler (1991), where a convex function f is minimized through the proximal point algorithm. Güler (1991) showed that the rate of convergence of the value sequence $\{f(z_k)\}$ is better when $\{z_k\}$ converges strongly than when it converges weakly. For more details on importance of strong convergence, please see Bauschke and Combettes (2001).

Strong convergence methods for solving problem (1) when B is set-valued maximal monotone operator and A is a single-valued $\kappa$-inverse strongly monotone operator (i.e., $\langle Ax-Ay,x-y\rangle \ge \kappa \Vert Ax-Ay\Vert ^2,~~\forall x, y\in H$) have been studied extensively in the literature (see, for example, Boikanyo 2016; Chang et al. 2019; Cholamjiak 2016; Cholamjiak et al. 2018; Dong et al. 2017; Gibali and Thong 2018; López et al. 2012; Riahi et al. 2018; Shehu 2016, 2019; Shehu and Cai 2018; Thong and Cholamjiak 2019; Wang and Wang 2018). However, there are still few results on the strong convergence results concerning more general case of problem (1) when A and B are set-valued maximal monotone operators. This is the gap that this paper aims to fill in.

Our aim in this paper is to prove the strong convergence analysis of the inertial Douglas–Rachford splitting method with different conditions from the conditions (a) and (b) assumed in Boţ et al. (2015) without assuming uniform monotonicity on either maximal monotone operator A or B. Furthermore our assumptions on the inertial factor $\theta _k$ here in this paper are the same assumptions in the results of Alvarez and Attouch (2001) (which is a special case of our result). In summary,

We prove strong convergence analysis of inertial Douglas–Rachford splitting method without using the conditions (a) and (b) assumed in Boţ et al. (2015). Our inertial conditions are the same as the ones assumed in Alvarez and Attouch (2001) for finding zero of a set-valued maximal monotone operator using inertial proximal method.
We obtain strong convergence results without assuming that any of the involved maximal monotone operators is uniformly monotone on every nonempty bounded subset. Our strong convergence results are much more general than the current ones in Bauschke and Combettes (2011) and other associated works where strong convergence is obtained.
Some numerical examples are given to confirm the importance of the presence of inertial term in our method.

The paper is therefore organized as follows: We first recall some basic explanations of Douglas–Rachford splitting method and introduce our inertial Douglas–Rachford splitting method alongside some results in Sect. 2. The analysis of strong convergence of our proposed method is then investigated in Sect. 3. We give numerical implementations in Sect. 4 and conclude with some final remarks in Sect. 5.

2 Preliminaries

Let us first recall some basics that are required to derive and analyze the Douglas–Rachford splitting method; for the corresponding details, we refer, Eckstein and Bertsekas (1992), He and Yuan (2015), Svaiter (2011) and Zhang and Cheng (2013).

Let $\lambda > 0$ be a fixed parameter, and let us denote by

$$\begin{aligned} J^\lambda _A := ( I + \lambda A )^{-1} \quad \text {and} \quad J^\lambda _B := ( I + \lambda B )^{-1} \end{aligned}$$

the resolvents of A and B, respectively, which are known to be firmly nonexpansive (operator T is firmly-nonexpansive if $\langle x-y,Tx-Ty\rangle \ge \Vert Tx-Ty\Vert ^2, ~~\forall x,y\in H$). Furthermore, let us write

$$\begin{aligned} R^\lambda _A := 2 J^\lambda _A - I \quad \text {and} \quad R^\lambda _B := 2 J^\lambda _B - I \end{aligned}$$

for the corresponding reflections (also called Cayley operators), and note that the reflections are nonexpansive operators (T is nonexpansive if $\Vert Tx-Ty\Vert \le \Vert x-y\Vert ,~~\forall x,y\in H$).

In Eckstein and Bertsekas (1992) and He and Yuan (2015), the maximal monotone operator $S_{\lambda ,A,B}$ is defined as

$$\begin{aligned} S_{\lambda ,A,B}:=\{(v+\lambda b,u-v):(u,b) \in B, (v,a)\in A, v+\lambda a=u-\lambda b\}. \end{aligned}$$

It was shown in Eckstein and Bertsekas (1992) that the Douglas–Rachford splitting method (5) can be converted to

$$\begin{aligned} z_{k+1}=\Big (J^\lambda _A(2J^\lambda _B-I)+(I-J^\lambda _B)\Big )z_k=(I+S_{\lambda ,A,B})^{-1}z_k=J_{S_{\lambda ,A,B}}(z_k). \end{aligned}$$

By Eckstein and Bertsekas (1992, Thm. 5), for any given zero $z^*$ of $S_{\lambda ,A,B}$, $J^\lambda _B(z^*)$ is a zero of $A+B$. Therefore, $J^\lambda _B(z^*)$ is a solution of (1) whenever $z^*$ satisfies

$$\begin{aligned} z^* =R^\lambda _A o R^\lambda _B(z^*). \end{aligned}$$

(7)

Consequently, the Douglas–Rachford splitting method (5) can be rewritten as

$$\begin{aligned} z_{k+1}= & {} \, J^\lambda _A(2J^\lambda _B-I)z_k+(I-J^\lambda _B)z_k \nonumber \\= & \, {} z_k+\frac{1}{2}(2J^\lambda _A(2J^\lambda _B(z_k)-z_k)-(2J^\lambda _B(z_k)-z_k)-z_k) \nonumber \\= &\, z_k+\frac{1}{2}(R^\lambda _A o R^\lambda _B(z_k)-z_k) \nonumber \\= &\, z_k-e(z_k,\lambda ), \end{aligned}$$

(8)

where $e(z_k,\lambda ):=\frac{1}{2}(z_k-R^\lambda _A o R^\lambda _B(z_k))$.

In this paper, our convergence analysis will be conducted for an inertial generalized version of Douglas–Rachford splitting method (8): $z_0, z_1 \in H$,

$$\begin{aligned} \left\{ \begin{array}{ll} &{} y_k=\alpha _kz_0+(1-\alpha _k)z_k+\theta _k(z_k-z_{k-1})\\ &{} z_{k+1}=y_k-\beta _k e(y_k,\lambda ), \end{array} \right. \end{aligned}$$

(9)

with $\alpha _k \in [0,1), \beta _k \in (0,1]$ and $\theta _k \in [0,1)$. We get the original Douglas–Rachford method (8) when $\beta _k=1, \theta _k=0=\alpha _k$ in (9).

We next recall some properties of the projection. For any point $u \in H$, there exists a unique point $P_C u \in C$ such that

$$\begin{aligned} \Vert u-P_C u\Vert \le \Vert u-y\Vert ,~~\forall y \in C. \end{aligned}$$

$P_C$ is called the metric projection of H onto C. We know that $P_C$ is a nonexpansive mapping of H onto C. It is also known that $P_C$ satisfies

$$\begin{aligned} \langle x-y, P_C x-P_C y \rangle \ge \Vert P_C x-P_C y\Vert ^2~~\forall x, y, \in H. \end{aligned}$$

(10)

In particular, we get from (10) that

$$\begin{aligned} \langle x-y, x-P_C y \rangle \ge \Vert x-P_C y\Vert ^2,~~\forall x \in C, y \in H. \end{aligned}$$

(11)

Furthermore, $P_C x$ is characterized by the properties

$$\begin{aligned} P_Cx\in C\quad \text {and} \quad \langle x-P_C x,P_C x-y\rangle \ge 0,~\forall y\in C. \end{aligned}$$

(12)

This characterization implies that

$$\begin{aligned} \Vert x-y\Vert ^2\ge \Vert x-P_Cx\Vert ^2+\Vert y-P_Cx\Vert ^2~~\forall x \in H, \forall y \in C. \end{aligned}$$

(13)

The following result is obtained (Shehu et al. 2020) but we give the proof for the sake of completeness.

Lemma 2.1

Let $S \subseteq H$ be a nonempty, closed, and convex subset of a real Hilbert space H. Let $u \in H$ be arbitrarily given, $z := P_S u$, and $\Omega := \{ x \in H : \langle x - u, x - z \rangle \le 0 \}$. Then $\Omega \cap S = \{ z \}$.

Proof

By definition, it follows immediately that $z \in \Omega \cap S$. Conversely, take an arbitrary $y \in \Omega \cap S$. Then, in particular, we have $y \in \Omega$, and it therefore follows that

$$\begin{aligned} \Vert y-z \Vert ^2= & \, \langle y-z, y-z \rangle \nonumber \\= &\, \langle y-z, y-u \rangle + \langle y-z, u-z \rangle \\\le &\, \langle y-z, u-z \rangle . \nonumber \end{aligned}$$

(14)

Using $z = P_S u$ together with the characterization (12), we also have

$$\begin{aligned} \langle u-z, z-x \rangle \ge 0 \quad \forall x \in S. \end{aligned}$$

In particular, since $y \in S$, we therefore have $\langle u-z, z-y \rangle \ge 0$. Hence (14) implies $\Vert y-z \Vert ^2 \le 0$, so that $y = z$. This completes the proof. $\square$

Finally, we state some basic properties that will be used in our convergence theorems.

Lemma 2.2

The following statements hold in H:

(a)
$\Vert x+y\Vert ^2=\Vert x\Vert ^2+2\langle x,y\rangle +\Vert y\Vert ^2$ for all $x, y \in H$.
(b)
$2 \langle x-y, x-z \rangle = \Vert x-y \Vert ^2 + \Vert x-z \Vert ^2 - \Vert y-z \Vert ^2$ for all $x,y,z \in H$.
(c)
$\Vert tx+sy\Vert ^2=t(t+s)\Vert x\Vert ^2+s(t+s)\Vert y\Vert ^2-st\Vert x-y\Vert ^2, \quad \forall x, y \in H, \forall s, t \in \mathbb {R}.$

Lemma 2.3

(Maingé 2008) Assume that $\varphi _{k}\in [0,\infty )$ and $\delta _{k}\in [0,\infty )$ satisfy:

(1)
$\varphi _{k+1}-\varphi _{n}\le \theta _{k}(\varphi _{k}-\varphi _{k-1})+\delta _{k},$
(2)
$\sum _{k=1}^{\infty }\delta _{k}<\infty ,$
(3)
$\{\theta _{k}\}\subset [0,\theta ],$ where $\theta \in (0,1).$

Then the sequence $\{\varphi _{k}\}$ is convergent with $\sum _{k=1}^{\infty }[\varphi _{k+1}-\varphi _{k}]_{+}<\infty ,$ where $[t]_{+}:=\max \{t,0\}$ (for any $t\in \mathbb {R})$.

3 Analysis of the convergence

For the rest of this paper, we assume that $S\ne \emptyset$, $\alpha _k \in (0,1)$ with $\lim _{k\rightarrow \infty } \alpha _k=0$ and $\sum _{k=1}^\infty \alpha _k=\infty$, $0<\beta \le \beta _k\le 1$ and $0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}$.

Lemma 3.1

Let $\{z_k\}$ be the sequence generated by (9). For any z satisfying (7), we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2 \le \Vert y_k-z\Vert ^2-\Vert z_{k+1}-y_k\Vert ^2. \end{aligned}$$

(15)

Proof

By (9), we get

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2= & {} \Vert y_k-z-\beta _ke(y_k,\lambda )\Vert ^2 \nonumber \\= & {} \Vert y_k-z\Vert ^2-2\beta _k \langle y_k-z, e(y_k,\lambda )\rangle +\beta _k^2\Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$

(16)

We know that $e(y_k,\lambda )=\frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))$, where $\lambda >0$ is the proximal parameter, is firmly-nonexpansive (see, He and Yuan 2015, lem. 2.2). Thus,

$$\begin{aligned} \langle x-y,e(x,\lambda )-e(y,\lambda )\rangle \ge \Vert e(x,\lambda )-e(y,\lambda )\Vert ^2,~~\forall x,y \in H, \lambda >0. \end{aligned}$$

In particular, for $z=R^\lambda _AoR^\lambda _B(z)$, we obtain

$$\begin{aligned} \langle y_k-z, e(y_k,\lambda )\rangle \ge \Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$

(17)

Putting (17) into (16), we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2\le & \, \Vert y_k-z\Vert ^2-2\beta _k \Vert e(y_k,\lambda )\Vert ^2 +\beta _k^2\Vert e(y_k,\lambda )\Vert ^2 \nonumber \\= &\, \Vert y_k-z\Vert ^2-\beta _k(2-\beta _k) \Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$

(18)

Recall that $\beta _k e(y_k,\lambda )=y_k-z_{k+1}$ implies that

$$\begin{aligned} e(y_k,\lambda )=\frac{1}{\beta _k}(y_k-z_{k+1}). \end{aligned}$$

(19)

Using (19) in (18) and the condition that $0<\beta \le \beta _k\le 1$, we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2\le & \, \Vert y_k-z\Vert ^2-\beta _k(2-\beta _k) \frac{1}{\beta _k^2}\Vert z_{k+1}-y_k\Vert ^2\\= & \, \Vert y_k-z\Vert ^2-\frac{2-\beta _k}{\beta _k} \Vert z_{k+1}-y_k\Vert ^2\\\le & \, \Vert y_k-z\Vert ^2- \Vert z_{k+1}-y_k\Vert ^2. \end{aligned}$$

$\square$

Lemma 3.2

Let $\{z_k\}$ be the sequence generated by (9). For any z satisfying (7), we have

$$\begin{aligned}&{-\,2\alpha _k\langle z_k-z,z_k-z_0\rangle } \nonumber \\&\quad \ge \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}- 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad +\alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}-\alpha _k\Vert z_k-z_0\Vert ^{2}- \theta _k\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}\nonumber \\&\qquad +(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}. \end{aligned}$$

(20)

Proof

Moreover, from the definition of $y_k$, we obtain using Lemma 2.2 (a) that

$$\begin{aligned} \Vert y_k-z\Vert ^2= & {} \Vert (z_k-z)+\theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0)\Vert ^2\nonumber \\= & {} \Vert z_k-z\Vert ^{2}+\Vert \theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0)\Vert ^{2}\nonumber \\&+\,2 \big \langle z_k-z ,\theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0) \big \rangle \nonumber \\= & {} \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z,z_k-z_{k-1}\rangle - 2\alpha _k\langle z_k-z,z_k-z_0\rangle \nonumber \\&+\, \Vert \theta _k(z_k-z_{k-1})-\alpha _k(z_k-z_0)\Vert ^{2}, \end{aligned}$$

(21)

and, similarly, with z replaced by $z_{k+1}$ in the previous formula,

$$\begin{aligned}&{\Vert y_k-z_{k+1}\Vert ^{2}} \nonumber \\= & {} \Vert z_k-z_{k+1}\Vert ^{2}+2\theta _k\langle z_k-z_{k+1},z_k-z_{k-1} \rangle \nonumber \\&-\,2\alpha _k\langle z_k-z_{k+1}, z_k-z_0\rangle + \Vert \theta _k(z_k-z_{k-1})-\alpha _k(z_k-z_0)\Vert ^{2}. \end{aligned}$$

(22)

Substituting (21) and (22) into (15) and eliminating identical terms, we get

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}} \nonumber \\&\quad \le \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z, z_k-z_{k-1} \rangle \nonumber \\&\qquad -\,2\alpha _k\langle z_k-z,z_k-z_0\rangle -\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _k\langle z_k-z_{k+1},z_k-z_{k-1}\rangle +2\alpha _k\langle z_k-z_{k+1}, z_k-z_0\rangle \nonumber \\= & \, \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z, z_k-z_{k-1} \rangle \nonumber \\&\qquad -\,2\alpha _k\langle z_k-z,z_k-z_0\rangle -\Vert z_k-z_{k+1}\Vert ^{2} +\theta _k\Vert z_k-z_{k+1}\Vert ^{2}+\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad -\,\theta _k\Vert z_k-z_{k+1}+(z_k-z_{k-1})\Vert ^{2} +2\alpha _k\langle z_k-z_{k+1},z_k-z_0\rangle . \end{aligned}$$

(23)

Therefore, we obtain

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}-\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+ (1-\theta _k)\Vert z_k-z_{k+1}\Vert ^{2}}\nonumber \\&\quad \le -2\alpha _k\langle z_k-z,z_k-z_0\rangle +2\theta _k \langle z_k-z,z_k-z_{k-1}\rangle +2\alpha _k \langle z_k-z_{k+1},z_k-z_0\rangle \nonumber \\&\quad =-2\alpha _k\langle z_k-z,z_k-z_0\rangle - \theta _k\Vert z_{k-1}-z\Vert ^{2}+\theta _k\Vert z_k-z\Vert ^{2} + \theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad -\alpha _k\Vert z_0-z_{k+1}\Vert ^{2}+\alpha _k\Vert z_{k+1}-z_k\Vert ^{2}+ \alpha _k\Vert z_k-z_0\Vert ^{2}, \end{aligned}$$

(24)

where the last identity exploits Lemma 2.2 (a) twice. We therefore have

$$\begin{aligned}&{-2\alpha _k\langle z_k-z,z_k-z_0\rangle } \end{aligned}$$

(25)

$$\begin{aligned}&\quad \ge \Vert z_{k+1}-z\Vert ^{2} -\Vert z_k-z\Vert ^{2}+ 2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2} -2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad +\theta _k \big ( \Vert z_{k-1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} \big ) + \alpha _k \big ( \Vert z_0-z_{k+1}\Vert ^{2}-\Vert z_k-z_0\Vert ^{2} \big )\nonumber \\&\qquad +(1-\theta _k-2\theta _{k+1}-\alpha _k)\Vert z_{k+1}-z_k\Vert ^2. \end{aligned}$$

(26)

Using the fact that $\{\theta _k\}$ is non-decreasing and $\{ \alpha _k \}$ is non-increasing, we then obtain

$$\begin{aligned}&{-\,2\alpha _k\langle z_k-z,z_k-z_0\rangle } \\&\quad \ge \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}- 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2} \\&\qquad +\,\alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}-\alpha _k\Vert z_k-z_0\Vert ^{2}- \theta _k\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} \\&\qquad +\,(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}, \end{aligned}$$

which is the desired inequality. $\square$

Our first central result below shows that the sequence $\{ z_k \}$ generated by (9) is bounded.

Lemma 3.3

The sequence $\{z_k\}$ generated by (9) is bounded.

Proof

A simple re-ordering of (20) implies that

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} - (1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}- \alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}\nonumber \\&\qquad +\,\alpha _k\Vert z_k-z_0\Vert ^{2}-2\alpha _k\langle z_k-z_0,z_k-z \rangle \nonumber \\&\quad = \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} - (1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}- \alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}\nonumber \\&\qquad +\,\alpha _k\Vert z_k-z_0\Vert ^{2}+\alpha _k\Vert z_0-z\Vert ^{2} - \alpha _k\Vert z_k-z_0\Vert ^{2}-\alpha _k\Vert z_k-z\Vert ^{2}, \end{aligned}$$

(27)

where the equality uses once again Lemma 2.2 (a). Hence, by cancellation, re-ordering, and neglecting a non-positive term on the right-hand side, we obtain

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+ \alpha _k\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2} +\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

(28)

Let $\mu _j:=e^{\sum _{i=1}^{j}\alpha _{i}}, j\ge 1$. Using $1-x \le e^{-x}$ for all $x \in \mathbb R$ (or equivalently, $1-e^{-x} \le x, x \in \mathbb {R}$), we obtain

$$\begin{aligned} \frac{1}{\mu _{k+1}}(\mu _{k+1}-\mu _{k})= & {} 1-\frac{\mu _{k}}{\mu _{k+1}} \nonumber \\= & {} 1-e^{(\sum _{i=1}^{k}\alpha _{i}-\sum _{i=1}^{k+1}\alpha _{i})} \nonumber \\= & {} 1-e^{-\alpha _{k+1}} \le \alpha _{k+1}. \end{aligned}$$

(29)

Then (29) consequently implies that

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \\&\quad =\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} +\frac{1}{\mu _{k+1}}(\mu _{k+1}-\mu _{k})\Vert z_k-z\Vert ^{2} \\&\quad \le \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+\alpha _{k+1}\Vert z_k-z\Vert ^{2}. \end{aligned}$$

Since $\{\alpha _k\}$ is non-increasing in (0,1), this implies

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \nonumber \\&\quad \le \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} +\alpha _k\Vert z_k-z\Vert ^{2}. \end{aligned}$$

(30)

It then follows from (28) and (30) that

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}\\&\qquad -\,(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}- 2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad +\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

Since $\mu _{k}\le \mu _{k+1}$, $\mu _{k+1}=\mu _{k}e^{\alpha _{k+1}}$ and $\{\alpha _k\}$ is non-increasing in (0,1), we therefore get

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -\mu _{k+1}(1-3\theta _{k+1}-\alpha _k)\Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad -\,2\mu _{k+1}\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+ 2\mu _{k}\theta _ke^{\alpha _{k+1}}\Vert z_k-z_{k-1}\Vert ^{2} +\mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}, \end{aligned}$$

which can be rewritten as (since $\{\alpha _k\}$ is non-increasing in (0,1))

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} \\&\qquad -\,\mu _{k+1} \big [ 1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )- \alpha _k \big ] \Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad -\,2\mu _{k+1}\theta _{k+1}e^{\alpha _{k+1}}\Vert z_{k+1}-z_k\Vert ^{2}+ 2\mu _{k}\theta _ke^{\alpha _k}\Vert z_k-z_{k-1}\Vert ^{2} +\mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

Since the sequence $\{\theta _k\}$ belongs to the interval $[0,\theta ]$, we have

$$\begin{aligned} 1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k\ge 1- \theta \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k, \quad \forall k \in \mathbb N. \end{aligned}$$

Using $\lim _{k \rightarrow \infty } \alpha _k = 0$ and $\theta \in [0, 1/3)$, it follows that the right-hand side is eventually bounded from below by a positive number, i.e., there is a constant $\gamma > 0$ such that $1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k\ge \gamma$ for all $k \in \mathbb N$ sufficiently large, say, for all $k \ge k_0$. Hence, we have

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -2\mu _{k+1}\theta _{k+1}e^{\alpha _{k+1}}\Vert z_{k+1}-z_k\Vert ^{2} \\&\qquad -\, \gamma \mu _{k+1}\Vert z_{k+1}-z_k\Vert ^{2} +2\mu _{k}\theta _ke^{\alpha _k}\Vert z_k-z_{k-1}\Vert ^{2}+ \mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

This implies that for $k\ge k_{0}$,

$$\begin{aligned}&{\Vert z_0-z\Vert ^{2}\sum _{j=k_{0}+1}^{k}\mu _{j+1}\alpha _{j}} \nonumber \\&\quad \ge \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}+2\mu _{k+1} \theta _{k+1} e^{\alpha _{k+1}} \Vert z_{k+1}-z_k\Vert ^{2}-\mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}\nonumber \\&\qquad -\,\mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2}-2\mu _{k_{0}+1}\theta _{k_{0}+1} e^{\alpha _{k_{0}+1}}\Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad +\, \mu _{k_{0}+1}\theta _{k_{0}}\Vert z_{k_{0}}-z\Vert ^{2}. \end{aligned}$$

(31)

Thus, dividing by $\mu _{k+1}$ and omitting a non-positive term, we get

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\theta _k\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le e^{-t_{k+1}} \big [ \mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2} +2\mu _{k_{0}+1}\theta _{k_{0}+1}e^{\alpha _{k_{0}+1}} \Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad -\,\mu _{k_{0}+1}\theta _{k_{0}}\Vert z_{k_{0}}-z\Vert ^{2} \big ] +\Vert z_0-z\Vert ^{2} e^{-t_{k+1}}\sum _{j=k_{0}+1}^{k}\alpha _{j}e^{t_{j}+1}, \end{aligned}$$

(32)

where $t_{k}:=\sum _{i=1}^{k}\alpha _{i}$. Since $\alpha _k \in (0,1)$ for all $k \in \mathbb N$, it is easy to see that $\alpha _{k}e^{t_{k+1}}\le e^{2}(e^{t_{k}}-e^{t_{k-1}})$ for all $k\ge 2$, so that

$$\begin{aligned} \sum _{j=k_{0}+1}^k \mu _{j+1}\alpha _{j} = \sum _{j=k_0+1}^k \alpha _j e^{t_{j+1}} \le e^2 \big ( e^{t_k} - e^{t_{k_0}} \big ) \le e^{2}e^{t_{k}}, \end{aligned}$$

which, by (32), $e^{-t_{k+1}} \le 1$, and the fact that $\{\theta _k\}$ belongs to the interval $[0, \theta ] \subset [0,\frac{1}{3})$, yields

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}} \nonumber \\&\quad \le \theta \Vert z_k-z\Vert ^{2}+ \mu _{k_{0}+1}\Vert z_{k_{0+1}}-z\Vert ^{2}+2\mu _{k_{0}+1}\theta _{k_{0}+1} e^{\alpha _{k_{0}+1}}\Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad +\,e^{2}\Vert z_0-z\Vert ^{2}. \end{aligned}$$

(33)

Using (33), $\theta \in [0,1)$, and the convergence of the geometric series, a simple calculation gives

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^{2}\le & {} \theta ^{k-k_{0}}\Vert z_{k_0+1}-z\Vert ^{2}+ \frac{1}{1-\theta } \big [ \mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2}\\&+\,2\mu _{k_{0}+1}\theta _{k_{0}+1}e^{\alpha _{k_{0}+1}} \Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2} + e^{2}\Vert z_0-z\Vert ^{2} \big ]. \end{aligned}$$

Using once again that $\theta < 1$, this shows that $\{z_k\}$ is bounded. $\square$

Next, we formulate a simple lemma that turns out to be useful for proving the strong convergence result.

Lemma 3.4

Let $\{z_k\}$ be the sequence generated by (9). Define

$$\begin{aligned} u_{k}:=\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}+ 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2} \end{aligned}$$

for all $k \in \mathbb N$. Then $u_k \ge 0$ for all $k \in \mathbb N$.

Proof

Since $\{\theta _k\}$ is non-decreasing with $0\le \theta _k< \frac{1}{3}$, and by Lemma 2.2 (a), we have

$$\begin{aligned} u_{k}= & \, \Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z_k+z_k-z\Vert ^{2}+ 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= & \, \Vert z_k-z\Vert ^{2}-\theta _{k-1} \big [ \Vert z_{k-1}-z_k\Vert ^{2}+ \Vert z_k-z\Vert ^{2}+2\langle z_{k-1}-z_k,z_k-z \rangle \big ]\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= &\, \Vert z_k-z\Vert ^{2}-\theta _{k-1} \big [ 2\Vert z_{k-1}-z_k\Vert ^{2}+ 2\Vert z_k-z\Vert ^{2}-\Vert z_{k-1}-2z_k-z\Vert ^{2} \big ]\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= & \, \Vert z_k-z\Vert ^{2}-2\theta _{k-1}\Vert z_{k-1}-z_k\Vert ^{2} - 2\theta _{k-1}\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-2z_k-z\Vert ^{2}\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & \, \Vert z_k-z\Vert ^{2}- 2\theta _k\Vert z_{k-1}-z_k\Vert ^{2} - \frac{2}{3}\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-2z_k-z\Vert ^{2}\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & \, \frac{1}{3}\Vert z_k-z\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & {} 0, \end{aligned}$$

and this completes the proof. $\square$

Before we prove our main strong convergence result, we state another preliminary result which provides sufficient conditions for the strong convergence of the sequence $\{z_k\}$ generated by our method (9). In our strong convergence result, we will then show that these sufficient conditions automatically hold.

Lemma 3.5

Let $\{z_k\}$ be the sequence generated by (9). Assume that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0 \end{aligned}$$

and

$$\begin{aligned} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert ^2-\theta _k\Vert z_k-z\Vert ^2)=0. \end{aligned}$$

Then the entire sequence $\{z_k\}$ converges strongly to the solution z.

Proof

By assumption, we have

$$\begin{aligned} 0= & {} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert ^2-\theta _k\Vert z_k-z\Vert ^2) \nonumber \\= & {} \lim _{k\rightarrow \infty }\Big [(\Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert )(\Vert z_{k+1}-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\Big ]. \end{aligned}$$

(34)

We claim that this already implies

$$\begin{aligned} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert )=0, \end{aligned}$$

from which the strong convergence of the entire sequence $\{z_k\}$ to z follows immediately. Assume this limit does not hold. Then there is a subset $K\subseteq \mathbb {N}$ and a constant $\rho > 0$ such that

$$\begin{aligned} \Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert \ge \rho , \forall k \in K. \end{aligned}$$

(35)

Since $\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0$ by the assumption and $0\le \theta <1$, then (recall that if $\{a_k\}$ and $\{b_k\}$ are bounded sequences in $\mathbb {R}$ and one of either $\{a_k\}$ or $\{b_k\}$ converges, then $\limsup _{k\rightarrow \infty } (a_k+b_k)=\limsup _{k\rightarrow \infty } a_k +\limsup _{k\rightarrow \infty } b_k$)

$$\begin{aligned} \limsup _{k \in K}((1-\sqrt{\theta })\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert )= & {} \limsup _{k \in K}(1-\sqrt{\theta })\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert \\= & {} (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert . \end{aligned}$$

Using (34) and $\theta _k\le \theta <1$, we get

$$\begin{aligned} 0= & {} \lim _{k \in K}(\Vert z_{k+1}-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\= & {} \limsup _{k \in K}(\Vert z_{k+1}-z_k+z_k-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\\ge & {} \limsup _{k \in K}(\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\\ge & {} \limsup _{k \in K}((1-\sqrt{\theta })\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert )\\= & \, (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert \\= & \, (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert . \end{aligned}$$

Consequently, we have $\limsup _{k \in K}\Vert z_k-z\Vert \le 0.$ Since $\liminf _{k \in K}\Vert z_k-z\Vert \ge 0$ obviously holds, it follows that $\lim _{k \in K}\Vert z_k-z\Vert = 0.$ This implies [by (35)]

$$\begin{aligned} \Vert z_{k+1}-z_k\Vert\ge &\, \Vert z_{k+1}-z\Vert -\Vert z_k-z\Vert \\= & \, \Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert -(1+\sqrt{\theta _k})\Vert z_k-z\Vert \\\ge &\, \frac{\rho }{2} \end{aligned}$$

for all $k\in K$ sufficiently large, a contradiction to the assumption that $\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0.$ This completes the proof. $\square$

We are now ready to obtain strong convergence of the sequence $\{ z_k \}$ generated by (9) to an element of S.

Theorem 3.6

The sequence $\{z_k\}$ generated by (9) strongly converges to z, where $z=P_Sz_0$.

Proof

Let $u_{k}$ denote the nonnegative number defined in Lemma 3.4, and let us apply Lemma 3.2. We obtain from (20) that

$$\begin{aligned}&u_{k+1}-u_{k}+(1-3\theta _{k+1}-\alpha _k) \Vert z_k-z_{k+1}\Vert ^{2} \nonumber \\&\quad \le -2\alpha _k\langle z_k-z,z_k-z_0\rangle . \end{aligned}$$

(36)

We now consider two cases.

Case 1 Suppose $\{u_{k}\}$ is eventually a monotonically decreasing sequence, i.e. for some $k_{0} \in \mathbb N$ large enough, we have $u_{k+1} \le u_k$ for all $k \ge k_0$. Then, since $u_k$ is nonnegative for all $k \in \mathbb N$ by Lemma 3.4, we obviously get that $\{u_{k}\}$ is a convergent sequence. Consequently, it follows that $\lim _{k\rightarrow \infty }u_{k}=\lim _{k\rightarrow \infty }u_{k+1}$. Since $\{z_k\}$ is bounded by Theorem 3.3, there exists $M>0$ such that $2|\langle z_k-z,z_k-z_0\rangle |\le M.$ Moreover, it follows that there exist $N\in \mathbb {N}$ and $\gamma _{1}>0$ such that $1-3\theta _{k+1}-\alpha _k \ge \gamma _{1}$ for all $k\ge N$. Therefore, for $k\ge N$, we obtain from (36) that

$$\begin{aligned} \gamma _{1} \Vert z_{k+1}-z_k\Vert ^{2}\le & {} \alpha _kM + u_{k}-u_{k+1}\\\rightarrow & {} 0 \quad \text {for } k\rightarrow \infty . \end{aligned}$$

Hence

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert \rightarrow 0. \end{aligned}$$

Together with $\alpha _k \rightarrow 0$, the boundedness of $\{ z_k \}$, and the convergence of $\{ u_k \}$, we therefore obtain from the definition of $u_k$ that the limit

$$\begin{aligned} \lambda := \lim _{k\rightarrow \infty } \big ( \Vert z_{k+1}-z\Vert ^{2}-\theta _k\Vert z_k-z\Vert ^{2} \big ) \end{aligned}$$

(37)

exists and is equal to $\lim _{k \rightarrow \infty } u_{k+1}$. In particular, Lemma 3.4 therefore implies that $\lambda \ge 0$. We will show that $\lambda = 0$ holds; then (37) together with the fact that $\theta _k \le \theta < 1$ for all $k \in \mathbb N$ yields the strong convergence of the sequence $\{ z_k \}$ to the solution z.

By contradiction, assume that $\lambda > 0$. Since $\{ z_k \}$ is bounded by Theorem 3.3, it is easy to see that we can choose a subsequence $\{z_{k_{j}}\}$ which converges weakly to an element $p\in H$ and such that

$$\begin{aligned} \underset{k\rightarrow \infty }{\liminf }\langle z_k-z,z-z_0\rangle = \underset{j\rightarrow \infty }{\lim }\langle z_{k_j}-z,z-z_0\rangle = \langle p-z,z-z_0\rangle . \end{aligned}$$

We show that $p \in S$. Observe that the updating rule for $y_k$ implies

$$\begin{aligned} \Vert y_{k}-z_k\Vert= &\, \Vert \alpha _k(z_0-z_k)+\theta _k(z_k-z_{k-1})\Vert \\\le & \, \alpha _k\Vert z_0-z_k\Vert +\theta _k\Vert z_k-z_{k-1}\Vert \rightarrow 0,~~k\rightarrow \infty . \end{aligned}$$

This yields

$$\begin{aligned} \Vert e(y_k,\lambda )\Vert \le \frac{1}{\beta }\Vert z_{k+1}-y_k\Vert \le \Vert z_k-y_k\Vert +\Vert z_{k+1}-z_k\Vert \rightarrow 0,~~k \rightarrow \infty . \end{aligned}$$

Let $Ty:=\frac{1}{2}y+\frac{1}{2}R^\lambda _AoR^\lambda _B(y),~~y \in H$. Then it is clear that T is nonexpansive and $z \in F(T):=\{x\in H:x=Tx\}$ if and only if $z=R^\lambda _AoR^\lambda _B(z)$. Similarly, it is easy to see that $e(y_k,\lambda )= \frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))=y_k-Ty_k$. Therefore,

$$\begin{aligned} \lim _{k\rightarrow \infty } \Vert y_k-Ty_k\Vert =\lim _{k\rightarrow \infty }\Vert e(y_k,\lambda )\Vert =0. \end{aligned}$$

Demiclosedness Principle of T implies that $p \in F(T)$. Hence, $p \in S$. This implies that

$$\begin{aligned} \underset{k\rightarrow \infty }{\liminf }\langle z_k-z,z-z_0\rangle = \langle p-z,z-z_0\rangle \ge 0, \end{aligned}$$

(38)

where the inequality follows from the characterization (12) of a projection applied to $z = P_S z_0$ and $p \in S$. Since (37) yields

$$\begin{aligned} \liminf _{k \rightarrow \infty } \Vert z_{k+1} - z \Vert ^2 \ge \lim _{k \rightarrow \infty } \big ( \Vert z_{k+1} - z \Vert ^2 - \theta _k \Vert z_k - z \Vert ^2 \big ) = \lambda , \end{aligned}$$

and since $\lambda > 0$ by assumption, we have

$$\begin{aligned} \Vert z_{k+1} - z \Vert ^2 \ge \frac{1}{2} \lambda \quad \forall k \ge k_1 \end{aligned}$$

for some sufficiently large $k_1 \in \mathbb N$. Using the identity

$$\begin{aligned} \langle z_k - z, z_k - z_0 \rangle = \Vert z_k - z \Vert ^2 + \langle z_k - z, z - z_0 \rangle , \end{aligned}$$

we therefore get

$$\begin{aligned} \liminf _{k \rightarrow \infty } \langle z_k - z, z_k - z_0 \rangle= & \, \liminf _{k \rightarrow \infty } \big ( \Vert z_k - z \Vert ^2 + \langle z_k - z, z - z_0 \rangle \big ) \\\ge & \, \liminf _{k \rightarrow \infty } \Big ( \frac{1}{2} \lambda + \langle z_k - z, z - z_0 \rangle \Big ) \\= & \, \frac{1}{2} \lambda + \liminf _{k \rightarrow \infty } \langle z_k - z, z - z_0 \rangle \\\ge & \, \frac{1}{2} \lambda \end{aligned}$$

from (38). Using once again the assumption that $\lambda > 0$, this implies

$$\begin{aligned} \langle z_k - z, z_k - z_0 \rangle \ge \frac{1}{4} \lambda \quad \forall k \ge k_2 \end{aligned}$$

for some sufficiently large $k_2 \in \mathbb N, k_2 \ge k_1$. From (36), we therefore obtain

$$\begin{aligned} u_{k+1} - u_k \le - \frac{1}{2} \alpha _k \lambda \quad \forall k \ge k_2. \end{aligned}$$

This implies

$$\begin{aligned} \frac{1}{2} \lambda \sum _{j=k_2}^k \alpha _j \le u_{k_2} - u_{k+1} \le u_{k_2} \quad \forall k \ge k_2, \end{aligned}$$

where the second inequality follows from Lemma 3.4. Since $\lambda > 0$, this gives the summability of the sequence $\{ \alpha _k \}$, a contradiction to our assumption. Hence we must have $\lambda = 0$, and this yields the strong convergence of the sequence $\{ z_k \}$ to z.

Case 2 Assume that $\{u_k\}$ is not eventually monotonically decreasing. Then let $\tau :\mathbb {N}\rightarrow \mathbb {N}$ be the map defined for all $k\ge k_{0}$ (for some $k_{0} \in \mathbb N$ large enough) by

$$\begin{aligned} \tau (k):=\max \{j\in \mathbb {N}: j\le k, u_{j}\le u_{j+1}\}. \end{aligned}$$

(39)

Clearly, $\tau (k)$ is a non-decreasing sequence such that $\tau (k) \rightarrow \infty$ for $k\rightarrow \infty$ and $u_{\tau (k)}\le u_{\tau (k)+1}$ for all $k\ge k_{0}$. Hence, similar to the proof of Case 1, we therefore obtain from (36) that

$$\begin{aligned} \gamma _1\Vert x_{\tau (k)+1}-x_{\tau (k)}\Vert ^{2} \le \alpha _{\tau (k)}M\rightarrow 0 \end{aligned}$$

(40)

for some constant $M > 0$. Thus,

$$\begin{aligned} \Vert x_{\tau (k)+1}-x_{\tau (k)}\Vert \rightarrow 0,~~ k\rightarrow \infty . \end{aligned}$$

(41)

Using the same technique of the proof as in Case 1, one can also derive the limits

$$\begin{aligned} \Vert x_{\tau (k)+1} - w_{\tau (k)} \Vert\rightarrow & {} 0,~~ k\rightarrow \infty , \nonumber \\ \Vert w_{\tau (k)} - x_{\tau (n)} \Vert\rightarrow & {} 0,~~ k \rightarrow \infty , \end{aligned}$$

(42)

$$\begin{aligned} \Vert x_{\tau (k)} - z_{\tau (k)} \Vert\rightarrow & {} 0,~~ k \rightarrow \infty . \end{aligned}$$

(43)

Again observe that for $j\ge 0$ by (36), we have $u_{j+1}<u_{j}$ when $x_{j}\not \in \Omega :=\{x\in H: \langle x-z_0,x-z\rangle \le 0\}$ (note that this $\Omega$ is the same set as in Lemma 2.1). Hence $x_{\tau (k)}\in \Omega$ for all $k\ge k_{0}$ since $u_{\tau (k)} \le u_{\tau (k)+1}$. Since $\{x_{\tau (k)}\}$ is bounded, we may choose a subsequence (which we again call $\{x_{\tau (k)}\}$) which converges weakly to some $x^{*}\in H$. As $\Omega$ is a closed and convex set, it is then weakly closed and so $x^{*} \in \Omega$. Using (43), one can see as in Case 1 that $z_{\tau (k)}\rightharpoonup x^{*}$ and $x^* \in S$. Consequently, we have $x^{*}\in \Omega \cap S$. In view of Lemma 2.1, however, the intersection $\Omega \cap S$ contains z as its only element. We therefore get $x^* = z$. Furthermore, we have

$$\begin{aligned} \Vert x_{\tau (k)}-z\Vert ^{2}= & \, \langle x_{\tau (k)}-z_0,x_{\tau (k)}-z\rangle - \langle z-z_0,x_{\tau (k)}-z\rangle \\\le &\, -\langle z-z_0,x_{\tau (k)}-z\rangle \end{aligned}$$

since $x_{\tau (k)}\in \Omega$. Taking lim sup in this last inequality gives

$$\begin{aligned} \limsup _{k \rightarrow \infty }\Vert x_{\tau (k)}-z\Vert \le 0. \end{aligned}$$

Hence

$$\begin{aligned} \Vert x_{\tau (k)}-z\Vert \rightarrow 0,~~ k\rightarrow \infty . \end{aligned}$$

(44)

We claim that this implies $\lim _{k \rightarrow \infty } u_{\tau (k)+1} = 0$. By definition, $u_{\tau (k)+1}$ is equal to

$$\begin{aligned} \Vert x_{\tau (k)+1} - z \Vert ^2 - \theta _{\tau (k)} \Vert x_{\tau (k)} - z \Vert ^2 + 2 \theta _{\tau (k)+1} \Vert x_{\tau (k)+1} - x_{\tau (k)} \Vert ^2 + \alpha _{\tau (k)+1} \Vert x_{\tau (k)+1} - z_0 \Vert ^2. \end{aligned}$$

Adding and subtracting $x_{\tau (k)}$ inside the norm of the first term, and using (41), (44), we see that the first term goes to zero. The second term converges to zero also in view of (44), taking into account the boundedness of $\{ \theta _k \}$. The third term vanishes in the limit because of (41) and noting once again that $\{ \theta _k \}$ is a bounded sequence. Finally, the last term goes to zero since $\{ \alpha _k \}$ converges to zero and the sequence $\{ z_k \}$ is bounded by Theorem 3.3.

We next show that we actually have $\lim _{k \rightarrow \infty } u_k = 0$. To this end, first observe that, for $k\ge k_{0},$ one has $u_{k}\le u_{\tau (k)+1}$ if $k\ne \tau (k)$ (that is, if $\tau (k)<k$) because we necessarily have $u_{j}>u_{j+1}$ for $\tau (k)+1\le j\le k-1$. It follows that for all $k\ge k_{0}$, we have $u_{k}\le \max \{u_{\tau (k)}, u_{\tau (k)+1}\}=u_{\tau (k)+1} \rightarrow 0$, hence $\limsup _{k\rightarrow \infty }u_{k}\le 0$. On the other hand, Lemma 3.4 implies that $\liminf _{k \rightarrow \infty } u_k \ge 0$. Together we obtain $\lim _{k \rightarrow \infty } u_k = 0$.

Consequently, the boundedness of $\{ z_k \}$, assumptions on our iterative parameters and (36) show that

$$\begin{aligned} \Vert z_k - z_{k+1} \Vert \rightarrow 0,~~ k \rightarrow \infty . \end{aligned}$$

Hence the definition of $u_k$ yields

$$\begin{aligned} \lim _{k \rightarrow \infty } \big ( \Vert z_{k+1} - z \Vert ^2 - \theta _k \Vert z_k - z \Vert ^2 \big )=0. \end{aligned}$$

Using our assumption, it is not difficult to see that this implies the strong convergence of the entire sequence $\{ z_k \}$ to the particular solution z. The statement therefore follows from Lemma 3.5. $\square$

In the special case when B is a set-valued maximal monotone operator and A is a single-valued $\kappa$-inverse strongly monotone operator in problem (1), iterative procedure (9) reduces to the following: $z_0, z_1 \in H$,

$$\begin{aligned} \left\{ \begin{array}{ll} &{} y_k=\alpha _kz_0+(1-\alpha _k)z_k+\theta _k(z_k-z_{k-1})\\ &{} z_{k+1}=(1-\beta _k)y_k+\beta _k(I+\lambda B)^{-1}(I-\lambda A)y_k, \end{array} \right. \end{aligned}$$

(45)

with $0<\lambda <2\kappa$. Moreover, we obtain strong convergence for this special case of monotone inclusion for which its proof can be easily obtained by following line of arguments of previous lemmas and Theorem 3.6.

Corollary 3.7

Suppose B is a set-valued maximal monotone operator and A is a single-valued $\kappa$-inverse strongly monotone operator. Assume that $S:=\{x\in H: 0\in Ax+Bx\}\ne \emptyset$. Let $\{z_k\}$ be the sequence generated by (45) with $0<\beta \le \beta _k\le \frac{1}{2}$, $0<\lambda <2\kappa$ and $0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}$. Then $\{z_k\}$ strongly converges to z, where $z=P_Sz_0$.

We next relate our results to some existing results from the literature.

Remark 3.8

(a)
In the results of Thong and Vinh (see Thong and Vinh 2019, Thm. 3.5), strong convergence for monotone inclusion was obtained under some assumptions on the iterative sequence. The monotone inclusion studied in Thong and Vinh (2019) involves sum of a set-valued maximal monotone operator and single-valued inverse-strongly monotone operator. In this paper, our method is proposed such that no assumption is made on the iterative sequence even for a more general result considered here.
(b)
The Algorithm (45) could be taken as the inertial strong convergence version of some recent results in Attouch and Cabot (2019), Boţ and Csetnek (2016), Lorenz and Pock (2015) and Villa et al. (2013). $\Diamond$

4 Numerical experiments

In all the examples in this section, we compare our proposed method (9) with the non-inertial version (when $\theta _n=0$), Thong and Vinh results (see Thong and Vinh 2019, Thm. 3.5) and Shehu (2016). Our aim is to compare our method with other relevant strong convergence methods in the literature.

Example 4.1

Let $H=L^2([0,1])$. Let $A:=\partial \Vert .\Vert$ and $B=N_C$ in (1), where $N_C$ is the normal cone of nonempty closed and convex subset C of H ($N_C(x):=\{x^* \in H:\langle y-x,x^*\rangle \le 0, \forall y \in C \}$), . Then problem (1) reduces to the following minimization problem: find $x^{*}\in L^2([0,1])$ such that

$$\begin{aligned} 0\in \partial \Vert x^{*}\Vert +N_C(x^{*}). \end{aligned}$$

(46)

Note that $S\ne \emptyset$ since $0\in S.$ Furthermore, the resolvent $J^{\lambda }_B=(I+\lambda N_C)^{-1}=P_C$, and $J^{\lambda }_A$ is given by the Moreau decomposition

$$\begin{aligned} J^{\lambda }_A(x)= & \, (I+\lambda \partial \Vert .\Vert )^{-1}(x)\\= & \, \text {Prox}_{\lambda \Vert .\Vert }(x) =x-\lambda P_{B_{\Vert .\Vert _*}}(\frac{x}{\lambda }), \end{aligned}$$

where $\text {Prox}_{\lambda \Vert .\Vert }(x) := \text {argmin}_y \big \{ \Vert y\Vert + \frac{1}{2} \Vert y - x \Vert ^2 \big \}$, $P_{B_{\Vert .\Vert _*}}$ is the projection operator and $B_{\Vert .\Vert _*}$ is the norm unit ball(of the dual norm). Note that in this case, $L^2([0,1])$ is self dual. Moreover, the projection $P_{B_{\Vert .\Vert _*}}$ (see Bauschke and Combettes 2011; Cegielski 2012) is given by:

$$\begin{aligned} P_{B_{\Vert .\Vert _*}}(x)= \left\{ \begin{array}{llll} &{}\frac{x}{\Vert x\Vert }, &{}\Vert x\Vert >1\\ &{}x, &{}\Vert x\Vert \le 1. \end{array} \right. \end{aligned}$$

Therefore,

$$\begin{aligned} J^{\lambda }_A(x)=x-\lambda P_{B_{\Vert .\Vert }}(\frac{x}{\lambda }) = \left\{ \begin{array}{llll} &{}x-\lambda \frac{x}{\Vert x\Vert }, &{}\Vert \frac{x}{\lambda }\Vert >1\\ &{}0, &{}\Vert \frac{x}{\lambda }\Vert \le 1. \end{array} \right. \end{aligned}$$

We take C as the ball $C:=\{x \in H:\Vert x-z\Vert \le r\}$, then

$$\begin{aligned} P_C(x)= \left\{ \begin{array}{llll} &{} x,~~&{}\Vert x-z\Vert \le r\\ &{} z+\frac{r(x-z)}{\Vert x-z\Vert },~~&{}\Vert x-z\Vert > r \end{array} \right. \end{aligned}$$

In particular, $C=\{x\in L^2([0,1]):\int _0^1|x(t)-\sin (\frac{t}{2\pi })|^2 dt \le 16 \}$.

Set $\lambda =0.02$, $\beta _k=0.6$ and $\alpha _k=100/k$. Take $\Vert z_k-z_{k-1}\Vert \le 10^{-3}$ as the stopping criterion (Fig. 1).

Table 1 Comparing the change of $\theta _k$ under the same initial value for Example 4.1

Full size table

Example 4.2

Suppose $A:\mathbb {R}^3\rightarrow \mathbb {R}^3$ and $B:\mathbb {R}^3 \rightarrow \mathbb {R}^3$ are given by

$$\begin{aligned} A \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) = \left( \begin{array}{ccc} 8 &{} 0 &{} 0 \\ 0 &{} 5 &{} 0 \\ 0 &{} 0 &{} 10 \\ \end{array} \right) \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) ,\qquad B \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) = \left( \begin{array}{ccc} 7 &{} 0 &{} 0 \\ 0 &{} 6 &{} 0 \\ 0 &{} 0 &{} 4 \\ \end{array} \right) \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) , \end{aligned}$$

It can be shown that $\Omega =\{(0,0,0)\}$.

Let $z_0$ be randomly selected. In Algorithm 3 of Thong and Vinh (2019), we chose $\lambda =0.1$, $\beta _k=1/(k+1)$ and $\tau _k=1/(k+1)^2$. In Algorithm (26) of Shehu (2016), we chose $\alpha _k=1/k$, $\beta _k=k/(2k+1)$ and $r_k=0.1$. Take $\Vert z_k\Vert \le 0.005$ as the stopping criterion.

For Examples 4.2 and 4.3, we take $\lambda =0.2$, $\beta _k=0.5$, $\alpha _k=\frac{1}{25k}$ in Algorithm (9) and $\beta =0.2$ in Algorithm 3 of Thong and Vinh (2019).

Table 2 Comparing the change of $\theta _k$ under the same initial value for Example 4.2

Full size table

We compared the algorithm (9), Algorithm 3 in Thong and Vinh (2019) and the algorithm (26) in Shehu (2016). From Fig. 2, we know that the performance of the algorithm (9) is better than that of the other two algorithms.

Example 4.3

Let us consider the following well known $\ell _1$-regularized least squares problem, which consists of finding a sparse solution to an underdetermined linear system. Suppose that we solve the following problem:

$$\begin{aligned} \min \frac{1}{2}\Vert Dx-b\Vert ^2_2+\rho \Vert x\Vert _1, \end{aligned}$$

(47)

where $D \in \mathbb {R}^{m \times n}$ and $b\in \mathbb {R}^m$. In this case,

$$\begin{aligned} J^{\lambda }_A(x)=(D^tD+ \lambda ^{-1}I)^{-1}(D^tb+ \lambda ^{-1}x) \end{aligned}$$

while

$$\begin{aligned} J^{\lambda }_B(x)=(sign(x_i).\max \{0,|x_i|-\lambda \rho \})_i, ~~i=1,2,\ldots ,n. \end{aligned}$$

We remark that there is a commercial software, based on the projected gradient method for solving problem (47), for example SPGL1 (van den Berg and Friedlander 2007; Lorenz 2013) and FISTA (Beck and Teboulle 2009), but this is beyond the scope of this paper. Our interest here is to demonstrate the efficiency of our proposed method (9) using problem (47).

We generate random problems using different choices of $\lambda$ for $m=100$ and $n=1000$. In Algorithm 3 of Thong and Vinh (2019) and the algorithm (26) of Shehu (2016), we chose $\rho =1$ and $\lambda =1.9/(\max (eig(D^TD)))$, and in the algorithm (9), we chose $\rho =0.5$. In addition, select $r_k=0.2$ in algorithm (26) of Shehu (2016).

Table 3 Comparing the change of $\theta _k$ under the same initial value for Example 4.3

Full size table

Table 3 shows that the algorithm (9) is better when $\theta _k=0.33$. The numerical result is described in Fig. 3, it illustrates that the performance of Algorithm (9) is better than the other two algorithms.

Remark 4.4

(a)
It can be seen from the numerical examples that Algorithm (9) outperforms the methods in Shehu (2016) and Thong and Vinh (2019) (see Figs. 2, 3) for strong convergence of sum of maximal monotone operators. Furthermore, the additional of inertial term improves the acceleration of the proposed method as can be seen in the numerical examples that Algorithm (9) converges faster than the non-inertial case when $\theta _k=0$ (please see, Tables 1, 2, 3). Also, the optimum choice of $\theta _k=0$ should be close to the upper bound $\frac{1}{3}$ from our examples.
(b)
Algorithm (9) is sensitive to the choice of the initial point $z_0$ as can be seen in our examples in Tables 1, 2 and 3.

Remark 4.5

(a)
We point out that there are different strategies in the current literature to enforce strong convergence on proximal-like algorithms (in particular, DR splitting); see, e.g., Solodov and Svaiter (2000) and Hirstoaga (2006). In this regard, the results of Hirstoaga (2006) is concerned with ”anchor-point” algorithms as employed in our proposed method (9). One can see in Algorithm 2.1 of Hirstoaga (2006), there is no presence of inertial extrapolation term, $\theta _k(z_k-z_{k-1})$, which has been proved in the literature to increase the speed of convergence of non-inertial counterpart in most optimization methods. From our proposed method (9), we see that when $\theta _k\ne 0$ (in this paper, we assume that $0 \le \theta _k \le \theta <\frac{1}{3}$), then our method (9) cannot be reduced to Algorithm 2.1 of Hirstoaga (2006) applied to the splitting operator of Eckstein and Bertsekas (1992). As confirmed in our numerical examples in Sect. 4, our method (9) outperforms Algorithm 2.1 of Hirstoaga (2006) when applied to the splitting operator of Eckstein and Bertsekas (1992). Also, our method of proof is different from the method of proof given in Hirstoaga (2006).
(b)
The essence of our numerical examples in Sect. 4 is to drive home the implementations and effectiveness of our proposed method (9). As discussed in MacNamara and Strang (2016) and other related chapters in the book, applications of our method (9) to solve problems arising from wireless communications, imaging, networking, finance, hemodynamics, free-surface flows, and other science and engineering problems in infinite-dimensional Hilbert spaces would be discussed separately as a future project. $\Diamond$

5 Final remarks

In this paper we propose a Douglas–Rachford splitting method with inertial extrapolation step and give strong convergence analysis of the method. The method is much more applicable for a general class of maximal monotone operators and no uniform monotonicity on any of the involved maximal monotone operators is assumed. Furthermore, the analysis of the algorithm is obtained under the natural condition of the inertial factor $\theta _k$ being monotone non-decreasing and bounded away from 1/3. Some numerical illustrations are given to test the efficiency and implemnetation of the proposed scheme. The results obtained in this paper could serve as the strong convergence counterpart of already obtained weak convergence methods for inertial Douglas–Rachford splitting methods (Bauschke and Combettes 2011; Beck and Teboulle 2009; Boţ et al. 2015; Lorenz and Pock 2015; Thong and Vinh 2019) in the literature.

Our future project include the following:

to modify the proposed method (9) in this paper so that the bound of the inertial factor $\theta _k$ could exceed 1/3 and possibly lead to a faster convergence; and
to obtain the rate of convergence of method (9). As far as we know, this has not been obtained before in the literature.

References

Alvarez F, Attouch H (2001) An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal 9:3–11
MathSciNet MATH Google Scholar
Attouch H, Cabot A (2019) Convergence of a relaxed inertial forward–backward algorithm for structured monotone inclusions. Appl Math Optim. https://doi.org/10.1007/s00245-019-09584-z
Article MathSciNet MATH Google Scholar
Bauschke HH, Combettes PL (2001) A weak-to-strong convergence principle for Fejér-monotone methods in Hilbert spaces. Math Oper Res 26:248–264
MathSciNet MATH Google Scholar
Bauschke HH, Combettes PL (2011) Convex analysis and monotone operator theory in Hilbert spaces. CMS books in mathematics. Springer, New York
MATH Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
MathSciNet MATH Google Scholar
Boikanyo OA (2016) The viscosity approximation forward–backward splitting method for zeros of the sum of monotone operators. Abstr Appl Anal. Article ID 2371857
Boţ RI, Csetnek ER (2016) An inertial forward-backward-forward primal-dual splitting algorithm for solving monotone inclusion problems. Numer. Algorithms 71:519–540
MathSciNet MATH Google Scholar
Boţ RI, Csetnek ER, Hendrich C (2015) Inertial Douglas–Rachford splitting for monotone inclusion problems. Appl Math Comput 256:472–487
MathSciNet MATH Google Scholar
Cegielski A (2012) Iterative methods for fixed point problems in Hilbert spaces. Lecture notes in mathematics 2057. Springer, Berlin
Google Scholar
Chang S-S, Wen C-F, Yao J-C (2019) A generalized forward-backward splitting method for solving a system of quasi variational inclusions in Banach spaces. RACSAM 113:729–747
MathSciNet MATH Google Scholar
Cholamjiak P (2016) A generalized forward-backward splitting method for solving quasi inclusion problems in Banach spaces. Numer. Algorithms 71:915–932
MathSciNet MATH Google Scholar
Cholamjiak W, Cholamjiak P, Prasit SS (2018) An inertial forward-backward splitting method for solving inclusion problems in Hilbert spaces. J. Fixed Point Theory Appl. 20:42
MathSciNet MATH Google Scholar
Combettes PL (2004) Solving monotone inclusions via compositions of nonexpansive averaged operators. Optimization 53:475–504
MathSciNet MATH Google Scholar
Dong Q, Jiang D, Cholamjiak P, Shehu Y (2017) A strong convergence result involving an inertial forward–backward algorithm for monotone inclusions. J. Fixed Point Theory Appl. 19:3097–3118
MathSciNet MATH Google Scholar
Dong QL, Cho YJ, Zhong Zhong LL, Rassias TM (2018) Inertial projection and contraction algorithms for variational inequalities. J. Glob. Optim. 70(3):687–704
MathSciNet MATH Google Scholar
Douglas J, Rachford HH (1956) On the numerical solution of heat conduction problems in two or three space variables. Trans. Am. Math. Soc. 82:421–439
MathSciNet MATH Google Scholar
Eckstein J (1989) Splitting methods for monotone operators with applications to parallel optimization. Doctoral dissertation, Department of Civil Engineering, Massachusetts Institute of Technology. Available as Report LIDS-TH-1877, Laboratory for Information and Decision Sciences, MIT, Cambridge
Eckstein J, Bertsekas DP (1992) On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(3):293–318
MathSciNet MATH Google Scholar
Fukushima M (1996) The primal Douglas–Rachford splitting algorithm for a class of monotone mappings with application to the traffic equilibrium problem. Math. Program. 72:1–15
MathSciNet MATH Google Scholar
Gabay D, Mercier B (1976) A dual algorithm for the solution of nonlinear variational problems via finite element approximations. Comput. Math. Appl. 2:17–40
MATH Google Scholar
Gibali A, Thong DV (2018) Tseng type methods for solving inclusion problems and its applications. Calcolo 55:49
MathSciNet MATH Google Scholar
Glowinski R, Le Tallec P (1989) Augmented Lagrangian and operator-splitting methods in nonlinear mechanics. SIAM Studies in Applied Mathematics, vol 9, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA. ISBN: 0-89871-230-0
Glowinski R, Marrocco A (1975) Approximation par é’ements finis d’ordre un et résolution par pénalisation-dualité d’une classe de problémes non linéaires. R.A.I.R.O. R2:41–76
Google Scholar
Güler O (1991) On the convergence of the proximal point algorithm for convex minimization. SIAM J. Control Optim. 29:403–419
MathSciNet MATH Google Scholar
He B, Yuan X (2015) On the convergence rate of Douglas–Rachford operator splitting method. Math. Program. 153:715–722
MathSciNet MATH Google Scholar
Hirstoaga SA (2006) Iterative selection methods for common fixed point problems. J. Math. Anal. Appl. 324(2):1020–1035
MathSciNet MATH Google Scholar
Lions PL, Mercier B (1979) Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16:964–979
MathSciNet MATH Google Scholar
López G, Martín-Márquez V, Wang F, Xu H-K (2012) Forward–backward splitting methods for accretive operators in Banach spaces. Abstr. Appl. Anal. Article ID 109236
Lorenz DA (2013) Constructing test instances for Basis Pursuit Denoising. IEEE Trans. Signal Process. 61:1210–1214
MathSciNet MATH Google Scholar
Lorenz DA, Pock T (2015) An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51:311–325
MathSciNet MATH Google Scholar
MacNamara S, Strang G (2016) Operator splitting. In: Glowinski R, Osher S, Yin W (eds) Splitting methods in communication, imaging, science, and engineering. Springer, Berlin, pp 95–114
MATH Google Scholar
Maingé PE (2008) Convergence theorem for inertial KM-type algorithms. J. Comput. Appl. Math. 219(1):223–236
MathSciNet MATH Google Scholar
Martinet B (1970) Regularisation dequations variationnelles par approximations successives. Rev. Francaise Informat. Recherche. Operationnelle 4:154–158
MATH Google Scholar
Moreau JJ (1965) Proximité et dualit’e dans un espace Hilbertien. Bull. Soc. Math. Fr. 93:273–299
MATH Google Scholar
Riahi H, Chbani Z, Loumi M-T (2018) Weak and strong convergences of the generalized penalty Forward–Forward and Forward–Backward splitting algorithms for solving bilevel hierarchical pseudomonotone equilibrium problems. Optimization 67:1745–1767
MathSciNet MATH Google Scholar
Rockafellar RT (1976) Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14:877–898
MathSciNet MATH Google Scholar
Shehu Y (2016) Iterative approximations for zeros of sum of accretive operators in Banach spaces. J Funct Spaces. Article ID 5973468
Shehu Y (2018) Convergence rate analysis of inertial Krasnoselskii–Mann-type iteration with applications. Numer. Funct. Anal. Optim. 39:1077–1091
MathSciNet MATH Google Scholar
Shehu Y (2019) Convergence results of forward–backward algorithms for sum of monotone operators in Banach spaces. Results Math. 74:138
MathSciNet MATH Google Scholar
Shehu Y, Cai G (2018) Strong convergence result of forward–backward splitting methods for accretive operators in Banach spaces with applications. RACSAM 112:71–87
MathSciNet MATH Google Scholar
Shehu Y, Li X-H, Dong Q-L (2020) An efficient projection-type method for monotone variational inequalities in Hilbert spaces. Numer. Algorithms 84:365–388
MathSciNet MATH Google Scholar
Solodov MV, Svaiter BF (2000) Forcing strong convergence of proximal point iterations in a Hilbert space. Math. Program. Ser. A 87:189–202
MathSciNet MATH Google Scholar
Svaiter BF (2011) On weak convergence of the Douglas–Rachford method. SIAM J. Control Optim. 49:280–287
MathSciNet MATH Google Scholar
Thong DV, Cholamjiak P (2019) Strong convergence of a forward–backward splitting method with a new step size for solving monotone inclusions. Comput. Appl. Math. 38:94
MathSciNet MATH Google Scholar
Thong DV, Vinh NT (2019) Inertial methods for fixed point problems and zero point problems of the sum of two monotone mappings. Optimization 68:1037–1072
MathSciNet MATH Google Scholar
van den Berg E, Friedlander MP (2007) SPGL1: a solver for large-scale sparse reconstruction. http://www.cs.ubc.ca/labs/scl/spgl1. Version 1.9. Accessed 2015
Villa S, Salzo S, Baldassarres L, Verri A (2013) Accelerated and inexact forward–backward. SIAM J. Optim. 23:1607–1633
MathSciNet MATH Google Scholar
Wang Y, Wang F (2018) Strong convergence of the forward–backward splitting method with multiple parameters in Hilbert spaces. Optimization 67:493–505
MathSciNet MATH Google Scholar
Zhang H, Cheng L (2013) Projective splitting methods for sums of maximal monotone operators with applications. J. Math. Anal. Appl. 406:323–334
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Open access funding provided by Institute of Science and Technology (IST Austria). The project of Yekini Shehu has received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Program (FP7—2007–2013) (Grant Agreement No. 616160). The authors are grateful to the anonymous referees and the handling Editor for their comments and suggestions which have improved the earlier version of the manuscript greatly.

Author information

Authors and Affiliations

Institute of Science and Technology (IST), Am Campus 1, 3400, Klosterneuburg, Austria
Yekini Shehu
Department of Mathematics, Zhejiang Normal University, Jinhua, 321004, People’s Republic of China
Yekini Shehu
College of Science, Civil Aviation University of China, Tianjin, 300300, People’s Republic of China
Qiao-Li Dong & Lu-Lu Liu
Center for General Education, China Medical University, Taichung, Taiwan
Jen-Chih Yao

Authors

Yekini Shehu
View author publications
You can also search for this author in PubMed Google Scholar
Qiao-Li Dong
View author publications
You can also search for this author in PubMed Google Scholar
Lu-Lu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jen-Chih Yao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yekini Shehu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Shehu, Y., Dong, QL., Liu, LL. et al. New strong convergence method for the sum of two maximal monotone operators. Optim Eng 22, 2627–2653 (2021). https://doi.org/10.1007/s11081-020-09544-5

Download citation

Received: 19 September 2019
Revised: 21 July 2020
Accepted: 21 July 2020
Published: 31 July 2020
Issue Date: December 2021
DOI: https://doi.org/10.1007/s11081-020-09544-5

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

New strong convergence method for the sum of two maximal monotone operators

Abstract

Similar content being viewed by others

Convergence results of iterative algorithms for the sum of two monotone operators in reflexive Banach spaces

Convergence Analysis of a New Forward-Reflected-Backward Algorithm for Four Operators Without Cocoercivity

Complexity of the relaxed Peaceman–Rachford splitting method for the sum of two maximal strongly monotone operators

1 Introduction

1.1 Motivations and contributions

2 Preliminaries

Lemma 2.1

Proof

Lemma 2.2

Lemma 2.3

3 Analysis of the convergence

Lemma 3.1

Proof

Lemma 3.2

Proof

Lemma 3.3

Proof

Lemma 3.4

Proof

Lemma 3.5

Proof

Theorem 3.6

Proof

Corollary 3.7

Remark 3.8

4 Numerical experiments

Example 4.1

Example 4.2

Example 4.3

Remark 4.4

Remark 4.5

5 Final remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation