1 Introduction

Let H be a real Hilbert space with scalar product \(\langle . , . \rangle\) and induced norm \(\Vert \cdot \Vert\). An operator \(A:H\rightarrow 2^H\) with domain D(A) is said to be monotone if

$$\begin{aligned} \langle u-v,x-y\rangle \ge 0~~\forall x,y \in D(A),~~u \in Ax,v\in Ay. \end{aligned}$$

A is maximal monotone if its graph

$$\begin{aligned} G(A):=\{(x,y):x \in D(A),y \in Ax\} \end{aligned}$$

is not properly contained in the graph of any other monotone operators.

Let us consider the inclusion problem of the form

$$\begin{aligned} 0\in A(u)+B(u), \end{aligned}$$
(1)

where A and B are set-valued maximal monotone operators in H. Throughout this paper, we assume that the set of solution, denoted by S, of (1) is nonempty.

The proximal point algorithm (PPA) is the well-known method for solving inclusion problem (1) (see, Lions and Mercier 1979; Martinet 1970; Moreau 1965; Rockafellar 1976). The PPA for solving (1) is expressed as

$$\begin{aligned} 0\in A(u_{k+1})+B(u_{k+1})+\frac{1}{\lambda }(u_{k+1}-u_k), \end{aligned}$$
(2)

where \(\lambda >0\) is the proximal parameter. Now, implementing PPA (2) to solve (1) requires computing the resolvent operator of the sum \(A+B\) exactly. This is very difficult to implement and could be as hard as the original inclusion problem (1). This difficulty has led many authors to consider the operator splitting approach to solve (1). The aim of operator splitting method is to circumvent the computation of \(J^\lambda _{A+B}\) when implementing (2) but rather consider the computation of \(J^\lambda _A\) and \(J^\lambda _B\) (Eckstein and Bertsekas 1992; Glowinski and Le Tallec 1989; Lions and Mercier 1979).

When both A and B are single-valued linear operators in (1), Douglas and Rachford (1956) proposed the following method for solving heat conduction problems:

$$\begin{aligned} \left\{ \begin{array}{ll} &{} \frac{1}{\lambda }\Big (u_{k+\frac{1}{2}}-u_k\Big )+A\Big (u_{k+\frac{1}{2}}\Big )+B(u_k) =0,\\ &{} \frac{1}{\lambda }\Big (u_{k+1}-u_{k+\frac{1}{2}}\Big )+B\Big (u_{k+1}\Big )-B(u_k) =0. \end{array} \right. \end{aligned}$$
(3)

We can eliminate \(u_{k+\frac{1}{2}}\) in (3) above and obtain

$$\begin{aligned} \Big (J^\lambda _B\Big )^{-1}u_{k+1}=\Big (J^\lambda _A(2J^\lambda _B-I)+(I-J^\lambda _B)\Big )\Big (J^\lambda _B\Big )^{-1}u_k. \end{aligned}$$
(4)

Define \(z_k:=\Big (J^\lambda _B\Big )^{-1}u_k\Leftrightarrow u_k=J^\lambda _B(z_k)\). Then, (4) reduces to the following splitting method (known as Douglas–Rachford splitting method)

$$\begin{aligned} z_{k+1}=J^\lambda _A(2J^\lambda _B-I)z_k+(I-J^\lambda _B)z_k. \end{aligned}$$
(5)

Lions and Mercier (1979) extended the Douglas–Rachford splitting method (5) to the generic case where both A and B are set-valued nonlinear operators as in our problem (1). The Douglas–Rachford splitting method (5) to the generic case is explained as follows in Lions and Mercier (1979): Starting from an arbitrary iterate \(u_1\) in the domain of B, choosing \(b_1 \in B(u_1)\) and setting \(z_1 = u_1 +\lambda b_1\), then \(u_1 = J^\lambda _B(z_1)\) (the existence of the pair \((u_1,z_1)\) is unique by the Representation Lemma, see Eckstein and Bertsekas 1992, cor. 2.3). Thus a sequence \(\{z_k\}\) is generated by the Douglas–Rachford scheme (5); and consequently a sequence \(\{u_k := J^\lambda _B(z_k)\}\) converging to a solution point of (1) can be generated (see Eckstein 1989, Thm. 3.15). We refer to Combettes (2004) for the precise connection between (5) and the original Douglas–Rachford scheme in Douglas and Rachford (1956) for heat conduction problems. More details on Douglas–Rachford splitting method (5) can also be found in Fukushima (1996), Gabay and Mercier (1976) and Glowinski and Marrocco (1975).

1.1 Motivations and contributions

Boţ et al. (2015) gave the following method for solving (1): \(z_0=z_1\);

$$\begin{aligned} \left\{ \begin{array}{ll} &{} u_k=J^\lambda _B(z_k+\alpha _k(z_k-z_{k-1}))\\ &{} w_k=J^\lambda _A(2u_k-z_k-\alpha _k(z_k-z_{k-1}))\\ &{} z_{k+1}:=z_k +\alpha _k(z_k-z_{k-1})+ \beta _k(w_k - u_k ), \end{array} \right. \end{aligned}$$
(6)

where \(\{\alpha _k\}\) is a non-decreasing sequence with \(0\le \alpha _k \le \alpha <1, \forall k \ge 1\) and \(\lambda , \sigma , \delta >0\) such that

  1. (a)

    \(\delta >\frac{\alpha ^2(1+\alpha )+\alpha \sigma }{1-\alpha ^2}\); and

  2. (b)

    \(0 <\lambda \le \beta _k \le \theta :=2\frac{\delta -\alpha [\alpha (1+\alpha )+\alpha \delta +\sigma ]}{\delta [1+\alpha (1+\alpha )+\alpha \delta +\sigma ]}\).

Boţ et al. (2015) obtained weak convergence analysis of algorithm (6) for finding common zeros of the sum of two maximal monotone operators and illustrate their results through some numerical experiments. The same conditions (a) and (b) above have been used in recent works in Dong et al. (2018), Shehu (2018) and other associated papers. When \(\alpha _k=0\), it was proved in Bauschke and Combettes (2011, Thm. 25.6(vii)) that \(\{z_k\}\) in (6) converges strongly to a solution of (1) if either A or B is uniformly monotone (A is uniformly monotone if \(\langle x-y,u-v\rangle \ge \phi (\Vert x-y\Vert ), \forall u \in Ax, v\in Ay\), where \(\phi :[0,\infty )\rightarrow [0,\infty )\) is increasing and vanishes only at zero) on every nonempty bounded subset of its domain.

When \(\beta _k=1\) and \(B\equiv 0\), then (6) reduces to the inertial proximal point method proposed by Alvarez and Attouch (2001). In this case, Alvarez and Attouch (2001) assumed that the inertial factor \(\alpha _k\) satisfies the condition \(0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}\) in their convergence result. However, the assumption on the inertial factor \(\alpha _k\) imposed in (6) does not appear as simple as condition \(0\le \alpha _k\le \alpha _{k+1}\le \alpha <\frac{1}{3}\), assumed by Alvarez and Attouch (2001).

Problems arise in infinite dimensional spaces in many disciplines like economics, image recovery, electromagnetics, quantum physics, and control theory. For such problems, strong convergence of sequence of iterates \(z_k\) of the proposed iterative procedure is often much more desirable than weak convergence. This is because strong convergence translates the physically tangible property that the energy \(\Vert z_k-z\Vert\) of the error between the iterate \(z_k\) and a solution z eventually becomes arbitrarily small. Another importance of strong convergence is also underlined in the works of Güler (1991), where a convex function f is minimized through the proximal point algorithm. Güler (1991) showed that the rate of convergence of the value sequence \(\{f(z_k)\}\) is better when \(\{z_k\}\) converges strongly than when it converges weakly. For more details on importance of strong convergence, please see Bauschke and Combettes (2001).

Strong convergence methods for solving problem (1) when B is set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator (i.e., \(\langle Ax-Ay,x-y\rangle \ge \kappa \Vert Ax-Ay\Vert ^2,~~\forall x, y\in H\)) have been studied extensively in the literature (see, for example, Boikanyo 2016; Chang et al. 2019; Cholamjiak 2016; Cholamjiak et al. 2018; Dong et al. 2017; Gibali and Thong 2018; López et al. 2012; Riahi et al. 2018; Shehu 2016, 2019; Shehu and Cai 2018; Thong and Cholamjiak 2019; Wang and Wang 2018). However, there are still few results on the strong convergence results concerning more general case of problem (1) when A and B are set-valued maximal monotone operators. This is the gap that this paper aims to fill in.

Our aim in this paper is to prove the strong convergence analysis of the inertial Douglas–Rachford splitting method with different conditions from the conditions (a) and (b) assumed in Boţ et al. (2015) without assuming uniform monotonicity on either maximal monotone operator A or B. Furthermore our assumptions on the inertial factor \(\theta _k\) here in this paper are the same assumptions in the results of Alvarez and Attouch (2001) (which is a special case of our result). In summary,

  • We prove strong convergence analysis of inertial Douglas–Rachford splitting method without using the conditions (a) and (b) assumed in Boţ et al. (2015). Our inertial conditions are the same as the ones assumed in Alvarez and Attouch (2001) for finding zero of a set-valued maximal monotone operator using inertial proximal method.

  • We obtain strong convergence results without assuming that any of the involved maximal monotone operators is uniformly monotone on every nonempty bounded subset. Our strong convergence results are much more general than the current ones in Bauschke and Combettes (2011) and other associated works where strong convergence is obtained.

  • Some numerical examples are given to confirm the importance of the presence of inertial term in our method.

The paper is therefore organized as follows: We first recall some basic explanations of Douglas–Rachford splitting method and introduce our inertial Douglas–Rachford splitting method alongside some results in Sect. 2. The analysis of strong convergence of our proposed method is then investigated in Sect. 3. We give numerical implementations in Sect. 4 and conclude with some final remarks in Sect. 5.

2 Preliminaries

Let us first recall some basics that are required to derive and analyze the Douglas–Rachford splitting method; for the corresponding details, we refer, Eckstein and Bertsekas (1992), He and Yuan (2015), Svaiter (2011) and Zhang and Cheng (2013).

Let \(\lambda > 0\) be a fixed parameter, and let us denote by

$$\begin{aligned} J^\lambda _A := ( I + \lambda A )^{-1} \quad \text {and} \quad J^\lambda _B := ( I + \lambda B )^{-1} \end{aligned}$$

the resolvents of A and B, respectively, which are known to be firmly nonexpansive (operator T is firmly-nonexpansive if \(\langle x-y,Tx-Ty\rangle \ge \Vert Tx-Ty\Vert ^2, ~~\forall x,y\in H\)). Furthermore, let us write

$$\begin{aligned} R^\lambda _A := 2 J^\lambda _A - I \quad \text {and} \quad R^\lambda _B := 2 J^\lambda _B - I \end{aligned}$$

for the corresponding reflections (also called Cayley operators), and note that the reflections are nonexpansive operators (T is nonexpansive if \(\Vert Tx-Ty\Vert \le \Vert x-y\Vert ,~~\forall x,y\in H\)).

In Eckstein and Bertsekas (1992) and He and Yuan (2015), the maximal monotone operator \(S_{\lambda ,A,B}\) is defined as

$$\begin{aligned} S_{\lambda ,A,B}:=\{(v+\lambda b,u-v):(u,b) \in B, (v,a)\in A, v+\lambda a=u-\lambda b\}. \end{aligned}$$

It was shown in Eckstein and Bertsekas (1992) that the Douglas–Rachford splitting method (5) can be converted to

$$\begin{aligned} z_{k+1}=\Big (J^\lambda _A(2J^\lambda _B-I)+(I-J^\lambda _B)\Big )z_k=(I+S_{\lambda ,A,B})^{-1}z_k=J_{S_{\lambda ,A,B}}(z_k). \end{aligned}$$

By Eckstein and Bertsekas (1992, Thm. 5), for any given zero \(z^*\) of \(S_{\lambda ,A,B}\), \(J^\lambda _B(z^*)\) is a zero of \(A+B\). Therefore, \(J^\lambda _B(z^*)\) is a solution of (1) whenever \(z^*\) satisfies

$$\begin{aligned} z^* =R^\lambda _A o R^\lambda _B(z^*). \end{aligned}$$
(7)

Consequently, the Douglas–Rachford splitting method (5) can be rewritten as

$$\begin{aligned} z_{k+1}= & {} \, J^\lambda _A(2J^\lambda _B-I)z_k+(I-J^\lambda _B)z_k \nonumber \\= & \, {} z_k+\frac{1}{2}(2J^\lambda _A(2J^\lambda _B(z_k)-z_k)-(2J^\lambda _B(z_k)-z_k)-z_k) \nonumber \\= &\, z_k+\frac{1}{2}(R^\lambda _A o R^\lambda _B(z_k)-z_k) \nonumber \\= &\, z_k-e(z_k,\lambda ), \end{aligned}$$
(8)

where \(e(z_k,\lambda ):=\frac{1}{2}(z_k-R^\lambda _A o R^\lambda _B(z_k))\).

In this paper, our convergence analysis will be conducted for an inertial generalized version of Douglas–Rachford splitting method (8): \(z_0, z_1 \in H\),

$$\begin{aligned} \left\{ \begin{array}{ll} &{} y_k=\alpha _kz_0+(1-\alpha _k)z_k+\theta _k(z_k-z_{k-1})\\ &{} z_{k+1}=y_k-\beta _k e(y_k,\lambda ), \end{array} \right. \end{aligned}$$
(9)

with \(\alpha _k \in [0,1), \beta _k \in (0,1]\) and \(\theta _k \in [0,1)\). We get the original Douglas–Rachford method (8) when \(\beta _k=1, \theta _k=0=\alpha _k\) in (9).

We next recall some properties of the projection. For any point \(u \in H\), there exists a unique point \(P_C u \in C\) such that

$$\begin{aligned} \Vert u-P_C u\Vert \le \Vert u-y\Vert ,~~\forall y \in C. \end{aligned}$$

\(P_C\) is called the metric projection of H onto C. We know that \(P_C\) is a nonexpansive mapping of H onto C. It is also known that \(P_C\) satisfies

$$\begin{aligned} \langle x-y, P_C x-P_C y \rangle \ge \Vert P_C x-P_C y\Vert ^2~~\forall x, y, \in H. \end{aligned}$$
(10)

In particular, we get from (10) that

$$\begin{aligned} \langle x-y, x-P_C y \rangle \ge \Vert x-P_C y\Vert ^2,~~\forall x \in C, y \in H. \end{aligned}$$
(11)

Furthermore, \(P_C x\) is characterized by the properties

$$\begin{aligned} P_Cx\in C\quad \text {and} \quad \langle x-P_C x,P_C x-y\rangle \ge 0,~\forall y\in C. \end{aligned}$$
(12)

This characterization implies that

$$\begin{aligned} \Vert x-y\Vert ^2\ge \Vert x-P_Cx\Vert ^2+\Vert y-P_Cx\Vert ^2~~\forall x \in H, \forall y \in C. \end{aligned}$$
(13)

The following result is obtained (Shehu et al. 2020) but we give the proof for the sake of completeness.

Lemma 2.1

Let \(S \subseteq H\) be a nonempty, closed, and convex subset of a real Hilbert space H. Let \(u \in H\) be arbitrarily given, \(z := P_S u\), and \(\Omega := \{ x \in H : \langle x - u, x - z \rangle \le 0 \}\). Then \(\Omega \cap S = \{ z \}\).

Proof

By definition, it follows immediately that \(z \in \Omega \cap S\). Conversely, take an arbitrary \(y \in \Omega \cap S\). Then, in particular, we have \(y \in \Omega\), and it therefore follows that

$$\begin{aligned} \Vert y-z \Vert ^2= & \, \langle y-z, y-z \rangle \nonumber \\= &\, \langle y-z, y-u \rangle + \langle y-z, u-z \rangle \\\le &\, \langle y-z, u-z \rangle . \nonumber \end{aligned}$$
(14)

Using \(z = P_S u\) together with the characterization (12), we also have

$$\begin{aligned} \langle u-z, z-x \rangle \ge 0 \quad \forall x \in S. \end{aligned}$$

In particular, since \(y \in S\), we therefore have \(\langle u-z, z-y \rangle \ge 0\). Hence (14) implies \(\Vert y-z \Vert ^2 \le 0\), so that \(y = z\). This completes the proof. \(\square\)

Finally, we state some basic properties that will be used in our convergence theorems.

Lemma 2.2

The following statements hold in H:

  1. (a)

    \(\Vert x+y\Vert ^2=\Vert x\Vert ^2+2\langle x,y\rangle +\Vert y\Vert ^2\) for all \(x, y \in H\).

  2. (b)

    \(2 \langle x-y, x-z \rangle = \Vert x-y \Vert ^2 + \Vert x-z \Vert ^2 - \Vert y-z \Vert ^2\) for all \(x,y,z \in H\).

  3. (c)

    \(\Vert tx+sy\Vert ^2=t(t+s)\Vert x\Vert ^2+s(t+s)\Vert y\Vert ^2-st\Vert x-y\Vert ^2, \quad \forall x, y \in H, \forall s, t \in \mathbb {R}.\)

Lemma 2.3

(Maingé 2008) Assume that \(\varphi _{k}\in [0,\infty )\) and \(\delta _{k}\in [0,\infty )\) satisfy:

  1. (1)

    \(\varphi _{k+1}-\varphi _{n}\le \theta _{k}(\varphi _{k}-\varphi _{k-1})+\delta _{k},\)

  2. (2)

    \(\sum _{k=1}^{\infty }\delta _{k}<\infty ,\)

  3. (3)

    \(\{\theta _{k}\}\subset [0,\theta ],\) where \(\theta \in (0,1).\)

Then the sequence \(\{\varphi _{k}\}\) is convergent with \(\sum _{k=1}^{\infty }[\varphi _{k+1}-\varphi _{k}]_{+}<\infty ,\) where \([t]_{+}:=\max \{t,0\}\) (for any \(t\in \mathbb {R})\).

3 Analysis of the convergence

For the rest of this paper, we assume that \(S\ne \emptyset\), \(\alpha _k \in (0,1)\) with \(\lim _{k\rightarrow \infty } \alpha _k=0\) and \(\sum _{k=1}^\infty \alpha _k=\infty\), \(0<\beta \le \beta _k\le 1\) and \(0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}\).

Lemma 3.1

Let \(\{z_k\}\) be the sequence generated by (9). For any z satisfying (7), we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2 \le \Vert y_k-z\Vert ^2-\Vert z_{k+1}-y_k\Vert ^2. \end{aligned}$$
(15)

Proof

By (9), we get

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2= & {} \Vert y_k-z-\beta _ke(y_k,\lambda )\Vert ^2 \nonumber \\= & {} \Vert y_k-z\Vert ^2-2\beta _k \langle y_k-z, e(y_k,\lambda )\rangle +\beta _k^2\Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$
(16)

We know that \(e(y_k,\lambda )=\frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))\), where \(\lambda >0\) is the proximal parameter, is firmly-nonexpansive (see, He and Yuan 2015, lem. 2.2). Thus,

$$\begin{aligned} \langle x-y,e(x,\lambda )-e(y,\lambda )\rangle \ge \Vert e(x,\lambda )-e(y,\lambda )\Vert ^2,~~\forall x,y \in H, \lambda >0. \end{aligned}$$

In particular, for \(z=R^\lambda _AoR^\lambda _B(z)\), we obtain

$$\begin{aligned} \langle y_k-z, e(y_k,\lambda )\rangle \ge \Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$
(17)

Putting (17) into (16), we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2\le & \, \Vert y_k-z\Vert ^2-2\beta _k \Vert e(y_k,\lambda )\Vert ^2 +\beta _k^2\Vert e(y_k,\lambda )\Vert ^2 \nonumber \\= &\, \Vert y_k-z\Vert ^2-\beta _k(2-\beta _k) \Vert e(y_k,\lambda )\Vert ^2. \end{aligned}$$
(18)

Recall that \(\beta _k e(y_k,\lambda )=y_k-z_{k+1}\) implies that

$$\begin{aligned} e(y_k,\lambda )=\frac{1}{\beta _k}(y_k-z_{k+1}). \end{aligned}$$
(19)

Using (19) in (18) and the condition that \(0<\beta \le \beta _k\le 1\), we have

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^2\le & \, \Vert y_k-z\Vert ^2-\beta _k(2-\beta _k) \frac{1}{\beta _k^2}\Vert z_{k+1}-y_k\Vert ^2\\= & \, \Vert y_k-z\Vert ^2-\frac{2-\beta _k}{\beta _k} \Vert z_{k+1}-y_k\Vert ^2\\\le & \, \Vert y_k-z\Vert ^2- \Vert z_{k+1}-y_k\Vert ^2. \end{aligned}$$

\(\square\)

Lemma 3.2

Let \(\{z_k\}\) be the sequence generated by (9). For any z satisfying (7), we have

$$\begin{aligned}&{-\,2\alpha _k\langle z_k-z,z_k-z_0\rangle } \nonumber \\&\quad \ge \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}- 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad +\alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}-\alpha _k\Vert z_k-z_0\Vert ^{2}- \theta _k\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}\nonumber \\&\qquad +(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}. \end{aligned}$$
(20)

Proof

Moreover, from the definition of \(y_k\), we obtain using Lemma 2.2 (a) that

$$\begin{aligned} \Vert y_k-z\Vert ^2= & {} \Vert (z_k-z)+\theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0)\Vert ^2\nonumber \\= & {} \Vert z_k-z\Vert ^{2}+\Vert \theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0)\Vert ^{2}\nonumber \\&+\,2 \big \langle z_k-z ,\theta _k(z_k-z_{k-1})- \alpha _k(z_k-z_0) \big \rangle \nonumber \\= & {} \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z,z_k-z_{k-1}\rangle - 2\alpha _k\langle z_k-z,z_k-z_0\rangle \nonumber \\&+\, \Vert \theta _k(z_k-z_{k-1})-\alpha _k(z_k-z_0)\Vert ^{2}, \end{aligned}$$
(21)

and, similarly, with z replaced by \(z_{k+1}\) in the previous formula,

$$\begin{aligned}&{\Vert y_k-z_{k+1}\Vert ^{2}} \nonumber \\= & {} \Vert z_k-z_{k+1}\Vert ^{2}+2\theta _k\langle z_k-z_{k+1},z_k-z_{k-1} \rangle \nonumber \\&-\,2\alpha _k\langle z_k-z_{k+1}, z_k-z_0\rangle + \Vert \theta _k(z_k-z_{k-1})-\alpha _k(z_k-z_0)\Vert ^{2}. \end{aligned}$$
(22)

Substituting (21) and (22) into (15) and eliminating identical terms, we get

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}} \nonumber \\&\quad \le \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z, z_k-z_{k-1} \rangle \nonumber \\&\qquad -\,2\alpha _k\langle z_k-z,z_k-z_0\rangle -\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _k\langle z_k-z_{k+1},z_k-z_{k-1}\rangle +2\alpha _k\langle z_k-z_{k+1}, z_k-z_0\rangle \nonumber \\= & \, \Vert z_k-z\Vert ^{2}+2\theta _k\langle z_k-z, z_k-z_{k-1} \rangle \nonumber \\&\qquad -\,2\alpha _k\langle z_k-z,z_k-z_0\rangle -\Vert z_k-z_{k+1}\Vert ^{2} +\theta _k\Vert z_k-z_{k+1}\Vert ^{2}+\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad -\,\theta _k\Vert z_k-z_{k+1}+(z_k-z_{k-1})\Vert ^{2} +2\alpha _k\langle z_k-z_{k+1},z_k-z_0\rangle . \end{aligned}$$
(23)

Therefore, we obtain

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}-\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+ (1-\theta _k)\Vert z_k-z_{k+1}\Vert ^{2}}\nonumber \\&\quad \le -2\alpha _k\langle z_k-z,z_k-z_0\rangle +2\theta _k \langle z_k-z,z_k-z_{k-1}\rangle +2\alpha _k \langle z_k-z_{k+1},z_k-z_0\rangle \nonumber \\&\quad =-2\alpha _k\langle z_k-z,z_k-z_0\rangle - \theta _k\Vert z_{k-1}-z\Vert ^{2}+\theta _k\Vert z_k-z\Vert ^{2} + \theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad -\alpha _k\Vert z_0-z_{k+1}\Vert ^{2}+\alpha _k\Vert z_{k+1}-z_k\Vert ^{2}+ \alpha _k\Vert z_k-z_0\Vert ^{2}, \end{aligned}$$
(24)

where the last identity exploits Lemma 2.2 (a) twice. We therefore have

$$\begin{aligned}&{-2\alpha _k\langle z_k-z,z_k-z_0\rangle } \end{aligned}$$
(25)
$$\begin{aligned}&\quad \ge \Vert z_{k+1}-z\Vert ^{2} -\Vert z_k-z\Vert ^{2}+ 2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2} -2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}\nonumber \\&\qquad +\theta _k \big ( \Vert z_{k-1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} \big ) + \alpha _k \big ( \Vert z_0-z_{k+1}\Vert ^{2}-\Vert z_k-z_0\Vert ^{2} \big )\nonumber \\&\qquad +(1-\theta _k-2\theta _{k+1}-\alpha _k)\Vert z_{k+1}-z_k\Vert ^2. \end{aligned}$$
(26)

Using the fact that \(\{\theta _k\}\) is non-decreasing and \(\{ \alpha _k \}\) is non-increasing, we then obtain

$$\begin{aligned}&{-\,2\alpha _k\langle z_k-z,z_k-z_0\rangle } \\&\quad \ge \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}- 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2} \\&\qquad +\,\alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}-\alpha _k\Vert z_k-z_0\Vert ^{2}- \theta _k\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} \\&\qquad +\,(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}, \end{aligned}$$

which is the desired inequality. \(\square\)

Our first central result below shows that the sequence \(\{ z_k \}\) generated by (9) is bounded.

Lemma 3.3

The sequence \(\{z_k\}\) generated by (9) is bounded.

Proof

A simple re-ordering of (20) implies that

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} - (1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}- \alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}\nonumber \\&\qquad +\,\alpha _k\Vert z_k-z_0\Vert ^{2}-2\alpha _k\langle z_k-z_0,z_k-z \rangle \nonumber \\&\quad = \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} - (1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}- \alpha _{k+1}\Vert z_0-z_{k+1}\Vert ^{2}\nonumber \\&\qquad +\,\alpha _k\Vert z_k-z_0\Vert ^{2}+\alpha _k\Vert z_0-z\Vert ^{2} - \alpha _k\Vert z_k-z_0\Vert ^{2}-\alpha _k\Vert z_k-z\Vert ^{2}, \end{aligned}$$
(27)

where the equality uses once again Lemma 2.2 (a). Hence, by cancellation, re-ordering, and neglecting a non-positive term on the right-hand side, we obtain

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+ \alpha _k\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}\nonumber \\&\qquad -\,2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+2\theta _k\Vert z_k-z_{k-1}\Vert ^{2} +\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$
(28)

Let \(\mu _j:=e^{\sum _{i=1}^{j}\alpha _{i}}, j\ge 1\). Using \(1-x \le e^{-x}\) for all \(x \in \mathbb R\) (or equivalently, \(1-e^{-x} \le x, x \in \mathbb {R}\)), we obtain

$$\begin{aligned} \frac{1}{\mu _{k+1}}(\mu _{k+1}-\mu _{k})= & {} 1-\frac{\mu _{k}}{\mu _{k+1}} \nonumber \\= & {} 1-e^{(\sum _{i=1}^{k}\alpha _{i}-\sum _{i=1}^{k+1}\alpha _{i})} \nonumber \\= & {} 1-e^{-\alpha _{k+1}} \le \alpha _{k+1}. \end{aligned}$$
(29)

Then (29) consequently implies that

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \\&\quad =\Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} +\frac{1}{\mu _{k+1}}(\mu _{k+1}-\mu _{k})\Vert z_k-z\Vert ^{2} \\&\quad \le \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2}+\alpha _{k+1}\Vert z_k-z\Vert ^{2}. \end{aligned}$$

Since \(\{\alpha _k\}\) is non-increasing in (0,1), this implies

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \nonumber \\&\quad \le \Vert z_{k+1}-z\Vert ^{2}-\Vert z_k-z\Vert ^{2} +\alpha _k\Vert z_k-z\Vert ^{2}. \end{aligned}$$
(30)

It then follows from (28) and (30) that

$$\begin{aligned}&{\frac{1}{\mu _{k+1}} \big ( \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}- \mu _{k}\Vert z_k-z\Vert ^{2} \big )} \\&\quad \le \theta _k\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}\\&\qquad -\,(1-3\theta _{k+1}-\alpha _k)\Vert z_k-z_{k+1}\Vert ^{2}- 2\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad +\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

Since \(\mu _{k}\le \mu _{k+1}\), \(\mu _{k+1}=\mu _{k}e^{\alpha _{k+1}}\) and \(\{\alpha _k\}\) is non-increasing in (0,1), we therefore get

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -\mu _{k+1}(1-3\theta _{k+1}-\alpha _k)\Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad -\,2\mu _{k+1}\theta _{k+1}\Vert z_{k+1}-z_k\Vert ^{2}+ 2\mu _{k}\theta _ke^{\alpha _{k+1}}\Vert z_k-z_{k-1}\Vert ^{2} +\mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}, \end{aligned}$$

which can be rewritten as (since \(\{\alpha _k\}\) is non-increasing in (0,1))

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} \\&\qquad -\,\mu _{k+1} \big [ 1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )- \alpha _k \big ] \Vert z_{k+1}-z_k\Vert ^{2}\\&\qquad -\,2\mu _{k+1}\theta _{k+1}e^{\alpha _{k+1}}\Vert z_{k+1}-z_k\Vert ^{2}+ 2\mu _{k}\theta _ke^{\alpha _k}\Vert z_k-z_{k-1}\Vert ^{2} +\mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

Since the sequence \(\{\theta _k\}\) belongs to the interval \([0,\theta ]\), we have

$$\begin{aligned} 1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k\ge 1- \theta \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k, \quad \forall k \in \mathbb N. \end{aligned}$$

Using \(\lim _{k \rightarrow \infty } \alpha _k = 0\) and \(\theta \in [0, 1/3)\), it follows that the right-hand side is eventually bounded from below by a positive number, i.e., there is a constant \(\gamma > 0\) such that \(1-\theta _{k+1} \big ( 3+2(e^{\alpha _{k+1}}-1) \big )-\alpha _k\ge \gamma\) for all \(k \in \mathbb N\) sufficiently large, say, for all \(k \ge k_0\). Hence, we have

$$\begin{aligned}&{\mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}-\mu _{k}\Vert z_k-z\Vert ^{2}}\\&\quad \le \mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}- \mu _{k}\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2} -2\mu _{k+1}\theta _{k+1}e^{\alpha _{k+1}}\Vert z_{k+1}-z_k\Vert ^{2} \\&\qquad -\, \gamma \mu _{k+1}\Vert z_{k+1}-z_k\Vert ^{2} +2\mu _{k}\theta _ke^{\alpha _k}\Vert z_k-z_{k-1}\Vert ^{2}+ \mu _{k+1}\alpha _k\Vert z_0-z\Vert ^{2}. \end{aligned}$$

This implies that for \(k\ge k_{0}\),

$$\begin{aligned}&{\Vert z_0-z\Vert ^{2}\sum _{j=k_{0}+1}^{k}\mu _{j+1}\alpha _{j}} \nonumber \\&\quad \ge \mu _{k+1}\Vert z_{k+1}-z\Vert ^{2}+2\mu _{k+1} \theta _{k+1} e^{\alpha _{k+1}} \Vert z_{k+1}-z_k\Vert ^{2}-\mu _{k+1}\theta _k\Vert z_k-z\Vert ^{2}\nonumber \\&\qquad -\,\mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2}-2\mu _{k_{0}+1}\theta _{k_{0}+1} e^{\alpha _{k_{0}+1}}\Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad +\, \mu _{k_{0}+1}\theta _{k_{0}}\Vert z_{k_{0}}-z\Vert ^{2}. \end{aligned}$$
(31)

Thus, dividing by \(\mu _{k+1}\) and omitting a non-positive term, we get

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}-\theta _k\Vert z_k-z\Vert ^{2}} \nonumber \\&\quad \le e^{-t_{k+1}} \big [ \mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2} +2\mu _{k_{0}+1}\theta _{k_{0}+1}e^{\alpha _{k_{0}+1}} \Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad -\,\mu _{k_{0}+1}\theta _{k_{0}}\Vert z_{k_{0}}-z\Vert ^{2} \big ] +\Vert z_0-z\Vert ^{2} e^{-t_{k+1}}\sum _{j=k_{0}+1}^{k}\alpha _{j}e^{t_{j}+1}, \end{aligned}$$
(32)

where \(t_{k}:=\sum _{i=1}^{k}\alpha _{i}\). Since \(\alpha _k \in (0,1)\) for all \(k \in \mathbb N\), it is easy to see that \(\alpha _{k}e^{t_{k+1}}\le e^{2}(e^{t_{k}}-e^{t_{k-1}})\) for all \(k\ge 2\), so that

$$\begin{aligned} \sum _{j=k_{0}+1}^k \mu _{j+1}\alpha _{j} = \sum _{j=k_0+1}^k \alpha _j e^{t_{j+1}} \le e^2 \big ( e^{t_k} - e^{t_{k_0}} \big ) \le e^{2}e^{t_{k}}, \end{aligned}$$

which, by (32), \(e^{-t_{k+1}} \le 1\), and the fact that \(\{\theta _k\}\) belongs to the interval \([0, \theta ] \subset [0,\frac{1}{3})\), yields

$$\begin{aligned}&{\Vert z_{k+1}-z\Vert ^{2}} \nonumber \\&\quad \le \theta \Vert z_k-z\Vert ^{2}+ \mu _{k_{0}+1}\Vert z_{k_{0+1}}-z\Vert ^{2}+2\mu _{k_{0}+1}\theta _{k_{0}+1} e^{\alpha _{k_{0}+1}}\Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2}\nonumber \\&\qquad +\,e^{2}\Vert z_0-z\Vert ^{2}. \end{aligned}$$
(33)

Using (33), \(\theta \in [0,1)\), and the convergence of the geometric series, a simple calculation gives

$$\begin{aligned} \Vert z_{k+1}-z\Vert ^{2}\le & {} \theta ^{k-k_{0}}\Vert z_{k_0+1}-z\Vert ^{2}+ \frac{1}{1-\theta } \big [ \mu _{k_{0}+1}\Vert z_{k_{0}+1}-z\Vert ^{2}\\&+\,2\mu _{k_{0}+1}\theta _{k_{0}+1}e^{\alpha _{k_{0}+1}} \Vert z_{k_{0}+1}-z_{k_{0}}\Vert ^{2} + e^{2}\Vert z_0-z\Vert ^{2} \big ]. \end{aligned}$$

Using once again that \(\theta < 1\), this shows that \(\{z_k\}\) is bounded. \(\square\)

Next, we formulate a simple lemma that turns out to be useful for proving the strong convergence result.

Lemma 3.4

Let \(\{z_k\}\) be the sequence generated by (9). Define

$$\begin{aligned} u_{k}:=\Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z\Vert ^{2}+ 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2} \end{aligned}$$

for all \(k \in \mathbb N\). Then \(u_k \ge 0\) for all \(k \in \mathbb N\).

Proof

Since \(\{\theta _k\}\) is non-decreasing with \(0\le \theta _k< \frac{1}{3}\), and by Lemma 2.2 (a), we have

$$\begin{aligned} u_{k}= & \, \Vert z_k-z\Vert ^{2}-\theta _{k-1}\Vert z_{k-1}-z_k+z_k-z\Vert ^{2}+ 2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= & \, \Vert z_k-z\Vert ^{2}-\theta _{k-1} \big [ \Vert z_{k-1}-z_k\Vert ^{2}+ \Vert z_k-z\Vert ^{2}+2\langle z_{k-1}-z_k,z_k-z \rangle \big ]\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= &\, \Vert z_k-z\Vert ^{2}-\theta _{k-1} \big [ 2\Vert z_{k-1}-z_k\Vert ^{2}+ 2\Vert z_k-z\Vert ^{2}-\Vert z_{k-1}-2z_k-z\Vert ^{2} \big ]\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\= & \, \Vert z_k-z\Vert ^{2}-2\theta _{k-1}\Vert z_{k-1}-z_k\Vert ^{2} - 2\theta _{k-1}\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-2z_k-z\Vert ^{2}\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & \, \Vert z_k-z\Vert ^{2}- 2\theta _k\Vert z_{k-1}-z_k\Vert ^{2} - \frac{2}{3}\Vert z_k-z\Vert ^{2}+\theta _{k-1}\Vert z_{k-1}-2z_k-z\Vert ^{2}\\&+\,2\theta _k\Vert z_k-z_{k-1}\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & \, \frac{1}{3}\Vert z_k-z\Vert ^{2}+\alpha _k\Vert z_k-z_0\Vert ^{2}\\\ge & {} 0, \end{aligned}$$

and this completes the proof. \(\square\)

Before we prove our main strong convergence result, we state another preliminary result which provides sufficient conditions for the strong convergence of the sequence \(\{z_k\}\) generated by our method (9). In our strong convergence result, we will then show that these sufficient conditions automatically hold.

Lemma 3.5

Let \(\{z_k\}\) be the sequence generated by (9). Assume that

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0 \end{aligned}$$

and

$$\begin{aligned} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert ^2-\theta _k\Vert z_k-z\Vert ^2)=0. \end{aligned}$$

Then the entire sequence \(\{z_k\}\) converges strongly to the solution z.

Proof

By assumption, we have

$$\begin{aligned} 0= & {} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert ^2-\theta _k\Vert z_k-z\Vert ^2) \nonumber \\= & {} \lim _{k\rightarrow \infty }\Big [(\Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert )(\Vert z_{k+1}-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\Big ]. \end{aligned}$$
(34)

We claim that this already implies

$$\begin{aligned} \lim _{k\rightarrow \infty }(\Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert )=0, \end{aligned}$$

from which the strong convergence of the entire sequence \(\{z_k\}\) to z follows immediately. Assume this limit does not hold. Then there is a subset \(K\subseteq \mathbb {N}\) and a constant \(\rho > 0\) such that

$$\begin{aligned} \Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert \ge \rho , \forall k \in K. \end{aligned}$$
(35)

Since \(\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0\) by the assumption and \(0\le \theta <1\), then (recall that if \(\{a_k\}\) and \(\{b_k\}\) are bounded sequences in \(\mathbb {R}\) and one of either \(\{a_k\}\) or \(\{b_k\}\) converges, then \(\limsup _{k\rightarrow \infty } (a_k+b_k)=\limsup _{k\rightarrow \infty } a_k +\limsup _{k\rightarrow \infty } b_k\))

$$\begin{aligned} \limsup _{k \in K}((1-\sqrt{\theta })\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert )= & {} \limsup _{k \in K}(1-\sqrt{\theta })\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert \\= & {} (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert . \end{aligned}$$

Using (34) and \(\theta _k\le \theta <1\), we get

$$\begin{aligned} 0= & {} \lim _{k \in K}(\Vert z_{k+1}-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\= & {} \limsup _{k \in K}(\Vert z_{k+1}-z_k+z_k-z\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\\ge & {} \limsup _{k \in K}(\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert -\sqrt{\theta _k}\Vert z_k-z\Vert )\\\ge & {} \limsup _{k \in K}((1-\sqrt{\theta })\Vert z_k-z\Vert -\Vert z_{k+1}-z_k\Vert )\\= & \, (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert -\lim _{k \in K}\Vert z_{k+1}-z_k\Vert \\= & \, (1-\sqrt{\theta })\limsup _{k \in K}\Vert z_k-z\Vert . \end{aligned}$$

Consequently, we have \(\limsup _{k \in K}\Vert z_k-z\Vert \le 0.\) Since \(\liminf _{k \in K}\Vert z_k-z\Vert \ge 0\) obviously holds, it follows that \(\lim _{k \in K}\Vert z_k-z\Vert = 0.\) This implies [by (35)]

$$\begin{aligned} \Vert z_{k+1}-z_k\Vert\ge &\, \Vert z_{k+1}-z\Vert -\Vert z_k-z\Vert \\= & \, \Vert z_{k+1}-z\Vert +\sqrt{\theta _k}\Vert z_k-z\Vert -(1+\sqrt{\theta _k})\Vert z_k-z\Vert \\\ge &\, \frac{\rho }{2} \end{aligned}$$

for all \(k\in K\) sufficiently large, a contradiction to the assumption that \(\lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert =0.\) This completes the proof. \(\square\)

We are now ready to obtain strong convergence of the sequence \(\{ z_k \}\) generated by (9) to an element of S.

Theorem 3.6

The sequence \(\{z_k\}\) generated by (9) strongly converges to z, where \(z=P_Sz_0\).

Proof

Let \(u_{k}\) denote the nonnegative number defined in Lemma 3.4, and let us apply Lemma 3.2. We obtain from (20) that

$$\begin{aligned}&u_{k+1}-u_{k}+(1-3\theta _{k+1}-\alpha _k) \Vert z_k-z_{k+1}\Vert ^{2} \nonumber \\&\quad \le -2\alpha _k\langle z_k-z,z_k-z_0\rangle . \end{aligned}$$
(36)

We now consider two cases.

Case 1 Suppose \(\{u_{k}\}\) is eventually a monotonically decreasing sequence, i.e. for some \(k_{0} \in \mathbb N\) large enough, we have \(u_{k+1} \le u_k\) for all \(k \ge k_0\). Then, since \(u_k\) is nonnegative for all \(k \in \mathbb N\) by Lemma 3.4, we obviously get that \(\{u_{k}\}\) is a convergent sequence. Consequently, it follows that \(\lim _{k\rightarrow \infty }u_{k}=\lim _{k\rightarrow \infty }u_{k+1}\). Since \(\{z_k\}\) is bounded by Theorem 3.3, there exists \(M>0\) such that \(2|\langle z_k-z,z_k-z_0\rangle |\le M.\) Moreover, it follows that there exist \(N\in \mathbb {N}\) and \(\gamma _{1}>0\) such that \(1-3\theta _{k+1}-\alpha _k \ge \gamma _{1}\) for all \(k\ge N\). Therefore, for \(k\ge N\), we obtain from (36) that

$$\begin{aligned} \gamma _{1} \Vert z_{k+1}-z_k\Vert ^{2}\le & {} \alpha _kM + u_{k}-u_{k+1}\\\rightarrow & {} 0 \quad \text {for } k\rightarrow \infty . \end{aligned}$$

Hence

$$\begin{aligned} \lim _{k\rightarrow \infty }\Vert z_{k+1}-z_k\Vert \rightarrow 0. \end{aligned}$$

Together with \(\alpha _k \rightarrow 0\), the boundedness of \(\{ z_k \}\), and the convergence of \(\{ u_k \}\), we therefore obtain from the definition of \(u_k\) that the limit

$$\begin{aligned} \lambda := \lim _{k\rightarrow \infty } \big ( \Vert z_{k+1}-z\Vert ^{2}-\theta _k\Vert z_k-z\Vert ^{2} \big ) \end{aligned}$$
(37)

exists and is equal to \(\lim _{k \rightarrow \infty } u_{k+1}\). In particular, Lemma 3.4 therefore implies that \(\lambda \ge 0\). We will show that \(\lambda = 0\) holds; then (37) together with the fact that \(\theta _k \le \theta < 1\) for all \(k \in \mathbb N\) yields the strong convergence of the sequence \(\{ z_k \}\) to the solution z.

By contradiction, assume that \(\lambda > 0\). Since \(\{ z_k \}\) is bounded by Theorem 3.3, it is easy to see that we can choose a subsequence \(\{z_{k_{j}}\}\) which converges weakly to an element \(p\in H\) and such that

$$\begin{aligned} \underset{k\rightarrow \infty }{\liminf }\langle z_k-z,z-z_0\rangle = \underset{j\rightarrow \infty }{\lim }\langle z_{k_j}-z,z-z_0\rangle = \langle p-z,z-z_0\rangle . \end{aligned}$$

We show that \(p \in S\). Observe that the updating rule for \(y_k\) implies

$$\begin{aligned} \Vert y_{k}-z_k\Vert= &\, \Vert \alpha _k(z_0-z_k)+\theta _k(z_k-z_{k-1})\Vert \\\le & \, \alpha _k\Vert z_0-z_k\Vert +\theta _k\Vert z_k-z_{k-1}\Vert \rightarrow 0,~~k\rightarrow \infty . \end{aligned}$$

This yields

$$\begin{aligned} \Vert e(y_k,\lambda )\Vert \le \frac{1}{\beta }\Vert z_{k+1}-y_k\Vert \le \Vert z_k-y_k\Vert +\Vert z_{k+1}-z_k\Vert \rightarrow 0,~~k \rightarrow \infty . \end{aligned}$$

Let \(Ty:=\frac{1}{2}y+\frac{1}{2}R^\lambda _AoR^\lambda _B(y),~~y \in H\). Then it is clear that T is nonexpansive and \(z \in F(T):=\{x\in H:x=Tx\}\) if and only if \(z=R^\lambda _AoR^\lambda _B(z)\). Similarly, it is easy to see that \(e(y_k,\lambda )= \frac{1}{2}(y_k-R^\lambda _AoR^\lambda _B(y_k))=y_k-Ty_k\). Therefore,

$$\begin{aligned} \lim _{k\rightarrow \infty } \Vert y_k-Ty_k\Vert =\lim _{k\rightarrow \infty }\Vert e(y_k,\lambda )\Vert =0. \end{aligned}$$

Demiclosedness Principle of T implies that \(p \in F(T)\). Hence, \(p \in S\). This implies that

$$\begin{aligned} \underset{k\rightarrow \infty }{\liminf }\langle z_k-z,z-z_0\rangle = \langle p-z,z-z_0\rangle \ge 0, \end{aligned}$$
(38)

where the inequality follows from the characterization (12) of a projection applied to \(z = P_S z_0\) and \(p \in S\). Since (37) yields

$$\begin{aligned} \liminf _{k \rightarrow \infty } \Vert z_{k+1} - z \Vert ^2 \ge \lim _{k \rightarrow \infty } \big ( \Vert z_{k+1} - z \Vert ^2 - \theta _k \Vert z_k - z \Vert ^2 \big ) = \lambda , \end{aligned}$$

and since \(\lambda > 0\) by assumption, we have

$$\begin{aligned} \Vert z_{k+1} - z \Vert ^2 \ge \frac{1}{2} \lambda \quad \forall k \ge k_1 \end{aligned}$$

for some sufficiently large \(k_1 \in \mathbb N\). Using the identity

$$\begin{aligned} \langle z_k - z, z_k - z_0 \rangle = \Vert z_k - z \Vert ^2 + \langle z_k - z, z - z_0 \rangle , \end{aligned}$$

we therefore get

$$\begin{aligned} \liminf _{k \rightarrow \infty } \langle z_k - z, z_k - z_0 \rangle= & \, \liminf _{k \rightarrow \infty } \big ( \Vert z_k - z \Vert ^2 + \langle z_k - z, z - z_0 \rangle \big ) \\\ge & \, \liminf _{k \rightarrow \infty } \Big ( \frac{1}{2} \lambda + \langle z_k - z, z - z_0 \rangle \Big ) \\= & \, \frac{1}{2} \lambda + \liminf _{k \rightarrow \infty } \langle z_k - z, z - z_0 \rangle \\\ge & \, \frac{1}{2} \lambda \end{aligned}$$

from (38). Using once again the assumption that \(\lambda > 0\), this implies

$$\begin{aligned} \langle z_k - z, z_k - z_0 \rangle \ge \frac{1}{4} \lambda \quad \forall k \ge k_2 \end{aligned}$$

for some sufficiently large \(k_2 \in \mathbb N, k_2 \ge k_1\). From (36), we therefore obtain

$$\begin{aligned} u_{k+1} - u_k \le - \frac{1}{2} \alpha _k \lambda \quad \forall k \ge k_2. \end{aligned}$$

This implies

$$\begin{aligned} \frac{1}{2} \lambda \sum _{j=k_2}^k \alpha _j \le u_{k_2} - u_{k+1} \le u_{k_2} \quad \forall k \ge k_2, \end{aligned}$$

where the second inequality follows from Lemma 3.4. Since \(\lambda > 0\), this gives the summability of the sequence \(\{ \alpha _k \}\), a contradiction to our assumption. Hence we must have \(\lambda = 0\), and this yields the strong convergence of the sequence \(\{ z_k \}\) to z.

Case 2 Assume that \(\{u_k\}\) is not eventually monotonically decreasing. Then let \(\tau :\mathbb {N}\rightarrow \mathbb {N}\) be the map defined for all \(k\ge k_{0}\) (for some \(k_{0} \in \mathbb N\) large enough) by

$$\begin{aligned} \tau (k):=\max \{j\in \mathbb {N}: j\le k, u_{j}\le u_{j+1}\}. \end{aligned}$$
(39)

Clearly, \(\tau (k)\) is a non-decreasing sequence such that \(\tau (k) \rightarrow \infty\) for \(k\rightarrow \infty\) and \(u_{\tau (k)}\le u_{\tau (k)+1}\) for all \(k\ge k_{0}\). Hence, similar to the proof of Case 1, we therefore obtain from (36) that

$$\begin{aligned} \gamma _1\Vert x_{\tau (k)+1}-x_{\tau (k)}\Vert ^{2} \le \alpha _{\tau (k)}M\rightarrow 0 \end{aligned}$$
(40)

for some constant \(M > 0\). Thus,

$$\begin{aligned} \Vert x_{\tau (k)+1}-x_{\tau (k)}\Vert \rightarrow 0,~~ k\rightarrow \infty . \end{aligned}$$
(41)

Using the same technique of the proof as in Case 1, one can also derive the limits

$$\begin{aligned} \Vert x_{\tau (k)+1} - w_{\tau (k)} \Vert\rightarrow & {} 0,~~ k\rightarrow \infty , \nonumber \\ \Vert w_{\tau (k)} - x_{\tau (n)} \Vert\rightarrow & {} 0,~~ k \rightarrow \infty , \end{aligned}$$
(42)
$$\begin{aligned} \Vert x_{\tau (k)} - z_{\tau (k)} \Vert\rightarrow & {} 0,~~ k \rightarrow \infty . \end{aligned}$$
(43)

Again observe that for \(j\ge 0\) by (36), we have \(u_{j+1}<u_{j}\) when \(x_{j}\not \in \Omega :=\{x\in H: \langle x-z_0,x-z\rangle \le 0\}\) (note that this \(\Omega\) is the same set as in Lemma 2.1). Hence \(x_{\tau (k)}\in \Omega\) for all \(k\ge k_{0}\) since \(u_{\tau (k)} \le u_{\tau (k)+1}\). Since \(\{x_{\tau (k)}\}\) is bounded, we may choose a subsequence (which we again call \(\{x_{\tau (k)}\}\)) which converges weakly to some \(x^{*}\in H\). As \(\Omega\) is a closed and convex set, it is then weakly closed and so \(x^{*} \in \Omega\). Using (43), one can see as in Case 1 that \(z_{\tau (k)}\rightharpoonup x^{*}\) and \(x^* \in S\). Consequently, we have \(x^{*}\in \Omega \cap S\). In view of Lemma 2.1, however, the intersection \(\Omega \cap S\) contains z as its only element. We therefore get \(x^* = z\). Furthermore, we have

$$\begin{aligned} \Vert x_{\tau (k)}-z\Vert ^{2}= & \, \langle x_{\tau (k)}-z_0,x_{\tau (k)}-z\rangle - \langle z-z_0,x_{\tau (k)}-z\rangle \\\le &\, -\langle z-z_0,x_{\tau (k)}-z\rangle \end{aligned}$$

since \(x_{\tau (k)}\in \Omega\). Taking lim sup in this last inequality gives

$$\begin{aligned} \limsup _{k \rightarrow \infty }\Vert x_{\tau (k)}-z\Vert \le 0. \end{aligned}$$

Hence

$$\begin{aligned} \Vert x_{\tau (k)}-z\Vert \rightarrow 0,~~ k\rightarrow \infty . \end{aligned}$$
(44)

We claim that this implies \(\lim _{k \rightarrow \infty } u_{\tau (k)+1} = 0\). By definition, \(u_{\tau (k)+1}\) is equal to

$$\begin{aligned} \Vert x_{\tau (k)+1} - z \Vert ^2 - \theta _{\tau (k)} \Vert x_{\tau (k)} - z \Vert ^2 + 2 \theta _{\tau (k)+1} \Vert x_{\tau (k)+1} - x_{\tau (k)} \Vert ^2 + \alpha _{\tau (k)+1} \Vert x_{\tau (k)+1} - z_0 \Vert ^2. \end{aligned}$$

Adding and subtracting \(x_{\tau (k)}\) inside the norm of the first term, and using (41), (44), we see that the first term goes to zero. The second term converges to zero also in view of (44), taking into account the boundedness of \(\{ \theta _k \}\). The third term vanishes in the limit because of (41) and noting once again that \(\{ \theta _k \}\) is a bounded sequence. Finally, the last term goes to zero since \(\{ \alpha _k \}\) converges to zero and the sequence \(\{ z_k \}\) is bounded by Theorem 3.3.

We next show that we actually have \(\lim _{k \rightarrow \infty } u_k = 0\). To this end, first observe that, for \(k\ge k_{0},\) one has \(u_{k}\le u_{\tau (k)+1}\) if \(k\ne \tau (k)\) (that is, if \(\tau (k)<k\)) because we necessarily have \(u_{j}>u_{j+1}\) for \(\tau (k)+1\le j\le k-1\). It follows that for all \(k\ge k_{0}\), we have \(u_{k}\le \max \{u_{\tau (k)}, u_{\tau (k)+1}\}=u_{\tau (k)+1} \rightarrow 0\), hence \(\limsup _{k\rightarrow \infty }u_{k}\le 0\). On the other hand, Lemma 3.4 implies that \(\liminf _{k \rightarrow \infty } u_k \ge 0\). Together we obtain \(\lim _{k \rightarrow \infty } u_k = 0\).

Consequently, the boundedness of \(\{ z_k \}\), assumptions on our iterative parameters and (36) show that

$$\begin{aligned} \Vert z_k - z_{k+1} \Vert \rightarrow 0,~~ k \rightarrow \infty . \end{aligned}$$

Hence the definition of \(u_k\) yields

$$\begin{aligned} \lim _{k \rightarrow \infty } \big ( \Vert z_{k+1} - z \Vert ^2 - \theta _k \Vert z_k - z \Vert ^2 \big )=0. \end{aligned}$$

Using our assumption, it is not difficult to see that this implies the strong convergence of the entire sequence \(\{ z_k \}\) to the particular solution z. The statement therefore follows from Lemma 3.5. \(\square\)

In the special case when B is a set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator in problem (1), iterative procedure (9) reduces to the following: \(z_0, z_1 \in H\),

$$\begin{aligned} \left\{ \begin{array}{ll} &{} y_k=\alpha _kz_0+(1-\alpha _k)z_k+\theta _k(z_k-z_{k-1})\\ &{} z_{k+1}=(1-\beta _k)y_k+\beta _k(I+\lambda B)^{-1}(I-\lambda A)y_k, \end{array} \right. \end{aligned}$$
(45)

with \(0<\lambda <2\kappa\). Moreover, we obtain strong convergence for this special case of monotone inclusion for which its proof can be easily obtained by following line of arguments of previous lemmas and Theorem 3.6.

Corollary 3.7

Suppose B is a set-valued maximal monotone operator and A is a single-valued \(\kappa\)-inverse strongly monotone operator. Assume that \(S:=\{x\in H: 0\in Ax+Bx\}\ne \emptyset\). Let \(\{z_k\}\) be the sequence generated by (45) with \(0<\beta \le \beta _k\le \frac{1}{2}\), \(0<\lambda <2\kappa\) and \(0\le \theta _k\le \theta _{k+1}\le \theta <\frac{1}{3}\). Then \(\{z_k\}\) strongly converges to z, where \(z=P_Sz_0\).

We next relate our results to some existing results from the literature.

Remark 3.8

  1. (a)

    In the results of Thong and Vinh (see Thong and Vinh 2019, Thm. 3.5), strong convergence for monotone inclusion was obtained under some assumptions on the iterative sequence. The monotone inclusion studied in Thong and Vinh (2019) involves sum of a set-valued maximal monotone operator and single-valued inverse-strongly monotone operator. In this paper, our method is proposed such that no assumption is made on the iterative sequence even for a more general result considered here.

  2. (b)

    The Algorithm (45) could be taken as the inertial strong convergence version of some recent results in Attouch and Cabot (2019), Boţ and Csetnek (2016), Lorenz and Pock (2015) and Villa et al. (2013). \(\Diamond\)

4 Numerical experiments

In all the examples in this section, we compare our proposed method (9) with the non-inertial version (when \(\theta _n=0\)), Thong and Vinh results (see Thong and Vinh 2019, Thm. 3.5) and Shehu (2016). Our aim is to compare our method with other relevant strong convergence methods in the literature.

Example 4.1

Let \(H=L^2([0,1])\). Let \(A:=\partial \Vert .\Vert\) and \(B=N_C\) in (1), where \(N_C\) is the normal cone of nonempty closed and convex subset C of H (\(N_C(x):=\{x^* \in H:\langle y-x,x^*\rangle \le 0, \forall y \in C \}\)), . Then problem (1) reduces to the following minimization problem: find \(x^{*}\in L^2([0,1])\) such that

$$\begin{aligned} 0\in \partial \Vert x^{*}\Vert +N_C(x^{*}). \end{aligned}$$
(46)

Note that \(S\ne \emptyset\) since \(0\in S.\) Furthermore, the resolvent \(J^{\lambda }_B=(I+\lambda N_C)^{-1}=P_C\), and \(J^{\lambda }_A\) is given by the Moreau decomposition

$$\begin{aligned} J^{\lambda }_A(x)= & \, (I+\lambda \partial \Vert .\Vert )^{-1}(x)\\= & \, \text {Prox}_{\lambda \Vert .\Vert }(x) =x-\lambda P_{B_{\Vert .\Vert _*}}(\frac{x}{\lambda }), \end{aligned}$$

where \(\text {Prox}_{\lambda \Vert .\Vert }(x) := \text {argmin}_y \big \{ \Vert y\Vert + \frac{1}{2} \Vert y - x \Vert ^2 \big \}\), \(P_{B_{\Vert .\Vert _*}}\) is the projection operator and \(B_{\Vert .\Vert _*}\) is the norm unit ball(of the dual norm). Note that in this case, \(L^2([0,1])\) is self dual. Moreover, the projection \(P_{B_{\Vert .\Vert _*}}\) (see Bauschke and Combettes 2011; Cegielski 2012) is given by:

$$\begin{aligned} P_{B_{\Vert .\Vert _*}}(x)= \left\{ \begin{array}{llll} &{}\frac{x}{\Vert x\Vert }, &{}\Vert x\Vert >1\\ &{}x, &{}\Vert x\Vert \le 1. \end{array} \right. \end{aligned}$$

Therefore,

$$\begin{aligned} J^{\lambda }_A(x)=x-\lambda P_{B_{\Vert .\Vert }}(\frac{x}{\lambda }) = \left\{ \begin{array}{llll} &{}x-\lambda \frac{x}{\Vert x\Vert }, &{}\Vert \frac{x}{\lambda }\Vert >1\\ &{}0, &{}\Vert \frac{x}{\lambda }\Vert \le 1. \end{array} \right. \end{aligned}$$

We take C as the ball \(C:=\{x \in H:\Vert x-z\Vert \le r\}\), then

$$\begin{aligned} P_C(x)= \left\{ \begin{array}{llll} &{} x,~~&{}\Vert x-z\Vert \le r\\ &{} z+\frac{r(x-z)}{\Vert x-z\Vert },~~&{}\Vert x-z\Vert > r \end{array} \right. \end{aligned}$$

In particular, \(C=\{x\in L^2([0,1]):\int _0^1|x(t)-\sin (\frac{t}{2\pi })|^2 dt \le 16 \}\).

Set \(\lambda =0.02\), \(\beta _k=0.6\) and \(\alpha _k=100/k\). Take \(\Vert z_k-z_{k-1}\Vert \le 10^{-3}\) as the stopping criterion (Fig. 1).

Table 1 Comparing the change of \(\theta _k\) under the same initial value for Example 4.1
Fig. 1
figure 1

Error attenuation trend of Algorithm (9)

Example 4.2

Suppose \(A:\mathbb {R}^3\rightarrow \mathbb {R}^3\) and \(B:\mathbb {R}^3 \rightarrow \mathbb {R}^3\) are given by

$$\begin{aligned} A \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) = \left( \begin{array}{ccc} 8 &{} 0 &{} 0 \\ 0 &{} 5 &{} 0 \\ 0 &{} 0 &{} 10 \\ \end{array} \right) \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) ,\qquad B \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) = \left( \begin{array}{ccc} 7 &{} 0 &{} 0 \\ 0 &{} 6 &{} 0 \\ 0 &{} 0 &{} 4 \\ \end{array} \right) \left( \begin{array}{c} x \\ y \\ z \\ \end{array} \right) , \end{aligned}$$

It can be shown that \(\Omega =\{(0,0,0)\}\).

Let \(z_0\) be randomly selected. In Algorithm 3 of Thong and Vinh (2019), we chose \(\lambda =0.1\), \(\beta _k=1/(k+1)\) and \(\tau _k=1/(k+1)^2\). In Algorithm (26) of Shehu (2016), we chose \(\alpha _k=1/k\), \(\beta _k=k/(2k+1)\) and \(r_k=0.1\). Take \(\Vert z_k\Vert \le 0.005\) as the stopping criterion.

For Examples 4.2 and 4.3, we take \(\lambda =0.2\), \(\beta _k=0.5\), \(\alpha _k=\frac{1}{25k}\) in Algorithm (9) and \(\beta =0.2\) in Algorithm 3 of Thong and Vinh (2019).

Table 2 Comparing the change of \(\theta _k\) under the same initial value for Example 4.2
Fig. 2
figure 2

Comparison of three algorithms

We compared the algorithm (9), Algorithm 3 in Thong and Vinh (2019) and the algorithm (26) in Shehu (2016). From Fig. 2, we know that the performance of the algorithm (9) is better than that of the other two algorithms.

Example 4.3

Let us consider the following well known \(\ell _1\)-regularized least squares problem, which consists of finding a sparse solution to an underdetermined linear system. Suppose that we solve the following problem:

$$\begin{aligned} \min \frac{1}{2}\Vert Dx-b\Vert ^2_2+\rho \Vert x\Vert _1, \end{aligned}$$
(47)

where \(D \in \mathbb {R}^{m \times n}\) and \(b\in \mathbb {R}^m\). In this case,

$$\begin{aligned} J^{\lambda }_A(x)=(D^tD+ \lambda ^{-1}I)^{-1}(D^tb+ \lambda ^{-1}x) \end{aligned}$$

while

$$\begin{aligned} J^{\lambda }_B(x)=(sign(x_i).\max \{0,|x_i|-\lambda \rho \})_i, ~~i=1,2,\ldots ,n. \end{aligned}$$

We remark that there is a commercial software, based on the projected gradient method for solving problem (47), for example SPGL1 (van den Berg and Friedlander 2007; Lorenz 2013) and FISTA (Beck and Teboulle 2009), but this is beyond the scope of this paper. Our interest here is to demonstrate the efficiency of our proposed method (9) using problem (47).

We generate random problems using different choices of \(\lambda\) for \(m=100\) and \(n=1000\). In Algorithm 3 of Thong and Vinh (2019) and the algorithm (26) of Shehu (2016), we chose \(\rho =1\) and \(\lambda =1.9/(\max (eig(D^TD)))\), and in the algorithm (9), we chose \(\rho =0.5\). In addition, select \(r_k=0.2\) in algorithm (26) of Shehu (2016).

Table 3 Comparing the change of \(\theta _k\) under the same initial value for Example 4.3
Fig. 3
figure 3

Comparison of three algorithms

Table 3 shows that the algorithm (9) is better when \(\theta _k=0.33\). The numerical result is described in Fig.  3, it illustrates that the performance of Algorithm (9) is better than the other two algorithms.

Remark 4.4

  1. (a)

    It can be seen from the numerical examples that Algorithm (9) outperforms the methods in Shehu (2016) and Thong and Vinh (2019) (see Figs. 2, 3) for strong convergence of sum of maximal monotone operators. Furthermore, the additional of inertial term improves the acceleration of the proposed method as can be seen in the numerical examples that Algorithm (9) converges faster than the non-inertial case when \(\theta _k=0\) (please see, Tables 1, 2, 3). Also, the optimum choice of \(\theta _k=0\) should be close to the upper bound \(\frac{1}{3}\) from our examples.

  2. (b)

    Algorithm (9) is sensitive to the choice of the initial point \(z_0\) as can be seen in our examples in Tables  1, 2 and 3.

Remark 4.5

  1. (a)

    We point out that there are different strategies in the current literature to enforce strong convergence on proximal-like algorithms (in particular, DR splitting); see, e.g., Solodov and Svaiter (2000) and Hirstoaga (2006). In this regard, the results of Hirstoaga (2006) is concerned with ”anchor-point” algorithms as employed in our proposed method (9). One can see in Algorithm 2.1 of Hirstoaga (2006), there is no presence of inertial extrapolation term, \(\theta _k(z_k-z_{k-1})\), which has been proved in the literature to increase the speed of convergence of non-inertial counterpart in most optimization methods. From our proposed method (9), we see that when \(\theta _k\ne 0\) (in this paper, we assume that \(0 \le \theta _k \le \theta <\frac{1}{3}\)), then our method (9) cannot be reduced to Algorithm 2.1 of Hirstoaga (2006) applied to the splitting operator of Eckstein and Bertsekas (1992). As confirmed in our numerical examples in Sect. 4, our method (9) outperforms Algorithm 2.1 of Hirstoaga (2006) when applied to the splitting operator of Eckstein and Bertsekas (1992). Also, our method of proof is different from the method of proof given in Hirstoaga (2006).

  2. (b)

    The essence of our numerical examples in Sect. 4 is to drive home the implementations and effectiveness of our proposed method (9). As discussed in MacNamara and Strang (2016) and other related chapters in the book, applications of our method (9) to solve problems arising from wireless communications, imaging, networking, finance, hemodynamics, free-surface flows, and other science and engineering problems in infinite-dimensional Hilbert spaces would be discussed separately as a future project. \(\Diamond\)

5 Final remarks

In this paper we propose a Douglas–Rachford splitting method with inertial extrapolation step and give strong convergence analysis of the method. The method is much more applicable for a general class of maximal monotone operators and no uniform monotonicity on any of the involved maximal monotone operators is assumed. Furthermore, the analysis of the algorithm is obtained under the natural condition of the inertial factor \(\theta _k\) being monotone non-decreasing and bounded away from 1/3. Some numerical illustrations are given to test the efficiency and implemnetation of the proposed scheme. The results obtained in this paper could serve as the strong convergence counterpart of already obtained weak convergence methods for inertial Douglas–Rachford splitting methods (Bauschke and Combettes 2011; Beck and Teboulle 2009; Boţ et al. 2015; Lorenz and Pock 2015; Thong and Vinh 2019) in the literature.

Our future project include the following:

  • to modify the proposed method (9) in this paper so that the bound of the inertial factor \(\theta _k\) could exceed 1/3 and possibly lead to a faster convergence; and

  • to obtain the rate of convergence of method (9). As far as we know, this has not been obtained before in the literature.