1 Introduction

Let E be a real Banach space with norm \(\Vert .\Vert ,\) we denote by \(E^*\) the dual of E and \(\langle f,x\rangle \) the value of \(f\in E^*\) at \(x\in E.\) Let \(B:E\rightarrow 2^{E^*}\) be a maximal monotone operator and \(A:E\rightarrow E^*\) be a Lipschitz continuous monotone operator. We consider the following inclusion problem: find \(x\in E\) such that

$$\begin{aligned} 0 \in (A+B)x. \end{aligned}$$
(1)

Throughout this paper, we denote the solution set of the inclusion problem (1) by \((A+B)^{-1}(0)\).

The inclusion problem (1) contains, as special cases, convexly constrained linear inverse problem, split feasibility problem, convexly constrained minimization problem, fixed point problems, variational inequalities, Nash equilibrium problem in noncooperative games, and many more. See, for instance, [11, 15, 28, 33, 35, 36] and the references therein.

A popular method for solving problem (1) in real Hilbert spaces, is the well-known forward-backward splitting method introduced by Passty [35] and Lions and Mercier [28]. The method is formulated as

$$\begin{aligned} x_{n+1}=(I+\lambda _n B)^{-1}(I-\lambda _n A)x_n, ~~\lambda _n>0, \end{aligned}$$
(2)

under the condition that \(Dom(B) \subset Dom(A)\). It was shown, see for example [11], that weak convergence of (2) requires quite restrictive assumptions on A and B, such that the inverse of A is strongly monotone or B is Lipschitz continuous and monotone and the operator \(A + B\) is strongly monotone on Dom(B). Tseng in [48], weakened these assumptions and included an extra step per each step of (2) (called Tseng’s splitting algorithm) and obtained weak convergence result in real Hilbert spaces. Quite recently, Gibali and Thong [18] have obtained strong convergence result by modifying Tseng’s splitting algorithm in real Hilbert spaces.

In this paper, we extend Tseng’s result [48] to a Banach space. We first prove the weak convergence of the sequence generated by our proposed method, assuming that the duality mapping is weakly sequentially continuous. This weak convergence is a generalization of Theorem 3.4 given in [48]. We next prove the strong convergence result for problem (1) under some mild assumptions and this extends Theorems 1 and 2 in [18] to Banach spaces. Finally, we apply our convergence results to the composite convex minimization problem in Banach spaces.

2 Preliminaries

In this section, we define some concepts and state few basic results that we will use in our subsequent analysis. Let \(S_E\) be the unit sphere of E, and \(B_E\) the closed unit ball of E.

Let \(\rho _E:[0,\infty )\rightarrow [0,\infty )\) be the modulus of smoothness of E defined by

$$\begin{aligned} \rho _E(t):=\sup \left\{ \frac{1}{2}(\Vert x+y\Vert +\Vert x-y\Vert )-1:\,x\in S_E,\,\Vert y\Vert \le t\right\} . \end{aligned}$$

A Banach space E is said to be 2-uniformly smooth, if there exists a fixed constant \(c>0\) such that \(\rho _E(t)\le ct^2\). The space E is said to be smooth if

$$\begin{aligned} \lim _{t\rightarrow 0} \frac{\Vert x+ty\Vert -\Vert x\Vert }{t} \end{aligned}$$
(3)

exists for all \(x,y\in S_E\). The space E is also said to be uniformly smooth if (3) converges uniformly in \(x,y\in S_E\). It is well known that if E is 2-uniformly smooth, then E is uniformly smooth. It is said to be strictly convex if \(\Vert (x + y)/2\Vert <1\) whenever \(x,y\in S_E\) and \(x\ne y\). It is said to be uniformly convex if \(\delta _E(\epsilon ) >0\) for all \(\epsilon \in (0,2]\), where \( \delta _E\) is the modulus of convexity of E defined by

$$\begin{aligned} \delta _E(\epsilon ):=\inf \left\{ 1-\Big |\Big |\frac{x+y}{2}\Big |\Big |\mid x,y\in B_E, \Vert x-y\Vert \ge \epsilon \right\} \end{aligned}$$
(4)

for all \(\epsilon \in [0,2]\). The space E is said to be 2-uniformly convex if there exists \(c >0\) such that \(\delta _E(\epsilon )\ge c\epsilon ^2\) for all \(\epsilon \in [0,2]\). It is obvious that every 2-uniformly convex Banach space is uniformly convex. It is known that all Hilbert spaces are uniformly smooth and 2-uniformly convex. It is also known that all the Lebesgue spaces \(L_p\) are uniformly smooth and 2-uniformly convex whenever \(1<p\le 2\) (see [7]).

The normalized duality mapping of E into \(E^*\) is defined by

$$\begin{aligned} Jx:=\{x^* \in E^*\mid \langle x^*,x\rangle =\Vert x^*\Vert ^2=\Vert x\Vert ^2\} \end{aligned}$$

for all \(x\in E\). The normalized duality mapping J has the following properties (see, e.g., [47]):

  • if E is reflexive and strictly convex with the strictly convex dual space \(E^*\), then J is single-valued, one-to-one and onto mapping. In this case, we can define the single-valued mapping \(J^{-1}:E^*\rightarrow E\) and we have \(J^{-1}=J_{*}\), where \(J_{*}\) is the normalized duality mapping on \(E^*\);

  • if E is uniformly smooth, then J is uniformly norm-to-norm continuous on each bounded subset of E.

Let us recall from [1, 13] some examples for the normalized duality mapping J in the uniformly convex and uniformly smooth Banach spaces \(\ell _p\) and \(L_p, 1< p <\infty \).

  • For \(\ell _p: Jx=\Vert x\Vert _{\ell _p}^{2-p}y \in \ell _q\), where \( x=(x_j)_{j\ge 1}\) and \(y=(x_j|x_j|^{p-2})_{j\ge 1}\), \(\frac{1}{p}+\frac{1}{q}=1\).

  • For \(L_p: Jx=\Vert x\Vert _{L_p}^{2-p}|x|^{p-2}x \in L_q\), \(\frac{1}{p}+\frac{1}{q}=1\).

Now, we recall some fundamental and useful results.

Lemma 2.1

The space E is 2-uniformly convex if and only if there exists \(\mu _E \ge 1\) such that

$$\begin{aligned} \frac{\Vert x+y\Vert ^2+\Vert x-y\Vert ^2}{2}\ge \Vert x\Vert ^2+\Vert \mu ^{-1}_E y\Vert ^2 \end{aligned}$$
(5)

for all \(x,y\in E\).

The minimum value of the set of all \(\mu _E \ge 1\) satisfying (5) for all \(x,y\in E\) is denoted by \(\mu \) and is called the 2-uniform convexity constant of E; see [5]. It is obvious that \(\mu =1\) whenever E is a Hilbert space.

Lemma 2.2

([4]). Let \(\displaystyle \frac{1}{p}+\frac{1}{q}=1,\ p,q>1\). The space E is q-uniformly smooth if and only if its dual \(E^*\) is p-uniformly convex.

Lemma 2.3

([51]). Let E be a real Banach space. The following are equivalent:

  1. (1)

    E is 2-uniformly smooth

  2. (2)

    There exists a constant \(\kappa >0\) such that \(\forall \ x,y\in E\),

    $$\begin{aligned} \Vert x+y\Vert ^2\le \Vert x\Vert ^2+2\langle y,J(x)\rangle +2\kappa ^2\Vert y\Vert ^2, \end{aligned}$$

    where \(\kappa \) is the 2-uniform smoothness constant. In Hilbert spaces, \(\kappa =\frac{1}{\sqrt{2}}\).

Definition 2.4

Let \( X \subseteq E \) be a nonempty subset. Then a mapping \( A: X \rightarrow E^* \) is called

  1. (a)

    strongly monotone with modulus \(\gamma >0\) on X if

    $$\begin{aligned} \langle Ax-Ay, x-y\rangle \ge \gamma \Vert x-y\Vert ^2, \forall x,y \in X. \end{aligned}$$

    In this case, we say that A is \(\gamma \)-strongly monotone;

  2. (b)

    monotone on X if

    $$\begin{aligned} \langle Ax-Ay, x-y\rangle \ge 0, \forall x,y \in X; \end{aligned}$$
  3. (c)

    Lipschitz continuous on X if there exists a constant \( L > 0 \) such that \( \Vert Ax - Ay \Vert \le L \Vert x-y \Vert \) for all \( x, y \in X \).

We give some examples of monotone operator in Banach spaces as given in [2].

Example 2.5

Let \(G\subset \mathbb {R}^n\) be a bounded measurable domain. Define the operator \(A:L^p(G)\rightarrow L^q(G),~~\frac{1}{p}+\frac{1}{q} = 1, ~~p>1\), by the formula

$$\begin{aligned} Ay(x):=\varphi (x,|y(x)|^{p-1})|y(x)|^{p-2}y(x),~~x \in G, \end{aligned}$$

where the function \(\varphi (x,s)\) is measurable as a function of x for every \(s\in [0,\infty )\) and continuous for almost all \(x\in G\) as a function on \(s, |\varphi (x,s)|\le M\) for all \(s\in [0,\infty )\) and for almost all \(x\in G\). Observe that the operator A really maps \(L^p(G)\) to \(L^q(G)\) because of the inequality \(|Ay|\le M|y|^{p-1}\). Then it can be shown that A is a monotone map on \(L^p(G)\).

Let us consider another example from quantum mechanics.

Example 2.6

Define the operator

$$\begin{aligned} Au:=-a^2\triangle u+(g(x)+b)u(x)+u(x)\int _{\mathbb {R}^3} \frac{u^2(y)}{|x-y|}dy, \end{aligned}$$

where \(\triangle :=\sum _{i=1}^3 \frac{\partial ^2}{\partial x_i^2}\) is the Laplacian in \(\mathbb {R}^3\), a and b are constants, \(g(x)=g_0(x)+g_1(x),~~g_0(x) \in L^{\infty }(\mathbb {R}^3), g_1(x) \in L^2(\mathbb {R}^3)\). Let \(A:= L+B\), where the operator L is the linear part of A (it is the Schrödinger operator) and B is defined by the last term. It is known that B is a monotone operator on \(L^2(\mathbb {R}^3)\) (see p. 23 of [2]) and this implies that \(A:L^2(\mathbb {R}^3)\rightarrow L^2(\mathbb {R}^3)\) is also a monotone operator.

Example 2.7

This example gives one of the perhaps most famous example of monotone operators, viz. the p-Laplacian \(-\mathrm{div}(|\nabla u|^{p-2}\nabla u): W^1_0(L_p(\Omega ))\rightarrow \Big (W^1_0(L_p(\Omega ))\Big )^*\), where \(u:\Omega \rightarrow \mathbb {R}\) is a real function defined on a domain \(\Omega \subset \mathbb {R}^n\). The p-Laplacian operator is a monotone operator for \(1<p<\infty \) (in fact, it is strongly monotone for \(p \ge 2\), and strictly monotone for \(1< p < 2\)). The p-Laplacian operator is an extremely important model in many topical applications and certainly played an important role in the development of the theory of monotone operators.

Definition 2.8

A multi-valued operator \(B:E \rightarrow 2^{E^*}\) with graph \(G(T)= \{(x,x^*): x^* \in Tx\}\) is said to be monotone if for any \(x,y \in D(T),x^* \in Tx\) and \(y^* \in Ty\)

$$\begin{aligned} \langle x-y,x^*-y^*\rangle \ge 0. \end{aligned}$$

A monotone operator B is said to be maximal if \(B =S\) whenever \(S:E \rightarrow 2^{E^*}\) is monotone and \(G(B)\subset G(S)\).

Let E be a reflexive, strictly convex and smooth Banach space and let \(B:E \rightarrow 2^{E^*}\) be a maximal monotone operator. Then for each \(r > 0\) and \(x \in E\), there corresponds a unique element \(x_r \in E\) such that

$$\begin{aligned} Jx\in Jx_r+rBx_r. \end{aligned}$$

We define this unique element \(x_r\), the resolvent of B, denoted by \(J^B_rx\). In other words, \(J_r^B=(J +rB)^{-1}J\) for all \(r > 0\). It is easy to show that \(B^{-1}0 = F(J^B_r)\) for all \(r > 0\), where \(F(J^B_r)\) denotes the set of all fixed points of \(J^B_r\). We can also define, for each \(r > 0\), the Yosida approximation of B by \(A_r=\frac{J-JJ^B_r}{r}\). For more details, see, for instance [6].

Suppose E is a smooth Banach space. We introduce the functional studied in [1, 25, 38]: \(\phi :E \times E\rightarrow \mathbb {R}\) defined by:

$$\begin{aligned} \phi (x,y):=\Vert x\Vert ^2-2\langle x,Jy\rangle +\Vert y\Vert ^2. \end{aligned}$$
(6)

Clearly,

$$\begin{aligned} \phi (x,y)\ge (\Vert x\Vert -\Vert y\Vert )^2 \ge 0. \end{aligned}$$

The following lemma gives some identities of functional \(\phi \) defined in (6).

Lemma 2.9

(See [1, 3]). Let E be a real uniformly convex, smooth Banach space. Then, the following identities hold:

  1. (i)
    $$\begin{aligned} \phi (x,y)= \phi (x,z)+ \phi (z,y)+2\langle x-z, Jz-Jy\rangle , \quad \forall x,y,z\in E. \end{aligned}$$
  2. (ii)
    $$\begin{aligned} \phi (x,y)+\phi (y,x) = 2\langle x-y, Jx-Jy\rangle , \quad \forall x,y\in E. \end{aligned}$$

Let \(C \subseteq E \) be a nonempty, closed and convex subset of a real, uniformly convex Banach space E. Let us introduce the functional \(V(x,y):E \times E^{*}\rightarrow \mathbb {R}\) by the formula:

$$\begin{aligned} V(x,y):=\Vert x\Vert ^2_E-2\langle x,y\rangle +\Vert y\Vert ^2_{E^*}. \end{aligned}$$
(7)

Then, it is easy to see that

$$\begin{aligned} V(y,x)=\phi (y,J^{-1}x),\quad \forall x \in E^*,y \in E. \end{aligned}$$

In the next lemma, we describe the property of the operator V(., .) defined in (7).

Lemma 2.10

([1]).

$$\begin{aligned} V(x,x^*)+2\langle J^{-1}x^*-x,y^*\rangle \le V(x,x^*+y^*),\quad \forall x \in E,\ x^*,y^*\in E^*. \end{aligned}$$

The lemma that follows is stated and proven in [3, Lemma 2.2].

Lemma 2.11

Suppose that E is 2-uniformly convex Banach space. Then, there exists \(\mu \ge 1\) such that

$$\begin{aligned} \frac{1}{\mu }\Vert x-y\Vert ^2\le \phi (x,y)\quad \forall x,y\in E. \end{aligned}$$

The following lemma was given in [21].

Lemma 2.12

Let S be a nonempty, closed convex subset of a uniformly convex, smooth Banach space E. Let \(\{x_n\}\) be a sequence in E. Suppose that, for all \(u\in S\),

$$\begin{aligned} \phi (u,x_{n+1}) \le \phi (u,x_n),\quad \forall n \ge 1. \end{aligned}$$

Then \(\{\Pi _S(x_n)\}\) is a Cauchy sequence.

The following property of \(\phi (.,.)\) was given in [1, Theorem 7.5] (see also [16, 17]).

Lemma 2.13

Let E be a uniformly smooth Banach space which is also uniformly convex. If \(\Vert x\Vert \le c, \Vert y\Vert \le c\), then

$$\begin{aligned} 2L_1^{-1}c^2\delta _E\Big (\frac{\Vert x-y\Vert }{4c}\Big )\le \phi (y,x)\le 4L_1^{-1}c^2\rho _E\Big (\frac{4\Vert x-y\Vert }{c}\Big ), \end{aligned}$$

where \(L_1 (1< L_1 < 3.18)\) is the Figiel’s constant.

We next recall some existing results from the literature to facilitate our proof of strong convergence. The first is taken from [31].

Lemma 2.14

Let \(\{a_n\}\) be sequence of real numbers such that there exists a subsequence \(\{n_i\}\) of \(\{n\}\) such that \(a_{n_i} < a_{{n_i}+1}\), for all \(i\in \mathbb {N}\). Then there exists a nondecreasing sequence \(\{m_k\}\subset \mathbb {N}\) such that \(m_k\rightarrow \infty \) and the following properties are satisfied by all (sufficiently large) numbers \(k\in \mathbb {N}\)

$$\begin{aligned} a_{m_k} \le a_{{m_k}+1}\quad \mathrm{and}\quad a_k\le a_{{m_k}+1}. \end{aligned}$$

In fact, \(m_k=\max \{j\le k : a_j<a_{j+1}\}\).

Lemma 2.15

([52]). Let \(\{a_n\}\) be a sequence of nonnegative real numbers satisfying the following relation:

$$\begin{aligned} a_{n+1}\le (1-\alpha _n)a_n+\alpha _n\sigma _n+\gamma _n,~~n \ge 1, \end{aligned}$$

where

  1. (a)

    \(\{\alpha _n\}\subset [0,1],\) \( \sum _{n=1}^{\infty } \alpha _n=\infty ;\)

  2. (b)

    \(\limsup \sigma _n \le 0\);

  3. (c)

    \(\gamma _n \ge 0 \ (n \ge 1),\) \( \sum _{n=1}^{\infty } \gamma _n <\infty .\)

Then, \(a_n\rightarrow 0\) as \(n\rightarrow \infty \).

The following lemma is needed in our proof to show that the weak limit point is a solution to the inclusion problem (1).

Lemma 2.16

([6]). Let \(B:E \rightarrow 2^{E^*} \) be a maximal monotone mapping and \( A: E \rightarrow E^* \) be a Lipschitz continuous and monotone mapping. Then the mapping \(A+B\) is a maximal monotone mapping.

The following result gives an equivalence of fixed point problem and problem (1).

Lemma 2.17

Let \(B:E \rightarrow 2^{E^*} \) be a maximal monotone mapping and \( A: E \rightarrow E^* \) be a mapping. Define a mapping

$$\begin{aligned} T_\lambda x:= J_{\lambda }^BoJ^{-1}(J - \lambda A)(x),\quad x\in E, \lambda >0. \end{aligned}$$

Then \(F(T_\lambda )=(A+B)^{-1}(0),\) where \(F(T_\lambda )\) denotes the set of all fixed points of \(T_\lambda \).

Proof

Let \(x \in F(T_\lambda )\). Then

$$\begin{aligned} x \in F(T_\lambda )\Leftrightarrow & {} x=T_\lambda x=J_{\lambda }^BoJ^{-1}(J - \lambda A)(x) \\\Leftrightarrow & {} x = (J+\lambda B)^{-1} JoJ^{-1}(Jx - \lambda Ax) \\\Leftrightarrow & {} Jx - \lambda Ax \in Jx+\lambda Bx \\\Leftrightarrow & {} 0 \in \lambda (Ax+Bx)\\\Leftrightarrow & {} 0 \in Ax+Bx \\\Leftrightarrow & {} x \in (A+B)^{-1}(0). \end{aligned}$$

\(\square \)

We shall adopt the following notation in this paper:

  • . \(x_n\rightarrow x\) means that \(x_n\rightarrow x\) strongly.

  • . \(x_n\rightharpoonup x\) means that \(x_n\rightarrow x\) weakly.

3 Approximation Method

In this section, we propose our method and state certain conditions under which we obtain the desired convergence for our proposed methods. First, we give the conditions governing the cost function and the sequence of parameters below.

Assumption 3.1

  1. (a)

    Let E be a real 2-uniformly convex Banach space which is also uniformly smooth.

  2. (b)

    Let \(B:E \rightarrow 2^{E^*} \) be a maximal monotone operator; \( A: E \rightarrow E^* \) a monotone and L-Lipschitz continuous.

  3. (c)

    The solution set \( (A+B)^{-1}(0) \) of the inclusion problem (1) is nonempty.

Throughout this paper, we assume that the duality mapping J and the resolvent \(J_{\lambda _n}^B:=(J+\lambda _nB)^{-1} J\) of maximal monotone operator B are easy to compute.

Assumption 3.2

Suppose the sequence \( \{ \lambda _n \}_{n=1}^\infty \) of step-sizes satisfies the following condition:

$$\begin{aligned} 0<a\le \lambda _n\le b<\displaystyle \frac{1}{\sqrt{2\mu }\kappa L} \end{aligned}$$

where

  • \(\mu \) is the 2-uniform convexity constant of E;

  • \(\kappa \) is the 2-uniform smoothness constant of \(E^*\);

  • L is the Lipschitz constant of A.

Assumption 3.2 is satisfied, e.g., for \( \lambda _n = a+\frac{n}{n+1}\Big (\frac{1}{\sqrt{2\mu }\kappa L}-a\Big )\) for all \( n \ge 1\).

We now give our proposed method below.

Algorithm 3.3

Step 0 Let Assumptions 3.1 and 3.2 hold. Let \( x_1 \in E \) be a given starting point. Set \( n := 1 \).

Step 1 Compute \(y_n:=J_{\lambda _n}^BoJ^{-1}(Jx_n - \lambda _nAx_n)\). If \(x_n-y_n=0\): STOP.

Step 2 Compute

$$\begin{aligned} x_{n+1} = J^{-1}[Jy_n - \lambda _n(Ay_n-Ax_n)]. \end{aligned}$$
(8)

Step 3 Set \( n \leftarrow n+1 \), and go to Step 1.

We observe that in real Hilbert spaces, the duality mapping J becomes the identity mapping and our Algorithm 3.3 reduces to the algorithm proposed by Tseng in [48].

Note that both sequences \(\{y_n\}\) and \(\{x_n\}\) are in E. Furthermore, by Lemma 2.17, we have that if \(x_n=y_n\), then \(x_n\) is a solution of problem (1).

To the best of our knowledge, the proposed Algorithm 3.3 is the only known algorithm which can solve monotone inclusion problem (1) without the inverse-strongly monotonicity of A. We consider some various cases of Algorithm 3.3.

  • When \(A=0\) in Algorithm 3.3, then Algorithm 3.3 reduces to the methods proposed in [6, 20, 24, 26,27,28, 32, 35, 38, 39, 43]. In this case, the assumption that E is 2-uniformly convex Banach space and uniformly smooth is not needed. In fact, the convergence can be obtained in reflexive Banach spaces in this case. However, we do not know if the convergence of Algorithm 3.3 can be obtained in a more general reflexive Banach space for problem (1).

  • When \(B=N_C\), the normal cone for closed and convex subset C of E (\(N_C(x):=\{x^* \in E^*:\langle y-x,x^*\rangle \le 0, \forall y \in C \}\)), then the inclusion problem (1) reduces to a variational inequality problem (i.e., find \(x \in C: \langle Ax,y-x\rangle \ge 0,~\forall y \in C\)). It is well known that \(N_C=\partial \delta _C\), where \(\delta _C\) is the indicator function of C at x, defined by \(\delta _C(x)=0\) if \(x \in C\) and \(\delta _C(x)=+\infty \) if \(x \notin C\) and \(\partial (.)\) is the subdifferential, defined by \(\partial f(x):=\{x^* \in E^*: f(y)\ge f(x)+\langle x^*, y-x\rangle ,~~\forall y \in E\}\) for a proper, lower semicontinuous convex functional f on E. Using the theorem of Rockafellar in [40, 41], \(N_C=\partial \delta _C\) is maximal monotone. Hence,

    $$\begin{aligned} Jz \in J (J_{\lambda _n}^B)+\lambda _n \partial \delta _C(J_{\lambda _n}^B),\quad \forall z \in E. \end{aligned}$$

    This implies that

    $$\begin{aligned} 0\in \partial \delta _C(J_{\lambda _n}^B)+\frac{1}{\lambda _n}J (J_{\lambda _n}^B)-\frac{1}{\lambda _n}Jz =\partial \left( \delta _C+\frac{1}{2\lambda _n}\Vert .\Vert ^2-\frac{1}{\lambda _n}Jz \right) J_{\lambda _n}^B. \end{aligned}$$

    Therefore,

    $$\begin{aligned} J_{\lambda _n}^B(z)=\mathop {\mathrm{argmin}}\limits _{{y \in E}}\left\{ \delta _C(y)+\frac{1}{2\lambda _n}\Vert y\Vert ^2-\frac{1}{\lambda _n}\langle y,Jz\rangle \right\} \end{aligned}$$

    and \(y_n\) in Algorithm 3.3 reduces to

    $$\begin{aligned} y_n= \mathop {\mathrm{argmin}}\limits _{{y \in E}}\left\{ \delta _C(y)+\frac{1}{2\lambda _n}\Vert y\Vert ^2-\frac{1}{\lambda _n}\langle y,Jx_n - \lambda _nAx_n\rangle \right\} . \end{aligned}$$

However, in implementing our proposed Algorithm 3.3, we assume that the resolvent \((J+\lambda _nB)^{-1} J\) is easy to compute and the duality mapping J is easily computable as well. On the other hand, one has to obtain the Lipschitz constant, L, of the monotone mapping A (or an estimate of it). In a case when the Lipschitz constant cannot be accurately estimated or overestimated, this might result in too small step-sizes \(\lambda _n\). This is a drawback of our proposed Algorithm 3.3. One way to overcome this obstacle is to introduce linesearch in our Algorithm 3.3. This case will be considered in Algorithm 3.8.

3.1 Convergence Analysis

In this section, we give the convergence analysis of the proposed Algorithm 3.3. First, we establish the boundedness of the sequence of iterates generated by Algorithm 3.3.

Lemma 3.4

Let Assumptions 3.1 and 3.2 hold. Assume that \(x^*\in (A+B)^{-1}(0)\) and let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by Algorithm 3.3. Then \(\{x_n\}\) is bounded.

Proof

By the Lyaponuv functional \(\phi \), we have

$$\begin{aligned} \phi (x^*,x_{n+1})=\,&\phi (x^*,J^{-1}(Jy_n - \lambda _n(Ay_n-Ax_n))) \nonumber \\ =\,&\Vert x^*\Vert ^2-2\langle x^*,JJ^{-1}(Jy_n - \lambda _n(Ay_n-Ax_n))\rangle \nonumber \\&+\Vert J^{-1}(Jy_n - \lambda _n(Ay_n-Ax_n))\Vert ^2\nonumber \\ =\,&\Vert x^*\Vert ^2-2\langle x^*,Jy_n - \lambda _n(Ay_n-Ax_n)\rangle \nonumber \\&+\Vert (Jy_n - \lambda _n(Ay_n-Ax_n))\Vert ^2\nonumber \\ =\,&\Vert x^*\Vert ^2-2\langle x^*,Jy_n\rangle +2\lambda _n\langle x^*, Ay_n-Ax_n\rangle \nonumber \\&+\Vert Jy_n - \lambda _n(Ay_n-Ax_n)\Vert ^2. \end{aligned}$$
(9)

Using Lemma 2.2, we get that \(E^*\) is 2-uniformly smooth and so by Lemma 2.3, we get

$$\begin{aligned} \Vert Jy_n - \lambda _n(Ay_n-Ax_n)\Vert ^2\le \,&\Vert Jy_n\Vert ^2-2\lambda _n\langle Ay_n-Ax_n,y_n\rangle \nonumber \\&+2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2. \end{aligned}$$
(10)

Substituting (10) into (9), we get

$$\begin{aligned} \phi (x^*,x_{n+1})\le \,&\Vert Jy_n\Vert ^2 -2 \lambda _n\langle Ay_n-Ax_n,y_n\rangle +2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2\nonumber \\&+\Vert x^*\Vert ^2-2\langle x^*,Jy_n\rangle + 2 \lambda _n\langle x^*, Ay_n-Ax_n\rangle \nonumber \\ =\,&\Vert x^*\Vert ^2-2\langle x^*, Jy_n\rangle + \Vert y_n\Vert ^2-2 \lambda _n\langle Ay_n-Ax_n,y_n-x^*\rangle \nonumber \\&+ 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2 \nonumber \\ =\,&\phi (x^*,y_n) -2 \lambda _n\langle Ay_n-Ax_n,y_n-x^*\rangle + 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2. \end{aligned}$$
(11)

Using Lemma 2.9 (i), we get

$$\begin{aligned} \phi (x^*,y_n)=\,&\phi (x^*,x_n) +\phi (x_n,y_n) + 2\langle x^*-x_n, Jx_n-Jy_n\rangle \nonumber \\ =\,&\phi (x^*,x_n) +\phi (x_n,y_n) + 2\langle x_n-x^*, Jy_n-Jx_n\rangle . \end{aligned}$$
(12)

Putting (12) into (11), we get

$$\begin{aligned} \phi (x^*,x_{n+1})=\,&\phi (x^*,x_n) + \phi (x_n,y_n) + 2\langle x_n-x^*, Jy_n-Jx_n\rangle \nonumber \\&- 2 \lambda _n\langle Ay_n-Ax_n,y_n-x^*\rangle + 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2 \nonumber \\ =\,&\phi (x^*,x_n) + \phi (x_n,y_n) -2\langle y_n-x_n, Jy_n-Jx_n\rangle \nonumber \\&+ 2\langle y_n-x^*, Jy_n-Jx_n\rangle \nonumber \\&- 2 \lambda _n\langle Ay_n-Ax_n,y_n-x^*\rangle + 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2. \end{aligned}$$
(13)

Using Lemma 2.9 (ii), we get

$$\begin{aligned} - \phi (y_n,x_n) + 2\langle y_n-x_n, Jy_n-Jx_n\rangle = \phi (x_n,y_n). \end{aligned}$$
(14)

Substituting (14) into (13), we have

$$\begin{aligned} \phi (x^*,x_{n+1})\le \,&\phi (x^*,x_n) - \phi (y_n,x_n) + 2\langle y_n-x^*, Jy_n-Jx_n\rangle \nonumber \\&- 2 \lambda _n\langle Ay_n-Ax_n,y_n-x^*\rangle \nonumber \\&+ 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2\nonumber \\ =\,&\phi (x^*,x_n) - \phi (y_n,x_n)+ 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2\nonumber \\&-2\langle Jx_n-Jy_n-\lambda _n(Ax_n-Ay_n),y_n-x^*\rangle . \end{aligned}$$
(15)

Since \(y_n= (J+\lambda _nB)^{-1} JoJ^{-1}(Jx_n - \lambda _nAx_n)\), we have \(Jx_n - \lambda _nAx_n \in (J+\lambda _nB)y_n\). Using the fact that B is maximal monotone, then there exists \(v_n \in By_n\) such that \(Jx_n - \lambda _nAx_n=Jy_n+\lambda _n v_n\). Therefore

$$\begin{aligned} v_n=\frac{1}{\lambda _n}(Jx_n-Jy_n-\lambda _n Ax_n). \end{aligned}$$
(16)

On the other hand, we know that \(0 \in (Ax^*+Bx^*)\) and \(Ay_n +v_n \in (A+B)y_n\). Since \(A+B\) is maximal monotone, we obtain

$$\begin{aligned} \langle Ay_n +v_n,y_n-x^*\rangle \ge 0. \end{aligned}$$
(17)

Putting (16) into (17), we get

$$\begin{aligned} \langle Jx_n-Jy_n-\lambda _n(Ax_n-Ay_n),y_n-x^*\rangle \ge 0. \end{aligned}$$
(18)

Now, using (18) in (15), we get

$$\begin{aligned} \phi (x^*,x_{n+1})\le \,&\phi (x^*,x_n) - \phi (y_n,x_n)+ 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2\nonumber \\ \le \,&\phi (x^*,x_n) - \phi (y_n,x_n)+ 2\kappa ^2\lambda _n^2L^2\mu \phi (y_n,x_n)\nonumber \\ =\,&\phi (x^*,x_n) -(1-2\kappa ^2\lambda _n^2L^2\mu )\phi (y_n,x_n). \end{aligned}$$
(19)

Using Assumption 3.2, we get

$$\begin{aligned} \phi (x^*,x_{n+1}) \le \phi (x^*, x_n), \end{aligned}$$
(20)

which shows that \(\lim \phi (x^*, x_n)\) exists and hence, \(\{\phi (x^*, x_n)\}\) is bounded. Therefore \(\{x_n\}\) is bounded. \(\square \)

Definition 3.5

The duality mapping J is weakly sequentially continuous if, for any sequence \(\{x_n\}\subset E\) such that \(x_n\rightharpoonup x\) as \(n\rightarrow \infty \), then \(Jx_n\rightharpoonup ^* Jx\) as \(n\rightarrow \infty \). It is known that the normalized duality map on \(\ell _p\) spaces, \(1< p <\infty \), is weakly sequentially continuous.

We now obtain the weak convergence result of Algorithm 3.3 in the next theorem.

Theorem 3.6

Let Assumptions 3.1 and 3.2 hold. Assume that J is weakly sequentially continuous on E and let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by Algorithm 3.3. Then \(\{x_n\}\) converges weakly to \(z\in (A+B)^{-1}(0)\). Moreover, \(z:=\underset{n\rightarrow \infty }{\lim }\Pi _{(A+B)^{-1}(0)}(x_n)\).

Proof

Let \(x^*\in (A+B)^{-1}(0)\). From (19), we have

$$\begin{aligned} 0< & {} [1-2\kappa ^2b^2L^2\mu ]\phi (y_n, x_n)\le [1-2\kappa ^2\lambda _n^2L^2\mu ]\phi (y_n, x_n)\nonumber \\\le & {} \phi (x^*, x_n)-\phi (x^*,x_{n+1}). \end{aligned}$$
(21)

Since \(\lim _{n\rightarrow \infty }\phi (x^*, x_n)\) exists, we obtain from (21) that

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\phi (y_n, x_n)=0. \end{aligned}$$

Applying Lemma 2.11, we get

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert x_n-y_n\Vert =0. \end{aligned}$$

Since E is uniformly smooth, the duality mapping J is uniformly norm-to-norm continuous on each bounded subset of E. Hence, we have

$$\begin{aligned} \underset{n\rightarrow \infty }{\lim }\Vert Jx_n-Jy_n\Vert =0. \end{aligned}$$

Since \(\{x_n\}\) is bounded by Lemma 3.4, there exists a subsequence \(\{x_{n_i}\}\) of \(\{x_n\}\) and \(z\in C\) such that \(x_{n_i}\rightharpoonup z\). Since \(\lim _{n\rightarrow \infty } \Vert x_n-y_n\Vert =0\), it follows that \(x_{{n_i}+1}\rightharpoonup z\). We now show that \(z\in (A+B)^{-1}(0)\).

Suppose \((v,u) \in \text {Graph}(A+B)\). This implies that \(Ju-Av \in Bv\). Furthermore, we obtain from \(y_{n_i}= (J+\lambda _{n_i}B)^{-1} JoJ^{-1}(Jx_{n_i} - \lambda _{n_i}Ax_{n_i})\) that

$$\begin{aligned} (J-\lambda _{n_i}A)x_{n_i} \in (J+\lambda _{n_i} B)y_{n_i}, \end{aligned}$$

and thus

$$\begin{aligned} \frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i}-\lambda _{n_i} Ax_{n_i}) \in By_{n_i}. \end{aligned}$$

Using the fact that B is maximal monotone, we obtain

$$\begin{aligned} \left\langle v-y_{n_i},Ju-Av-\frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i}-\lambda _{n_i} Ax_{n_i})\right\rangle \ge 0. \end{aligned}$$

Therefore,

$$\begin{aligned} \langle v-y_{n_i},Ju\rangle\ge & {} \left\langle v-y_{n_i},Av+\frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i}-\lambda _{n_i} Ax_{n_i})\right\rangle \nonumber \\= & {} \langle v-y_{n_i}, Av-Ax_{n_i}\rangle +\left\langle v-y_{n_i}, \frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i})\right\rangle \nonumber \\= & {} \langle v-y_{n_i}, Av-Ay_{n_i}\rangle + \langle v-y_{n_i}, Ay_{n_i}-Ax_{n_i}\rangle \nonumber \\&+\left\langle v-y_{n_i}, \frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i})\right\rangle \nonumber \\\ge & {} \langle v-y_{n_i}, Ay_{n_i}-Ax_{n_i}\rangle +\left\langle v-y_{n_i}, \frac{1}{\lambda _{n_i}}(Jx_{n_i}-Jy_{n_i})\right\rangle . \end{aligned}$$

By the fact that \(\lim _{n\rightarrow \infty } \Vert x_n-y_n\Vert =0\) and A is Lipschitz continuous, we obtain \(\lim {n\rightarrow \infty } \Vert Ax_n-Ay_n\Vert =0\). Consequently, we obtain that

$$\begin{aligned} \langle v-z,Ju\rangle \ge 0. \end{aligned}$$

By the maximal monotonicity of \(A+B\), we have \(0\in (A+B)z\). Hence, \(z \in (A+B)^{-1}(0)\).

Let \(u_n:=\Pi _{(A+B)^{-1}(0)} (x_n)\). By (20) and Lemma 2.12, we have that \(\{u_n\}\) is a Cauchy sequence. Since \((A+B)^{-1}(0)\) is closed, we have that \(\{u_n\}\) converges strongly to \( w \in (A+B)^{-1}(0)\). By the uniform smoothness of E, we also have \(\lim _{n\rightarrow \infty } \Vert Ju_n-Jw\Vert =0\). We then show that \(z=w\). Using Lemma 2.10 (i), \(u_n=\Pi _{(A+B)^{-1}(0)} (x_n)\) and \( z \in (A+B)^{-1}(0)\), we have

$$\begin{aligned} \langle z-u_n,Ju_n-Jx_n\rangle \ge 0,\quad \forall n \ge 1. \end{aligned}$$

Therefore,

$$\begin{aligned} \langle z-w,Jx_n-Ju_n\rangle= & {} \langle z-u_n,Jx_n-Ju_n\rangle +\langle u_n-w,Jx_n-Ju_n\rangle \\\le & {} \Vert u_n-w\Vert \Vert Jx_n-Ju_n\Vert \le M\Vert u_n-w\Vert ,\quad \forall n \ge 1, \end{aligned}$$

where \(M:=\sup _{n \ge 1}\Vert Jx_n-Ju_n\Vert \). Using \(n=n_{i}\) in \(\lim _{n\rightarrow \infty }\Vert u_n-w\Vert =0, \lim _{n\rightarrow \infty }\Vert Ju_n-Jw\Vert =0\) and the weakly sequential continuity of J, we obtain

$$\begin{aligned} \langle z-w,Jz-Jw\rangle \le 0 \end{aligned}$$

as \(i\rightarrow \infty \). Therefore, \( \langle z-w,Jz-Jw\rangle = 0\). Since E is strictly convex, we have \(z = w\). Therefore, the sequence \(\{x_n\}\) converges weakly to \(z=\lim _{n\rightarrow \infty } \Pi (A+B)^{-1}(0) (x_n)\). This completes the proof. \(\square \)

It is easy to see from Algorithm 3.3 above and Lemma 2.17 that \(x_n=y_n\) if and only if \(x_n \in (A+B)^{-1}(0)\). Also, we have already established that \(\Vert x_n-y_n\Vert \rightarrow 0\) holds when \((A+B)^{-1}(0)\ne \emptyset \). Therefore, using the \(\Vert x_n-y_n\Vert \) as a measure of convergence rate, we obtain the following non asymptotic rate of convergence of our proposed Algorithm 3.3.

Theorem 3.7

Let Assumptions 3.1 and 3.2 hold. Let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by Algorithm 3.3. Then \(\min _{1\le k\le n}\Vert x_k-y_k\Vert =O(1/\sqrt{n})\).

Proof

We obtain from (19) that

$$\begin{aligned} \phi (x^*,x_{n+1})\le \phi (x^*,x_n)-(1-2\kappa ^2\lambda _n^2L^2\mu )\phi (y_n,x_n). \end{aligned}$$

Hence, we have from Lemma 2.11 that

$$\begin{aligned} \frac{1}{\mu }(1-2\kappa ^2\lambda _n^2L^2\mu )\Vert x_n-y_n\Vert ^2\le & {} (1-2\kappa ^2\lambda _n^2L^2\mu )\phi (y_n,x_n)\\\le & {} \phi (x^*,x_n)-\phi (x^*,x_{n+1}). \end{aligned}$$

By Assumption 3.2, we get

$$\begin{aligned} \sum _{k=1}^n\Vert x_k-y_k\Vert ^2\le \frac{\mu }{(1-2\kappa ^2\lambda _n^2L^2\mu )}\phi (x^*,x_1). \end{aligned}$$

Therefore,

$$\begin{aligned} \min _{1\le k\le n}\Vert x_k-y_k\Vert ^2\le \frac{\mu }{n(1-2\kappa ^2\lambda _n^2L^2\mu )}\phi (x^*,x_1). \end{aligned}$$

This implies that

$$\begin{aligned} \min _{1\le k\le n}\Vert x_k-y_k\Vert =O(1/\sqrt{n}). \end{aligned}$$

\(\square \)

Next, we propose another iterative method such that the sequence of step-sizes does not depend on the Lipschitz constant of monotone operator A in problem (1).

Algorithm 3.8

Step 0 Let Assumption 3.1 hold. Given \(\gamma >0, l \in (0,1)\) and \(\theta \in (0,\frac{1}{\sqrt{2\mu }\kappa })\). Let \( x_1 \in E \) be a given starting point. Set \( n := 1 \).

Step 1 Compute \(y_n:= J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\), where \(\lambda _n\) is chosen to be the largest

$$\begin{aligned} \lambda \in \{\gamma ,\gamma l,\gamma l^2,\ldots \} \end{aligned}$$

satisfying

$$\begin{aligned} \lambda \Vert Ax_n-Ay_n\Vert \le \theta \Vert x_n-y_n\Vert . \end{aligned}$$
(22)

If \(x_n-y_n=0\): STOP.

Step 2 Compute

$$\begin{aligned} x_{n+1} = J^{-1}[Jy_n - \lambda _n(Ay_n-Ax_n)]. \end{aligned}$$
(23)

Step 3 Set \( n \leftarrow n+1 \), and go to Step 1.

Before we establish the weak convergence analysis of Algorithm 3.8, we first show that the line search rule given in (22) is well-defined in this lemma.

Lemma 3.9

The line search rule (22) in Algorithm 3.8 is well-defined and

$$\begin{aligned} \min \Big \{\gamma ,\frac{\theta l}{L} \Big \}\le \lambda _n \le \gamma . \end{aligned}$$

Proof

Using the Lipschitz continuity of A on E, we obtain

$$\begin{aligned} \Vert Ax_n-A(J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n))\Vert \le L\Vert x_n-J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\Vert . \end{aligned}$$

This implies that

$$\begin{aligned} \frac{\theta }{L}\Vert Ax_n-A(J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n))\Vert \le \theta \Vert x_n-J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\Vert . \end{aligned}$$

Therefore, (22) holds whenever \(\lambda _n \le \frac{\theta }{L}\). Hence, \( \lambda _n\) is well-defined.

From the way \( \lambda _n\) is chosen, we can clearly see that \( \lambda _n \le \gamma \). Now, suppose \( \lambda _n=\gamma \), then (22) is satisfied and the lemma is proved. Suppose \( \lambda _n<\gamma \). Then \(\frac{ \lambda _n}{l}\) violates (22) and we get

$$\begin{aligned} L\Vert x_n-J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\Vert\ge & {} \Vert Ax_n-A(J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n))\Vert \\> & {} \frac{\theta }{\frac{\lambda _n}{l}} \Vert x_n-J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\Vert . \end{aligned}$$

This implies that \(\lambda _n > \frac{\theta l}{L}\). This completes the proof. \(\square \)

We now give a weak convergence result using Algorithm 3.8 in the next theorem.

Theorem 3.10

Let Assumptions 3.1. Assume that J is weakly sequentially continuous on E and let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by Algorithm 3.8. Then \(\{x_n\}\) converges weakly to \(z\in (A+B)^{-1}(0)\). Moreover, \(z:=\lim _{n\rightarrow \infty } \Pi _{(A+B)^{-1}(0)}(x_n)\).

Proof

Using the same line of arguments as in the proof of Lemma 3.4, we can obtain from (19) that

$$\begin{aligned} \phi (x^*,x_{n+1})\le & {} \phi (x^*,x_n) - \phi (y_n,x_n)+ 2\kappa ^2\Vert \lambda _n(Ay_n-Ax_n)\Vert ^2\nonumber \\\le & {} \phi (x^*, x_n) -\phi (y_n, x_n) +2\kappa ^2\theta ^2\Vert y_n-x_n\Vert ^2 \nonumber \\\le & {} \phi (x^*,x_n) - \phi (y_n,x_n)+ 2\kappa ^2\theta ^2\mu \phi (y_n,x_n)\nonumber \\= & {} \phi (x^*,x_n) -(1-2\kappa ^2\theta ^2\mu )\phi (y_n,x_n). \end{aligned}$$
(24)

Since \(\theta ^2 <\frac{1}{2\kappa ^2\mu }\), we get

$$\begin{aligned} \phi (x^*,x_{n+1}) \le \phi (x^*, x_n), \end{aligned}$$
(25)

which shows that \(\lim \phi (x^*, x_n)\) exists and hence, \(\{\phi (x^*, x_n)\}\) is bounded. Therefore \(\{x_n\}\) is bounded. The rest of the proof follows by using the same arguments as in the proof of Theorem 3.6. The completes the proof. \(\square \)

Finally, we give a modification of Algorithm 3.3 and consequently obtain the strong convergence analysis below.

Algorithm 3.11

Step 0 Let Assumptions 3.1 and 3.2 hold. Suppose that \(\{\alpha _n\}\) is a real sequence in (0,1) and let \( x_1 \in E \) be a given starting point. Set \( n := 1 \).

Step 1 Compute \(y_n:= J_{\lambda _n}^BJ^{-1}(Jx_n - \lambda _nAx_n)\). If \(x_n-y_n=0\): STOP.

Step 2 Compute

$$\begin{aligned} w_n = J^{-1}[Jy_n - \lambda _n(Ay_n-Ax_n)] \end{aligned}$$
(26)

and

$$\begin{aligned} x_{n+1} = J^{-1}[\alpha _nJx_1+(1-\alpha _n)Jw_n]. \end{aligned}$$
(27)

Step 3: Set \( n \leftarrow n+1 \), and go to Step 1.

Theorem 3.12

Let Assumptions 3.1 and 3.2 hold. Suppose that \(\lim _{n \rightarrow \infty } \alpha _n = 0 \) and \( \sum _{n = 1}^{\infty } \alpha _n = \infty \). Let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by Algorithm 3.11. Then \(\{x_n\}\) converges strongly to \(z=\Pi _{(A+B)^{-1}(0)}(x_1)\).

Proof

By Lemma 3.4, we have that \(\{x_n\}\) is bounded. Furthermore, using Lemma 2.10 with (26) and (27), we have

$$\begin{aligned} \phi (z,x_{n+1})= & {} \phi (z,J^{-1}(\alpha _n Jx_1+(1-\alpha _n)Jw_n))\nonumber \\= & {} V(z,\alpha _n Jx_1+(1-\alpha _n)Jw_n))\nonumber \\\le & {} V(z,\alpha _n Jx_1+(1-\alpha _n)Jw_n-\alpha _n(Jx_1-Jz))\nonumber \\&+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle \nonumber \\= & {} V(z,\alpha _n Jz+(1-\alpha _n)Jw_n)+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle \nonumber \\\le & {} \alpha _n V(z,Jz)+(1-\alpha _n)V(z,Jw_n)+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle \nonumber \\= & {} (1-\alpha _n)V(z,Jw_n)+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle \nonumber \\\le & {} (1-\alpha _n)V(z,Jx_n)+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle \nonumber \\= & {} (1-\alpha _n)\phi (z,x_n)+2\alpha _n\langle Jx_1-Jz,x_{n+1}-z\rangle . \end{aligned}$$
(28)

Set \(a_n:=\phi (x_n,z)\) and divide the rest of the proof into two parts as follows.

Case 1 Suppose that there exists \(n_0 \in \mathbb {N}\) such that \(\{\phi (z,x_n)\}_{n=n_0}^{\infty }\) is non-increasing. Then \(\{\phi (z,x_n)\}_{n=1}^{\infty }\) converges, and we therefore obtain

$$\begin{aligned} a_n-a_{n+1}\rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$
(29)

Using (20) in (27), we have

$$\begin{aligned} V(z,Jx_{n+1})\le & {} \alpha _n V(z,Jx_1)+(1-\alpha _n)V(z,Jw_n) \nonumber \\\le & {} \alpha _n V(Jx_1,z)+(1-\alpha _n)V(Jx_n,z) \nonumber \\&-(1-\alpha _n)[1-2\kappa ^2\theta ^2\mu ]V(y_n,Jx_n). \end{aligned}$$
(30)

This implies from (30) that

$$\begin{aligned} (1-\alpha _n)[1-2\kappa ^2\theta ^2\mu ]V(y_n,Jx_n) \le V(Jx_n,z)-V(Jx_{n+1},z)+ \alpha _nM_1, \end{aligned}$$

for some \(M_1>0\). Thus,

$$\begin{aligned} (1-\alpha _n)[1-2\kappa ^2\theta ^2\mu ]\phi (y_n,x_n) \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Hence,

$$\begin{aligned} \phi (y_n,x_n)\rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Consequently, \( \Vert x_n-y_n\Vert \rightarrow 0,\ n\rightarrow \infty .\) By (26), we get

$$\begin{aligned} \Vert Jw_n-Jy_n\Vert= & {} \lambda _n\Vert Ay_n-Ax_n\Vert \\\le & {} b\Vert Ay_n-Ax_n\Vert \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Therefore, \(\Vert w_n-y_n\Vert \rightarrow 0,\ n\rightarrow \infty .\) Moreover, we obtain from (27) that

$$\begin{aligned} \Vert Jx_{n+1}-Jw_n\Vert= & {} \alpha _n\Vert Jx_1-Jw_n\Vert \le \alpha _nM_2\rightarrow 0,\quad n\rightarrow \infty , \end{aligned}$$
(31)

for some \(M_2>0\). Since \(J^{-1}\) is norm-to-norm uniformly continuous on bounded subsets of \(E^*\), we have that

$$\begin{aligned} \Vert x_{n+1}-w_n\Vert \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Now,

$$\begin{aligned} \Vert x_{n+1}-x_n\Vert \le \Vert x_{n+1}-w_n\Vert +\Vert w_n-y_n\Vert +\Vert y_n-x_n\Vert \rightarrow 0,\quad n\rightarrow \infty . \end{aligned}$$

Since \(\{x_n\}\) is a bounded sunset of E, we can choose a subsequence \(\{x_{n_k}\}\) of \(\{x_n\}\) such that \(x_{n_k}\rightharpoonup p \in E\) and

$$\begin{aligned} \limsup _{n\rightarrow \infty } \langle Jx_1-Jz,x_n-z\rangle \le 2 \lim _{k\rightarrow \infty } \langle Jx_1-Jz,x_{n_k}-z\rangle . \end{aligned}$$

Since \(z=\Pi _C x_1\), we get

$$\begin{aligned} \limsup _{n\rightarrow \infty } \langle Jx_1-Jz,x_n-z\rangle\le & {} 2 \lim _{k\rightarrow \infty } \langle Jx_1-Jz,x_{n_k}-z\rangle \nonumber \\= & {} 2\langle Jx_1-Jz,p-z\rangle \le 0. \end{aligned}$$
(32)

This implies that

$$\begin{aligned} \limsup _{n\rightarrow \infty } \langle Jx_1-Jz,x_n-z\rangle \le 0. \end{aligned}$$

Using Lemma 2.15 and (32) in (28), we obtain \(\lim _{n \rightarrow \infty }\phi (z,x_n)=0.\) Thus, \(x_n\rightarrow z\), \(n \rightarrow \infty \).

Case 2 Suppose that there exists a subsequence \(\{x_{n_j}\}\) of \(\{x_n\}\) such that

$$\begin{aligned} \phi (z,x_{m_j}) <\phi (z,x_{{m_j}+1}),\quad \forall j\in \mathbb {N}. \end{aligned}$$

From Lemma 2.14, there exists a nondecreasing sequence \(\{n_k\}\) of \(\mathbb {N}\) such that \(\lim _{k\rightarrow \infty }\lim n_k=\infty \) and the following inequalities hold for all \(k \in \mathbb {N}\):

$$\begin{aligned} \phi (z,x_{n_k}) \le \phi (z,x_{{n_k}+1})\quad \mathrm{and}\quad \phi (z,x_k) \le \phi (z,x_{{n_k}+1}). \end{aligned}$$
(33)

Observe that

$$\begin{aligned} \phi (z,x_{n_k})\le & {} \phi (z,x_{{n_k}+1}) \le \alpha _{n_k}\phi (z,x_1)+(1-\alpha _{n_k})\phi (z,w_{n_k})\\\le & {} \alpha _{n_k}\phi (z,x_1)+(1-\alpha _{n_k})\phi (z,x_{n_k}). \end{aligned}$$

Since \(\lim _{n\rightarrow \infty }\alpha _n=0\), we get

$$\begin{aligned} \phi (z,x_{{n_k}+1})-\phi (z,x_{n_k})\rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$

Since \(\{x_{n_k}\}\) is bounded, there exists a subsequence of \(\{x_{n_k}\}\) still denoted by \(\{x_{n_k}\}\) which converges weakly to \( p\in E\). Repeating the same arguments as in Case 1 above, we can show that

$$\begin{aligned}&\Vert x_{n_k}-y_{n_k}\Vert \rightarrow 0,~~k\rightarrow \infty ,\Vert y_{n_k}-w_{n_k}\Vert \rightarrow 0,~~k\rightarrow \infty ~~\mathrm{and }~~\Vert x_{{n_k}+1}-x_{n_k}\Vert \\&\quad \rightarrow 0,~~k\rightarrow \infty . \end{aligned}$$

Similarly, we can conclude that

$$\begin{aligned} \limsup _{k\rightarrow \infty } \langle x_{{n_k}+1}-z, Jx_1-Jz\rangle= & {} \limsup _{k\rightarrow \infty } \langle x_{n_k}-z, Jx_1-Jz\rangle \le 0. \end{aligned}$$
(34)

It then follows from (28) and (33) that

$$\begin{aligned} \phi (z,x_{{n_k}+1})\le & {} (1-\alpha _{n_k})\phi (z,x_{n_k})+\alpha _{n_k}\langle x_{{n_k}+1}-z, Jx_1-Jz\rangle \\\le & {} (1-\alpha _{n_k})\phi (z,x_{{n_k}+1})+\alpha _{n_k}\langle x_{{n_k}+1}-z, Jx_1-Jz\rangle . \end{aligned}$$

Since \(\alpha _{n_k}>0\), we get

$$\begin{aligned} \phi (z,x_{n_k}) \le \phi (z,x_{{n_k}+1}) \le \langle x_{{n_k}+1}-z, Jx_1-Jz\rangle . \end{aligned}$$

By (34), we have that

$$\begin{aligned} \limsup _{k\rightarrow \infty } \phi (z,x_{n_k}) \le \limsup _{k\rightarrow \infty } \langle x_{{n_k}+1}-z, Jx_1-Jz\rangle . \end{aligned}$$

Therefore, \(x_k\rightarrow z,~~k\rightarrow \infty .\) This concludes the proof. \(\square \)

Remark 3.13

Our proposed Algorithms 3.3 and 3.11 are more applicable than the proposed methods in [10, 12, 23, 29, 30, 42, 44,45,46, 49] even in Hilbert spaces. The methods proposed in [12, 23, 29, 30, 42, 44,45,46, 49] are only applicable for solving problem (1) in the case when B is maximal monotone and A is inverse-strongly monotone (co-coercive) operator in real Hilbert spaces. Our Algorithms 3.3 and 3.11 are applicable for the case when B is maximal monotone and A is monotone operator even in 2-uniformly convex and uniformly smooth Banach spaces (e.g., \(L_p, 1<p\le 2\)). Our results in this paper also complement the results of [14, 22].

4 Application

In this section, we apply our results to the minimization of composite objective function of the type

$$\begin{aligned} \min _{x \in E} f(x)+g(x), \end{aligned}$$
(35)

where \(f:E\rightarrow \mathbb {R}\cup \{ +\infty \}\) is proper, convex and lower semi-continuous functional and \(g:E\rightarrow \mathbb {R}\) is convex functional.

Many optimization problems from image processing [9], statistical regression, machine learning (see, e.g., [50] and the references contained therein), etc can be adapted into the form of (35). In this setting, we assume that g represents the “smooth part” of the functional where f is assumed to be non-smooth. Specifically, we assume that g is G\(\hat{a}\)teaux-differentiable with derivative \(\nabla g\) which is Lipschitz-continuous with constant L. Then by [37, Theorem 3.13], we have

$$\begin{aligned} \langle \nabla g (x)-\nabla g(y), x-y\rangle \ge \frac{1}{L}\Vert \nabla g(x)-\nabla g(y)\Vert ^2, \quad \forall x, y \in E. \end{aligned}$$

Therefore, \(\nabla g\) is monotone and Lipschitz continuous with Lipschitz constant L. Observe that problem (35) is equivalent to find \( \in E\) such that

$$\begin{aligned} 0\in \partial f(x)+ \nabla g(x). \end{aligned}$$
(36)

Then problem (36) is a special case of inclusion problem (1) with \(A:=\nabla g\) and \(B:=\partial f\).

Next, we obtain the resolvent of \(\partial f\). Let us fix \(r>0\) and \(z \in E\). Suppose \(J_r^{\partial f}\) is the resolvent of \(\partial f\). Then

$$\begin{aligned} Jz \in J (J_r^{\partial f})+r \partial f(J_r^{\partial f}). \end{aligned}$$

Hence we obtain

$$\begin{aligned} 0\in \partial f(J_r^{\partial f})+\frac{1}{r}J (J_r^{\partial f})-\frac{1}{r}Jz =\partial \left( f+\frac{1}{2r}\Vert .\Vert ^2-\frac{1}{r}Jz \right) J_r^{\partial f}. \end{aligned}$$

Therefore,

$$\begin{aligned} J_r^{\partial f}(z)=\mathop {\mathrm{argmin}}\limits _{{y \in E}}\left\{ f(y)+\frac{1}{2r}\Vert y\Vert ^2-\frac{1}{r}\langle y,Jz\rangle \right\} . \end{aligned}$$

We can then write \(y_n\) in Algorithm 3.3 as

$$\begin{aligned} y_n=\mathop {\mathrm{argmin}}\limits _{{y \in E}}\left\{ f(y)+\frac{1}{2\lambda _n}\Vert y\Vert ^2-\frac{1}{\lambda _n}\langle y,Jx_n - \lambda _n \nabla g(x_n)\rangle \right\} . \end{aligned}$$

We obtain the following weak and strong convergence results for problem (35).

Theorem 4.1

Let E be a real 2-uniformly convex Banach space which is also uniformly smooth and the solution set S of problem (35) be nonempty. Suppose \( \{ \lambda _n \}_{n=1}^\infty \) satisfies the condition \(0<a\le \lambda _n\le b<\displaystyle \frac{1}{\sqrt{2\mu }\kappa L}\). Assume that J is weakly sequentially continuous on E and let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by

$$\begin{aligned} \left\{ \begin{array}{llll} &{} x_1 \in E,\\ &{} y_n=\mathop {\mathrm{argmin}}\limits _{{y \in E}}\Big \{f(y)+\frac{1}{2\lambda _n}\Vert y\Vert ^2-\frac{1}{\lambda _n}\langle y,Jx_n - \lambda _n \nabla g(x_n)\rangle \Big \}\\ &{} x_{n+1} = J^{-1}[Jy_n - \lambda _n(\nabla g(y_n)-\nabla g(x_n))],~~n \ge 1. \end{array} \right. \end{aligned}$$
(37)

Then \(\{x_n\}\) converges weakly to \(z\in S\). Moreover, \(z:=\lim _{n\rightarrow \infty } \Pi _S(x_n)\).

Theorem 4.2

Let E be a real 2-uniformly convex Banach space which is also uniformly smooth and the solution set S of problem (35) be nonempty. Suppose \( \{ \lambda _n \}_{n=1}^\infty \) satisfies the condition \(0<a\le \lambda _n\le b<\displaystyle \frac{1}{\sqrt{2\mu }\kappa L}\). Suppose that \(\{\alpha _n\}\) is a real sequence in (0, 1) with \(\lim _{n \rightarrow \infty } \alpha _n = 0 \) and \( \sum _{n = 1}^{\infty } \alpha _n = \infty \). Let the sequence \( \{x_n\}_{n=1}^\infty \) be generated by

$$\begin{aligned} \left\{ \begin{array}{llll} &{} x_1 \in E,\\ &{} y_n=\mathop {\mathrm{argmin}}\limits _{{y \in E}}\Big \{f(y)+\frac{1}{2\lambda _n}\Vert y\Vert ^2-\frac{1}{\lambda _n}\langle y,Jx_n - \lambda _n \nabla g(x_n)\rangle \Big \}\\ &{} w_n = J^{-1}[Jy_n - \lambda _n(\nabla g(y_n)-\nabla g(x_n))],\\ &{} x_{n+1} = J^{-1}[\alpha _nJx_1+(1-\alpha _n)Jw_n],~~n \ge 1. \end{array} \right. \end{aligned}$$
(38)

Then \(\{x_n\}\) converges strongly to \(z=\Pi _S(x_1)\).

Remark 4.3

  • Our result in Theorems 4.1 and 4.2 complement the results of Bredies [9, 19]. Consequently, our results in Sect. 3.1 extend the results of Bredies [9, 19] to inclusion problem (1). In particular, we do not assume boundedness of \(\{x_n\}\) (which was imposed on the results of [9, 19]) in our results. Therefore, our result improves on the results of [9, 19].

  • The minimization problem (35) in this section extends the problem studied in [8, 15, 34, 50] and other related papers from Hilbert spaces to Banach spaces.

5 Conclusion

We study the Tseng-type algorithm for finding a solution to monotone inclusion problem involving a sum of maximal monotone and a Lipschitz continuous monotone mapping in 2-uniformly convex Banach space which is also uniformly smooth. We prove both weak and strong convergence of sequences of iterates to the solution of the inclusion problem under some appropriate conditions. Many results on monotone inclusion problems with single maximal monotone operator can be considered as special cases of the problem studied in this paper. As far as we know, this is the first time an inclusion problem involving sum of maximal monotone and Lipschitz continuous monotone operators will be studied in Banach spaces. Therefore, the results of this paper open up many forthcoming results regarding the inclusion problem studied in this paper. Our next project involves the following.

  • The results in this paper exclude \(L_p\) spaces with \(p > 2\). Therefore, extension of the results in this paper to a more general reflexive Banach space will be desired.

  • How to effectively compute the duality mapping J and the resolvent of maximal monotone mapping B during implementations of our proposed algorithms will be considered further.

  • The numerical implementations of problem (1) arising from signal processing, image reconstruction, etc will be studied;

  • Other ways of implementation of the step-sizes \(\lambda _n\) to give faster convergence of the proposed methods in this paper will be given.