1 Introduction

Let H be a real Hilbert space with inner product \(\langle \cdot ,\cdot \rangle\) and induced norm \(||\cdot ||.\) Let C be a nonempty, closed and convex subset of H,  and let \(A:H\rightarrow H\) be a mapping. The variational inequality problem (VIP) is formulated as finding a point \(p\in C\) such that

$$\begin{aligned} \langle x-p, Ap \rangle \ge 0,\quad \forall ~x\in C. \end{aligned}$$
(1.1)

We denote the solution set of the VIP (1.1) by VI(CA). In the recent years, the VIP has received great research attention due to its wide areas of applications, such as in structural analysis, economics, optimization theory (Alakoya and Mewomo 2022; Aubin and Ekeland 1984; Ogwo et al. 2022), operations research, sciences and engineering (see Baiocchi and Capelo 1984; Censor et al. 2012a; Godwin et al. 2023; Kinderlehrer and Stampacchia 2000 and the references therein). The VIP is an important mathematical model that has been widely utilized to formulate and investigate a plethora of competitive equilibrium problems in various disciplines, such as traffic network equilibrium problems, spatial price equilibrium problems, oligopolistic market equilibrium problems, financial equilibrium problems, migration equilibrium problems, environmental network and ecology problems, knowledge network problems, supply chain network equilibrium problems, internet problems, etc., see, e.g., Geunes and Pardalos (2003), Nagurney (1999), Nagurney and Dong (2002) for further examples and details. The study of VIPs in finite dimensional spaces was initiated independently by Smith (1979) and Dafermos (1980). They set up the traffic assignment problem in terms of a finite dimensional VIP. On the other hand, Lawphongpanich and Hearn (1984), and Panicucci et al. (2007) studied the traffic assignment problems based on Wardrop user equilibrium principle via a variational inequality model. Since then, several other economics related problems like Nash equilibrium problem, spatial price equilibrium problems, internet problems, dynamic financial equilibrium problems and environmental network and ecology problems have been investigated via variational inequality problem (see Aussel et al. 2016; Ciarciá and Daniele 2016; Nagurney et al. 2007; Scrimali and Mirabella 2018).

There are two common approaches to solving the VIP, namely: the regularised methods and the projection methods. In this study, our interest is in the projection methods. The earliest and simplest projection method for solving the VIP is the projected gradient method (GM), which is presented as follows:

Algorithm 1.1 (Gradient Method (GM))

$$\begin{aligned} x_{n+1}= P_C(x_n- \lambda Ax_n), \end{aligned}$$
(1.2)

for each \(n\ge 1,\) where \(P_C\) denotes the metric projection map. Observe that the GM requires calculating only one projection per iteration onto the feasible C. However, the method only converges when the cost operator A is \(\alpha\)-strongly monotone and L-Lipschitz continuous, where \(\lambda \in \Big (0,\frac{2\alpha }{L^2}\Big ).\) These stringent conditions greatly limit the scope of applications of the GM (1.2).

In order to relax the conditions for the convergence of the GM to a solution of the VIP, Korpelevich (1976) and Antipin (1976) independently proposed the following extragradient method (EGM) in finite-dimensional Euclidean space:

Algorithm 1.2 (Extragradient Method (EGM))

$$\begin{aligned} {\left\{ \begin{array}{ll} x_1 \in C\\ y_n= P_C (x_n- \lambda Ax_n)\\ x_{n+1}= P_C (x_n- \lambda Ay_n),\\ \end{array}\right. } \end{aligned}$$
(1.3)

where \(\lambda \in {(0, \frac{1}{L})}\), \(A:\mathbb {R}^n\rightarrow \mathbb {R}^n\) is monotone and L-Lipschitz continuous. If the solution set VI(CA) is nonempty, the EGM (1.3) generates a sequence that converges to a solution of the VIP.

Observe that the EGM needs to compute two projections onto the feasible set C and two evaluations of the operator A per iteration. In general, computing projection onto an arbitrary closed and convex set C is complicated. This limitation can affect the efficiency of the EGM. In the recent years, the EGM has attracted the attention of researchers, who improved it in various ways (see, e.g., Ceng et al. 2021; Duong and Gibali 2019; Godwin et al. 2022; He et al. 2019) and the references therein. One of the major areas of improvement of the method is to minimize the number of projections onto the feasible set C per iteration (Thong et al. 2020; Uzor et al. 2022). Censor et al. (2011) initiated an attempt in this direction by modifying the EGM and replacing the second projection with a projection onto a half-space. The resulting method requires only one projection onto the feasible set C and is known as the subgradient extragradient method (SEGM). The SEGM is presented as follows:

Algorithm 1.3 (Subgradient Extragradient Method (SEGM))

$$\begin{aligned} {\left\{ \begin{array}{ll} x_1\in H,\\ y_n = P_C(x_n - \lambda Ax_n),\\ T_n = \{z\in H: \langle x_n - \lambda Ax_n - y_n, z-y_n \rangle \ \le 0\},\\ x_{n+1} = P_{T_n}(x_n - \lambda Ay_n). \end{array}\right. } \end{aligned}$$
(1.4)

Censor et al. (2011) obtained weak convergence result for the SEGM (1.4) under the same assumptions as the EGM (1.3). Since there is an explicit formula to calculate projection onto an half-space, the SEGM can be considered as an improvement over the EGM. However, we observe that the SEGM still requires computing one projection onto the closed convex set C per iteration. This can still be a great barrier to the implementation of the SEGM.

In order to address this limitation, Censor et al. (2012b) also proposed the following method called the two-subgradient method (TSEGM):

Algorithm 1.4 (Two-Subgradient Extragradient Method (TSEGM))

$$\begin{aligned} {\left\{ \begin{array}{ll} x_1\in H,\\ y_n = P_{C_n}(x_n - \lambda Ax_n),\\ C_n:=\{x\in H: c(w_n)+\langle \zeta _n, x-w_n\rangle \le 0\},\\ x_{n+1} = P_{C_n}(x_n - \lambda Ay_n), \end{array}\right. } \end{aligned}$$
(1.5)

where \(\zeta _n\in \partial c(x_n).\) Here, \(\partial c(x)\) denotes the sub-differential of the convex function \(c(\cdot )\) at x defined in (2.2).

The idea behind the TSEGM is in that any closed and convex set C can be expressed as

$$\begin{aligned} C=\{x\in H~|~c(x)\le 0\}, \end{aligned}$$
(1.6)

where \(c:H\rightarrow \mathbb {R}\) is a convex function. For instance, we can let \(c(x):=\text {dist}(x,C),\) where “\(\text {dist}\)” is the distance function. We observe that the two projections in the TSEGM (1.5) are made onto an half-space, which makes it easier to implement. However, the convergence of the TSEGM (1.5) was puzzling and was therefore posted as an open question by Censor et al. (2012b).

At this point, we briefly discuss the inertial technique. The inertial algorithm is based on a discrete version of the second-order dissipative dynamical system, which was first proposed by Polyak (1964). The main feature of the inertial-algorithm is that the method uses the previous two iterates to generate the next iterate. It is worth mentioning that this small change can greatly improve the convergence rate of an iterative method. In the recent years, many researchers have constructed very fast iterative methods by employing the inertial technique, see, e.g., Alakoya and Mewomo (2022), Alakoya et al. (2022), Gibali et al. (2020), Godwin et al. (2023), Wickramasinghe et al. (2023) and the references therein.

In 2019, Cao and Guo (2020) partially answered the open question posted by Censor et al. (2012b). by combining the inertial technique with the TSEGM and obtained a weak convergence result for the proposed algorithm (Algorithm 7.1) under the assumptions that the cost operator A is monotone, Lipschitz continuous, the convex function \(c:H\rightarrow \mathbb {R}\) in (1.6) is continuously differentiable and the G\(\hat{a}\)teaux differential \(c'(\cdot )\) is Lipschitz continuous.

We need to point out at this point that all the above methods are not applicable when the Lipschitz constant of the cost operator is unknown because the step size of the algorithms depends on prior knowledge of the Lipschitz constant of the cost operator. We also note that the ITSEGM proposed by Cao and Guo (2020) requires prior knowledge of the Lipschitz constant of the G\(\hat{a}\)teaux differential \(c'(\cdot )\) of \(c(\cdot ).\) In most cases, the Lipschitz constants of these operators are unknown or difficult to calculate. All of these drawbacks may hinder the implementation of these algorithms. Moreover, all the above methods only give weak convergence results under these stringent conditions.

Bauschke and Combettes (2001) pointed out that in solving optimization problems, strong convergent iterative methods are more applicable, and hence more desirable than their weak convergent counterparts. Thus, it is important to develop algorithms that generate strong convergence sequence when solving optimization problems.

Very recently, Ma and Wang (2022) tried to improve on the results of Cao and Guo (2020) by proposing a new TSEGM (Algorithm 7.2), which uses a self-adaptive step size such that the implementation of their algorithm does not require prior knowledge of the Lipschitz constant of the cost operator. However, we note that the implementation of their result also requires knowledge of the Lipschitz constant of the G\(\hat{a}\)teaux differential \(c'(\cdot )\) of \(c(\cdot ).\) Moreover, the authors were also only able to obtain weak convergence result for their proposed algorithm.

Considering the above review, it is pertinent to ask the following research questions:

Can we construct a new inertial two-subgradient extragradient method, which is applicable when the Lipschitz constant of the cost operator A and/or when the Lipschitz constant of the Gâteaux differential \(c'(\cdot )\) of \(c(\cdot )\) are unknown? Can we obtain a strong convergence result for this method?

In this paper, we provide affirmative answers to the above questions. More precisely, we introduce a new inertial two-subgradient extragradient method which does not require knowledge of the Lipschitz constant of the cost operator nor knowledge of the Lipschitz constant of the G\(\hat{a}\)teaux differential \(c'(\cdot )\) of \(c(\cdot ).\) This makes our results applicable to a larger class of problems. Moreover, we prove that the sequence generated by our proposed algorithm converges strongly to a minimum-norm solution of the VIP. In many practical problems, finding the minimum-norm solution is very important and useful. All of the above highlighted properties are some of the improvements of our proposed method over the results of Ma and Wang (2022), and Cao and Guo (2020). In addition, the proof of our strong convergence theorem does not rely on the usual “two cases approach” widely employed by authors to prove strong convergence results. We also point out that unlike several of the existing results in the literature, our method does not involve any linesearch technique which could be computationally expensive to implement (e.g., see Cai et al. 2022; Peeyada et al. 2020; Suantai et al. 2020) nor does it require evaluating any inner product function which is not easily evaluated unlike the norm function (see Muangchoo et al. 2021, Corollary 4.4). Rather, we employ a simple but very efficient self-adaptive step size technique, which generates a non-monotonic sequence of step sizes with less dependency on the initial step size. This makes our method more efficient and less expensive to implement. Moreover, we present several numerical experiments to demonstrate the computational advantage of our proposed method over the existing methods in the literature. Finally, we apply our result to image restoration problem. The results of the numerical experiments show that our method is more efficient than several of the existing methods in the literature. Clearly, our proposed method is economically viable and our results improve and generalize several of the existing results in the literature in this direction.

The rest of the paper is organized as follows: In Section 2, we recall some definitions and lemmas employed in the paper. In Section 3, we present our proposed algorithm and highlight some of its features. Convergence analysis of the proposed method is discussed in Section 4. In Section 5 we present some numerical experiments and apply our result to image restoration problem. Finally, in Section 6 we give a concluding remark.

2 Preliminaries

In what follows, we assume that C is a nonempty, closed and convex subset of a real Hilbert space H. We denote the weak and strong convergence of a sequence \(\{x_n\}\) to a point \(x \in H\) by \(x_n \rightharpoonup x\) and \(x_n \rightarrow x\), respectively and \(w_\omega (x_n)\) denotes set of weak limits of \(\{x_n\},\) that is,

$$\begin{aligned} w_\omega (x_n):= \{x\in H: x_{n_j}\rightharpoonup x~ \text {for some subsequence}~ \{x_{n_j}\}~ \text {of} ~\{x_{n}\}\}. \end{aligned}$$

Let H be a real Hilbert space, for a nonempty closed and convex subset C of H, the metric projection \(P_C: H\rightarrow C\) (Taiwo et al. 2021) is defined, for each \(x\in H,\) as the unique element \(P_Cx\in C\) such that

$$\begin{aligned} ||x - P_Cx|| = \inf \{||x-z||: z\in C\}. \end{aligned}$$

It is known that \(P_C\) is nonexpansive and has the following properties (Alakoya and Mewomo 2022; Uzor et al. 2022):

  1. 1.

    \(||P_Cx - P_Cy||^2 \le \langle P_Cx - P_Cy, x -y\rangle \;\;\; \text {for all}\,\, x, y\in C;\)

  2. 2.

    for any \(x\in H\) and \(z\in C, z = P_Cx\) if and only if

    $$\begin{aligned} \langle x - z, z - y\rangle \ge 0\;\;\; \text {for all}\,\, y\in C; \end{aligned}$$
    (2.1)
  3. 3.

    for any \(x\in H\) and \(y\in C,\)

    $$\begin{aligned} ||P_Cx - y||^2 + ||x - P_Cx||^2 \le ||x - y||^2; \end{aligned}$$
  4. 4.

    for any \(x,y\in H\) with \(y\ne 0,\) let \(Q = \{z\in H: \langle y, z-x \rangle \le 0\}.\) Then, for all \(u\in H,\) \(P_Q(u)\) is given by

    $$\begin{aligned} P_Q(u) = u - \max \Big \{0, \frac{\langle y, u-x \rangle }{||y||^2}\Big \}y, \end{aligned}$$

    which gives an explicit formula for calculating the projection of any given point onto a half-space.

Lemma 2.1

Let H be a real Hilbert space. Then the following results hold for all \(x,y\in H\) and \(\delta \in \mathbb {R}:\)

  1. (i)

    \(||x + y||^2 \le ||x||^2 + 2\langle y, x + y \rangle ;\)

  2. (ii)

    \(||x + y||^2 = ||x||^2 + 2\langle x, y \rangle + ||y||^2;\)

  3. (iii)

    \(||\delta x + (1-\delta ) y||^2 = \delta ||x||^2 + (1-\delta )||y||^2 -\delta (1-\delta )||x-y||^2.\)

Definition 2.2

An operator \(A:H\rightarrow H\) is said to be

  1. (i)

    \(\alpha\)-strongly monotone, if there exists \(\alpha >0\) such that

    $$\begin{aligned} \langle x-y, Ax-Ay\rangle \ge \alpha \Vert x-y\Vert ^2,~~ \forall ~x,y \in H; \end{aligned}$$
  2. (ii)

    \(\alpha\)-inverse strongly monotone (\(\alpha\)-cocoercive), if there exists a positive real number \(\alpha\) such that

    $$\begin{aligned} \langle x-y, Ax-Ay \rangle \ge \alpha ||Ax-Ay||^2,\quad \forall ~ x,y\in H; \end{aligned}$$
  3. (iii)

    monotone, if

    $$\begin{aligned} \langle x-y, Ax-Ay \rangle \ge 0,\quad \forall ~ x,y\in H; \end{aligned}$$
  4. (iv)

    L-Lipschitz continuous, if there exists a constant \(L>0\) such that

    $$\begin{aligned} ||Ax-Ay||\le L||x-y|| ,\quad \forall ~ x,y\in H; \end{aligned}$$

It is known that if A is \(\alpha\)-strongly monotone and L-Lipschitz continuous, then A is \(\frac{\alpha }{L^2}\)-inverse strongly monotone. Furthermore, \(\alpha\)-inverse strongly monotone operators are \(\frac{1}{\alpha }\)-Lipschitz continuous and monotone but the converse is not true.

Definition 2.3

Bauschke and Combettes (2017) A function \(c:H\rightarrow \mathbb {R}\) is said to be G\(\hat{a}\)teaux differentiable at \(x\in H,\) if there exists an element denoted by \(c^{\prime }(x) \in H\) such that

$$\begin{aligned} \lim _{h\rightarrow 0}\frac{c(x+hv)-c(x)}{h}=\langle v, c^{\prime }(x)\rangle ,~~\forall ~~ v\in H, ~~h\in [0,1], \end{aligned}$$

where \(c^{\prime }(x)\) is called the G\(\hat{a}\)teaux differential of c at x. Recall that if for each \(x\in H\), c is G\(\hat{a}\)teaux differentiable at x, then c is G\(\hat{a}\)teaux differentiable on H.

Definition 2.4

Bauschke and Combettes (2017) A convex set \(c:H\rightarrow \mathbb {R}\) is said to be subdifferentiable at a point \(x\in H\) if the set

$$\begin{aligned} \partial {c}(x)=\{\zeta \in H~|~c(y)\ge c(x)+\langle \zeta , y-x\rangle ,\quad ~~\forall y\in H\} \end{aligned}$$
(2.2)

is nonempty. Each element in \(\partial {c}(x)\) is called a subgradient of c at x. We note that if c is subdifferentiable at each \(x\in H\), then c is subdifferentiable on H. It is also known that if c is G\(\hat{a}\)teaux differentiable at x, then c is subdifferentiable at x and \(\partial {c}(x)=\{c^{\prime }(x)\}\).

Definition 2.5

Let H be a real Hilbert space. A function \(c:H\rightarrow \mathbb {R}\cup \{+\infty \}\) is said to be weakly lower semi-continuous (w-lsc) at \(x\in H,\) if

$$\begin{aligned} c(x)\le \liminf _{n\rightarrow \infty } c(x_n) \end{aligned}$$

holds for every sequence \(\{x_n\}\) in H satisfying \(x_n\rightharpoonup x.\)

Lemma 2.6

Bauschke and Combettes (2017) Let \(c:H\rightarrow \mathbb {R}\cup \{+\infty \}\) be convex. Then the following are equivalent:

  1. (i)

    c is weakly sequential lower semi-continuous;

  2. (ii)

    c is lower semi-continuous.

Lemma 2.7

He and Xu (2013) Assume that the solution set \(\text{ VI }(C,A)\) of the VIP (1.1) is nonempty, and C is defined as \(C:=\{x\in H~|~c(x)\le 0\}\), where \(c:H\rightarrow \mathbb {R}\) is a continuously differentiable convex function. Given \(p\in C\). Then \(p\in \text{ VI }(C,A)\) if and only if either

  1. 1.

    \(Ap=0,\) or

  2. 2.

    \(p\in \partial {C}\) and there exists \(\eta _p>0\) such that \(Ap=-\eta _p c^{\prime }(p),\) where \(\partial {C}\) denotes the boundary of C.

Lemma 2.8

Dong et al. (2018) Let C be a nonempty closed and convex subset of H. Let \(A:C \rightarrow H\) be a continuous, monotone mapping and \(z\in C\), then

$$\begin{aligned} z \in VI(C,A) \iff \langle Ax, x-z \rangle \ge 0,\quad \forall ~x\in C. \end{aligned}$$

Lemma 2.9

Tan and Xu (1993) Suppose \(\{\lambda _n\}\) and \(\{\phi _n\}\) are two nonnegative real sequences such that

$$\begin{aligned} \lambda _{n+1}\le \lambda _n + \phi _n,\quad \forall n\ge 1. \end{aligned}$$

If \(\sum _{n=1}^{\infty }\phi _n<+\infty ,\) then \(\lim \limits _{n\rightarrow \infty }\lambda _n\) exists.

Lemma 2.10

Saejung and Yotkaew (2012) Let \(\{a_n\}\) be a sequence of nonnegative real numbers, \(\{\alpha _{n}\}\) be a sequence in (0, 1) with the condition: \(\sum _{n=1}^{\infty }\alpha _{n}=\infty\) and \(\{b_n\}\) be a sequence of real numbers. Assume that

$$\begin{aligned} a_{n+1}\le (1-\alpha _{n})a_n + \alpha _{n}b_n,~~\forall n\ge 1. \end{aligned}$$

If \(\limsup _{k\rightarrow \infty }b_{n_k}\le 0\) for every subsequence \(\{a_{n_k}\}\) of \(\{a_n\}\) satisfying the condition:

$$\begin{aligned} \liminf _{k\rightarrow \infty }(a_{n_k+1}-a_{n_k})\ge 0, \end{aligned}$$

then \(\lim _{k\rightarrow \infty }a_n=0.\)

3 Proposed Method

In this section, we present our proposed method and discuss some of its important features. We begin with the following assumptions under which our strong convergence result is obtained.

Assumption 3.1

Suppose that the following conditions hold:

  1. 1.

    The set C is defined by

    $$\begin{aligned} C=\{x\in H~|~c(x)\le 0\}; \end{aligned}$$
    (3.1)

    where \(c:H\rightarrow \mathbb {R}\) is a continuously differentiable convex function such that \(c^{\prime }(\cdot )\) is \(L_1\)-Lipschitz continuous (however, prior knowledge of the Lipschitz constant is not required).

  2. 2.
    1. (a)

      \(A:H \rightarrow H\) is monotone and \(L_2\)-Lipschitz continuous (however, prior knowledge of the Lipschitz constant is not needed).

    2. (b)

      There exists \(K>0\) such that \(\Vert Ax\Vert \le K\Vert c'(x)\Vert\) for all \(x\in \partial C.\)

    3. (c)

      The solution set VI(CA) is nonempty.

  3. 3.

    \(\{\alpha _n\}^{\infty }_{n=1},~~\{\beta _n\}^{\infty }_{n=1}\) and \(\{\xi _n\}^{\infty }_{n=1}\) are non-negative sequences satisfying the following conditions:

    1. (a)

      \(\alpha _n \in (0,1), ~~\) \(\lim \limits _{n\rightarrow \infty } \alpha _n=0,~~\sum _{n=1}^{\infty }\alpha _n= \infty , \lim \limits _{n\rightarrow \infty }\dfrac{\xi _n}{\alpha _n}=0, \theta>0,\lambda _1>0.\)

    2. (b)

      \(\{\beta _n\}\subset [a,b]\subset (0,1-\alpha _n),\delta \in \Big (0,-K+\sqrt{1+K^2}\Big ).\)

    3. (c)

      Let \(\{\phi _n\}\) be a nonnegative sequence such that \(\sum _{n=1}^\infty \phi _n<+\infty .\)

Algorithm 3.2

Step 0.:

Select two arbitrary initial points \(x_0, x_1\in H\) and set \(n=1.\)

Step 1.:

Given the \((n-1)th\) and nth iterates, choose \(\theta _n\) such that \(0\le \theta _n\le \hat{\theta }_n\) with \(\hat{\theta }_n\) defined by

$$\begin{aligned} \hat{\theta }_n = {\left\{ \begin{array}{ll} \min \Big \{\theta ,~ \frac{\xi _n}{\Vert x_n - x_{n-1}\Vert }\Big \}, \quad \text {if}~ x_n \ne x_{n-1},\\ \theta , \hspace{95pt} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(3.2)
Step 2.:

Compute

$$\begin{aligned} w_n=x_n + \theta _n(x_n-x_{n-1}), \end{aligned}$$
Step 3.:

Construct the half-space

$$\begin{aligned} C_n=\{x\in H: c(w_n)+\langle c'(w_n), x-w_n\rangle \le 0\}, \end{aligned}$$

and compute

$$\begin{aligned} y_n= P_{C_n}(w_n-\lambda _nAw_n) \end{aligned}$$
$$\begin{aligned} z_n= P_{C_n}(w_n-\lambda _nAy_n) \end{aligned}$$

If \(c(y_n)\le 0\) and either \(w_n-y_n=0\) or \(Ay_n=0,\) then stop and \(y_n\) is a solution of the VIP. Otherwise, go to Step 4.

Step 4.:

Compute

$$\begin{aligned} x_{n+1}= (1-\alpha _n-\beta _n)w_n+\beta _nz_n. \end{aligned}$$
Step 5.:

Compute

$$\begin{aligned} \lambda _{n+1}={\left\{ \begin{array}{ll} \min \left\{ \frac{\delta \Vert w_n-y_n\Vert }{\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert },~\lambda _n+\phi _n\right\} ,&{} \text{ if }~\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert \ne 0,\\ \lambda _n+\phi _n,&{} \text{ otherwise }. \end{array}\right. } \end{aligned}$$
(3.3)

Set \(n=n+1\) and go back to Step 1.

Remark 3.3

  • Observe that unlike the results of Cao and Guo (2020) and Ma and Wang (2022) knowledge of the Lipschitz constant of the cost operator A and knowledge of the Lipschitz constant of the G\(\hat{a}\)teaux differential \(c'(\cdot )\) of \(c(\cdot )\) are not required to implement our algorithm.

  • Moreover, we need to point out that our algorithm does not require any linesearch technique, rather we employ a more efficient step size rule in (3.3) which generates a non-monotonic sequence of step sizes. The step size is constructed such that it reduces the dependence of the algorithm on the initial step size \(\lambda _1.\)

  • We also remark that our proposed algorithm generates a strong convergence sequence, which converges to a minimum-norm solution of the VIP.

Remark 3.4

  1. (i)

    Observe that by the definition of C in (3.1) and the construction of \(C_n,\) we have \(C\subset C_n.\)

  2. (ii)

    By Assumption 3.1 3(a), it can easily be verified from (3.2) that

    $$\begin{aligned} \lim _{n\rightarrow \infty }\theta _n||x_n - x_{n-1}|| = 0 ~~ \text {and}~~ \lim _{n\rightarrow \infty }\frac{\theta _n}{\alpha _n}||x_n - x_{n-1}|| = 0. \end{aligned}$$

Remark 3.5

Observe that by (3.1) and Lemma 4.5 together with the formulation of the variational inequality problem, it is clear that if \(c(y_n)\le 0\) and either \(w_n-y_n=0\) or \(Ay_n=0,\) then \(y_n\) is a solution of the VIP.

4 Convergence Analysis

First, we establish some lemmas which will be needed to prove our strong convergence theorem for the proposed algorithm.

Lemma 4.1

Let \(\{\lambda _n\}\) be a sequence generated by Algorithm 3.2. Then, we have \(\lim \limits _{n\rightarrow \infty }\lambda _n=\lambda ,\) where \(\lambda \in \Big [\text{ min }\{\frac{\delta }{L_2+L_1},\lambda _1\},\lambda _1+\Phi \Big ]\) for some positive constant M and \(\Phi =\sum \limits _{n=1}^{\infty }\phi _n.\)

Proof

Since A is \(L_2\)-Lipschitz continuous and \(c'(\cdot )\) is \(L_1\)-Lipschitz continuous, then for the case \(\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert \ne 0\) for all \(n\ge 1\) we have

$$\begin{aligned} \begin{aligned} \frac{\delta \Vert w_n-y_n\Vert }{\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert }&\ge \frac{\delta \Vert w_n-y_n\Vert }{L_2\Vert w_n-y_n\Vert + L_1\Vert w_n-y_n\Vert }\\&=\frac{\delta }{L_2+L_1}. \end{aligned} \end{aligned}$$

Thus, by the definition of \(\lambda _{n+1},\) the sequence \(\{\lambda _n\}\) has lower bound \(\min \{\frac{\delta }{L_2+L_1},\lambda _1\}\) and upper bound \(\lambda _1 + \Phi .\) By Lemma 2.9, we have that \(\lim \limits _{n\rightarrow \infty }\lambda _n\) exists and denoted by \(\lambda =\lim \limits _{n\rightarrow \infty }\lambda _n.\) It is clear that \(\lambda \in \big [\min \{\frac{\delta }{L_2+L_1},\lambda _1\},\lambda _1+\Phi \big ]\).\(\square\)

Lemma 4.2

Let \(\{x_n\}\) be a sequence generated by Algorithm 3.2 under Assumption 3.1. Then, the following inequality holds for all \(p\in VI(C,A):\)

$$\begin{aligned} \Vert z_n-p\Vert ^2\le \Vert w_n-p\Vert ^2-\Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\Vert w_n-y_n\Vert ^2. \end{aligned}$$
(4.1)

Proof

From (3.3), we have

$$\begin{aligned} \lambda _{n+1}&=\min \Big \{\frac{\delta \Vert w_n-y_n\Vert }{\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert },\lambda _n+\phi _n\Big \} \\& \le \frac{\delta \Vert w_n-y_n\Vert }{\Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert }, \end{aligned}$$

which implies that

$$\begin{aligned} \Vert Aw_n-Ay_n\Vert +\Vert c'(w_n)-c'(y_n)\Vert \le \frac{\delta }{\lambda _{n+1}}\Vert w_n-y_n\Vert ,\quad \forall n\ge 1. \end{aligned}$$
(4.2)

Let \(p\in VI(C,A).\) For convenience, we set \(v_n=w_n-\lambda _nAy_n\) and we have \(z_n=P_{C_n}(v_n).\) Since \(p\in C\subset C_n,\) then by applying (2.1) we obtain

$$\begin{aligned} \Vert z_n-p\Vert ^2 = \Vert P_{C_n}(v_n)-p\Vert ^2\le \Vert v_n-p\Vert ^2 - \Vert v_n-P_{C_n}(v_n)\Vert ^2. \end{aligned}$$
(4.3)

Observe that

$$\begin{aligned} & \Vert v_n-p\Vert ^2 - \Vert v_n-P_{C_n}(v_n)\Vert ^2\\& \quad =\Vert (w_n-p)-\lambda _nAy_n\Vert ^2-\Vert (w_n-z_n)-\lambda _nAy_n\Vert ^2\\& \quad = \Vert w_n-p\Vert ^2-2\lambda _n\langle w_n-p, Ay_n \rangle \\&\qquad -\, \Vert w_n-z_n\Vert ^2 + 2\lambda _n\langle w_n-z_n,Ay_n \rangle \\& \quad = \Vert w_n-p\Vert ^2- \Vert w_n-z_n\Vert ^2 + 2\lambda _n\langle p-z_n,Ay_n \rangle \\& \quad = \Vert w_n-p\Vert ^2- \big [\Vert w_n-y_n\Vert ^2+ \Vert y_n-z_n\Vert ^2 \\&\qquad +\, 2\langle w_n-y_n, y_n-z_n\rangle \big ] + 2\lambda _n\langle p-y_n,Ay_n \rangle \\&\qquad +\, 2\lambda _n\langle y_n-z_n,Ay_n \rangle \\& \quad = \Vert w_n-p\Vert ^2- \Vert w_n-y_n\Vert ^2- \Vert y_n-z_n\Vert ^2 \\&\qquad +\, 2\lambda _n\langle p-y_n,Ay_n \rangle + 2\langle z_n-y_n, w_n-\lambda _nAy_n-y_n\rangle \\& \quad = \Vert w_n-p\Vert ^2- \Vert w_n-y_n\Vert ^2- \Vert y_n-z_n\Vert ^2 \\&\qquad +\, 2\lambda _n\langle p-y_n,Ay_n \rangle + 2\langle z_n-y_n, w_n-\lambda _nAw_n-y_n\rangle \\&\qquad +\, 2\lambda _n \langle z_n-y_n, Aw_n-Ay_n\rangle . \end{aligned}$$
(4.4)

Since \(y_n=P_{C_n}(w_n-\lambda _nAw_n)\) and \(z_n\in C_n,\) it follows from (2.1) that

$$\begin{aligned} \langle w_n-\lambda _nAw_n-y_n,z_n-y_n \rangle \le 0. \end{aligned}$$
(4.5)

Applying Young’s inequality and (4.2), we get

$$\begin{aligned} \begin{aligned} 2\lambda _n\langle z_n-y_n, Aw_n-Ay_n \rangle&\le \Vert z_n-y_n\Vert ^2 + \lambda _n^2\Vert Aw_n-Ay_n\Vert ^2\\&\le \Vert z_n-y_n\Vert ^2 + \delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2} \Vert w_n-y_n\Vert ^2. \end{aligned} \end{aligned}$$
(4.6)

Furthermore, by the monotonicity of A,  we get

$$\begin{aligned} \langle p-y_n,Ay_n \rangle \le \langle p-y_n,Ap \rangle \end{aligned}$$
(4.7)

Now, applying (4.4)-(4.7) in (4.3) we get

$$\begin{aligned} \Vert z_n-p\Vert ^2\le \Vert w_n-p\Vert ^2 -\Big (1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}\Big )\Vert w_n-y_n\Vert ^2 + 2\lambda _n\langle p-y_n,Ap \rangle . \end{aligned}$$
(4.8)

Now, we consider the following two cases:

Case 1:

: \(Ap=0.\) If \(Ap=0,\) then from (4.8) the desired inequality (4.1) follows.

Case 2:

: \(Ap\ne 0.\) By Lemma 2.7, \(p\in \partial C\) and there exists \(\eta _p >0\) such that \(Ap=-\eta _p c'(p).\) Since \(p\in \partial C,\) then \(c(p)=0.\) By the sub-differential inequality (2.2), we have

$$\begin{aligned} \begin{aligned} c(y_n)&\ge c(p)+\langle c'(p), y_n-p\rangle \\&=\frac{-1}{\eta _p}\langle Ap, y_n-p\rangle . \end{aligned} \end{aligned}$$

From the last inequality we obtain

$$\begin{aligned} \langle p-y_n, Ap\rangle \le \eta _p c(y_n). \end{aligned}$$
(4.9)

Since \(y_n\in C_n,\) we obtain

$$\begin{aligned} c(w_n)+\langle c'(w_n), y_n-w_n\rangle \le 0. \end{aligned}$$
(4.10)

Again, by the sub-differential inequality (2.2), we get

$$\begin{aligned} c(y_n)+\langle c'(y_n), w_n-y_n\rangle \le c(w_n). \end{aligned}$$
(4.11)

Adding (4.10) and (4.11), we have

$$\begin{aligned} c(y_n)\le \langle c'(y_n)-c'(w_n), y_n-w_n\rangle . \end{aligned}$$
(4.12)

From (4.9) and (4.12), we get

$$\begin{aligned} \langle p-y_n, Ap\rangle \le \eta _p\langle c'(y_n)-c'(w_n), y_n-w_n\rangle . \end{aligned}$$
(4.13)

Observe that by Assumption 3.1 2(b)

$$\begin{aligned} \eta _p\le K. \end{aligned}$$
(4.14)

Hence, we have

$$\begin{aligned} & 2\lambda _n\eta _p\langle c'(y_n)-c'(w_n), y_n-w_n\rangle \\&\quad \le 2\lambda _n\eta _p\Vert c'(y_n)-c'(w_n)\Vert \Vert y_n-w_n\Vert \\&\quad \le 2\lambda _nK\Vert c'(y_n)-c'(w_n)\Vert \Vert y_n-w_n\Vert . \end{aligned}$$
(4.15)

Applying (4.2), (4.13) and (4.15) in (4.8), we obtain

$$\begin{aligned} \Vert z_n-p\Vert ^2&\le \Vert w_n-p\Vert ^2 -\Big (1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}\Big )\Vert w_n-y_n\Vert ^2 \\&\quad\; +\, 2\lambda _nK\Vert c'(y_n)-c'(w_n)\Vert \Vert y_n-w_n\Vert \\&\le \Vert w_n-p\Vert ^2 -\Big (1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}\Big )\Vert w_n-y_n\Vert ^2 \\&\quad\; +\, 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Vert w_n-y_n\Vert ^2\\&= \Vert w_n-p\Vert ^2 -\Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\\&\quad\,\,\Vert w_n-y_n\Vert ^2, \end{aligned}$$
(4.16)

which is the required inequality (4.1).\(\square\)

Since the limit of \(\{\lambda _n\}\) exists, \(\lim \limits _{n\rightarrow \infty }\lambda _n=\lim \limits _{n\rightarrow \infty }\lambda _{n+1}.\) Hence, by the conditions on the control parameters we have

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ] = \big [1-\delta ^2-2\delta K \big ]>0. \end{aligned}$$
(4.17)

Therefore, there exists \(n_0\ge 1\) such that for all \(n\ge n_0\) we have

$$\begin{aligned} \lim \limits _{n\rightarrow \infty }\Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]>0. \end{aligned}$$

Thus, from (4.1) we have that for all \(n\ge n_0,\)

$$\begin{aligned} \Vert z_n-p\Vert \le \Vert w_n-p\Vert . \end{aligned}$$
(4.18)

Lemma 4.3

Let \(\{x_n\}\) be a sequence generated by Algorithm 3.2 under Assumption 3.1. Then, \(\{x_n\}\) is bounded.

Proof

Let \(p\in VI(C,A).\) Then, by the definition of \(w_n\) we have

$$\begin{aligned} \begin{aligned} \Vert w_n-p\Vert&=\Vert x_n+ \theta _n(x_n-x_{n-1})-p\Vert \\&\le \Vert x_n-p\Vert +\theta _n\Vert x_n-x_{n-1}\Vert \\&=\Vert x_n-p\Vert + \alpha _n\dfrac{\theta _n}{\alpha _n}\Vert x_n-x_{n-1}\Vert . \end{aligned} \end{aligned}$$
(4.19)

By Remark 3.4 (ii.), there exists \(M_1 > 0\) such that

$$\begin{aligned} \dfrac{\theta _n}{\alpha _n}\Vert x_n-x_{n-1}\Vert \le M_1, ~~~\forall ~ n\ge 1. \end{aligned}$$

Thus, it follows from (4.19) that

$$\begin{aligned} \Vert w_n-p\Vert \le \Vert x_n-p\Vert + \alpha _nM_1,~~\forall n\ge 1. \end{aligned}$$
(4.20)

By the definition of \(x_{n+1},\) we have

$$\begin{aligned} \begin{aligned} \Vert x_{n+1}-p\Vert&= \Vert (1-\alpha _n-\beta _n)(w_n-p) + \beta _n(z_n-p)-\alpha _np\Vert \\&\le \Vert (1-\alpha _n-\beta _n)(w_n-p) + \beta _n(z_n-p)\Vert + \alpha _n\Vert p\Vert . \end{aligned} \end{aligned}$$
(4.21)

Applying Lemma 2.1(ii) and using (4.18) we have

$$\begin{aligned} & \Vert (1-\alpha _n-\beta _n)(w_n-p)+ \beta _n(z_n-p)\Vert ^2\\& \quad = (1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 \\&\qquad +\; 2(1-\alpha _n-\beta _n)\beta _n\langle w_n-p, z_n-p \rangle \\&\qquad +\, \beta _n^2\Vert z_n-p\Vert ^2\\& \quad \le (1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 \\&\qquad +\, 2(1-\alpha _n-\beta _n)\beta _n\Vert w_n-p\Vert \Vert z_n-p\Vert \\&\qquad +\, \beta _n^2\Vert z_n-p\Vert ^2\\& \quad \le (1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 \\&\qquad +\, (1-\alpha _n-\beta _n)\beta _n\big [\Vert w_n-p\Vert ^2 \\&\qquad +\, \Vert z_n-p\Vert ^2\big ] + \beta _n^2\Vert z_n-p\Vert ^2\\& \quad =(1-\alpha _n-\beta _n)(1-\alpha _n)\Vert w_n-p\Vert ^2 \\&\qquad +\, \beta _n(1-\alpha _n)\Vert z_n-p\Vert ^2\\& \quad \le (1-\alpha _n-\beta _n)(1-\alpha _n)\Vert w_n-p\Vert ^2 \\&\qquad +\, \beta _n(1-\alpha _n)\Vert w_n-p\Vert ^2\\& \quad =(1-\alpha _n)^2\Vert w_n-p\Vert ^2, \end{aligned}$$

which implies that

$$\begin{aligned} \Vert (1-\alpha _n-\beta _n)(w_n-p) + \beta _n(z_n-p)\Vert \le (1-\alpha _n)\Vert w_n-p\Vert . \end{aligned}$$
(4.22)

Now, applying (4.20) and (4.22) in (4.21), we have for all \(n\ge n_0\)

$$\begin{aligned} \begin{aligned} \Vert x_{n+1}-p\Vert&\le (1-\alpha _n)\Vert w_n-p\Vert + \alpha _n\Vert p\Vert \\&\le (1-\alpha _n)\big [\Vert x_n-p\Vert + \alpha _nM_1\big ] + \alpha _n\Vert p\Vert \\&\le (1-\alpha _n)\Vert x_n-p\Vert + \alpha _n\big [\Vert p\Vert + M_1 \big ]\\&\le \max \big \{\Vert x_n-p\Vert , \Vert p\Vert + M_1 \big \}\\&~~\vdots \\&\le \max \big \{\Vert x_{n_0}-p\Vert , \Vert p\Vert + M_1 \big \}. \end{aligned} \end{aligned}$$

Hence, \(\{x_n\}\) is bounded. Consequently, \(\{w_n\}, \{y_n\}\) and \(\{z_n\}\) are all bounded.\(\square\)

Lemma 4.4

Suppose \(\{x_n\}\) is a sequence generated by Algorithm 3.2 such that Assumption 3.1 holds. Then, the following inequality holds for all \(p\in VI(C,A):\)

$$\begin{aligned} \begin{aligned} \Vert x_{n+1}-p\Vert ^2&\le (1-\alpha _n)^2\Vert x_n-p\Vert ^2 + 3M_2\alpha _n(1-\alpha _n)^2\frac{\theta _n}{\alpha _n}\Vert x_n-x_{n-1}\Vert \\&-\beta _n(1-\alpha _n)\Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\Vert w_n-y_n\Vert ^2 + 2\alpha _n\langle p,p-x_{n+1} \rangle . \end{aligned} \end{aligned}$$

Proof

Let \(p\in VI(C,A).\) Then, by applying Lemma 2.1(ii) together with the Cauchy-Schwartz inequality we have

$$\begin{aligned} \begin{aligned} \Vert w_n - p\Vert ^2&= \Vert x_n + \theta _n(x_n - x_{n-1}) - p\Vert ^2\\&= \Vert x_n - p\Vert ^2 + \theta _n^2\Vert x_n - x_{n-1}\Vert ^2 + 2\theta _n\langle x_n - p, x_n - x_{n-1} \rangle \\&\le \Vert x_n - p\Vert ^2 + \theta _n^2\Vert x_n - x_{n-1}\Vert ^2 + 2\theta _n\Vert x_n - x_{n-1}\Vert \Vert x_n - p\Vert \\&= \Vert x_n - p\Vert ^2 + \theta _n\Vert x_n - x_{n-1}\Vert (\theta _n\Vert x_n - x_{n-1}\Vert + 2\Vert x_n - p\Vert )\\&\le \Vert x_n - p\Vert ^2 + 3M_2\theta _n\Vert x_n - x_{n-1}\Vert \\&= \Vert x_n - p\Vert ^2 + 3M_2\alpha _n\frac{\theta _n}{\alpha _n}\Vert x_n - x_{n-1}\Vert , \end{aligned} \end{aligned}$$
(4.23)

where \(M_2:= \sup _{n\in \mathbb {N}}\{\Vert x_n - p\Vert , \theta _n\Vert x_n - x_{n-1}\Vert \}>0.\)

Next, by applying (4.2) together with the nonexpansiveness of \(P_{C_n}\) we have

$$\begin{aligned} \begin{aligned} \Vert z_n-w_n\Vert&\le \Vert z_n-y_n\Vert + \Vert w_n-y_n\Vert \\&= \Vert P_{C_n}(w_n-\lambda _nAy_n) - P_{C_n}(w_n-\lambda _nAw_n) + \Vert w_n-y_n\Vert \\&\le \lambda _n\Vert Aw_n-Ay_n\Vert + \Vert w_n-y_n\Vert \\&\le \delta \frac{\lambda _n}{\lambda _{n+1}}\Vert w_n-y_n\Vert + \Vert w_n-y_n\Vert \\&= \Big (1+\delta \frac{\lambda _n}{\lambda _{n+1}}\Big )\Vert w_n-y_n\Vert . \end{aligned} \end{aligned}$$
(4.24)

Using (4.23) and applying Lemmas 2.1 and 4.2, we have

$$\begin{aligned} \begin{aligned} \Vert x_{n+1}-p\Vert ^2&= \Vert (1-\alpha _n-\beta _n)(w_n-p) + \beta _n(z_n-p)-\alpha _np\Vert ^2\\&\le \Vert (1-\alpha _n-\beta _n)(w_n-p) + \beta _n(z_n-p)\Vert ^2 \\&\quad\; -\, 2\alpha _n\langle p,x_{n+1}-p \rangle \\&=(1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 + \beta _n^2\Vert z_n-p\Vert ^2 \\&\quad\; +\, 2\beta _n(1-\alpha _n-\beta _n)\langle w_n-p,z_n-p \rangle \\&\quad\; +\, 2\alpha _n\langle p,p-x_{n+1} \rangle \\&\le (1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 + \beta _n^2\Vert z_n-p\Vert ^2 \\&\quad\; +\, 2\beta _n(1-\alpha _n-\beta _n)\Vert w_n-p\Vert \Vert z_n-p\Vert \\&\quad\; +\, 2\alpha _n\langle p,p-x_{n+1} \rangle \\&\le (1-\alpha _n-\beta _n)^2\Vert w_n-p\Vert ^2 + \beta _n^2\Vert z_n-p\Vert ^2 \\&\quad\; +\, \beta _n(1-\alpha _n-\beta _n)\big [\Vert w_n-p\Vert ^2 +\Vert z_n-p\Vert ^2\big ] \\&\quad\; +\, 2\alpha _n\langle p,p-x_{n+1} \rangle \\&\le (1-\alpha _n-\beta _n)(1-\alpha _n)\Vert w_n-p\Vert ^2 \\&\quad\; +\, \beta _n(1-\alpha _n)\Vert z_n-p\Vert ^2 + 2\alpha _n\langle p,p-x_{n+1} \rangle \\&\le (1-\alpha _n-\beta _n)(1-\alpha _n)\Vert w_n-p\Vert ^2 \\&\quad\; +\, \beta _n(1-\alpha _n)\Vert w_n-p\Vert ^2\\&-\beta _n(1-\alpha _n) \Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\\&\quad\;\Vert w_n-y_n\Vert ^2+ 2\alpha _n\langle p,p-x_{n+1} \rangle \\&=(1-\alpha _n)^2\Vert w_n-p\Vert ^2 \\&\quad\; -\,\beta _n(1-\alpha _n) \Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\\&\quad\;\Vert w_n-y_n\Vert ^2 + 2\alpha _n\langle p,p-x_{n+1} \rangle \\&\le (1-\alpha _n)^2||x_n - p||^2 \\&\quad\; +\, 3M_2\alpha _n(1-\alpha _n)^2\frac{\theta _n}{\alpha _n}\Vert x_n - x_{n-1}\Vert \\&-\beta _n(1-\alpha _n) \Big [1-\delta ^2\frac{\lambda _n^2}{\lambda _{n+1}^2}- 2\delta K\frac{\lambda _n}{\lambda _{n+1}}\Big ]\Vert w_n-y_n\Vert ^2 \\&\quad\; +\, 2\alpha _n\langle p,p-x_{n+1} \rangle , \end{aligned} \end{aligned}$$

which is the required inequality.\(\square\)

Lemma 4.5

Let \(\{w_n\}\) and \(\{y_n\}\) be two sequences generated by Algorithm 3.2 under Assumption 3.1. If there exists a subsequence \(\{w_{n_k}\}\) of \(\{w_n\},\) which converges weakly to \(x^*\in H\) and \(\lim \limits _{k\rightarrow \infty }\Vert w_{n_k}-y_{n_k}\Vert =0,\) then \(x^*\in VI(C,A).\)

Proof

Suppose \(\{w_n\}\) and \(\{y_n\}\) are two sequences generated by Algorithm 3.2 with subsequences \(\{w_{n_k}\}\) and \(\{y_{n_k}\},\) respectively such that \(w_{n_k}\rightharpoonup x^*.\) Then by the hypothesis of the lemma we have \(y_{n_k}\rightharpoonup x^*.\) Also, since \(y_{n_k} \in C_{n_k},\) then by the definition of \(C_n\) we have

$$\begin{aligned} c(w_{n_k}) + \langle c'(w_{n_k}), y_{n_k} - w_{n_k} \rangle \le 0. \end{aligned}$$

Applying the Cauchy-Schwartz inequality, we obtain from the last inequality

$$\begin{aligned} c(w_{n_k})\le \Vert c'(w_{n_k})\Vert \Vert w_{n_k} - y_{n_k}\Vert . \end{aligned}$$
(4.25)

Since \(c'(\cdot )\) is continuous and \(\{w_{n_k}\}\) is bounded, then \(\{c'(w_{n_k})\}\) is bounded, that is, there exists a constant \(M>0\) such that \(\Vert c'(w_{n_k})\Vert \le M \; \text {for all} \; k\ge 0\). Then, from (4.25) we obtain

$$\begin{aligned} c(w_{n_k})\le M\Vert y_{n_k} - w_{n_k}\Vert . \end{aligned}$$
(4.26)

Since \(c(\cdot )\) is continuous, then it is lower semi-continuous. Also, since \(c(\cdot )\) is convex, by Lemma 2.6 \(c(\cdot )\) is weakly lower semi-continuous. Hence, it follows from (4.26) and the definition of weakly lower semi-continuity that

$$\begin{aligned} c(x^*)\le \liminf _{k\rightarrow \infty }c(w_{n_k})\le \lim \limits _{k\rightarrow \infty } M\Vert y_{n_k} - w_{n_k}\Vert =0, \end{aligned}$$
(4.27)

which implies that \(x^* \in C.\) By property (2.1) of \(P_{C_n}\), we obtain

$$\begin{aligned} \langle y_{n_k} - w_{n_k} + \lambda _{n_k}Aw_{n_k},~ z - y_{n_k} \rangle \ge 0, \quad \forall ~ z\in C \subseteq C_{n_k}. \end{aligned}$$

Since A is monotone, we have

$$\begin{aligned} 0&\le \langle y_{n_k} - w_{n_k},~ z - y_{n_k} \rangle + \lambda _{n_k}\langle Aw_{n_k},~ z - y_{n_k} \rangle \\&= \langle y_{n_k} - w_{n_k},~ z - y_{n_k} \rangle + \lambda _{n_k}\langle Aw_{n_k},~ z - w_{n_k} \rangle + \lambda _{n_k}\langle Aw_{n_k},~ w_{n_k} - y_{n_k} \rangle \\&\le \langle y_{n_k} - w_{n_k},~ z - y_{n_k} \rangle + \lambda _{n_k}\langle Az,~ z - w_{n_k} \rangle + \lambda _{n_k}\langle Aw_{n_k},~ w_{n_k} - y_{n_k} \rangle . \end{aligned}$$

Letting \(k\rightarrow \infty\) in the last inequality, and applying \(\lim \limits _{k\rightarrow \infty }||y_{n_k} - w_{n_k}|| = 0\) and \(\lim \limits _{k\rightarrow \infty }\lambda _{n_k}=\lambda >0,\) we have

$$\begin{aligned} \langle Az,~ z - x^* \rangle \ge 0,~~ \forall ~ z\in C. \end{aligned}$$

Applying Lemma 2.8, we obtain \(x^*\in VI(C, A)\).\(\square\)

At this point, we state and prove the strong theorem for our proposed algorithm.

Theorem 4.6

Let \(\{x_n\}\) be a sequence generated by Algorithm 3.2 under Assumption 3.1. Then \(\{x_n\}\) converges strongly to \(\hat{x}\in VI(C,A),\) where \(\hat{x}= \min \{\Vert p\Vert :p\in VI(C,A)\}.\)

Proof

Since \(\hat{x}= \min \{\Vert p\Vert :p\in VI(C,A)\},\) we have \(\hat{x}=P_{VI(C,A)}(0).\) From Lemma 4.4, we have

$$\begin{aligned} \begin{aligned} \Vert x_{n+1} - \hat{x}\Vert ^2&\le {(1-\alpha _n)\Vert x_n - \hat{x}\Vert ^2 + \alpha _n\Big [3M_2 (1-\alpha _n)^2\frac{\theta _n}{\alpha _n}\Vert x_n - x_{n-1}\Vert + 2\langle \hat{x},\hat{x}-x_{n+1}\rangle \Big ]} \\&= (1-\alpha _n)\Vert x_n - \hat{x}\Vert ^2 + \alpha _nb_n, \end{aligned} \end{aligned}$$
(4.28)

where \(b_n= {3M_2 (1-\alpha _n)^2\frac{\theta _n}{\alpha _n}\Vert x_n - x_{n-1}\Vert + 2\langle \hat{x},\hat{x}-x_{n+1}\rangle .}\) Now, we claim that the sequence \(\{\Vert x_n-\hat{x}\Vert \}\) converges to zero. To establish this, by Lemma 2.10 it suffices to show that \(\limsup \limits _{k\rightarrow \infty }b_{n_k}\le 0\) for every subsequence \(\{\Vert x_{n_k}-\hat{x}\Vert \}\) of \(\{\Vert x_n-\hat{x}\Vert \}\) satisfying

$$\begin{aligned} \liminf \limits _{k \rightarrow \infty }\left( \Vert x_{{n_k}+1}-\hat{x}\Vert -\Vert x_{n_k}-\hat{x}\Vert \right) \ge 0. \end{aligned}$$
(4.29)

Suppose that \(\{\Vert x_{n_k}-\hat{x}\Vert \}\) is a subsequence of \(\{\Vert x_n-\hat{x}\Vert \}\) such that (4.29) holds. Again, from Lemma 4.4 we obtain

$$\begin{aligned} \beta _{n_k}(1-\alpha _{n_k})\Big [1-&\delta ^2\frac{\lambda _{n_k}^2}{\lambda _{n_k+1}^2}- 2\delta K\frac{\lambda _{n_k}}{\lambda _{n_k+1}}\Big ]\Vert w_{n_k}-y_{n_k}\Vert ^2\\&\le (1-\alpha _{n_k})^2\Vert x_{n_k}-\hat{x}\Vert ^2 - \Vert x_{{n_k}+1}-\hat{x}\Vert ^2 + 3M_2\alpha _{n_k}(1-\alpha _{n_k})^2\frac{\theta _{n_k}}{\alpha _{n_k}}\Vert x_{n_k}-x_{{n_k}-1}\Vert \\&~~ {+ 2\alpha _{n_k}\langle \hat{x},\hat{x}-x_{n_k+1}\rangle .} \end{aligned}$$

Applying (4.29), Remark 3.4 and the fact that \(\lim \limits _{k\rightarrow \infty }\alpha _{n_k}=0,\) we get

$$\begin{aligned} \beta _{n_k}(1-\alpha _{n_k})\Big [1-\delta ^2\frac{\lambda _{n_k}^2}{\lambda _{n_k+1}^2}- 2\delta K\frac{\lambda _{n_k}}{\lambda _{n_k+1}}\Big ]\Vert w_{n_k}-y_{n_k}\Vert ^2\rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$

By the conditions on \(\alpha _{n_k}, \beta _{n_k}\) and (4.17), we have

$$\begin{aligned} \Vert w_{n_k}-y_{n_k}\Vert \rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$
(4.30)

Consequently, from (4.24) we get

$$\begin{aligned} \Vert z_{n_k}-w_{n_k}\Vert \rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$
(4.31)

By Remark 3.4 (ii.), we have

$$\begin{aligned} \Vert x_{n_k}-w_{n_k}\Vert =\theta _{n_k}\Vert x_{n_k}-x_{n_k-1}\Vert \rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$
(4.32)

Next, applying (4.31) and (4.32) we have

$$\begin{aligned} \Vert x_{n_k}-z_{n_k}\Vert \le \Vert x_{n_k}-w_{n_k}\Vert + \Vert w_{n_k}-z_{n_k}\Vert \rightarrow 0,\quad k\rightarrow \infty . \end{aligned}$$
(4.33)

Now, using (4.32), (4.33) and the fact that \(\lim \limits _{k\rightarrow \infty }\alpha _{n_k}=0\) we obtain

$$\begin{aligned} \begin{aligned} \Vert x_{{n_k}+1}-x_{n_k}\Vert&=\Vert (1-\alpha _{n_k}-\beta _{n_k})(w_{n_k}-x_{n_k}) + \beta _{n_k}(z_{n_k}-x_{n_k})-\alpha _nx_{n_k}\Vert \\&\le (1-\alpha _{n_k}-\beta _{n_k})\Vert w_{n_k}-x_{n_k}\Vert + \beta _{n_k}\Vert z_{n_k}-x_{n_k}\Vert + \alpha _{n_k}\Vert x_{n_k}\Vert \rightarrow 0,\quad k\rightarrow \infty . \end{aligned} \end{aligned}$$
(4.34)

Since \(\{x_n\}\) is bounded, then \(w_\omega (x_n)\) is nonempty. Let \(x^*\in w_\omega (x_n)\) be an arbitrary element. Then, there exists a subsequence \(\{x_{n_k}\}\) of \(\{x_n\}\) such that \(x_{n_k}\rightharpoonup x^*\) as \(k\rightarrow \infty .\) It follows from (4.32) that \(w_{n_k}\rightharpoonup x^*\) as \(k\rightarrow \infty .\) Moreover, by Lemma 4.5 and (4.30) we have \(x^*\in VI(C,A).\) Consequently, we have \(w_\omega (x_n)\subset VI(C,A).\)

Since \(\{x_{n_k}\}\) is bounded, there exists a subsequence \(\{x_{n_{k_j}}\}\) of \(\{x_{n_k}\}\) such that \(x_{n_{k_j}}\rightharpoonup q\) and

$$\begin{aligned} \limsup \limits _{k\rightarrow \infty }\langle \hat{x},\hat{x}-x_{n_k}\rangle =\lim \limits _{j\rightarrow \infty }\langle \hat{x},\hat{x}-x_{n_{k_j}}\rangle . \end{aligned}$$

Since \(\hat{x}=P_{VI(C,A)}(0),\) we have

$$\begin{aligned} \limsup \limits _{k\rightarrow \infty }\langle \hat{x},\hat{x}-x_{n_k}\rangle =\lim \limits _{j\rightarrow \infty }\langle \hat{x},\hat{x}-x_{n_{k_j}}\rangle =\langle \hat{x},\hat{x}-q\rangle \le 0, \end{aligned}$$

Hence, it follows from the last inequality and (4.34) that

$$\begin{aligned} \limsup \limits _{k\rightarrow \infty }\langle \hat{x},\hat{x}-x_{n_{k+1}}\rangle \le 0. \end{aligned}$$
(4.35)

Next, by Remark 3.4 (ii.), (4.31) and (4.35) we have \(\limsup \limits _{k\rightarrow \infty }b_{n_k}\le 0.\) Consequently, by invoking Lemma 2.10 it follows from (4.28) that \(\{\Vert x_n-\hat{x}\Vert \}\) converges to zero as required.\(\square\)

5 Numerical Examples

In this section, we present some numerical experiments to illustrate the performance of our method, Algorithm 3.2 in comparison with Algorithms 7.1, 7.2, 7.3, and 7.4. All numerical computations were carried out using Matlab version R2019(b).

In our computations, we choose \(\alpha _n = \frac{2}{3n+2}\)\(\beta _n=\frac{1-\alpha _n}{2}\)\(\xi _n = (\frac{2}{3n+2})^2\)\(\phi _n=\frac{20}{(2n+5)^2}\)\(\theta =0.87\)\(\lambda _1=0.93\) in our Algorithm 3.2, we choose \(\tau =0.0018,\rho _n=\frac{n}{4n+1}\) in Algorithm 7.1, \(\lambda _{-1}=0.0018,\varphi =0.6,\mu =0.8\) in Algorithm 7.2, \(l= 0.018\) in Algorithm 7.3, and \(f(x)=\frac{1}{3}x\) in Algorithms 7.3 and 7.4.

Example 5.1

Let the feasible set \(C=\{x\in \mathbb {R}^2: c(x):=x_1^2+x_2-2\le 0\}\) and define the operator \(A:\mathbb {R}^2\rightarrow \mathbb {R}^2\) by \(A(x)=(6h(x_1), 4x_1+2x_2),\) where \(x=(x_1,x_2)\in \mathbb {R}^2\) and

$$\begin{aligned} h(s):={\left\{ \begin{array}{ll} e(s-1)+e,&{} \text {if~~} s>1,\\ e^s, &{} \text {if~~} -1\le s\le 1,\\ e^{-1}(s+1)+e^{-1}, &{} \text {if~~} s<-1. \end{array}\right. } \end{aligned}$$

Then, it can easily be verified that A is monotone and \(2\sqrt{9e^2+5}\)-Lipschitz continuous. Also, c is a continuously differentiable convex function and \(c'\) is 2-Lipschitz continuous. Moreover, we have that \(K=6\sqrt{e^2+1}\) (see He et al. 2018). Hence, we choose \(\delta =0.025.\)

We test the algorithms for four different initial points as follows:

Case 1: \(x_0 = (0.5,1), x_1 = (1,0.7);\)

Case 2: \(x_0 = (1.3, 0.2), x_1 = (0.3, 1.5);\)

Case 3: \(x_0 = (0.7,0.9), x_1= (0.4,0.8);\)

Case 4: \(x_0 = (1.2, 0.3), x_1 = (0.9, 1.1).\)

The stopping criterion used for this example is \(|x_{n+1}-x_{n}|< 10^{-2}\). We plot the graphs of errors against the number of iterations in each case. The numerical results are reported in Figs. 1, 2, 3, and 4 and Table 1.

Table 1 Numerical Results for Example 5.1
Fig. 1
figure 1

Example 5.1 Case 1

Fig. 2
figure 2

Example 5.1 Case 2

Fig. 3
figure 3

Example 5.1 Case 3

Fig. 4
figure 4

Example 5.1 Case 4

Example 5.2

Let \(H=(\ell _2(\mathbb {R}), \Vert \cdot \Vert _2),\) where \(\ell _2(\mathbb {R}):=\{x=(x_1,x_2,\ldots ,x_n,\ldots )\)\(x_j\in \mathbb {R}:\sum _{j=1}^{\infty }|x_j|^2<+\infty \}\)\(||x||_2=(\sum _{j=1}^{\infty }|x_j|^2)^{\frac{1}{2}}\) and \(\langle x,y \rangle = \sum _{j=1}^\infty x_jy_j\) for all \(x\in \ell _2(\mathbb {R}).\) Let \(C=\{x \in H: c(x):=\Vert x\Vert ^2-1 \le 0\},\) and we define the operator \(A:H\rightarrow H\) by \(A(x)=2x,~\forall x \in H.\) Then A is monotone and 2-Lipschitz continuous. Moreover, \(K=1\) and we choose \(\delta =0.4.\)

We choose different initial values as follows:

Case 1: \(x_0 = (2, 1, \frac{1}{2}, \cdots ),\) \(x_1 = (-3, 1, -\frac{1}{3},\cdots );\)

Case 2: \(x_0 = (-2, 1, -\frac{1}{2},\cdots ),\) \(x_1 = (-4, 1, -\frac{1}{4}, \cdots );\)

Case 3: \(x_0 = (2, 1, \frac{1}{2}, \cdots ),\) \(x_1 = (-5, 1, -\frac{1}{5}, \cdots );\)

Case 4: \(x_0 = (-2, 1, -\frac{1}{2},\cdots ),\) \(x_1 = (-3, 1, -\frac{1}{3}, \cdots ).\)

The stopping criterion used for this example is \(\Vert x_{n+1}-x_{n}\Vert < 10^{-2}\). We plot the graphs of errors against the number of iterations in each case. The numerical results are reported in Figs. 5, 6, 7, and 8 and Table 2.

Table 2 Numerical Results for Example 5.2
Fig. 5
figure 5

Example 5.2 Case 1

Fig. 6
figure 6

Example 5.2 Case 2

Fig. 7
figure 7

Example 5.2 Case 3

Fig. 8
figure 8

Example 5.2 Case 4

Example 5.3

(Application to Image Restoration Problem)

In this last example, we apply our result to image restoration problem. We compare the efficiency of our Algorithm 3.2 with Algorithms 7.1, 7.3, and 7.4.

We recall that the image restoration problem can be formulated as the following linear inverse problem:

$$\begin{aligned} v = Dx + e \end{aligned}$$
(5.1)

where \(x\in \mathbb {R}^{N}\) is the original image, \(D\in \mathbb {R}^{M\times N}\) is the blurring matrix, \(v\in \mathbb {R}^{M}\) is the observed blurred image while e is the Gaussian noise. It is known that solving Problem (5.1) is equivalent to solving the convex minimization problem

$$\begin{aligned} \min _{x\in \mathbb {R}^N} ~\Big \{\frac{1}{2}||Dx-v||^2_2 +\lambda ||x||_1\Big \}, \end{aligned}$$
(5.2)

where \(\lambda >0\) is the regularization parameter, \(\Vert \cdot \Vert _2\) denotes the Euclidean norm and \(\Vert \cdot \Vert _1\) is the \(\ell _1\)-norm. Our task here is to restore the original image x given the data of the blurred image v. The minimization problem (5.2) can be expressed as a variational inequality problem by setting \(A:=D^T(Dx-v).\) It is known in this case that the operator A is monotone and \(\Vert D^TD\Vert\)-Lipschitz continuous. We consider the \(291 \times 240\) Pout, \(159 \times 191\) Cell, \(223 \times 298\) Shadow, and \(256 \times 256\) Cameraman images from MATLAB Image Processing Toolbox. Moreover, we use the Gaussian blur of size \(7\times 7\) and standard deviation \(\sigma =4\) to create the blurred and noisy image (observed image) and use the algorithms to recover the original image from the blurred image. Also, we measure the quality of the restored image using the signal-to-noise ratio defined by

$$\begin{aligned} \text{ SNR } = 20 \times \log _{10}\left( \frac{\Vert x\Vert _2}{\Vert x-x^*\Vert _2}\right) , \end{aligned}$$

where x is the original image and \(x^*\) is the restored image. Note that, the larger the SNR, the better the quality of the restored image. We choose the initial values as \(x_0 = {\textbf {0}} \in \mathbb {R}^{N}\) and \(x_1 = {\textbf {1}} \in \mathbb {R}^{N}.\) The results are reported in Table 3, which shows the SNR values for each algorithm, and Figs. 9, 10, 11, 12, 13, 14, 15, and 16 shows the original, blurred and restored images. The major advantages of our proposed Algorithm 3.2 over the other algorithms compared with are the higher SNR values for generating the recovered images.

Table 3 Numerical Results for Example 5.3
Fig. 9
figure 9

Example 5.3 Pout Figure

Fig. 10
figure 10

Example 5.3 Pout Image

Fig. 11
figure 11

Example 5.3 Cell Figure

Fig. 12
figure 12

Example 5.3 Cell Image

Fig. 13
figure 13

Example 5.3 Shadow Figure

Fig. 14
figure 14

Example 5.3 Shadow Image

Fig. 15
figure 15

Example 5.3 Cameraman Figure

Fig. 16
figure 16

Example 5.3 Cameraman Image

6 Conclusion

In this paper we study the monotone VIP. We introduce a new inertial two-subgradient extragradient method for approximating the solution of the problem in Hilbert spaces. Unlike several of the existing results in the literature, our method does not require any linesearch technique which could be time-consuming to implement. Rather, we employ a more efficient self-adaptive step size technique which generates a non-monotonic sequence of step sizes. Moreover, under mild conditions we prove that the sequence generated by our proposed algorithm converges strongly to a minimum-norm solution of the VIP. Finally, we presented several numerical experiments and applied our result to image restoration problem. Our result complements the existing results in the literature in this direction.