1 Introduction

The theory of variational inequalities has been of great interest due to its wide applications in several branches of pure and applied sciences. There are several methods for solving variational inequalities, most of which are based on projection methods. The simplest form of projection methods is due to Goldstein [25], which is a natural extension of the gradient projected technique for solving optimization problems. This method requires strong assumptions such as strong monotonicity before its convergence is guaranteed. Moreover, in general, it converges weakly to a solution of the variational inequalities. In finite dimensional spaces, Korpelevich [40] introduced a double projection method called extragradient method for solving variational inequalities with monotone and Lipschitz continuous operator. This method was later extended to infinite dimensional Hilbert spaces by some researchers; see, for instance, [9, 1214, 17, 22, 37, 56].

It is important to say that the extragradient method is not efficient in the case where the feasible set does not have a closed form expression, which makes projection onto it very difficult. This leads some researchers to introducing modifications of the extragradient method; see [15, 16, 18, 27, 39, 64]. In particular, Tseng [64] introduced a single projection extragradient method (also called forward-backward algorithm) for the variational inequalities in real Hilbert spaces. A typical disadvantage of Tseng algorithm and many other algorithms (such as [10, 11, 23, 24, 62] and the references therein) is the assumption that the Lipschitz constant of the monotone operator is known or can be estimated. In many practical problems, the Lipschitz constant is very difficult to estimate and the cost operator might even be pseudo-monotone. Recently, Thong and Vuong [63] introduced a modified Tseng extragradient method in which the operator is pseudo-monotone and there is no requirement for a prior estimate of the Lipschitz constant of the cost operator. The stepsize of their algorithm is determined by a line search process, and they proved weak and strong convergence results for the variational inequalities in real Hilbert spaces.

In recent years, the study of iterative methods for common solution of variational inequalities and fixed point problems has attracted considerable interest of many scientists. This topic develops mathematical tools for solving a wide range of problems arising in game, equilibrium, optimization theory, operation research, and so on; see, for instance, [30, 31, 44]. Despite its importance, there are very few results on finding a common solution of variational inequalities and fixed point problems in the literature; see [13, 30, 31, 36, 41, 44, 67]. Several results on solving variational inequalities in the literature are considered in real Hilbert spaces or 2-uniformly convex real Banach spaces. Moreover, it is very interesting to study the variational inequalities in real Banach spaces due to several physical models and applications which can be modeled as variational inequalities in real Banach spaces that are not Hilbert spaces; see, for instance, [1, Example 4.4.4]. In view of this, Cai et al. [8] introduced a double projection algorithm for solving monotone variational inequalities in 2-uniformly convex real Banach spaces. This method requires finding a prior estimate of the Lipschitz constant of the cost operator before its convergence is guaranteed. More so, Shehu [55] introduced a single projection method which also requires the prior estimate of the Lipschitz constant of the constant operator for solving variational inequalities in 2-uniformly convex real Banach spaces. Apart from the fact that the Lipschitz constant is very difficult to estimate, the methods of Cai et al. [8], Shehu [55] and some other related methods (e.g. [14, 20]) are very restricted since they considered the setting when E is a 2-uniformly convex real Banach space.

Recently, Jolaoso et al. [35] introduced a projections algorithm using Bregman distance techniques for solving variational inequalities and a fixed point problem in a reflexive real Banach space. This method requires computing more than one projection onto the feasible per each iteration. More so, the stepsize is determined by a line search process which is computationally expensive. Furthermore, Jolaoso and Aphane [33] introduced a Bregman subgradient extragradient method with a line search technique for solving variational inequalities in a real reflexive Banach space. Very recently, Jolaoso and Shehu [34] introduced a single Bregman projection method with self-adaptive stepsize selection technique for solving variational inequalities in a real reflexive Banach space. The authors proved that the sequence generated by their algorithm converges weakly to a solution of the variational inequalities in a real reflexive Banach space.

In this paper, we study the common solution of variational inequalities and fixed point problems in a real reflexive Banach space. Using the Bregman distance technique, we introduce a new self-adaptive Tseng extragradient method for finding a common solution of the problems in a real reflexive Banach space. The Bregman distance is a key substitute and generalization of the Euclidean distance, and it is induced by a chosen convex function. It has found numerous applications in optimization theory, nonlinear analysis, inverse problems, and recently machine learning; see, for instance, [19, 21, 49]. In addition, the use of Bregman distance allows the consideration of a general feasible set structure for the variational inequalities. In particular, we can choose Kullback–Leibler divergence (a Bregman distance on negative entropy) and obtain an explicitly calculated operator of projection onto simplex. We prove a strong convergence theorem for finding a common solution in the solution set of variational inequalities with pseudo-monotone and Lipschitz continuous operator, and the set of fixed points for a finite family of Bregman quasi-nonexpansive mappings in a reflexive Banach space. More so, the stepsize of our algorithm is determined by a self-adaptive process which is more efficient than the line search technique. We also present some applications of our algorithm to generalized Nash equilibrium problem and utility-based bandwidth allocation problem. We give some numerical examples to illustrate the performance of our algorithm for various Bregman functions and also compare with some existing methods in the literature.

The rest of the paper is organized as follows: In Sect. 2, we present some preliminary results and definitions needed for obtaining our result. In Sect. 3, we present our algorithm and its convergence analysis. In Sect. 4, we give the applications of our result to generalized Nash equilibrium problem and utility-based bandwidth allocation problem. In Sect. 5, we give some numerical experiments and compare our algorithm with some existing methods in the literature. We finally give some concluding remarks in Sect. 6.

2 Preliminaries

In this section, we introduce some definitions and basic results that will be needed in this paper.

Let E be a real Banach space with dual \(E^{*}\), and \(\langle \cdot , \cdot \rangle \) denotes the duality pairing between E and \(E^{*}\); \(x_{n} \to x\) denotes the strong convergence of the sequence \(\{x_{n}\} \subset E\) to \(x \in E\) and \(x_{n} \rightharpoonup x\) denotes the weak convergence of \(\{x_{n}\}\) to x. Let \(S_{E}\) be the unit sphere of E, C be a nonempty closed convex subset of E, and \(A:E \to E^{*}\) be a mapping. We consider the variational inequality problem (shortly, \(VIP(C,A)\)) which consists of finding a point \(x \in C\) such that

$$ \langle Ax, y -x \rangle \geq 0 \quad \forall y \in C. $$
(2.1)

We denote the solution set of (2.1) by \(VI(C,A)\). A point \(x \in E\) is called a fixed point of T if \(Tx = x\). The set of fixed points of T is denoted by \(F(T)\).

Definition 2.1

An operator \(A:C \to E^{*}\) is said to be

  1. (a)

    strongly monotone on C with parameter \(\tau >0\) if and only if

    $$ \langle Au - Av, u -v \rangle \geq \tau \Vert u - v \Vert , \quad \forall u ,v \in C; $$
  2. (b)

    monotone on C if and only if

    $$ \langle Au -Av, u-v \rangle \geq 0, \quad \forall u ,v \in C; $$
  3. (c)

    strongly pseudo-monotone on C with parameter \(\tau >0\) if

    $$ \langle Au, v - u \rangle \geq 0\quad \Rightarrow\quad \langle Av, v- u \rangle \geq \tau \Vert u -v \Vert ^{2}, \quad \forall u,v \in C; $$
  4. (d)

    pseudo-monotone on C if

    $$ \langle Au, v - u \rangle \geq 0 \quad \Rightarrow\quad \langle Av, v - u \rangle \geq 0, \quad \forall u,v \in C; $$
  5. (e)

    Lipschitz continuous if there exists a constant \(L >0\) such that

    $$ \Vert Au - Av \Vert \leq L \Vert u - v \Vert \quad \forall u , v \in C; $$
  6. (f)

    weakly sequentially continuous if for any \(\{x_{n}\} \subset E\) such that \(x_{n} \rightharpoonup x\) implies \(Ax_{n} \rightharpoonup Ax\).

From Definition 2.1, it is easy to see that

$$ (a) \Rightarrow (b) \Rightarrow (d) \quad \text{and} \quad (a) \Rightarrow (c) \Rightarrow (d); $$

however, the converse implications are not always true; see [26, 29, 36].

Definition 2.2

A function \(f:E \to \mathbb{R}\) is said to be proper if the domain of f, \(dom f = \{x \in E: f(x)< + \infty \}\) is nonempty. The Fenchel conjugate of f is the function \(f^{*}:E^{*} \to \mathbb{R}\) defined by

$$ f^{*}\bigl(x^{*}\bigr) = \sup \bigl\{ \bigl\langle x,x^{*} \bigr\rangle - f(x): x\in E\bigr\} $$

for any \(x^{*} \in E^{*}\). The function f is said to be Gâteaux differentiable at \(x \in {\mathrm{int}}(dom f)\) if the limit

$$ \lim_{t\to 0} \frac{f(x+ty) - f(x)}{t} = f^{\prime }(x,y) $$
(2.2)

exists for any \(y \in E\). f is said to be Gâteaux differentiable if it is Gâteaux differentiable at every \(x \in {\mathrm{int}}(dom f)\). More so, when the limit in (2.2) holds uniformly for any \(y \in S_{E}\) and \(x \in {\mathrm{int}}(dom f)\), we say that f is Fréchet differentiable. The gradient of f at \(x \in E\) is the linear function \(\nabla f(x)\) such that \(\langle y, \nabla f(x) \rangle = f^{\prime }(x,y)\) for all \(y \in E\). f is called a Legendre function if and only if it satisfies

  1. (i)

    \({\mathrm{int}}(dom f) \neq \emptyset \), \(dom \nabla f = {\mathrm{int}}(dom f)\) and f is Gâteaux differentiable;

  2. (ii)

    \({\mathrm{int}}(dom f^{*}) \neq \emptyset \), \(dom \nabla f^{*} = {\mathrm{int}}(dom f^{*})\) and \(f^{*}\) is Gâteaux differentiable.

For examples and more information on Legendre functions, see [3, 4, 6]. Also, f is said to be strongly coercive if

$$ \lim_{ \Vert x \Vert \to \infty } \frac{f(x)}{ \Vert x \Vert } = \infty , $$

and strongly convex with strong convexity parameter \(\beta >0\) if

$$ f(y) \geq f(x) + \bigl\langle \nabla f(x), y -x \bigr\rangle + \frac{\beta }{2} \Vert x-y \Vert ^{2} \quad \forall x,y \in E. $$

Definition 2.3

([5])

Let \(f:E \to \mathbb{R}\cup \{+\infty \}\) be a Gâteaux differentiable function. The Bregman distance \(D_{f}:dom f\times {\mathrm{int}}(dom f) \to \mathbb{R}\) is defined by

$$ D_{f}(x,y) = f(x) - f(y) - \bigl\langle x - y, \nabla f(y) \bigr\rangle \quad \forall x \in dom f, y \in {\mathrm{int}}(dom f). $$
(2.3)

Note that \(D_{f}\) is not a metric since it does not satisfy symmetric and the triangular inequality properties; however, it has the following important properties: for any \(x,w \in dom f\) and \(y,z \in {\mathrm{int}}(dom f)\),

$$ D_{f}(x,y) + D_{f}(y,z) -D_{f}(x,z) = \bigl\langle x- y, \nabla f(z) - \nabla f(y) \bigr\rangle , $$
(2.4)

and

$$ D_{f}(x,y) - D_{f}(x,z) - D_{f}(w,y) + D_{f}(w,z) = \bigl\langle x - w, \nabla f(z) - \nabla f(y) \bigr\rangle . $$
(2.5)

Next, we give some examples of convex functions with their corresponding Bregman distance (see also [28]).

Example 2.4

Let \(E = \mathbb{R}^{m}\), then

  1. (i)

    when \(f^{KL}(x) = \sum_{i=1}^{n} x_{i}\log (x_{i})\) (called the Shannon entropy), \(\nabla f(x) = (1+\log (x_{1}),\dots ,1+\log (x_{m}) )^{T}\), \(\nabla f^{*}(x) = (\exp (x_{1} -1),\dots ,\exp (x_{m}-1) )^{T}\) and

    $$ D_{f^{KL}}(x,y) = \sum_{i=1}^{n} \biggl( x_{i}\log \biggl( \frac{x_{i}}{y_{i}} \biggr)+y_{i} - x_{i} \biggr) $$

    which is called Kullback–Leibler distance;

  2. (ii)

    when \(f^{SE}(x) = \frac{1}{2}\|x\|^{2}\), \(\nabla f(x) = x\), \(\nabla f^{*}(x) = x\), and

    $$ D_{f^{SE}}(x,y) = \frac{1}{2} \Vert x-y \Vert ^{2} $$

    which is the squared Euclidean distance;

  3. (iii)

    when \(f^{IS}(x) = -\sum_{i\in I(x)}^{n} \log (x_{i})\) (called the Burg entropy), \(\nabla f(x) = - (\frac{1}{x},\dots ,\frac{1}{x_{m}} )^{T}\), \(\nabla f^{*}(x) = - (\frac{1}{x},\dots ,\frac{1}{x_{m}} )^{T}\) and

    $$ D_{f^{IS}}(x,y) = \sum_{i\in I(x)}^{n} \biggl( \log \biggl( \frac{x_{i}}{y_{i}} \biggr) + \frac{x_{i}}{y_{i}} -1 \biggr) $$

    which is called Itakura–Saito distance;

  4. (iv)

    when \(f^{SM}(x) = \frac{1}{2}x^{T}x\), where \(x^{T}\) is stands for the transpose of \(x \in \mathbb{R}^{n}\) and \(Q = diag(1,2,\dots ,n) \in \mathbb{R}^{n}\), \(\nabla f(x) = Qx\), \(\nabla f^{*}(x) = Q^{-1}(x)\) and

    $$ D_{f^{SM}}(x,y) = \frac{1}{2}(x-y)^{T}(x-y), $$

    which is called the squared Mahalanobis distance.

The relationship between \(D_{f}\) and norm \(\|\cdot \|\) is guaranteed when f is strongly convex with strong convexity constant \(\beta >0\) i.e.

$$ D_{f}(x,y) \geq \frac{\beta }{2} \Vert x-y \Vert ^{2} \quad \forall x \in dom f, y \in {\mathrm{int}}(dom f), $$
(2.6)

(see [65, Lemma 7]). The necessarily unique vector \(Proj_{C}^{f}(x)\) which satisfies

$$ D_{f}\bigl(Proj_{C}^{f}(x),x\bigr) = \inf \bigl\{ D_{f}(x,y): y \in C\bigr\} $$

is called the Bregman projection onto the convex set C. It is characterized by the following result.

Lemma 2.5

([7, 52])

Suppose that \(f:E \to \mathbb{R}\) is Gâteaux differentiable and \(C \subset {\mathrm{int}}(dom f)\) is a nonempty closed and convex set. Then the Bregman projection \(Proj_{C}^{f} :E \to C\) satisfies the following properties:

  1. (i)

    \(w = Proj_{C}^{f}(x)\) if and only if \(\langle \nabla f(x) - \nabla f(w), y - w \rangle \leq 0\) for all \(y \in C\);

  2. (ii)

    \(D_{f}(y, Proj_{C}^{f}(x)) + D_{f}(Proj_{C}^{f}(x),x) \leq D_{f}(y,x)\) for all \(y \in C\) and \(x \in E\).

Let \(f:E \to \mathbb{R}\) be a Legendre function. We define the function \(V_{f}:E \times E^{*} \to [0,\infty )\) associated with f by

$$ V_{f}\bigl(x,x^{*}\bigr) = f(x) - \bigl\langle x,x^{*} \bigr\rangle + f^{*}\bigl(x^{*}\bigr), \quad \forall x \in E, x^{*} \in E^{*}. $$
(2.7)

It is easy to see from (2.7) that \(V_{f}\) is nonnegative and \(V_{f}(x,x^{*}) = D_{f}(x, \nabla f^{*}(x^{*}))\). In addition, \(V_{f}\) satisfies the following inequality (see [54]):

$$ V_{f}\bigl(x,x^{*}\bigr) + \bigl\langle y^{*}, \nabla f^{*}\bigl(x^{*}\bigr) - x \bigr\rangle \leq V_{f}\bigl(x, x^{*} + y^{*}\bigr) \quad \forall x \in E, x^{*}, y^{*} \in E^{*}. $$
(2.8)

Definition 2.6

([38])

Let \(T:C \to C\) be a mapping. A point \(x \in E\) is called an asymptotic fixed point of T if there exists a sequence \(\{x_{n}\}\subset C\) such that \(x_{n} \rightharpoonup x\) and \(\lim_{n\to \infty }\|x_{n} - Tx_{n}\| = 0\). We denote the set of asymptotic fixed points of T by \(\hat{F}(T)\).

Definition 2.7

([38, 50])

The mapping \(T:C \to C\) is called

  1. (i)

    Bregman firmly nonexpansive (BFNE) if

    $$ \nabla f(x) - \nabla f(Ty), Tx- Ty \rangle \leq \bigl\langle \nabla f(x) - \nabla f(y), Tx - Ty \bigr\rangle \quad \forall x,y \in C; $$
  2. (ii)

    Bregman strongly nonexpansive (BSNE) with respect to \(\hat{F}(T)\) if \(D_{f}(z,Tx) \leq D_{f}(z,x)\) for all z\(\hat{F}(T)\) and \(x \in C\), and if whenever \(\{x_{n}\}\subset C\) is bounded and

    $$ \lim_{n\to \infty }\bigl(D_{f}(z,x_{n}) - D_{f}(z,Tx_{n})\bigr) = 0, $$

    it follows that \(\lim_{n\to \infty }D_{f}(x_{n},Tx_{n}) = 0\);

  3. (iii)

    Bregman quasi-nonexpansive (BQNE) if \(F(T)\neq \emptyset \) and

    $$ D_{f}(z,Tx) \leq D_{f}(z,x) \quad \forall z \in F(T), x \in C. $$

In the case when \(F(T) = \hat{F}(T)\), it is easy to see that the following inclusions hold:

$$ BFNE \Rightarrow BSNE \Rightarrow BQNE $$

(see [38]). The following lemmas will be used in the sequel.

Lemma 2.8

If \(f:E \to \mathbb{R}\) is a strongly coercive and Legendre function, then

  1. (i)

    \(\nabla f: E \to E^{*}\) is one-to-one, onto, and norm-to-weak* continuous;

  2. (ii)

    \(\{x \in E: D_{f}(x,y) \leq \rho \}\) is bounded for all \(y \in E\) and \(\rho >0\);

  3. (iii)

    \(dom f^{*} = E^{*}\), \(f^{*}\) is Gâteaux differentiable and \(\nabla f^{*} = (\nabla f)^{-1}\).

Lemma 2.9

([48])

If \(f:E \to (-\infty ,+\infty ]\) is a proper, lower semi-continuous, and convex function, \(f^{*}:E^{*} \to (-\infty ,+\infty ]\) is a weak* lower semi-continuous and convex function. Thus, for all \(w \in E\), we have

$$ D_{f} \Biggl(w, \nabla f^{*} \Biggl(\sum _{i=1}^{N} \delta _{i} \nabla f(x_{i}) \Biggr) \Biggr) \leq \sum_{i=1}^{N} \delta _{i} D_{f}(w,x_{i}), $$

where \(\{x_{i}\}\subset E\) and \(\{\delta _{i}\} \subseteq (0,1)\) satisfying \(\sum_{i=1}^{N} \delta _{i} = 1\).

Lemma 2.10

([47])

Let \(f: E \rightarrow \mathbb{R}\) be a continuous uniformly convex function on bounded subsets of E and \(r >0\) be a constant. Then

$$\begin{aligned} f \Biggl(\sum_{k=0}^{n} \alpha _{k} x_{k} \Biggr) \leq \sum_{k=0}^{n} \alpha _{k}f(x_{k}) - \alpha _{i} \alpha _{j} \rho _{r} \bigl( \Vert x_{i} - x_{j} \Vert \bigr) \end{aligned}$$
(2.9)

for all \(i,j \in \mathbb{N}\cup \{0\}\), \(x_{k} \in B_{r}\), \(\alpha _{k} \in (0,1)\), and \(k \in \mathbb{N}\cup {0}\) with \(\sum_{k=0}^{n} \alpha _{k} = 1\), where \(\rho _{r}\) is the gauge of uniform convexity of g.

Lemma 2.11

([50])

If \(f:E \to \mathbb{R}\) is uniformly Fréchet differentiable and bounded on bounded subsets of E, thenf is norm-to-norm uniformly continuous on bounded subsets of E and thus, both f andf are bounded on bounded subsets of E.

Definition 2.12

([45])

The minty variational inequality problem (MVIP) is defined as finding a point \(\bar{x} \in C\) such that

$$\begin{aligned} \langle Ay, y - \bar{x} \rangle \geq 0, \quad \forall y \in C. \end{aligned}$$
(2.10)

We denote by \(M(C,A)\) the set of solutions of (2.10). Some existence results for the MVIP have been presented in [42]. Also, the assumption that \(M(C,A) \neq \emptyset \) has already been used for solving \(VI(C,A)\) in finite dimensional spaces (see e.g. [57]). It is not difficult to prove that pseudo-monotonicity implies the property \(M(C,A) \neq \emptyset \), but the converse is not true. Indeed, let \(A:\mathbb{R} \rightarrow \mathbb{R}\) be defined by \(A(x) = \cos (x)\) with \(C = [0, \frac{\pi }{2}]\). We have that \(VI(C,A) = \{0, \frac{\pi }{2}\}\) and \(M(C,A) = \{0\}\). But if we take \(x = 0\) and \(y = \frac{\pi }{2}\) in Definition 2.1(d), we see that A is not pseudo-monotone.

Lemma 2.13

([45])

Consider VIP (2.1). If the mapping \(h:[0,1] \rightarrow E^{*}\) defined as \(h(t) = A(tx + (1-t)y)\) is continuous for all \(x,y \in C\) (i.e. h is hemicontinuous), then \(M(C,A) \subset VI(C,A)\). Moreover, if A is pseudo-monotone, then \(VI(C,A)\) is closed, convex and \(VI(C,A) = M(C,A)\).

Lemma 2.14

([66])

Let \(\{a_{n}\}\) be a sequence of nonnegative real numbers satisfying the following identity:

$$ a_{n+1} \leq (1-\alpha _{n})a_{n} + \alpha _{n} \delta _{n}, \quad n \geq 0, $$

where \(\{\alpha _{n}\}\subset (0,1)\) and \(\{\delta _{n}\}\subset \mathbb{R}\) such that \(\sum_{n=0}^{\infty }\alpha _{n} = \infty \) and \(\limsup_{n\to \infty }\delta _{n} \leq 0\) or \(\sum_{n=0}^{\infty }| \alpha _{n}\delta _{n}| < \infty \). Then \(\lim_{n\to \infty }a_{n} =0\).

Lemma 2.15

([43])

Let \(\{a_{n}\}\) be a sequence of real numbers such that there exists a subsequence \(\{a_{n_{i}}\}\) of \(\{a_{n}\}\) with \(a_{n_{i}} < a_{n_{i}+1}\) for all \(i \in \mathbb{N}\). Consider the integer \(\{m_{k}\}\) defined by

$$\begin{aligned} m_{k} = \max \{j \leq k: a_{j} < a_{j+1}\}. \end{aligned}$$

Then \(\{m_{k}\}\) is a nondecreasing sequence verifying \(\lim_{n \rightarrow \infty }m_{n} = \infty \), and for all \(k \in \mathbb{N}\), the following estimates hold:

$$\begin{aligned} a_{m_{k}} \leq a_{m_{k}+1} \quad \textit{and} \quad a_{k} \leq a_{m_{k}+1}. \end{aligned}$$

3 Main results

In this section, we introduce a new iterative algorithm for solving pseudo-monotone variational inequality and common fixed point problems in a reflexive Banach space. In order to present our method and its convergence analysis, we make the following assumptions.

Assumption 3.1

  1. (a)

    The feasible set C is a nonempty closed convex subset of a real reflexive Banach space E;

  2. (b)

    The operator \(A:E \to E^{*}\) is pseudo-monotone, L-Lipschitz continuous, and weakly sequentially continuous on E;

  3. (c)

    For \(i=1,2,\dots ,N\), \(\{T_{i}\}\) is a family of Bregman quasi-nonexpansive mappings on E such that \(F(T) = \hat{F}(T)\) for all \(i =1,2,\dots ,N\);

  4. (d)

    The solution set \(\Gamma = VI(C,A)\cap \bigcap_{i=1}^{N} F(T_{i})\) is nonempty.

Assumption 3.2

The function \(f:E \to \mathbb{R}\) satisfies the following:

  1. (a)

    f is proper, convex, and lower semicontinuous;

  2. (b)

    f is uniformly Fréchet differentiable;

  3. (c)

    f is strongly convex on E with strong convexity constant \(\beta >0\);

  4. (d)

    f is a strongly coercive and Legendre function which is bounded on bounded subsets of E.

Assumption 3.3

Also, we assume that the control sequences satisfy:

  1. (a)

    \(\{\beta _{n,i}\}\subset (0,1)\), \(\sum_{i=0}^{N} \beta _{n,i} =1\), and \(\liminf_{n\to \infty }\beta _{n,0}\beta _{n,i} >0\) for all \(i = 1,2,\dots ,N\) and \(n \in \mathbb{N}\);

  2. (b)

    \(\{\delta _{n}\}\subset (0,1)\), \(\lim_{n \rightarrow \infty }\delta _{n} = 0\), and \(\sum_{n=0}^{\infty }\delta _{n} = \infty \);

  3. (c)

    \(\{u_{n}\} \subset E\), \(\lim_{n \rightarrow \infty } u_{n} = u^{*}\) for some \(u^{*} \in E\).

We first highlight some novelties of Algorithm 1 with respect to some methods in the literature.

  1. (i)

    In [32, 35], the authors introduced extragradient-type methods for solving VIP (2.1) in reflexive Banach spaces. It should be observed that these methods used more than one projection onto the feasible set per each iteration, whereas Algorithm 1 performs only one projection onto the feasible set per each iteration.

    Algorithm 1
    figure a

    A self-adaptive Tseng extragradient method with Bregman distance (Alg. 3.4)

  2. (ii)

    In [32, 35, 55, 63], the authors employed a line search technique which uses inner loops and might consume additional computation time for determining the stepsize. In Algorithm 1 we use a self-adaptive method which is very simple and does not possess any inner loop.

  3. (iii)

    Our work also improves and extends the results of [55, 6063] on finding a common solution of \(VIP(C,A)\) and a fixed point problem from real Hilbert spaces and 2-uniformly convex Banach spaces to real reflexive Banach spaces.

Remark 3.4

Note that in case where \(x_{n} = y_{n} = w_{n}\), we arrived at a common solution of the VIP and fixed point of \(T_{i}\) (\(i=1,2,\dots ,N\)). In our convergence analysis, we implicitly assumed that this does not occur after finite iterations so that Algorithm 1 generates infinitely many iterations. More so, it is easy to see that the sequence \(\{\alpha _{n}\}\) generated by (3.1) is monotonically nonincreasing and bounded below by \(\min \{ \frac{\mu }{L},\alpha _{0} \} \). Hence, \(\lim_{n\to \infty }\alpha _{n}\) exists.

Lemma 3.5

Let \(\{x_{n}\}\), \(\{y_{n}\}\), \(\{z_{n}\}\) be sequences generated by Algorithm 1 and \(u \in \Gamma \). Then

$$ D_{f}(u,z_{n}) \leq D_{f}(u,x_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}) $$

for all \(n \geq 0\).

Proof

Since \(u \in \Gamma \), then

$$\begin{aligned} D_{f}(u,z_{n}) =& D_{f} \bigl(u, \nabla f^{*}\bigl( \nabla f(y_{n}) - \alpha _{n} (Ay_{n} -Ax_{n}) \bigr) \bigr) \\ =& f(u) - \bigl\langle u - z_{n}, \nabla f(y_{n}) - \alpha _{n} (Ay_{n} - Ax_{n}) \bigr\rangle - f(z_{n}) \\ =& f(u) + \bigl\langle z_{n} - u, \nabla f(y_{n}) \bigr\rangle + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle - f(z_{n}) \\ =& f(u) - \bigl\langle u - y_{n}, \nabla f(y_{n}) \bigr\rangle - f(y_{n}) + \bigl\langle u - y_{n}, \nabla f(y_{n}) \bigr\rangle + f(y_{n}) \\ &{} + \bigl\langle z_{n} - u, \nabla f(y_{n}) \bigr\rangle + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle - f(z_{n}) \\ =& D_{f}(u,y_{n}) - f(z_{n}) + f(y_{n}) + \bigl\langle z_{n} -y_{n}, \nabla f(y_{n}) \bigr\rangle + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle \\ =& D_{f}(u,y_{n}) - D_{f}(z_{n},y_{n}) + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle . \end{aligned}$$
(3.2)

Note that from (2.5) we have

$$\begin{aligned} D_{f}(u,y_{n}) - D_{f}(z_{n},y_{n}) = D_{f}(u,x_{n}) - D_{f}(z_{n},x_{n}) + \bigl\langle u -z_{n} , \nabla f(x_{n}) - \nabla f(y_{n})\bigr\rangle . \end{aligned}$$
(3.3)

Thus it follows from (3.2) and (3.3) that

$$\begin{aligned} D_{f}(u,z_{n}) =& D_{f}(u,x_{n}) - D_{f}(z_{n},x_{n}) + \bigl\langle u -z_{n} , \nabla f(x_{n}) - \nabla f(y_{n})\bigr\rangle \\ &{} + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle . \end{aligned}$$
(3.4)

Also from (2.4) we have

$$\begin{aligned} D_{f}(z_{n},x_{n}) = D_{f}(z_{n},y_{n}) + D_{f}(y_{n},x_{n}) - \bigl\langle \nabla f(x_{n}) - \nabla f(y_{n}), z_{n} - y_{n} \bigr\rangle . \end{aligned}$$
(3.5)

Then from (3.4) and (3.5) we obtain

$$\begin{aligned} D_{f}(u,z_{n}) =& D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) + \bigl\langle \nabla f(x_{n}) - \nabla f(y_{n}), z_{n} - y_{n} \bigr\rangle \\ &{} + \bigl\langle u -z_{n} , \nabla f(x_{n}) - \nabla f(y_{n})\bigr\rangle + \bigl\langle u -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle \\ =& D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) + \bigl\langle \nabla f(x_{n}) - \nabla f(y_{n}), u - y_{n} \bigr\rangle \\ &{} + \bigl\langle u -z_{n}, \alpha _{n} (Ay_{n} -Ax_{n}) \bigr\rangle \\ =& D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) + \bigl\langle \nabla f(x_{n}) - \nabla f(y_{n}), u - y_{n} \bigr\rangle \\ &{} - \bigl\langle z_{n} -y_{n} + y_{n} - u , \alpha _{n} (Ay_{n} -Ax_{n}) \bigr\rangle \\ =& D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) + \bigl\langle \nabla f(x_{n}) - \nabla f(y_{n}), u - y_{n} \bigr\rangle \\ &{} - \bigl\langle z_{n} - y_{n}, \alpha _{n} (Ay_{n} - Ax_{n}) \bigr\rangle - \bigl\langle y_{n} -u, \alpha _{n}(Ay_{n} -Ax_{n}) \bigr\rangle \\ =& D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) - \bigl\langle z_{n} - y_{n}, \alpha _{n} (Ay_{n} -Ax_{n}) \bigr\rangle \\ &{} - \bigl\langle y_{n} - u, \alpha _{n} (Ay_{n} - Ax_{n}) - \bigl(\nabla f(y_{n}) - \nabla f(x_{n}) \bigr) \bigr\rangle . \end{aligned}$$
(3.6)

Moreover, by the definition of \(y_{n}\) and using Lemma 2.5 (i), we obtain

$$ \bigl\langle \nabla f(x_{n}) - \alpha _{n} Ax_{n} - \nabla f(y_{n}), u - y_{n} \bigr\rangle \leq 0. $$
(3.7)

Also, since \(u \in VI(C,A)\) and A is pseudo-monotone, it follows that

$$ \langle Ay_{n}, y_{n} - u \rangle \geq 0. $$
(3.8)

Combining (3.7) and (3.8), we get

$$\begin{aligned} \bigl\langle \alpha _{n} (Ay_{n} - Ax_{n}) - \bigl(\nabla f(y_{n}) - \nabla f(x_{n})\bigr), y_{n} - u \bigr\rangle \geq 0 . \end{aligned}$$

Therefore it follows from (3.6) that

$$ D_{f}(u,z_{n}) \leq D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) - \bigl\langle z_{n} - y_{n}, \alpha _{n} (Ay_{n} -Ax_{n}) \bigr\rangle . $$

Using the Cauchy–Schwarz inequality and (3.1) with (2.6), we obtain

$$\begin{aligned} D_{f}(u,z_{n}) \leq & D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) \\ &{} + \bigl\langle y_{n} -z_{n}, \alpha _{n}(Ay_{n} - Ax_{n}) \bigr\rangle \\ \leq & D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) \\ &{} + \frac{\alpha _{n}}{\alpha _{n+1}}\alpha _{n+1} \Vert y_{n} -z_{n} \Vert \cdot \Vert Ay_{n} -Ax_{n} \Vert \\ \leq & D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) \\ &{} + \frac{\alpha _{n}}{\alpha _{n+1}}\mu \Vert y_{n} - z_{n} \Vert \cdot \Vert y_{n} - x_{n} \Vert \\ \leq & D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) \\ &{} + \frac{\alpha _{n}}{\alpha _{n+1}} \times \frac{\mu }{2} \bigl( \Vert y_{n} -z_{n} \Vert ^{2} + \Vert y_{n} -x_{n} \Vert ^{2}\bigr) \\ \leq & D_{f}(u,x_{n}) - D_{f}(z_{n},y_{n}) - D_{f}(y_{n},x_{n}) \\ &{} + \frac{\alpha _{n}}{\alpha _{n+1}} \times \frac{\mu }{\beta } \bigl(D_{f}(z_{n},y_{n}) + D_{f}(y_{n},x_{n})\bigr) \\ =& D_{f}(u,x_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}). \end{aligned}$$
(3.9)

 □

Next, we show that the sequences generated by Algorithm 1 are bounded.

Lemma 3.6

Let \(\{x_{n}\}\) be the sequence generated by Algorithm 1. Then \(\{x_{n}\}\) is bounded.

Proof

Let \(u^{*} \in \Gamma \), from Lemma 3.5 we obtain

$$\begin{aligned} D_{f}(u,z_{n}) \leq D_{f}(u,x_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}). \end{aligned}$$
(3.10)

Since \(\lim_{n \rightarrow \infty }\alpha _{n}\) exists and \(\mu \in (0,\beta )\), then

$$ \lim_{n \rightarrow \infty } 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } = 1 - \frac{\mu }{\beta } > 0. $$

This implies that there exists \(N>0\) such that

$$ 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } > 0 \quad \forall n \geq N. $$

Hence from (3.10) we get

$$\begin{aligned} D_{f}(u,z_{n}) \leq D_{f}(u,x_{n}). \end{aligned}$$

Also from Lemma 2.9 we obtain

$$\begin{aligned} D_{f}(u,w_{n}) =& D_{f} \Biggl(u, \nabla f^{*} \Biggl(\beta _{n,0} \nabla f(z_{n}) + \sum _{i=1}^{N} \beta _{n,i} \nabla f(T_{i}z_{n}) \Biggr) \Biggr) \\ \leq & \beta _{n,0} D_{f}(u,z_{n}) + \sum _{i=1}^{N}\beta_{n,i} D_{f}(u,T_{i}z_{n}) \\ \leq & \beta _{n,0} D_{f}(u,z_{n}) + \sum _{i=1}^{N}\beta_{n,i} D_{f}(u,z_{n}) \\ =& D_{f}(u,z_{n}). \end{aligned}$$

Therefore

$$\begin{aligned} D_{f}(u,x_{n+1}) =& D_{f} \bigl( u, \nabla f^{*}\bigl( \delta _{n} \nabla f(u_{n}) + (1-\delta _{n}) \nabla f(w_{n}) \bigr) \bigr) \\ \leq & \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) D_{f}(u,w_{n}) \\ \leq & \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) D_{f}(u,x_{n}) \\ \leq & \max \bigl\{ D_{f}(u,u_{n}), D_{f}(u,x_{n}) \bigr\} . \end{aligned}$$

Since the sequence \(\{u_{n}\}\) is bounded and ∇f is bounded on bounded subsets of E, there exists a real number \(\rho >0\) such that \(D_{f}(u,u_{n}) \leq \rho \) for all \(n \in \mathbb{N}\). Hence, by induction, we obtain

$$ D_{f}(u,x_{n+1}) \leq \max \bigl\{ \rho , D_{f}(u,x_{0})\bigr\} . $$

Thus \(\{D_{f}(u,x_{n})\}\) is bounded, and this implies that \(\{x_{n}\}\) is bounded. Consequently, \(\{y_{n}\}\), \(\{z_{n}\}\), \(\{w_{n}\}\) are bounded too. □

Lemma 3.7

Let \(\{x_{n}\}\) be the sequence generated by Algorithm 1. Then \(\{x_{n}\}\) satisfies the following inequalities:

  1. (i)

    \(a_{n+1} \leq (1-\delta _{n})a_{n} + \delta _{n} b_{n}\),

  2. (ii)

    \(-1 \leq b_{n} < +\infty \),

where \(a_{n} = D_{f}(u,x_{n})\), \(b_{n} = \langle \nabla f(u_{n}) - \nabla f(u), x_{n+1} - u \rangle \), and \(u \in \Gamma \) for all \(n \in \mathbb{N}\).

Proof

(i) Let \(u \in \Gamma \), in view of (2.7) and Lemma 2.10, we have

$$\begin{aligned} D_{f}(u,w_{n}) =& D_{f} \Biggl(u, \nabla f^{*} \Biggl(\beta _{n,0} \nabla f(z_{n}) + \sum _{i=1}^{N} \beta _{n,i}\nabla f(T_{i} z_{n}) \Biggr) \Biggr) \\ =& V_{f} \Biggl(u , \beta _{n,0} \nabla f(z_{n}) + \sum_{i=1}^{N} \beta _{n,i} \nabla f(T_{i} z_{n}) \Biggr) \\ =& f(u) - \Biggl\langle u, \beta _{n,0} \nabla f(z_{n}) + \sum_{i=1}^{N} \beta _{n,i}\nabla f(T_{i} z_{n}) \Biggr\rangle \\ &{} + f^{*} \Biggl(\beta _{n,0} \nabla f(z_{n}) + \sum_{i=1}^{N} \beta _{n,i}\nabla f(T_{i} z_{n}) \Biggr) \\ =& f(u) - \beta _{n,0} \bigl\langle u, \nabla f(z_{n}) \bigr\rangle + \sum_{i=1}^{N} \beta _{n,i}\bigl\langle u, \nabla f(T_{i}z_{n}) \bigr\rangle + \beta _{n,0} f^{*}\bigl( \nabla f(z_{n}) \bigr) \\ &{} + \sum_{i=1}^{N} \beta _{n,i} f^{*}\bigl(\nabla f(T_{i}z_{n}) \bigr) - \beta _{n,0} \beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \\ =& \beta _{n,0} V_{f}\bigl(u, \nabla f(z_{n}) \bigr) \\ &{}+ \sum_{i=1}^{N} \beta _{n,i} V_{f}\bigl(u, \nabla f(T_{i}z_{n}) \bigr) - \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \\ =& \beta _{n,0} D_{f}(u,z_{n})+ \sum _{i=1}^{N} D_{f}(u,T_{i}z_{n}) - \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \\ \leq & D_{f}(u,z_{n}) - \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr). \end{aligned}$$
(3.11)

Furthermore, using (2.8), we have

$$\begin{aligned} D_{f}(u,x_{n+1}) =&D_{f} \bigl(u, \nabla f^{*}\bigl(\delta _{n} \nabla f(u) + (1-\delta _{n})\nabla f(w_{n})\bigr) \bigr) \\ =& V_{f}\bigl(u, \delta _{n} \nabla f(u_{n}) + (1-\delta _{n})\nabla f(w_{n})\bigr) \\ \leq & V_{f} \bigl(u, \delta _{n} \nabla f(u_{n}) + (1-\delta _{n}) \nabla f(w_{n}) - \delta _{n} \bigl(\nabla f(u_{n}) - \nabla f(u)\bigr) \bigr) \\ &{} - \delta _{n} \bigl\langle -\bigl(\nabla f(u_{n}) - \nabla f(u)\bigr), x_{n+1} - u \bigr\rangle \\ =& V_{f} \bigl(u, \delta _{n} \nabla f(u) + (1-\delta _{n}) \nabla f(w_{n}) \bigr) + \delta _{n} \bigl\langle \nabla f(u_{n})- \nabla f(u), x_{n+1} - u \bigr\rangle \\ \leq & \delta _{n} D_{f}(u,u) + (1-\delta _{n}) D_{f}(u,w_{n}) + \delta _{n} \bigl\langle \nabla f(u_{n})- \nabla f(u), x_{n+1} - u \bigr\rangle \\ \leq & (1-\delta _{n}) D_{f}(u,x_{n}) + \delta _{n} \bigl\langle \nabla f(u_{n})- \nabla f(u), x_{n+1} - u \bigr\rangle . \end{aligned}$$

Thus, we established (i).

(ii) Since \(\{x_{n}\}\) and \(\{u_{n}\}\) are bounded, then

$$ \sup_{n \geq 0} b_{n} \leq \sup_{n \geq 0} \bigl\Vert \nabla f(u_{n}) - \nabla f(u) \bigr\Vert \cdot \Vert x_{n+1} - u \Vert < \infty . $$

This implies that \(\limsup_{n\to \infty }b_{n} < \infty \). Next we show that \(\limsup_{n\to \infty }b_{n} \geq -1\). On the contrary, suppose \(\limsup_{n\to \infty }b_{n} < -1\). Then we can choose \(n_{0} \in \mathbb{N}\) such that \(b_{n} < -1\) for all \(n \geq n_{0}\). Then, for all \(n \geq n_{0}\), we get from (i)

$$\begin{aligned} a_{n+1} \leq & (1-\delta _{n})a_{n} + \delta _{n} b_{n} \\ < & (1-\delta _{n})a_{n} - \delta _{n} \\ =& a_{n} - \delta _{n} (a_{n} +1) \\ \leq & a_{n} - \delta _{n}. \end{aligned}$$

Taking lim sup of the last inequality, we obtain

$$ \limsup_{n\to \infty } a_{n+1} \leq a_{n_{0}} - \lim _{n\to \infty } \sum_{i = n_{0}}^{\infty } \delta _{i} = - \infty . $$

This contradicts the fact that \(\{a_{n}\}\) is a nonnegative sequence. Hence \(\limsup_{n\to \infty }b_{n} \geq -1\). □

We now prove the convergence of Algorithm 1.

Theorem 3.8

Suppose that \(\{x_{n}\}\) is generated by Algorithm 1. Then \(\{x_{n}\}\) converges strongly to a point , where \(\bar{x} = Proj_{\Gamma }^{f}(\bar{x})\).

Proof

Let \(u \in \Gamma \) and \(a_{n} = D_{f}(u,x_{n})\). We consider the following two possible cases.

Case I: Suppose that \(\{a_{n}\}\) is monotonically decreasing. Since \(\{a_{n}\}\) is bounded, then \(a_{n} - a_{n+1} \to 0\). From (3.11), we have

$$\begin{aligned} D_{f}(u,x_{n+1}) \leq & \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n})D_{f}(u,w_{n}) \\ \leq & \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) \bigl[ D_{f}(u,z_{n}) - \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \bigr] \\ \leq &\delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) \bigl[ D_{f}(u,x_{n})- \beta _{n,0} \beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \bigr]. \end{aligned}$$

This implies that

$$\begin{aligned}& (1-\delta _{n}) \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) \\& \quad \leq \delta _{n} D_{f}(u,u_{n}) + (1- \delta _{n})D_{f}(u,x_{n}) - D_{f}(u,x_{n+1}) \\& \quad = \delta _{n} \bigl(D_{f}(u,u_{n}) - a_{n}\bigr) + a_{n} - a_{n+1}. \end{aligned}$$
(3.12)

Since \(\delta _{n} \to 0\) as \(n \to \infty \), it follows from (3.12) that

$$ \lim_{n \rightarrow \infty } \beta _{n,0}\beta _{n,i}\rho ^{*}_{r} \bigl( \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \bigr) = 0. $$

Also, since \(\liminf_{n\to \infty }\beta _{n,0}\beta _{n,i} >0\) and by the property of \(\rho _{r}^{*}\), we obtain

$$ \lim_{n \rightarrow \infty } \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert = 0. $$

Moreover, f is uniformly Fréchet differentiable, then \(\nabla f^{*}\) is uniformly continuous on bounded subsets of \(E^{*}\), and thus

$$ \lim_{n \rightarrow \infty } \Vert z_{n} - T_{i}z_{n} \Vert = 0. $$
(3.13)

Furthermore, from Lemma 3.5, we have

$$\begin{aligned}& D_{f}(u,x_{n+1}) \\& \quad \leq \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) D_{f}(u,w_{n}) \\& \quad \leq \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n}) D_{f}(u,z_{n}) \\& \quad \leq \delta _{n} D_{f}(u,u_{n}) \\& \qquad {}+ (1-\delta _{n}) \biggl[D_{f}(u,x_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) - \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}) \biggr] . \end{aligned}$$

This implies that

$$\begin{aligned} &(1-\delta _{n}) \biggl[ \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) + \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}) \biggr] \\ &\quad \leq \delta _{n} D_{f}(u,u_{n}) + (1-\delta _{n})D_{f}(u,x_{n}) - D_{f}(u,x_{n+1}) \\ &\quad = \delta _{n} \bigl(D_{f}(u,u_{n}) - a_{n}\bigr) + a_{n} - a_{n+1}. \end{aligned}$$

This implies that

$$ \lim_{n \rightarrow \infty } \biggl[ \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(z_{n},y_{n}) + \biggl( 1 - \frac{\alpha _{n}\mu }{\alpha _{n+1}\beta } \biggr)D_{f}(y_{n},x_{n}) \biggr] = 0, $$

and hence

$$ \lim_{n \rightarrow \infty } \biggl[ \biggl(1 - \frac{\mu }{\beta } \biggr)D_{f}(z_{n},y_{n}) + \biggl(1 - \frac{\mu }{\beta } \biggr)D_{f}(y_{n},x_{n}) \biggr] = 0. $$

Since \(\mu \in (0,\beta )\), we get

$$ \lim_{n \rightarrow \infty }\bigl(D_{f}(z_{n},y_{n})+D_{f}(y_{n},x_{n}) \bigr) = 0. $$

This implies that

$$ \lim_{n \rightarrow \infty }D_{f}(z_{n},y_{n}) = \lim_{n \rightarrow \infty }D_{f}(y_{n},x_{n}) = 0. $$

Consequently,

$$ \lim_{n \rightarrow \infty } \Vert z_{n} -y_{n} \Vert = \lim_{n \rightarrow \infty } \Vert y_{n} -x_{n} \Vert = 0. $$

Then

$$ \lim_{n \rightarrow \infty } \Vert z_{n} -x_{n} \Vert = 0. $$
(3.14)

It is clear that

$$\begin{aligned} \bigl\Vert \nabla f(w_{n}) - \nabla f(z_{n}) \bigr\Vert = \sum_{i=1}^{N}\beta_{n,i} \bigl\Vert \nabla f(z_{n}) - \nabla f(T_{i}z_{n}) \bigr\Vert \to 0. \end{aligned}$$

Hence, since \(\nabla f^{*}\) is norm-to-norm continuous on bounded subsets of \(E^{*}\), we obtain

$$ \lim_{n \rightarrow \infty } \Vert w_{n} - z_{n} \Vert = 0, $$

therefore

$$ \lim_{n \rightarrow \infty } \Vert w_{n} - x_{n} \Vert = 0. $$

Since \(\{x_{n}\}\) is bounded, there exists a subsequence \(\{x_{n_{j}}\}\) of \(\{x_{n}\}\) such that \(x_{n_{j}} \rightharpoonup x^{*} \in C\). We now show that \(x^{*} \in \Gamma \). Since \(y_{n_{j}} = Proj_{C}^{f}(\nabla f^{*}(\nabla f(x_{n_{j}}-\alpha _{n_{j}}Ax_{n_{j}})))\), it follows from Lemma 2.5 (i) that

$$\begin{aligned} \bigl\langle \nabla f(x_{n_{j}}) - \alpha _{n_{j}}Ax_{n_{j}} - \nabla f(y_{n_{j}}), x - y_{n_{j}} \bigr\rangle \leq 0 \quad \forall x \in C. \end{aligned}$$

Then

$$\begin{aligned} \bigl\langle \nabla f(x_{n_{j}}) - \nabla f(y_{n_{j}}), x - y_{n_{j}} \bigr\rangle \leq & \alpha _{n_{j}} \langle Ax_{n_{j}}, x - y _{n_{j}} \rangle \\ =& \alpha _{n_{j}} \langle Ax_{n_{j}}, x_{n_{j}} - y_{n_{j}} \rangle + \alpha _{n_{j}}\langle Ax_{n_{j}}, x - x_{n_{j}} \rangle . \end{aligned}$$

This implies that

$$\begin{aligned} \bigl\langle \nabla f(x_{n_{j}}) - \nabla f(y_{n_{j}}), x - y_{n_{j}} \bigr\rangle + \alpha _{n_{j}} \langle Ax_{n_{j}}, y_{n_{j}} - x_{n_{j}} \rangle \leq \alpha _{n_{j}} \langle Ax_{n_{j}}, x- x_{n_{j}} \rangle . \end{aligned}$$
(3.15)

Since \(\|x_{n_{j}}- y_{n_{j}}\|\to 0\) and f is strongly coercive, then

$$ \lim_{k\to \infty } \bigl\Vert \nabla f(x_{n_{j}}) - \nabla f(y_{n_{j}}) \bigr\Vert = 0. $$
(3.16)

Fix \(x \in C\), it follows from (3.15), (3.16) and the fact that \(\liminf_{j\to \infty }\alpha _{n_{j}}>0\), that

$$ 0 \leq \liminf_{j\to \infty }\langle Ax_{n_{j}}, x - x_{n_{j}} \rangle \quad \forall x \in C. $$
(3.17)

Now let \(\{\epsilon _{j}\}\) be a sequence of decreasing nonnegative numbers such that \(\epsilon _{j} \to 0\) as \(j \to \infty \). For each \(\epsilon _{j}\), we denote by \(n_{j}\) the smallest positive integer such that

$$ \langle Ax_{n_{j}}, x - x_{n_{j}} \rangle + \epsilon _{j} \geq 0 \quad \forall k \geq n_{j}, $$

where the existence of \(n_{j}\) follows from (3.17). Since \(\lbrace \epsilon _{j} \rbrace \) is decreasing, then \(\lbrace n_{j} \rbrace \) is increasing. Also, for each j, \(Ax_{n_{j}} \neq 0\) and, setting

$$ t_{n_{j}}=\frac{Ax_{n_{j}}}{ \Vert Ax_{n_{j}} \Vert ^{2}}, $$

one gets \(\langle Ax_{n_{j}},t_{n_{j}} \rangle =1\) for each k. Therefore,

$$ \langle Ax_{n_{j}}, x+ \epsilon _{j}t_{n_{j}} - x_{n_{j}} \rangle \geq 0. $$

Since A is pseudo-monotone, we have from (3.17) that

$$\begin{aligned} \bigl\langle A(x+\epsilon _{j}t_{n_{j}}), x+ \epsilon _{j}t_{n_{j}} - x_{n_{j}} \bigr\rangle \geq 0. \end{aligned}$$
(3.18)

Since \(\lbrace x_{n_{j}} \rbrace \) converges weakly to \(x^{*}\) as \(j \to \infty \) and A is weakly sequentially continuous, we have that \(\lbrace Ax_{n_{j}} \rbrace \) converges weakly to \(Ax^{*}\). Suppose \(Ax^{*} \neq 0\) (otherwise, \(x^{*} \in VI(C,A)\)). Then, by the sequentially weakly lower semicontinuity of the norm, we get

$$ 0< \bigl\Vert Ax^{*} \bigr\Vert = \liminf_{j \to \infty } \Vert Ax_{n_{j}} \Vert . $$

Since \(\lbrace x_{n_{j}} \rbrace \subset \lbrace x_{n_{j}} \rbrace \) and \(\epsilon _{j} \to 0\) as \(j \to \infty \), we get

$$\begin{aligned} 0 \leq & \lim \sup_{j \to \infty } \Vert \epsilon _{j} t_{n_{j}} \Vert = \lim \sup_{j \to \infty } \biggl( \frac{\epsilon _{j}}{ \Vert Ax_{2n_{j}} \Vert } \biggr) \\ \leq & \frac{\lim \sup_{j \to \infty }\epsilon _{j}}{\liminf_{j \to \infty } \Vert Ax_{n_{j}} \Vert } \leq \frac{0}{ \Vert Ax^{*} \Vert }=0, \end{aligned}$$

and this means \(\lim_{j \to \infty } \|\epsilon _{j} t_{2n_{j}}\|=0\). Passing the limit \(j \to \infty \) in (3.18), we get

$$ \bigl\langle Ax,x-x^{*} \bigr\rangle \geq 0. $$

Therefore, from Lemma 2.13, we have \(x^{*} \in VI(C,A)\). Furthermore, following from (3.13) and (3.14), we have that \(x^{*} \in \hat{F}(T_{i}) = F(T_{i})\) for all \(i =1,2,\dots ,N\), hence \(x^{*} \in \bigcap_{i=1}^{N} F(T_{i})\). Therefore \(x^{*} \in \Gamma \).

We now show that \(\{x_{n}\}\) converges strongly to a point \(\bar{x} = Proj_{\Gamma }^{f}(\bar{x})\). To do this, it suffices to show that \(\limsup_{n\to \infty }\langle \nabla f(u_{n})- \nabla f(\bar{x}), x_{n+1} - \bar{x} \rangle \leq 0\). Choose a sequence \(\{x_{n_{j}}\}\) of \(\{x_{n}\}\) such that

$$\begin{aligned} \limsup_{n\to \infty } \bigl\langle \nabla f(u_{n})- \nabla f(\bar{x}), x_{n+1} - \bar{x} \bigr\rangle = \lim_{j \to \infty } \bigl\langle \nabla f(u_{n_{j}})- \nabla f(\bar{x}), x_{n_{j}+1} - \bar{x} \bigr\rangle . \end{aligned}$$

Since \(u_{n_{j}} \to u^{*}\) and \(x_{n_{j}} \rightharpoonup x^{*}\), it follows from Lemma 2.5 (i) that

$$\begin{aligned} \limsup_{n\to \infty } \bigl\langle \nabla f(u_{n})- \nabla f(\bar{x}), x_{n+1} - \bar{x} \bigr\rangle =& \lim_{j \to \infty } \bigl\langle \nabla f(u_{n_{j}})- \nabla f(\bar{x}), x_{n_{j}+1} - \bar{x} \bigr\rangle \\ =& \bigl\langle \nabla f\bigl(u^{*}\bigr) - \nabla f(\bar{x}), x^{*} - \bar{x} \bigr\rangle \\ \leq & 0. \end{aligned}$$

Hence

$$ \limsup_{n\to \infty } \bigl\langle \nabla f(u_{n})- \nabla f(\bar{x}), x_{n+1} - \bar{x} \bigr\rangle \leq 0. $$
(3.19)

Putting \(u = \bar{x}\) in Lemma 3.7, it follows from Lemma 2.14 and (3.19) that \(\lim_{n \rightarrow \infty }a_{n} = 0\). Consequently, \(\lim_{n \rightarrow \infty }\|x_{n} - \bar{x}\| = 0\). This implies that \(\{x_{n}\}\) converges strongly to a point \(\bar{x} = Proj_{\Gamma }^{f}(\bar{x})\).

Case II: Suppose that \(\{a_{n}\}\) is not monotonically decreasing, that is, there is a subsequence \(\{a_{n_{j}}\}\) of \(\{a_{n}\}\) such that \(a_{n_{j}} < a_{n_{j}+1}\) for all \(j \in \mathbb{N}\). Then, by Lemma 2.15, we can define an integer sequence \(\{\tau (n)\}\) for all \(n \geq n_{0}\) by

$$ \tau (n) = \max \{k \leq n: \Gamma _{k}< \Gamma _{k+1}\}. $$

Moreover, \(\{\tau (n)\}\) is a nondecreasing sequence such that \(\tau (n) \to \infty \) as \(n \to \infty \) and \(a_{\tau (n)}\leq \Gamma _{\tau (n)+1}\) for all \(n \geq n_{0}\). Following a similar argument as in Case I, we obtain

$$ \lim_{n \rightarrow \infty } \Vert y_{\tau (n)} - x_{\tau (n)} \Vert = 0, \qquad \lim_{n \rightarrow \infty }\|z_{\tau (n)} - x_{\tau (n)}| =0 \quad \text{and}\quad \lim_{n \rightarrow \infty } \Vert w_{\tau (n)} - x_{ \tau (n)} \Vert =0. $$

By a similar argument as in Case A, we also obtain

$$ \limsup_{n\rightarrow \infty }\bigl\langle \nabla f(u_{\tau (n)}) - \nabla f(u), x_{\tau (n)+1} - u\bigr\rangle \leq 0. $$
(3.20)

Also, by Lemma 3.7 (i), we get

$$\begin{aligned} 0 \leq & a_{\tau (n)+1} - a_{\tau (n)} \\ \leq & (1-\delta _{\tau (n)}) D_{f}(u,x_{\tau (n)}) + \delta _{\tau (n)} \bigl\langle \nabla f(u_{\tau (n)}) - \nabla f(u), x_{\tau (n)+1} - u \bigr\rangle - D_{f}(u,x_{\tau (n)}). \end{aligned}$$

This implies that

$$\begin{aligned} D_{f}(u,x_{\tau (n)}) \leq \bigl\langle \nabla f(u_{\tau (n)}) - \nabla f(u), x_{\tau (n)+1} - u\bigr\rangle . \end{aligned}$$

Hence from (3.20) we obtain \(\limsup_{n\rightarrow \infty }D_{f}(u,x_{\tau (n)}) \leq 0\), which implies that \(\lim_{n \rightarrow \infty }D_{f}(u, x_{\tau (n)}) = 0\). Consequently, we have

$$ 0 \leq a_{n} \leq \max \{a_{\tau (n)},a_{\tau (n)+1}\} \leq a_{\tau (n)+1} \to 0 . $$

Hence \(D_{f}(u,x_{n}) \to 0\). This implies that \(\|x_{n} - u\| \to 0\), and thus \(x_{n} \to u = Prof_{\Gamma }^{f}(u)\). This completes the proof. □

The following can be obtained directly as consequences of our main result.

(i) If \(A:E \to E^{*}\) is monotone and Lipschitz continuous on E, then we obtain the following result.

Corollary 3.9

Let E be a real reflexive Banach space, \(A:E \to E^{*}\) be a monotone and Lipschitz continuous operator, and \(\{T_{i}\}\) (\(i =1,2,\dots ,N\)) be a finite family of Bregman quasi-nonexpansive mappings on E such that \(F(T_{i}) = \hat{F}(T_{i})\) for \(i=1,2,\dots ,N\). Let \(f:E \to \mathbb{R}\cup \{+\infty \}\) be a function satisfying Assumption 3.2and \(\{\beta _{n,i}\}\), \(\{\delta _{n}\}\), and \(\{u_{n}\}\) be sequences satisfying Assumption 3.3. Suppose \(\Gamma = VI(C,A) \cap \bigcap_{i=1}^{N} F(T_{i}) \neq \emptyset \). Then the sequence \(\{x_{n}\}\) generated by Algorithm 1 converges strongly to a solution , where \(\bar{x} = Proj_{\Gamma }^{f}(\bar{x})\).

Note that in this case the weak sequential continuity of A in Assumption 3.1 (b) has to be dropped since it follows from the monotonicity and (3.15) that

$$\begin{aligned} \bigl\langle \nabla f(x_{n_{j}}) - \nabla f(y_{n_{j}}), x - y_{n_{j}} \bigr\rangle + \alpha _{n_{j}} \langle Ax_{n_{j}}, y_{n_{j}} - x_{n_{j}} \rangle \leq & \alpha _{n_{j}} \langle Ax_{n_{j}}, x- x_{n_{j}} \rangle \\ \leq & \alpha _{n_{j}} \langle A x, x- x_{n_{j}} \rangle . \end{aligned}$$

Passing limit to the above inequality and using the facts that \(\|x_{n_{j}} - y_{n_{j}}\| \to 0\) and \(\|\nabla f(x_{n_{j}}) - \nabla f(y_{n_{j}}) \| \to 0\), we obtain

$$ \bigl\langle Ax, x - x^{*} \bigr\rangle \geq 0 \quad \forall x \in C. $$
(3.21)

(ii) In addition, if we take \(\{T_{i}\}\) (\(i=1,2,\dots ,N\)) as a finite family of Bregman nonexpansive mappings on E, then we obtain the following result.

Corollary 3.10

Let E be a real reflexive Banach space and \(A:E \to E^{*}\) and \(f:E \to \mathbb{R}\cup \{+\infty \}\) satisfy Assumptions 3.1and 3.2respectively, where \({T_{i}} \) (\(i=1,2,\dots ,N\)) is a finite family of Bregman nonexpansive mappings. Suppose that \(\{\beta _{n,i}\}\subset (0,1)\) and \(\{\delta _{n}\} \subset (0,1)\) \(\{u_{n}\} \subset E\) satisfy Assumption 3.3. Then the sequence generated by Algorithm 1 converges strongly to a point , where \(\bar{x} = Proj_{\Gamma }(\bar{x})\).

4 Applications

In this section, we give some applications of our result to a generalized Nash equilibrium problem and bandwidth allocation problem.

1. Generalized Nash equilibrium problem (GNEP)

Let \(I=\{1,2,\dots ,N\}\) be the set of players with each player \(i \in I\) controlling variable \(x^{i} \in \mathcal{C}_{i} \subset \mathbb{R}^{m_{i}}\) and \(\mathcal{C} = \prod_{i \in I} \mathcal{C}_{i} \subset \prod_{i \in I} \mathbb{R}^{m_{i}}\), where \(m_{i}\) (\(i\in I\)) satisfy \(m = \sum_{i=1}^{N} m_{i}\). The point \(x^{i}\) is called the strategy of the ith player. We denote by \(x \in \mathbb{R}^{m}\) the vector of strategies \(x = (x_{1},\dots ,x_{N})\), and \(x^{-i}\) denotes the vector formed by all player decision variables \(x^{j}\) except the player i. Thus, we can write \(x = (x^{i}, x^{-i})\), which is the shorthand to denote the vector \(x = (x^{1}, \dots , x^{i-1},x^{i},x^{i+1},\dots ,x^{N} )\). The set \(\mathcal{C}_{i}(x^{-i}) = \{x^{i}\in \mathbb{R}^{m_{i}}: (x^{i},x^{-i}) \in \mathcal{C}\}\) denotes the strategy set of the ith player when the remaining player chooses strategies \(x^{-i}\) (see e.g. [53]). We note that the aim of the ith player given the strategy \(x^{-i}\) is to choose a strategy \(x^{i}\) such that \(x^{i}\) solves the following minimization problem:

$$ \min \theta _{i}\bigl(x^{i},x^{-i}\bigr) \quad \text{such that}\quad x^{i}\in \mathcal{C}_{i} \bigl(x^{-i}\bigr). $$
(4.1)

For any given \(x^{-i}\), we denote the solution set of (4.1) by \(Sol_{i}(x^{-i})\). Using the above notation, we give the precise definition of the GNEP as follows (see e.g. [32]).

Definition 4.1

A GNEP is defined as finding \(\bar{x} \in \mathcal{C}\) such that \(\bar{x}^{i} \in Sol_{i}(\bar{x}^{-i})\) for every \(i \in I\).

The following result follows from the first order optimality condition for the solution of problem (4.1) for each \(i\in I\).

Theorem 4.2

Consider the GNEP such that

  1. (a)

    \(\mathcal{C}\) is closed and convex,

  2. (b)

    \(\theta _{i}\) is continuously differentiable for every \(i \in I\),

  3. (c)

    \(\theta _{i}(\cdot ,x^{-i}):\mathbb{R}^{m_{i}} \to \mathbb{R}\) is convex for every \(i \in I\) and every \(x \in \mathcal{C}\).

Define an operator \(F:\mathbb{R}^{m} \to \mathbb{R}^{m}\) as

$$ F(x) = \bigl(\nabla _{x^{1}}\theta _{1}(x),\dots ,\nabla _{x^{N}} \theta _{N}(x) \bigr), $$
(4.2)

where \(\nabla _{x^{i}}\theta _{i}\) denotes the gradient of \(\theta _{i}\) with respect to its first argument. Then every solution of \(VIP(\mathcal{C},F)\) is a solution of the GNEP.

In addition, the sets \(\mathcal{C}_{i}\) can be defined as the intersection of simpler closed convex sets \(\mathcal{C}_{i}^{(j)}\) (\(j=1,2,\dots ,l_{i}\)) i.e.

$$ \mathcal{C}_{i} = \bigcap_{j=1}^{l_{i}} \mathcal{C}_{i}^{(j)}. $$

Let \(T:\mathbb{R}^{m_{i}} \to \mathbb{R}^{m_{i}}\) be defined by

$$\begin{aligned} x&:= (x_{1},x_{2},\dots ,x_{m}) \mapsto \\ &\quad Tx = \Biggl(\sum_{j=1}^{l_{1}}\omega _{1}^{(j)}\Pi _{\mathcal{C}_{1}^{(j)}}(x_{1}), \sum _{j=1}^{l_{2}}\omega _{1}^{(j)} \Pi _{\mathcal{C}_{2}^{(j)}}(x_{2}), \dots , \sum_{j=1}^{l_{m}} \omega _{1}^{(j)}\Pi _{\mathcal{C}_{m}^{(j)}}(x_{m}) \Biggr), \end{aligned}$$

with the projections \(\Pi _{\mathcal{C}_{i}}^{(j)}: \mathbb{R}^{m_{i}} \to \mathcal{C}_{i}^{(j)} \) (\(i \in I\), \(j = 1,2,\dots ,l_{i}\)) and \(\omega _{i}^{(j)} >0\) (\(i \in I\), \(j=1,2,\dots ,l_{i}\)) such that \(\sum_{j=1}^{l_{i}}\omega _{i}^{(j)}=1\). Then T is nonexpansive (see [2]) and satisfies \(F(T) = \prod_{i\in I}\mathcal{C}_{i} = \mathcal{C}. \nonumber \) Hence, the GNEP can be expressed as the following variational inequality:

$$ \text{Find} \quad x \in \mathcal{C} \quad \text{such that} \quad x \in VI( \mathcal{C}, F). $$
(4.3)

It is worthy to mention that formulation (4.3) offers a simple approach of dealing with the GNEP since the sets \(\mathcal{C}_{i}^{(j)}\) can have closed form expressions. This can be applied to more general cases, where the projection onto \(\mathcal{C}\) is not necessarily easy, while the computations of projections \(\Pi _{\mathcal{C}_{i}^{(j)}}\) (\(i\in I\), \(j =1,2,\dots ,l_{i}\)) are in fact traceable.

By choosing \(f(x) = \|x\|^{2}\) and \(N=1\) in Algorithm 1, we can then apply our result to solving the GNEP as follows.

Corollary 4.3

Suppose that the GNEP is consistent (i.e. has a solution), and let \(\mathcal{C}\), F, and T be as defined above such thatF is monotone and Lipschitz continuous. Then the sequence \(\{x_{n}\}\) generated by Algorithm 1 converges strongly to a solution of the GNEP.

2. Utility-based bandwidth allocation problem

Efficient network distribution is very important for making communication networks reliable and stable. Network resources such as power, channel, and bandwidth are shared among many sources. In utility-based bandwidth network, the objective is to share available bandwidths among different traffic sources so as to maximize the overall utility under a capacity constraint [46, 58].

The utility based allocation problem can be modeled as a function \(\mathcal{U}\) of the transmission rates allocated to the traffic source which represents the efficiency and fairness of the bandwidth sharing [58]. \(\mathcal{U}\) is typically assumed to be concave and continuously differentiable. A well-known utility function is the weighted proportionally fair function defined by \(\mathcal{U}_{pf}(x) = \sum_{i\in I}\omega _{i} \log (x_{i})\) for all \(x:=(x_{1},x_{2},\dots ,x_{m})^{T} \in \mathbb{R}^{m}_{+}\backslash \{0\}\), where \(x_{i} >0\) denotes the transmission rate of source \(i \in I=\{1,2,\dots ,m\}\), \(\omega _{i} >0\) is the weighted parameter for source i, and \(\mathbb{R}^{m}_{+}:=\{(x_{1},x_{2},\dots ,x_{m})^{T}\in \mathbb{R}^{m}: x_{i} \geq 0, (i \in I)\}\). The optimal bandwidth allocation corresponding to \(\mathcal{U}_{pf}\) is said to be weighted proportionally fair.

The capacity constraint for each line can be expressed as an inequality constraint in which the total sum of the transmission rate for all source sharing the link is less than or equal to the capacity of the link. For each link \(l \in \mathcal{L} = \{1,2,\dots ,L\}\), the capacity constraint is expressed as \(\mathbb{R}_{+}^{m} \cap C_{l}\), where

$$ C_{l} := \biggl\{ x:=(x_{1},x_{2},\dots ,x_{m})^{T}\in \mathbb{R}^{m} : \sum _{s\in I}x_{i}I_{i,l}\leq C_{l} \biggr\} , $$

\(c_{l}>0\) stands for the capacity of link l, and \(I_{i,l}\) takes the value 1 if l is the link used by source s, and 0 otherwise. Then the objective in bandwidth allocation is to solve the following utility-based bandwidth allocation problem [58, Chap. 2] for maximizing the utility function subject to the capacity constraints:

$$ \text{Maximize}\quad \mathcal{U}_{pf}(x) \quad \text{subject to}\quad x \in C, $$
(4.4)

where \(C \subset \mathbb{R}^{m}\) is the capacity constraint set defined by

$$ C:= \mathbb{R}^{m}_{+} \cap \bigcap _{l\in \mathcal{L}}C_{l} = \mathbb{R}^{m}_{+} \cap \bigcap_{l\in \mathcal{L}} \biggl\{ (x_{1},x_{2}, \dots ,x_{m})^{T} \in \mathbb{R}^{m} : \sum _{i\in I}x_{i} I_{i,l} \leq c_{l} \biggr\} . $$
(4.5)

Note that the set C (4.5) can be expressed as the fixed point set of a mapping composed of the projections onto \(C_{l}s\). Let us define a mapping \(T_{proj}:\mathbb{R}^{m} \to \mathbb{R}^{m}\) composed of the projections onto \(\mathbb{R}^{m}_{+}\) and \(C_{l}s\) as follows:

$$ T_{proj}:= P_{\mathbb{R}^{m}_{+}} \prod_{l \in \mathcal{L}} P_{C_{l}} = P_{\mathbb{R}^{m}_{+}}P_{C_{1}}P_{C_{2}}\dots P_{C_{L}}, $$

where \(P_{D}\) stands for the projection onto a nonempty, closed and convex set \(D \subset \mathbb{R}^{m}\). Then \(T_{proj}\) satisfies the nonexpansivity condition because \(P_{\mathbb{R}^{m}_{+}}\) and \(P_{C_{l}}s\) are nonexpansive. Moreover, C coincides with the fixed point set of \(T_{proj}\) [2, Proposition 2.10] i.e.

$$ C = F(T_{proj}):= \{x \in \mathbb{R}:T_{proj}x = x\}. $$

Also, \(-\nabla \mathcal{U}_{pf}\) is strongly monotone and Lipschitz continuous on a certain set. Then the bandwidth allocation problem (4.4) can be expressed as a common variational inequality and fixed point problem as follows:

$$ \text{Find} \quad x \in VI(C,-\nabla \mathcal{U}_{pf}) \cap F(T_{proj}). $$
(4.6)

Choosing \(f(x) = \|x\|^{2}\) and \(N=1\) in Algorithm 1, we apply Algorithm 1 to solving the utility-based bandwidth network allocation as follows.

Corollary 4.4

Suppose that the utility based allocation problem (4.4) is consistent, and let \(\mathcal{U}_{pf}\), \(T_{proj}\) and the capacity constraint C be defined as above. Then the sequence \(\{x_{n}\}\) generated by Algorithm 1 converges strongly to a solution of Problem (4.4).

5 Numerical experiments

In this section, we present some numerical experiments for our proposed algorithm. We compare the performance of Algorithm 1 for different types of convex functions listed below and with some other algorithms in the literature. All codes are written with a Lenovo PC with the following specification: Intel(R)core i7-5600, CPU 2.48 GHz, RAM 8.0 GB, MATLAB version 9.5 (R2019b).

Example 5.1

Let \(E = \mathbb{R}^{m}\) and \(A: \mathbb{R}^{m} \to \mathbb{R}^{m}\) be defined by \(A(x) = Mx +q\), where

$$ M = BB^{T} + S + D $$

such that B is an \(m \times m\) matrix, S is an \(m \times m\) skew symmetric matrix, D is an \(m \times m\) diagonal matrix whose diagonal is nonnegative (so M is positive definite) and q is a vector in \(\mathbb{R}^{m}\). The feasible set C is defined by \(C = \{x = (x_{1},\dots ,x_{m})^{T}:\|x\| \leq 1 \textit{ and } x_{j} \geq a, j=1,\dots ,m \}\), where \(a <1/\sqrt{m}\). It is clear that A is monotone and Lipschitz continuous with Lipschitz constant \(L = \|M\|\). For \(q =0\), the unique solution of the corresponding VI is \(\{0\}\). We define the mapping \(T_{i}: \mathbb{R}^{m} \to \mathbb{R}^{m}\) as the projection onto C for \(i =1,2,\dots ,5\), which is Bregman nonexpansive (see [51]). The starting point \(x_{0} \in [0,1]^{m}\) and the entries of matrices B, S, D are generated randomly for \(m = 5, 20,50,100\). We choose \(\alpha _{0} = 0.23\), \(\mu = 0.36\), \(\delta _{n} = \frac{1}{n+1}\), \(u_{n} = \frac{1}{n^{5}}\), \(\beta _{n,i} = \frac{1}{6}\) for \(i = 0,1,2,\dots ,5\), and \(n \in \mathbb{N}\). We test Algorithm 1 using \(f^{KL}\), \(f^{SE}\), \(f^{IS}\), \(f^{SM}\) given in Example 2.4

We also compare the performance of Algorithm 1 with Algorithm 3.1 of Thong and Hieu [60] and Algorithm 3.1 of Thong and Hieu [61]. For [60] algorithm, we choose \(\gamma = 2\), \(l = 0.2\), \(\mu = 0.36\), \(\beta _{n} = \frac{2n}{3n+4}\), \(\alpha _{n} = \frac{1}{n+1}\), and \(f(x) = \frac{x}{2}\). We also choose for [61] algorithm \(\mu = 0.36\), \(\tau _{0} = 0.23\), \(\beta _{n} = \frac{n}{5n+5} \quad \forall n \in \mathbb{N}\). The projection onto C is calculated explicitly, and we stop the iterations when \(\|x_{n+1} - x_{n}\| < 10^{-4}\). The computational results are shown in Table 1 and Fig. 1.

Figure 1
figure 1

Experiment 1, top left: \(m=5\); top right: \(m=20\); bottom left: \(m=50\); bottom right: \(m=100\)

Table 1 Computation result for Example 5.1

Next, we consider an example in an infinite dimension space where A is a pseudo-monotone and Lipschitz continuous operator but not monotone. In this example, we choose \(f(x) = \|x\|^{2}\) and compare our Algorithm 1 with Algorithm 2 of Thong and Vuong [63].

Example 5.2

Let \(E=L_{2}([0,1])\) endowed with inner product \(\langle x,y \rangle = \int _{0}^{1}x(t)y(t)\,dt\) for all \(x,y \in L_{2}([0,1])\) and norm \(\|x\| = (\int _{0}^{1}|x(t)|^{2}\,dt )^{\frac{1}{2}}\) for all \(x \in L_{2}([0,1])\). Let \(C = \{x \in L_{2}([0,1]):\langle y,x\rangle \leq a\}\), where \(y= 3t^{2}+9\) and \(a =1\). Define \(g:C \rightarrow \mathbb{R}\) by \(g(u) = \frac{1}{1+\|u\|^{2}}\) and \(F: L^{2}([0,1]) \rightarrow L^{2}([0,1])\) as the Volterra integral operator given by \(F(u)(t) = \int _{0}^{t} u (s)\,ds\) for all \(u \in L^{2}([0,1])\) and \(t\in [0,1]\). F is bounded, linear, and monotone with \(L = \frac{\pi }{2}\). Let \(A: L^{2}([0,1])\rightarrow L^{2}([0,1])\) be defined by \(A(u)(t) = (g(u)F(u))(t)\). It is easy to show that A is L-Lipschitz continuous, pseudo-monotone, and not monotone. We choose \(\alpha _{0} = 0.09\), \(\mu = 0.34\), \(T_{i} x = x\), \(N=1\), \(\beta _{m,i} = \frac{n}{2n+2}\), and \(\delta _{n} = \frac{1}{\sqrt{n+1}}\), \(\forall n \in \mathbb{N}\). For Algorithm 2 of [63], we take \(\gamma = 2\), \(l = 0.02\), \(\mu = 0.34\), \(\alpha _{n} = \frac{1}{\sqrt{n+1}}\), \(\beta _{n} = \frac{1 - \alpha _{n}}{2}\), \(\forall n \in N\). We test the algorithms with the following initial points:

  1. Case I:

    \(x_{0} = \frac{\cos (2t)}{6}\),

  2. Case II:

    \(x_{0} = \exp (2t) + \sin (3t)\),

  3. Case III:

    \(x_{0} = \frac{\exp (-3t)}{7}\),

  4. Case IV:

    \(x_{0} = t^{3} + 2t +3\).

We stop the algorithms when \(\|x_{n+1} -x_{n}\|<10^{-4}\) is reached and plot the graphs of error (\(\|x_{n+1}-x_{n}\|\)) against a number of iterations. The numerical results are shown in Table 2 and Fig. 2.

Figure 2
figure 2

Experiment 2, top left: Case I; top right: Case II; bottom left: Case III; bottom right: Case IV

Table 2 Computation result for Example 5.2

Example 5.3

Let \(E = \ell _{3}(\mathbb{R})\), where \(\ell _{3}(\mathbb{R}) = \{x = (x_{1},x_{2},\dots ,x_{i},\dots ), x_{i} \in \mathbb{R}: \sum_{i=1}^{\infty }|x_{i}|^{3} < \infty \}\) with norm \(\|x\|_{\ell } = (\sum_{i=1}^{\infty }|x_{i}|^{3} )^{\frac{1}{3}}\) for \(x \in E\). Let \(C = \{x \in E: \|x\|_{\ell _{3}} \leq 1\}\). For all \(x \in E\), we define the operator \(A:E \to E\) be

$$ Ax = \frac{x}{2} + (1,1,0,0,\dots ). $$

Then A satisfies Assumption 3.1 (b). Let \(f:E \to \mathbb{R}\) be defined by \(f(x) = \frac{1}{3}\|x\|_{\ell _{3}}^{3}\) for all \(x \in \ell _{3}(\mathbb{R})\). Also, let \(\{e_{n}\}\) be the standard basis of \(\ell _{3}\) defined by \(e_{n} = (\delta _{n,1},\delta _{n,2},\dots )\) for each \(n \in \mathbb{N}\), where

$$ \delta _{n,i} = \textstyle\begin{cases} 1 & \text{if } n = i, \\ 0 & \text{if } n \neq i, \end{cases} $$

and \(T:E \to E\) be defined by

$$ Tx = \textstyle\begin{cases} \frac{x}{n+1} & \text{if } x = e_{n}, \\ \frac{x}{2} & \text{if } x \neq e_{n}. \end{cases} $$

Clearly, \(F(T) = \{0\}\) and T is Bregman quasi-nonexpansive (see [59]). We choose the following parameters for our computation: \(\beta _{n} = \frac{1}{100(n+1)} \), \(\delta _{n} = \frac{50n}{70n+3}\), \(\mu =0.26\), \(\alpha _{0} = 0.0025\), \(u_{n} = \frac{x_{0}}{n+1}\). We compare the performance of Algorithm 1 with Algorithm 3.6 of [33] (namely BSEM) and Algorithm 3.3 of [34] using the following initial value:

Case I: \(x_{0} = (1, \frac{1}{\sqrt{2}}, \frac{1}{\sqrt{3}},\dots )\);

Case II: \(x_{0} = (25, 5, 1, \dots )\);

Case III: \(x_{0} = (2, 2, 1,1, \dots )\);

Case IV: \(x_{0} = (1, 0, 1, 0, \dots )\).

We plot the graphs of error (\(\|x_{n+1} - x_{n}\|_{\ell _{3}}\)) against a number of iterations using \(\|x_{n+1} - x_{n}\|_{\ell _{3}} < 10^{-4}\) as the stopping criterion. The computational results are shown in Table 3 and Fig. 3.

Figure 3
figure 3

Example 5.3, top left: Case I; top right: Case II;bottom left: Case III; bottom right: Case IV

Table 3 Computation result for Example 5.3

6 Conclusion

This paper introduced a single projection method using Bregman distance techniques for finding a common element in the set of solutions of variational inequalities and the set of common fixed points of a finite family of Bregman quasi-nonexpansive mappings in the framework of a reflexive Banach space. The stepsize of the algorithm is selected self-adaptively, and strong convergence theorem is proved without using the Lipschitz constant of the cost operator as an input parameter. Some applications to generalized Nash equilibrium and bandwidth network problems were given, and some numerical examples were also presented to illustrate the performance of the algorithm. This result improves and extends several results such as [35, 45, 55, 6063] in the literature.