1 Introduction

As we know, the original split feasibility problem (SFP) was introduced firstly by Censor and Elfving [1], and has received much attention since its inception in 1994. This is due to its applications in signal processing and image reconstruction, with particular progress in intensity-modulated radiation therapy; please, see [26].

Since the SFP is a special case of the convex feasibility problem (CFP), which is to find a point in the nonempty intersection of finitely many closed and convex sets, we next briefly review some historic approaches which relate to the CFP. The CFP is an important problem because many real-world inversion or estimation problems in engineering as well as in mathematics can be cast into this framework; see, e.g., Combettes [7], Bauschke and Borwein [8] and Kiwiel [9]. Traditionally, iterative projection methods for solving the CFP employ orthogonal projections onto convex sets (i.e., nearest point projections with respect to the Euclidean distance function); see, e.g., [1014]. Much work has been done with generalized distance functions and the generalized projections associated with them suggested by Bregman [15].

In 1994, Censor and Elfving [1] investigated the use of different kinds of generalized projections in a single iterative process for solving the SFP. Their proposal is an iterative algorithm, which involves the computation of the inverse of a matrix, which is known to be a difficult task. That is why Byrne [16, 17] proposed the so-called CQ algorithm, which generates a sequence by a recursive procedure with suitable step-size. The CQ algorithm only involves the computations of the projections onto the sets C and Q, respectively, and is therefore implementable in the case where these projections have closed-form expressions (e.g., C and Q are the closed balls or half-spaces). There is a large number of references on the CQ method in the literature; see, for instance, [1834]. However, we have to remark that the determination of the step-size depends on the operator (matrix) norm (or the dominant eigenvalue of a matrix product). This means that in order to implement the CQ algorithm, one has first to compute (or, at least, to estimate) the matrix norm of an operator, which is in general not an easy work in practice.

To overcome the above difficulty, the so-called self-adaptive method which permits step-size being selected self-adaptively was developed. Note that this method is the application of the projection method of Goldstein [35], Levitin and Polyak [36] to a suitable variational inequality problem, which is among the simplest numerical methods for solving variational inequality problems. Nevertheless, the efficiency of this projection method depends strongly on the choice of the step-size parameter. If one chooses a small parameter such that it guarantees the convergence of the iterative sequence, the recursion leads to slow speed of convergence. On the other hand, if one chooses a large step-size to improve the speed of convergence, the generated sequence may not converge. In real applications to solving variational inequality problems, the Lipschitz constant may be difficult to estimate, even if the underlying mapping is linear, the case such as the SFP. Some self-adaptive methods for solving variational inequality problems have been developed according to the original Goldstein-Levitin-Polyak method [35, 36]. See, e.g., [3745].

Motivated by the self-adaptive strategy, Zhang et al. [45] proposed their method by using variable step-sizes instead of the fixed step-sizes as in Censor et al. [46]. Also, a self-adaptive projection method was introduced by Zhao and Yang [29], and it was adopted by using the Armijo-like searches. The advantage of these algorithms lies in the fact that neither prior information about the matrix norm A nor any other conditions on Q and A are required, and still convergence is guaranteed.

In this paper, we further develop and improve the self-adaptive methods for solving the SFP. An improved self-adaptive method is introduced for solving the SFP. As a special case, the minimum norm solution of the SFP can be approached iteratively.

2 Framework and preliminary results

Let ${H}_{1}$ and ${H}_{2}$ be two Hilbert spaces, and let C and Q be two closed and convex subsets of ${H}_{1}$ and ${H}_{2}$, respectively. Let $A:{H}_{1}\to {H}_{2}$ be a bounded linear operator. The split feasibility problem (SFP) is to find a point ${x}^{\ast }$ such that

${x}^{\ast }\in C\phantom{\rule{1em}{0ex}}\text{and}\phantom{\rule{1em}{0ex}}A{x}^{\ast }\in Q.$
(1)

Next, we use Γ to denote the solution set of the SFP, i.e., $\mathrm{\Gamma }=\left\{x\in C:Ax\in Q\right\}$.

In 1994, Censor and Elfving [1] investigated the use of different kinds of generalized projections in a single iterative process for solving the SFP. They were the first to propose the following algorithm which involved the computation of the inverse ${A}^{-1}$:

${x}_{k+1}={A}^{-1}{P}_{Q}\left({P}_{A\left(C\right)}\left(A{x}_{k}\right)\right),\phantom{\rule{1em}{0ex}}k\ge 0,$

where C and Q are closed and convex sets in ${\mathbb{R}}^{n}$, while A is a full rank $n×n$ matrix and $A\left(C\right)=\left\{y\in {\mathbb{R}}^{n}\mid y=Ax,x\in C\right\}$. Note that ${A}^{-1}$ is not easily executed. Consequently, Byrne [16, 17] proposed the so-called CQ algorithm which generates a sequence $\left\{{x}_{n}\right\}$ by the recursive procedure

${x}_{n+1}={P}_{C}\left({x}_{n}-{\tau }_{n}{A}^{\ast }\left(I-{P}_{Q}\right)A{x}_{n}\right),$
(2)

where the step-size ${\tau }_{n}$ is chosen in the interval $\left(0,2/{\parallel A\parallel }^{2}\right)$. It is remarkable that the CQ algorithm only involves the computations of the projections ${P}_{C}$ and ${P}_{Q}$ onto the sets C and Q, respectively, and is therefore implementable in the case where ${P}_{C}$ and ${P}_{Q}$ have closed-form expressions (e.g., C and Q are the closed balls or half-spaces). However, we observe that the determination of the step-size ${\tau }_{n}$ depends on the operator (matrix) norm $\parallel A\parallel$ (or the largest eigenvalue of ${A}^{\ast }A$). This means that for practical implementation of the CQ algorithm, one has first to compute (or, at least, to estimate) the matrix norm of A, which is in general not an easy task in practice.

To overcome the above difficulty, the so-called self-adaptive method which permits step-size ${\tau }_{n}$ being selected self-adaptively was developed. If we set

$f\left(x\right):=\frac{1}{2}{\parallel Ax-{P}_{Q}Ax\parallel }^{2},$

then the convex objective f is differentiable and has a Lipschitz gradient given by

$\mathrm{\nabla }f\left(x\right)={A}^{\ast }\left(I-{P}_{Q}\right)A.$

Thus, the CQ algorithm (2) can be obtained by minimizing the following convex minimization problem

$\underset{x\in C}{min}f\left(x\right).$
(3)

We know that a point ${x}^{\ast }\in C$ is a stationary point of problem (3) if it satisfies

$〈\mathrm{\nabla }f\left({x}^{\ast }\right),x-{x}^{\ast }〉\ge 0,\phantom{\rule{1em}{0ex}}\mathrm{\forall }x\in C.$
(4)

Thus, we can use a gradient projection algorithm below to solve the (SFP)

${x}_{n+1}={P}_{C}\left({x}_{n}-{\tau }_{n}\mathrm{\nabla }f\left({x}_{n}\right)\right),$
(5)

where ${\tau }_{n}$, the step-size at iteration n, is chosen in the interval $\left(0,2/L\right)$, where L is the Lipschitz constant of ∇f.

The above method (5) has to be thought of as the application of the projection method of Goldstein [35], Levitin and Polyak [36] to the variational inequality problem (4), which is among the simplest numerical methods for solving variational inequality problems. Nevertheless, the efficiency of this projection method depends greatly on the choice of the parameter ${\tau }_{n}$. A small ${\tau }_{n}$ guarantees the convergence of the iterative sequence, but the recursion leads to slow speed of convergence. On the other hand, a large step-size will improve the speed of convergence, but the generated sequence may not converge. In real applications for solving variational inequality problems, the Lipschitz constant may be difficult to estimate, even if the underlying mapping is linear, the case such as the SFP.

The methods in Zhang et al. [45] and Censor et al. [46] were proposed for solving the multiple-sets split feasibility problem.

Algorithm 2.1 S1. Given a nonnegative sequence ${\tau }_{n}$ such that ${\sum }_{n=0}^{\mathrm{\infty }}{\tau }_{n}<\mathrm{\infty }$, $\delta \in \left(0,1\right)$, $\mu \in \left(0,1\right)$, $\rho \in \left(0,1\right)$, $ϵ>0$, ${\beta }_{0}>0$, and arbitrary initial point ${x}_{0}$. Set ${\gamma }_{0}={\beta }_{0}$ and $n=0$.

S2. Find the smallest nonnegative integer ${l}_{n}$ such that ${\beta }_{n+1}={\mu }^{{l}_{k}}{\gamma }_{k}$ and

${x}_{n+1}={P}_{C}\left({x}_{n}-{\beta }_{n+1}\mathrm{\nabla }f\left({x}_{n}\right)\right),$

which satisfies

${\beta }_{n+1}{\parallel \mathrm{\nabla }f\left({x}_{n}\right)-\mathrm{\nabla }f\left({x}_{n+1}\right)\parallel }^{2}\le \left(2-\delta \right)〈{x}_{n}-{x}_{n+1},\mathrm{\nabla }f\left({x}_{n}\right)-\mathrm{\nabla }f\left({x}_{n+1}\right)〉.$

S3. If

${\beta }_{n+1}{\parallel \mathrm{\nabla }f\left({x}_{n}\right)-\mathrm{\nabla }f\left({x}_{n+1}\right)\parallel }^{2}\le \rho 〈{x}_{n}-{x}_{n+1},\mathrm{\nabla }f\left({x}_{n}\right)-\mathrm{\nabla }f\left({x}_{n+1}\right)〉,$

then set ${\gamma }_{n+1}=\left(1+{\tau }_{n+1}\right){\beta }_{n+1}$; otherwise, set ${\gamma }_{n+1}={\beta }_{n+1}$.

S4. If $\parallel e\left({x}_{n},{\beta }_{n}\right)\parallel \le ϵ$, stop; otherwise, set $n:=n+1$ and go to S2.

The following self-adaptive projection method was introduced by Zhao and Yang [29], and it was adopted by using the Armijo-like searches.

Algorithm 2.2 Given constants $\beta >0$, $\sigma \in \left(0,1\right)$ and $\gamma \in \left(0,1\right)$. Let ${x}_{0}$ be arbitrary. For $n=0,1,\dots$ , calculate

${x}_{n+1}={P}_{C}\left({x}_{n}-{\tau }_{n}\mathrm{\nabla }f\left({x}_{n}\right)\right),$

where ${\tau }_{n}=\beta {\gamma }^{{l}_{n}}$ and ${l}_{n}$ is the smallest nonnegative integer l such that

$f\left({P}_{C}\left({x}_{n}-\beta {\gamma }^{l}\mathrm{\nabla }f\left({x}_{n}\right)\right)\right)\le f\left({x}_{n}\right)-\sigma 〈\mathrm{\nabla }f\left({x}_{n}\right),{x}_{n}-{P}_{C}\left({x}_{n}-\beta {\gamma }^{l}\mathrm{\nabla }f\left({x}_{n}\right)\right)〉.$

The advantage of Algorithm 2.1 and Algorithm 2.2 lies in the fact that neither prior information about the matrix norm A nor any other conditions on Q and A are required, and still convergence is guaranteed.

We shall introduce our improved self-adaptive method for solving the SFP. In this respect, we need the ingredients introduced right now.

Let C be a nonempty closed convex subset of a real Hilbert space H. A mapping $T:C\to C$ is called nonexpansive if

$\parallel Tx-Ty\parallel \le \parallel x-y\parallel ,\phantom{\rule{1em}{0ex}}\mathrm{\forall }x,y\in C.$

A mapping $\psi :C\to C$ is said to be δ-contractive if there exists a constant $\delta \in \left[0,1\right)$ such that

$\parallel \psi \left(x\right)-\psi \left(y\right)\parallel \le \delta \parallel x-y\parallel ,\phantom{\rule{1em}{0ex}}\mathrm{\forall }x,y\in C.$

Recall that the (nearest point or metric) projection from H onto C, denoted by ${P}_{C}$, assigns to each $x\in H$ the unique point ${P}_{C}\left(x\right)\in C$ with the property

$\parallel x-{P}_{C}\left(x\right)\parallel =inf\left\{\parallel x-y\parallel :y\in C\right\}.$

It is well known that the metric projection ${P}_{C}$ of H onto C has the following basic properties:

1. (a)

$\parallel {P}_{C}\left(x\right)-{P}_{C}\left(y\right)\parallel \le \parallel x-y\parallel$ for all $x,y\in H$;

2. (b)

$〈x-y,{P}_{C}\left(x\right)-{P}_{C}\left(y\right)〉\ge {\parallel {P}_{C}\left(x\right)-{P}_{C}\left(y\right)\parallel }^{2}$ for every $x,y\in H$;

3. (c)

$〈x-{P}_{C}\left(x\right),y-{P}_{C}\left(x\right)〉\le 0$ for all $x\in H$ and $y\in C$.

Next we adopt the following notation:

• ${x}_{n}\to x$ means that ${x}_{n}$ converges strongly to x;

• ${x}_{n}⇀x$ means that ${x}_{n}$ converges weakly to x;

• ${\omega }_{w}\left({x}_{n}\right):=\left\{x:\mathrm{\exists }{x}_{{n}_{j}}⇀x\right\}$ is the weak ω-limit set of the sequence $\left\{{x}_{n}\right\}$.

Recall that a function $f:H\to \mathbb{R}$ is called convex if

$f\left(\lambda x+\left(1-\lambda \right)y\right)\le \lambda f\left(x\right)+\left(1-\lambda \right)f\left(y\right),\phantom{\rule{1em}{0ex}}\mathrm{\forall }\lambda \in \left(0,1\right),\mathrm{\forall }x,y\in H.$

It is known that a differentiable function f is convex if and only if the following relation holds:

$f\left(z\right)\ge f\left(x\right)+〈\mathrm{\nabla }f\left(x\right),z-x〉,\phantom{\rule{1em}{0ex}}\mathrm{\forall }z\in H.$

Recall that an element $g\in H$ is said to be a subgradient of $f:H\to \mathbb{R}$ at x if

$f\left(z\right)\ge f\left(x\right)+〈g,z-x〉,\phantom{\rule{1em}{0ex}}\mathrm{\forall }z\in H.$

If the function $f:H\to \mathbb{R}$ has at least one subgradient at x, it is said to be subdifferentiable at x. The set of subgradients of f at the point x is called the subdifferential of f at x, and is denoted by $\partial f\left(x\right)$. A function f is called subdifferentiable if it is subdifferentiable at all $x\in H$. If f is convex and differentiable, then its gradient and subgradient coincide. A function $f:H\to \mathbb{R}$ is said to be weakly lower semi-continuous (w-lsc) at x if ${x}_{n}⇀x$ implies

$f\left(x\right)\le \underset{n\to \mathrm{\infty }}{lim inf}f\left({x}_{n}\right).$

f is said to be w-lsc on H if it is w-lsc at every point $x\in H$.

The first lemma is easy to prove.

Lemma 2.1 [14]

Let $f\left(x\right):=\frac{1}{2}{\parallel Ax-{P}_{Q}Ax\parallel }^{2}$. Then

1. (i)

f is convex and differentiable;

2. (ii)

f is w-lsc on C.

Lemma 2.2 [47]

Given ${x}^{\ast }\in {H}_{1}$. Then ${x}^{\ast }$ solves the SFP if and only if ${x}^{\ast }$ solves the fixed point equation

${x}^{\ast }={P}_{C}\left({x}^{\ast }-\gamma {A}^{\ast }\left(I-{P}_{Q}\right)A{x}^{\ast }\right),$

where $\gamma >0$.

Lemma 2.3 [48]

Assume that $\left\{{a}_{n}\right\}$ is a sequence of nonnegative real numbers such that

${a}_{n+1}\le \left(1-{\gamma }_{n}\right){a}_{n}+{\delta }_{n},$

where $\left\{{\gamma }_{n}\right\}$ is a sequence in $\left(0,1\right)$ and $\left\{{\delta }_{n}\right\}$ is a sequence such that

1. (1)

${\sum }_{n=1}^{\mathrm{\infty }}{\gamma }_{n}=\mathrm{\infty }$;

2. (2)

${lim sup}_{n\to \mathrm{\infty }}\frac{{\delta }_{n}}{{\gamma }_{n}}\le 0$ or ${\sum }_{n=1}^{\mathrm{\infty }}|{\delta }_{n}|<\mathrm{\infty }$.

Then ${lim}_{n\to \mathrm{\infty }}{a}_{n}=0$.

Lemma 2.4 [49]

Let $\left\{{s}_{n}\right\}$ be a sequence of real numbers that does not decrease at infinity, in the sense that there exists a subsequence $\left\{{s}_{{n}_{i}}\right\}$ of $\left\{{s}_{n}\right\}$ such that ${s}_{{n}_{i}}\le {s}_{{n}_{i}+1}$ for all $i\ge 0$. For every $n\ge {n}_{0}$, define an integer sequence $\left\{\tau \left(n\right)\right\}$ as

$\tau \left(n\right)=max\left\{k\le n:{s}_{{n}_{i}}<{s}_{{n}_{i}+1}\right\}.$

Then $\tau \left(n\right)\to \mathrm{\infty }$ as $n\to \mathrm{\infty }$ and for all $n\ge {n}_{0}$

$max\left\{{s}_{\tau \left(n\right)},{s}_{n}\right\}\le {s}_{\tau \left(n\right)+1}.$

3 Main results

In this section we state and prove our main results.

Let C and Q be nonempty closed convex subsets of real Hilbert spaces ${H}_{1}$ and ${H}_{2}$, respectively. Let $\psi :C\to {H}_{1}$ be a δ-contraction with $\delta \in \left[0,\frac{\sqrt{2}}{2}\right)$. Let $A:{H}_{1}\to {H}_{2}$ be a bounded linear operator.

Algorithm 3.1 For given ${x}_{0}\in C$, assume that $\left\{{x}_{n}\right\}$ has been constructed. If $\mathrm{\nabla }f\left({x}_{n}\right)=0$, then stop and ${x}_{n}$ is a solution of SFP (1). Otherwise, continue and compute ${x}_{n+1}$ by the recursion

${x}_{n+1}={P}_{C}\left[{\alpha }_{n}\psi \left({x}_{n}\right)+\left(1-{\alpha }_{n}\right)\left({x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)\right)\right],\phantom{\rule{1em}{0ex}}n\ge 0,$
(6)

where $\left\{{\alpha }_{n}\right\}\subset \left(0,1\right)$ and $\left\{{\rho }_{n}\right\}\subset \left(0,2\right)$.

Theorem 3.1 Suppose that the SFP is consistent, that is, $\mathrm{\Gamma }\ne \mathrm{\varnothing }$. Assume that the following conditions hold:

1. (a)

${lim}_{n\to \mathrm{\infty }}{\alpha }_{n}=0$ and ${\sum }_{n=1}^{\mathrm{\infty }}{\alpha }_{n}=\mathrm{\infty }$;

2. (b)

${inf}_{n}{\rho }_{n}\left(2-{\rho }_{n}\right)>0$.

Then $\left\{{x}_{n}\right\}$ defined by (6) converges strongly to z, which solves the following variational inequality:

$z\in \mathrm{\Gamma }\phantom{\rule{1em}{0ex}}\mathit{\text{such that}}\phantom{\rule{1em}{0ex}}〈z-\psi \left(z\right),z-x〉\le 0\phantom{\rule{1em}{0ex}}\mathit{\text{for all}}\phantom{\rule{0.25em}{0ex}}x\in \mathrm{\Gamma }.$
(7)

Proof First, it is obvious that the solution of the variational inequality (7) is unique (by the strong monotonicity of $I-\psi$ according to the related results in variational inequality), denoted by z. Then $z={P}_{\mathrm{\Gamma }}\left(\psi \left(z\right)\right)$. We may assume that the sequence $\left\{{x}_{n}\right\}$ is infinite, that is, Algorithm 3.1 does not terminate in a finite number of iterations. Thus, $\mathrm{\nabla }f\left({x}_{n}\right)\ne 0$ for all n. From (6), we have

$\begin{array}{rcl}{\parallel {x}_{n+1}-z\parallel }^{2}& =& {\parallel {P}_{C}\left[{\alpha }_{n}\psi \left({x}_{n}\right)+\left(1-{\alpha }_{n}\right)\left({x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)\right)\right]-z\parallel }^{2}\\ \le & {\parallel {\alpha }_{n}\left(\psi \left({x}_{n}\right)-z\right)+\left(1-{\alpha }_{n}\right)\left({x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\right)\parallel }^{2}\\ \le & {\alpha }_{n}{\parallel \psi \left({x}_{n}\right)-z\parallel }^{2}+\left(1-{\alpha }_{n}\right){\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel }^{2}\\ \le & \left(1-{\alpha }_{n}\right)\left[{\parallel {x}_{n}-z\parallel }^{2}+\frac{{\rho }_{n}^{2}{f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}-\frac{2{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}〈\mathrm{\nabla }f\left({x}_{n}\right),{x}_{n}-z〉\right]\\ +{\alpha }_{n}{\left(\parallel \psi \left({x}_{n}\right)-\psi \left(z\right)\parallel +\parallel \psi \left(z\right)-z\parallel \right)}^{2}.\end{array}$
(8)

By the convexity of f (Lemma 2.1) and the fact that $\mathrm{\nabla }f\left(z\right)=0$ for $z\in \mathrm{\Gamma }$, we deduce that

$f\left({x}_{n}\right)=f\left({x}_{n}\right)-f\left(z\right)\le 〈\mathrm{\nabla }f\left({x}_{n}\right),{x}_{n}-z〉.$
(9)

Using the inequality ${\left(a+b\right)}^{2}\le 2\left({a}^{2}+{b}^{2}\right)$ for all $a,b\in \mathbb{R}$, we have

$\begin{array}{rcl}{\left(\parallel \psi \left({x}_{n}\right)-\psi \left(z\right)\parallel +\parallel \psi \left(z\right)-z\parallel \right)}^{2}& \le & 2{\parallel \psi \left({x}_{n}\right)-\psi \left(z\right)\parallel }^{2}+2{\parallel \psi \left(z\right)-z\parallel }^{2}\\ \le & 2{\delta }^{2}{\parallel {x}_{n}-z\parallel }^{2}+2{\parallel \psi \left(z\right)-z\parallel }^{2}.\end{array}$
(10)

From (8)-(10), we get

$\begin{array}{rcl}{\parallel {x}_{n+1}-z\parallel }^{2}& \le & \left(1-{\alpha }_{n}\right)\left[{\parallel {x}_{n}-z\parallel }^{2}-{\rho }_{n}\left(2-{\rho }_{n}\right)\frac{{f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\right]\\ +2{\delta }^{2}{\alpha }_{n}{\parallel {x}_{n}-z\parallel }^{2}+2{\alpha }_{n}{\parallel \psi \left(z\right)-z\parallel }^{2}\\ \le & \left[1-\left(1-2{\delta }^{2}\right){\alpha }_{n}\right]{\parallel {x}_{n}-z\parallel }^{2}+\left(1-2{\delta }^{2}\right){\alpha }_{n}\frac{2{\parallel \psi \left(z\right)-z\parallel }^{2}}{1-2{\delta }^{2}}\\ \le & max\left\{{\parallel {x}_{n}-z\parallel }^{2},\frac{2{\parallel \psi \left(z\right)-z\parallel }^{2}}{1-2{\delta }^{2}}\right\}.\end{array}$

By induction, we deduce

${\parallel {x}_{n+1}-z\parallel }^{2}\le max\left\{{\parallel {x}_{0}-z\parallel }^{2},\frac{2{\parallel \psi \left(z\right)-z\parallel }^{2}}{1-2{\delta }^{2}}\right\}.$

Hence, $\left\{{x}_{n}\right\}$ is bounded.

By using the firm nonexpansivity of ${P}_{C}$, we derive that

$\begin{array}{rcl}{\parallel {x}_{n+1}-z\parallel }^{2}& =& {\parallel {P}_{C}\left[{\alpha }_{n}\psi \left({x}_{n}\right)+\left(1-{\alpha }_{n}\right)\left({x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)\right)\right]-{P}_{C}z\parallel }^{2}\\ \le & {\alpha }_{n}〈\psi \left({x}_{n}\right)-z,{x}_{n+1}-z〉+\left(1-{\alpha }_{n}\right)〈{x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z,{x}_{n+1}-z〉\\ =& {\alpha }_{n}〈\psi \left({x}_{n}\right)-\psi \left(z\right),{x}_{n+1}-z〉+{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ +\left(1-{\alpha }_{n}\right)〈{x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z,{x}_{n+1}-z〉\\ \le & {\alpha }_{n}\delta \parallel {x}_{n}-z\parallel \parallel {x}_{n+1}-z\parallel +{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ +\left(1-{\alpha }_{n}\right)\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel \parallel {x}_{n+1}-z\parallel \\ =& \left({\alpha }_{n}\delta \parallel {x}_{n}-z\parallel +\left(1-{\alpha }_{n}\right)\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel \right)\parallel {x}_{n+1}-z\parallel \\ +{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ \le & \frac{1}{2}{\left({\alpha }_{n}\delta \parallel {x}_{n}-z\parallel +\left(1-{\alpha }_{n}\right)\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel \right)}^{2}\\ +\frac{1}{2}{\parallel {x}_{n+1}-z\parallel }^{2}+{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉.\end{array}$

It follows that

$\begin{array}{rcl}{\parallel {x}_{n+1}-z\parallel }^{2}& \le & {\left({\alpha }_{n}\delta \parallel {x}_{n}-z\parallel +\left(1-{\alpha }_{n}\right)\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel \right)}^{2}\\ +2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ \le & {\alpha }_{n}{\delta }^{2}{\parallel {x}_{n}-z\parallel }^{2}+\left(1-{\alpha }_{n}\right){\parallel {x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)-z\parallel }^{2}\\ +2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ \le & {\alpha }_{n}{\delta }^{2}{\parallel {x}_{n}-z\parallel }^{2}+\left(1-{\alpha }_{n}\right)\left[{\parallel {x}_{n}-z\parallel }^{2}-{\rho }_{n}\left(2-{\rho }_{n}\right)\frac{{f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\right]\\ +2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ =& \left[1-\left(1-{\delta }^{2}\right){\alpha }_{n}\right]{\parallel {x}_{n}-z\parallel }^{2}+2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉\\ -\left(1-{\alpha }_{n}\right){\rho }_{n}\left(2-{\rho }_{n}\right)\frac{{f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}.\end{array}$
(11)

Next, we will prove that ${x}_{n}\to z$ following the ideas in [49]. Set ${s}_{n}={\parallel {x}_{n}-z\parallel }^{2}$ for all $n\ge 0$. Since ${\alpha }_{n}\to 0$ and ${inf}_{n}{\rho }_{n}\left(2-{\rho }_{n}\right)>0$, we may assume, without loss of generality, that $\left(1-{\alpha }_{n}\right){\rho }_{n}\left(2-{\rho }_{n}\right)\ge \sigma$ for some $\sigma >0$. Thus, we can rewrite (11) as

${s}_{n+1}-{s}_{n}+\left(1-{\delta }^{2}\right){\alpha }_{n}{s}_{n}+\frac{\sigma {f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\le 2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉.$
(12)

Now, we consider two possible cases.

Case 1. Assume that $\left\{{s}_{n}\right\}$ is eventually decreasing, i.e., there exists $N>0$ such that $\left\{{s}_{n}\right\}$ is decreasing for $n\ge N$. In this case, $\left\{{s}_{n}\right\}$ must be convergent, and from (12) it follows that

$\begin{array}{rcl}0\le \frac{\sigma {f}^{2}\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}& \le & {s}_{n}-{s}_{n+1}-\left(1-{\delta }^{2}\right){\alpha }_{n}{s}_{n}+2{\alpha }_{n}\parallel \psi \left(z\right)-z\parallel \parallel {x}_{n+1}-z\parallel \\ \le & {s}_{n}-{s}_{n+1}+M{\alpha }_{n},\end{array}$
(13)

where $M>0$ is a constant such that ${sup}_{n}\left\{2\parallel \psi \left(z\right)-z\parallel \parallel {x}_{n+1}-z\parallel \right\}\le M$. Letting $n\to \mathrm{\infty }$ in (13), we get

$\underset{n\to \mathrm{\infty }}{lim}f\left({x}_{n}\right)=0.$

Since $\left\{{x}_{n}\right\}$ is bounded, there exists a subsequence $\left\{{x}_{{n}_{k}}\right\}$ of $\left\{{x}_{n}\right\}$ converging weakly to $\stackrel{˜}{x}\in C$.

From the weak lower semicontinuity of f, we have

$0\le f\left(\stackrel{˜}{x}\right)\le \underset{k\to \mathrm{\infty }}{lim inf}f\left({x}_{{n}_{k}}\right)=\underset{n\to \mathrm{\infty }}{lim}f\left({x}_{n}\right)=0.$

Hence, $f\left(\stackrel{˜}{x}\right)=0$, i.e., $A\stackrel{˜}{x}\in Q$. This indicates that

${\omega }_{w}\left({x}_{n}\right)\subset \mathrm{\Gamma }.$

Furthermore, due to the property of the projection (c),

$\underset{n\to \mathrm{\infty }}{lim sup}〈\psi \left(z\right)-z,{x}_{n+1}-z〉=\underset{\omega \in {\omega }_{w}\left({x}_{n}\right)}{max}〈\psi \left(z\right)-{P}_{\mathrm{\Gamma }}\left(\psi \left(z\right)\right),\omega -{P}_{\mathrm{\Gamma }}\left(\psi \left(z\right)\right)〉\le 0.$

From (12), we obtain

${s}_{n+1}\le \left[1-\left(1-{\delta }^{2}\right){\alpha }_{n}\right]{s}_{n}+2{\alpha }_{n}〈\psi \left(z\right)-z,{x}_{n+1}-z〉.$
(14)

Applying Lemma 2.3 to (14), we get ${s}_{n}\to 0$.

Case 2. Assume $\left\{{s}_{n}\right\}$ is not eventually decreasing. That is, there exists an integer ${n}_{0}$ such that ${s}_{{n}_{0}}\le {s}_{{n}_{0}+1}$. Thus, we can define an integer sequence $\left\{{\tau }_{n}\right\}$ for all $n\ge {n}_{0}$ as follows:

$\tau \left(n\right)=max\left\{k\in \mathbb{N}\mid {n}_{0}\le k\le n,{s}_{k}\le {s}_{k+1}\right\}.$

Clearly, $\tau \left(n\right)$ is a non-decreasing sequence such that $\tau \left(n\right)\to +\mathrm{\infty }$ as $n\to \mathrm{\infty }$ and

${s}_{\tau \left(n\right)}\le {s}_{\tau \left(n\right)+1}$

for all $n\ge {n}_{0}$. In this case, we derive from (13) that

$\frac{\sigma {f}^{2}\left({x}_{\tau \left(n\right)}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{\tau \left(n\right)}\right)\parallel }^{2}}\le M{\alpha }_{\tau \left(n\right)}\to 0.$

It follows that

$\underset{n\to \mathrm{\infty }}{lim}f\left({x}_{\tau \left(n\right)}\right)=0.$

This implies that every weak cluster point of $\left\{{x}_{\tau \left(n\right)}\right\}$ is in the solution set Γ; i.e., ${\omega }_{w}\left({x}_{\tau \left(n\right)}\right)\subset \mathrm{\Gamma }$.

On the other hand, we note that

$\parallel {x}_{\tau \left(n\right)+1}-{x}_{\tau \left(n\right)}\parallel \le {\alpha }_{\tau \left(n\right)}\parallel \psi \left({x}_{\tau \left(n\right)}\right)-{x}_{\tau \left(n\right)}\parallel +\left(1-{\alpha }_{\tau \left(n\right)}\right)\frac{{\rho }_{\tau \left(n\right)}f\left({x}_{\tau \left(n\right)}\right)}{\parallel \mathrm{\nabla }f\left({x}_{\tau \left(n\right)}\right)\parallel }\to 0,$

from which we can deduce that

$\begin{array}{rcl}\underset{n\to \mathrm{\infty }}{lim sup}〈\psi \left(z\right)-z,{x}_{\tau \left(n\right)+1}-z〉& =& \underset{n\to \mathrm{\infty }}{lim sup}〈\psi \left(z\right)-z,{x}_{\tau \left(n\right)}-z〉\\ =& \underset{\omega \in {\omega }_{w}\left({x}_{\tau \left(n\right)}\right)}{max}〈\psi \left(z\right)-{P}_{\mathrm{\Gamma }}\left(\psi \left(z\right)\right),\omega -{P}_{\mathrm{\Gamma }}\left(\psi \left(z\right)\right)〉\\ \le & 0.\end{array}$
(15)

Since ${s}_{\tau \left(n\right)}\le {s}_{\tau \left(n\right)+1}$, we have from (12) that

${s}_{\tau \left(n\right)}\le \frac{2}{1-{\delta }^{2}}〈\psi \left(z\right)-z,{x}_{\tau \left(n\right)+1}-z〉.$
(16)

Combining (15) and (16) yields

$\underset{n\to \mathrm{\infty }}{lim sup}{s}_{\tau \left(n\right)}\le 0,$

and hence

$\underset{n\to \mathrm{\infty }}{lim}{s}_{\tau \left(n\right)}=0.$

From (14), we have

$\underset{n\to \mathrm{\infty }}{lim sup}{s}_{\tau \left(n\right)+1}\le \underset{n\to \mathrm{\infty }}{lim sup}{s}_{\tau \left(n\right)}.$

Thus,

$\underset{n\to \mathrm{\infty }}{lim}{s}_{\tau \left(n\right)+1}=0.$

From Lemma 2.4, we have

$0\le {s}_{n}\le max\left\{{s}_{\tau \left(n\right)},{s}_{\tau \left(n\right)+1}\right\}.$

Therefore, ${s}_{n}\to 0$. That is, ${x}_{n}\to z$. This completes the proof. □

From Theorem 3.1, we can deduce easily the following algorithm and corollary.

Algorithm 3.2 For given ${x}_{0}\in C$, assume that $\left\{{x}_{n}\right\}$ has been constructed. If $\mathrm{\nabla }f\left({x}_{n}\right)=0$, then stop and ${x}_{n}$ is a solution of SFP (1). Otherwise, continue and compute ${x}_{n+1}$ by the recursion

${x}_{n+1}={P}_{C}\left[\left(1-{\alpha }_{n}\right)\left({x}_{n}-\frac{{\rho }_{n}f\left({x}_{n}\right)}{{\parallel \mathrm{\nabla }f\left({x}_{n}\right)\parallel }^{2}}\mathrm{\nabla }f\left({x}_{n}\right)\right)\right],\phantom{\rule{1em}{0ex}}n\ge 0,$
(17)

where $\left\{{\alpha }_{n}\right\}\subset \left(0,1\right)$ and $\left\{{\rho }_{n}\right\}\subset \left(0,2\right)$.

Theorem 3.2 Suppose that the SFP is consistent, that is, $\mathrm{\Gamma }\ne \mathrm{\varnothing }$. Assume that the following conditions hold:

1. (a)

${lim}_{n\to \mathrm{\infty }}{\alpha }_{n}=0$ and ${\sum }_{n=1}^{\mathrm{\infty }}{\alpha }_{n}=\mathrm{\infty }$;

2. (b)

${inf}_{n}{\rho }_{n}\left(2-{\rho }_{n}\right)>0$.

Then $\left\{{x}_{n}\right\}$ defined by (17) converges strongly to the minimum norm solution of the SFP.

4 Concluding remarks

This work contains our study dedicated to developing and improving self-adaptive methods for solving the split feasibility problem. We have introduced our improved self-adaptive method for solving the split feasibility problem. As a special case, the minimum norm solution of the split feasibility problem can be approached iteratively. This study is motivated by relevant applications for solving many real-world problems, which give rise to mathematical models in the sphere of variational inequality problems.