1 Introduction

In this paper, we consider the problem of finding a solution of the nonlinear system of equations

$$\begin{aligned} F(x)=0, \end{aligned}$$
(1.1)

where F:R nR n is continuous and monotone. By monotonicity, we mean

$$\begin{aligned} \bigl\langle F(x)-F(y),x-y\bigr\rangle\geq0,\quad\forall x,y\in R^n. \end{aligned}$$

Nonlinear monotone equations have many practical background such as the first order necessary condition of the unconstrained convex optimization problem and the subproblems in the generalized proximal algorithms with Bregman distances [9]. Some monotone variational inequality problems can also be converted into the form of the problem by means of fixed point maps or normal maps if the underlying function satisfies some coercive conditions [12].

Many methods for solving (1.1) fall into the class of the quasi-Newton methods since they converge rapidly from sufficiently good initial guess. Since Boryden [1] proposed the first quasi-Newton method for solving nonlinear equations, there has been significant progress in the theoretical study on quasi-Newton methods, especially in local convergence analysis [2, 3]. To ensure global convergence, some line search strategy for some merit function are used. Recently, Solodov and Svaiter [10] presented a Newton-type algorithm for solving systems of monotone equations. By using hybrid projection method, they showed that their method converges globally. For nonlinear equations, Griewank [4] obtained a global convergence results for Broyden’s rank one method. By introduce a new line search process, Li and Fukushima [7] have developed a globally convergent Broyden-like method for solving nonlinear equations and [6] presented a globally convergent Gauss-Newton-based BFGS method for solving symmetric nonlinear equations. The method in [6, 7] is not norm descent. Gu, Li, Qi and Zhou [5] generalized the method in [6] and proposed a globally convergent and norm descent BFGS method for solving symmetirc equations. Quite recently, Zhou and Li [13] proposed a global convergence BFGS method for systems of monotone equations without use of merit functions. We refer to papers [8, 11] for a review on recent advances in this area.

In this paper, based on the hyperplane projection method [10], we propose a quasi-Newton method for solving systems of monotone equations without use of merit functions. The method is a combination of the Broyden method and the hyperplane projection method [10]. Under appropriate conditions, we show that the proposed method is globally convergent. Preliminary numerical results show that the method is promising.

The paper is organized as follows. In Sect. 2, after simply recalling hyperplane projection method, we present the algorithm. In Sect. 3, we establish the global convergence of the algorithm. We report some numerical results in the last section.

2 Algorithm

In this section, we describe the method in detail. Firstly, let us first recall the hyperplane projection method in [10]. Note that by the monotonicity of F, for any \(\bar{x}\) such that \(F(\bar{x})=0\), we have

$$\begin{aligned} \bigl\langle F(z_k), \bar{x}-z_k \bigr\rangle\leq0. \end{aligned}$$

Let x k be the current iterate. By performing some kind of line search procedure along a direction \(\bar{d_{k}}\), a point \(z_{k}=x_{k}+\alpha_{k}\bar{d_{k}}\) can be computed such that

$$\begin{aligned} \bigl\langle F(z_k), x_k-z_k \bigr \rangle> 0. \end{aligned}$$

Thus the hyperplane

$$\begin{aligned} {\mathcal{H}}_k=\bigl\{x \in R^n \bigm| \bigl\langle F(z_k), x-z_k \bigr\rangle=0\bigr\} \end{aligned}$$

strictly separates the current iterate x k from zeros of (1.1). Therefore, it is reasonable to let the next iterate x k+1 be the projection of x k onto the hyperplane.

Now, we state the steps of the algorithm as follow.

Algorithm 2.1

(Broyden method)

  • Step 1. Given an initial point x 0R n and constants β∈(0,1), η∈(0,1), ξ∈(0,1) and 0<σ min<σ max. Given the initial steplength σ 0=1, B 0=I (the identity matrix) and d 0=−F(x 0). Let k:=0.

  • Step 2. Stop if ∥F(x k )∥=0.

  • Step 3. Determine steplength \(\alpha_{k}=\sigma_{k}\beta^{m_{k}}\) such that m k is the smallest nonnegative integer m satisfying

    $$\begin{aligned} -\bigl\langle F\bigl(x_k+\sigma_k \beta^md_k\bigr),d_k \bigr\rangle\geq\eta \sigma_k\beta^m\bigl\|F\bigl(x_k+ \sigma_k\beta^md_k\bigr)\bigr\|\|d_k \|^2. \end{aligned}$$
    (2.1)

    Let z k =x k +α k d k . Stop if ∥F(z k )∥=0.

  • Step 4. Compute the projection of x k on \({\mathcal{H}}_{k}\) by

    $$\begin{aligned} x_{k+1}=x_k-\frac{\langle F(z_k),x_k-z_k\rangle}{\|F(z_k)\|^2}F(z_k). \end{aligned}$$
    (2.2)

    Stop if ∥F(x k+1)∥=0.

  • Step 5. Compute B k+1 by the following Broyden update formula

    $$\begin{aligned} B_{k+1}=B_k+\frac{(y_k-B_ks_k)s_k^T}{\|s_k\|^2} \end{aligned}$$
    (2.3)

    where s k =x k x k−1 and y k =F(x k )−F(x k−1).

  • Step 6. Compute d k+1 by solving the linear equation

    $$\begin{aligned} B_{k+1} d_{k+1}=-F(x_{k+1}). \end{aligned}$$
    (2.4)

    If the system (2.4) is not solvable or the condition

    $$\begin{aligned} d_{k+1}^{T}F(x_{k+1})<-\xi\bigl\|F(x_{k+1}) \bigr\|^2 \end{aligned}$$

    is not satisfied, set d k+1=−F(x k+1) and \(\sigma_{k+1}=\max\{\sigma_{\min},\min\{\frac{\|s_{k}\| ^{2}}{s_{k}^{T}y_{k}},\sigma_{\max}\}\}\); else set σ k+1=1.

  • Step 7. Let k:=k+1. Go to Step 3.

Remark

It is easy to see from Step 6 of Algorithm 2.1 that

$$\begin{aligned} -F_{k}^{T}d_k\geq\xi \|F_k\|^2. \end{aligned}$$
(2.5)

Therefore after a finite number of reductions of α k , the line search condition (2.1) necessarily holds. Consequently, Algorithm 2.1 is well-defined.

3 Convergence property

This section is devoted to the global convergence of Algorithm 2.1. To establish global convergence of Algorithm 2.1, we need the following assumption.

Assumption 3.1

(1) ∇F is Lipschitz continuous on R n, i.e., there is a constant L>0 such that

$$\begin{aligned} \bigl\|\nabla F(x)-\nabla F(y)\bigr\|\leq L \|x-y\|, \quad \forall x, y \in R^{n}, \end{aligned}$$

where ∇F denotes the Jacobian of F.

(2) ∇F(x) in nonsingular for every xR n.

Before proving global convergence of Algorithm 2.1, we first give three preliminary lemmas. The following lemma is from [10].

Lemma 3.1

Let F be monotone and x,yR n satisfyF(y),xy〉>0. Let

$$\begin{aligned} x^+=x-\frac{\langle F(y), x-y\rangle}{\|F(y)\|^2}F(y). \end{aligned}$$

Then for any \(\bar{x}\in R^{n}\) such that \(F(\bar{x})=0\), it holds that

$$\begin{aligned} \|x^+-\bar{x}\|^2\leq\|x-\bar{x}\|^2-\|x^+-x \|^2. \end{aligned}$$

Define

$$\begin{aligned} \delta_k=\frac{\|y_k-B_ks_k\|}{\|s_k\|} \quad \mbox{and} \quad A_{k+1}=\int _{0}^{1}\nabla F(x_k+\tau s_k)d\tau. \end{aligned}$$

Then by the mean-value theorem, we have y k =A k+1 s k and hence

$$\begin{aligned} \delta_k=\frac{\|(A_{k+1}-B_k)s_k\|}{\|s_k\|}. \end{aligned}$$

Moreover, by the update formula (2.3), we have

$$\begin{aligned} B_{k+1}=B_k+ \frac{(A_{k+1}-B_k)s_ks_k^T}{\|s_k\|^2}. \end{aligned}$$

In a similar way to Lemma 2.6 in [7], it is not difficult to prove the following useful lemma.

Lemma 3.2

Suppose that Assumption 3.1 holds and the sequence {x k } generated by Algorithm 2.1 is bounded. If

$$\begin{aligned} \sum_{k=0}^{\infty} \|s_k\|^2< \infty, \end{aligned}$$
(3.1)

then there is a subsequence of {δ k } tending to zero.

Lemma 3.3

Suppose that Assumption 3.1 holds and the sequence {x k } generated by Algorithm 2.1 is bounded. If (3.1) holds, then there exist an infinite set K 1 and a constant C 1>0 such that

$$\begin{aligned} \xi \bigl\|F(x_k)\bigr\| \leq\|d_k\| \leq C_1 \bigl\|F(x_k)\bigr\| \end{aligned}$$
(3.2)

for all kK 1 large enough.

Proof

By Lemma 3.2, there is a subsequence {δ k } kK of {δ k } converging to zero. Since {x k } kK is bounded, there exists an infinite set K 1K such that \(\lim_{k \in K_{1}}x _{k}=\bar{x}\). By (3.1) and the definition of A k+1, it is clear that \(\{A_{k+1}\}_{k\in K_{1}}\) tends to \(\nabla F(\bar{x})\). By the nonsingularity of \(\nabla F(\bar{x})\), there exists a constant M 1>0 such that \(\|A_{k+1}^{-1}\| \leq M_{1}\) for all kK 1 sufficiently large. Thus by Step 6 of Algorithm 2.1 and the definition of δ k , we have

$$\begin{aligned} \|d_k\| \leq& \max\bigl\{\bigl\|F(x_k)\bigr\|, \bigl\|A_{k+1}^{-1}\bigl((A_{k+1}-B_k)d_k-F(x_k) \bigr)\bigr\|\bigr\} \\ \leq& \max\bigl\{\bigl\|F(x_k\bigr)\bigr\|, M_1\bigl( \delta_k\|d_k\|+\bigl\|F(x_k\bigr)\bigr\|\bigr)\bigr\}. \end{aligned}$$

Since \(\lim_{k\in K_{1}}\delta_{k} =0\), the last inequality implies that there is a constant C 1>0 such that for all kK 1 sufficiently large

$$\begin{aligned} \|d_k\| \leq C_1 \bigl\|F(x_k)\bigr\|. \end{aligned}$$
(3.3)

On the other hand, applying Cauchy-Schwarz inequality to (2.5), we obtain

$$\begin{aligned} \| d_k \| \geq\xi\|F_k\|. \end{aligned}$$

The last inequality together with (3.3) implies (3.2). □

Now we establish a global convergence theorem for Algorithm 2.1.

Theorem 3.1

Suppose that Assumption 3.1 holds. Let {x k } be generated by Algorithm 2.1. Suppose that F is monotone and that the solution set of (1.1) is not empty. Then for any \(\bar{x}\) such that \(F(\bar{x})=0\), it holds that

$$\begin{aligned} \|x_{k+1}-\bar{x}\|^2\leq\|x_k-\bar{x} \|^2-\|x_{k+1}-x_k\|^2. \end{aligned}$$

In particular, {x k } is bounded. Furthermore, it holds that either {x k } is finite and the last iterate is a solution, or the sequence is infinite and lim k→∞x k+1x k ∥=0. Moreover, {x k } converges to some solution of (1.1).

Proof

We first note that if the algorithm terminates at some iteration k, then ∥F(z k )∥=0 or ∥F(x k )∥=0. This means that x k or z k is a solution of (1.1).

Suppose that ∥F(z k )∥≠0 and ∥F(x k )∥≠0 for all k. Then an infinite sequence {x k } is generated. It follows from (2.1) that

$$\begin{aligned} \bigl\langle F(z_k),x_k-z_k \bigr\rangle=-\alpha_k \bigl\langle F(z_k),d_k \bigr\rangle \geq\eta \bigl\|F(z_k)\bigr\|\alpha_{k}^2 \|d_k\|^2>0. \end{aligned}$$
(3.4)

Let \(\bar{x}\) be any solution such that \(F(\bar{x})=0\). By (2.2), (3.4) and Lemma 3.1, we obtain

$$\begin{aligned} \|x_{k+1}-\bar{x}\|^2\leq\|x_k- \bar{x}\|^2-\|x_{k+1}-x_k\|^2. \end{aligned}$$
(3.5)

Hence the sequence \(\{\|x_{k}-\bar{x}\|^{2}\}\) is decreasing and convergent. In particular, the sequence \(\{\|x_{k}-\bar{x}\|\}\) is convergent and the sequence {x k } is bounded. Again by (3.5), we have

$$\begin{aligned} \|x_{k+1}-x_k\|^2 \leq \|x_k-\bar{x}\|^2-\|x_{k+1}-\bar{x} \|^2. \end{aligned}$$
(3.6)

Summing both side of (3.6), since the sequence \(\{\|x_{k}-\bar{x}\|^{2}\}\) is convergent, we have (3.1). In particular, we have

$$\begin{aligned} \lim_{k\rightarrow\infty}\|x_{k+1}-x_k \|=0. \end{aligned}$$
(3.7)

By (2.2) and (3.4), we obtain

$$\begin{aligned} \|x_{k+1}-x_k\|=\frac{\langle F(z_k),x_k-z_k \rangle}{\bigl\|F(z_k\bigr)\|}\geq\eta \alpha_k^2\|d_k\|^2. \end{aligned}$$

The last inequality together with (3.7) implies

$$\begin{aligned} \lim_{k\rightarrow\infty}\alpha_k\|d_k \|=0. \end{aligned}$$
(3.8)

Now we consider the following two possible cases:

  1. (i)

    lim k→∞inf∥F(x k )∥=0.

  2. (ii)

    lim k→∞inf∥F(x k )∥=ϵ>0.

If (i) holds, by the continuity of F and the boundedness of {x k }, it is clear that the sequence {x k } has some accumulation point \(\widehat{x}\) such that \(F(\widehat{x})=0\). From (3.5), it holds that the sequence \(\{\|x_{k}-\widehat{x}\|\}\) is convergent, and since \(\widehat{x}\) is an accumulation point of {x k }, it must hold that {x k } converges to \(\widehat{x}\).

If (ii) holds. In this case, by the boundedness of {x k } and the continuity of F, there exists a positive constant C such that

$$\begin{aligned} \frac{\epsilon}{2} \leq \bigl\|F(x_k)\bigr\| \leq C. \end{aligned}$$
(3.9)

On the other hand, by Assumption 3.1, the boundedness of {x k } and (3.1), then Lemma 3.3 holds. Lemma 3.3 and (3.2) and (3.9) implies that \(\{d_{k}\}_{k\in K_{1}}\) is bounded from below. Hence, from (3.8) we get that

$$\begin{aligned} \lim_{k \in K_1,\; k \rightarrow\infty} \alpha_k=0. \end{aligned}$$

By the line search rule, we have for all kK 1 sufficiently large, \(\sigma_{k} \beta^{m_{k}-1}\) will not satisfy (2.1). This means

$$\begin{aligned} -\bigl\langle F\bigl(x_k+\sigma_k \beta^{m_k-1}d_k\bigr),d_k \bigr\rangle<\eta \sigma_k \beta^{m_k-1}\bigl\|F\bigl(x_k+ \sigma_k\beta^{m_k-1}d_k\bigr)\bigr\| \|d_k\|^2. \end{aligned}$$
(3.10)

The boundedness of \(\{x_{k}\}_{k\in K_{1}} \) implies that there exist an accumulation point \(\hat{x}\) and an infinite index set K 2K 1 such that \(\lim_{k \in K_{2}} x_{k} = \hat{x}\). Since the sequence \(\{d_{k}\}_{k\in K_{2}}\) is also bounded, there exist an infinite index set K 3K 2 and an accumulation point \(\hat{d}\) such that \(\lim_{k \in K_{3}} d_{k}=\hat{d}\). Taking limit in (3.10) for kK 3, we obtain

$$\begin{aligned} -\bigl\langle F(\hat{x}),\hat{d} \bigr\rangle\leq0. \end{aligned}$$

However, it is easy to see from (2.5) that

$$\begin{aligned} -\bigl\langle F(\hat{x}),\hat{d} \bigr\rangle> 0. \end{aligned}$$

This yields a contradiction. Consequently, the case (ii) is not possible. The proof is complete. □

4 Numerical results

In this section, we tested Algorithm 2.1 and compared it with the BFGS method in [13] and the INM method in [10]. We implemented Algorithm 2.1 with the following parameters: β=0.6, ξ=10−8 and η=10−4. If σ k ∉[σ min ,σ max ], we replace it by

$$\begin{aligned} \sigma_k=\left \{ \begin{array}{l@{\quad}l@{\ }l} 1 &\mbox{if} &\|F(x_k)\|>1,\\ \|F(x_k)\|^{-1} & \mbox{if} & 10^{-5}\leq\|F(x_k)\|\leq1,\\ 10^5 &\mbox{if}& \|F(x_k)\| < 10^{-5}, \end{array} \right . \end{aligned}$$

where σ min =10−10 and σ max =1010. We stop the iteration if the iteration number exceeds 500 or the inequality

$$\begin{aligned} \bigl\|F(x_k)\bigr\|\leq10^{-4} \quad \mbox{or} \quad \bigl\|F(z_k)\bigr\|\leq 10^{-4} \end{aligned}$$

is satisfied. The BFGS method in [13] was implemented with the following parameters: β=0.6, σ=10−5, h=10−4 and r=0. For the INM method in [10], we set μ k =∥F(x k )∥, ρ k =0, β=0.4, λ=0.0001. The stop criterion is ∥F(x k )∥≤10−4 or the iteration number exceeds 500. The codes were written in FORTRAN 90 with double precision arithmetic and carried out on a PC (CPU 3.0 GHz, 512M memory) with Windows operation system.

The efficiency of the proposed method was tested on the following two problems with various dimensions and different initial points.

Problem 1

Function F is given by

$$\begin{aligned} F(x)=Ax+g(x), \end{aligned}$$

where \(g(x)=(2e^{x_{1}}-1,3e^{x_{2}}-1,\ldots,3e^{x_{n-1}}-1, 2e^{x_{n}}-1)^{T}\) and

$$\begin{aligned} A=\left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c@{\quad}c} 2&-1\\ -1&2&-1\\ &\ddots&\ddots&\ddots\\ &&\ddots&\ddots&-1\\ &&&-1&2 \end{array} \right ). \end{aligned}$$

Problem 2

Function F is given by

$$\begin{aligned} F_1(x) =&2x_1+sin(x_1)-1 \\ F_i(x) =&-2x_{i-1}+2x_i+sin(x_i)-1.0, \quad i=2,\ldots, n-1 \\ F_n(x) =&2x_n+sin(x_n)-1. \end{aligned}$$

We note that Problem 1 is symmetric while Problem 2 is nonsymmetric. The results are listed in Tables 12 where x 1=(0.1,0.1,…,0.1)T, x 2=(1,1,…,1)T, \(x_{3} = (\frac{1}{n},\frac{2}{n},\ldots,1)^{T}\) x 4=(−10,−10,…,−10)T, x 5=(−0.1,−0.1,…,−0.1)T, x 6=(−1,−1,…,−1)T, \(x_{7}=(1-\frac{1}{n},1-\frac{2}{n},\ldots, 0)^{T}\), x 8=(−0.01,−0.01,…,−0.01)T, \(x_{9}=(\frac{n}{n-1},\frac{n}{n-1}, \ldots, \frac{n}{n-1})\), \(x_{10}=(\frac{1}{n},\frac{1}{n},\ldots,\frac{1}{n})\), \(x_{11}=(\frac{n}{n+1},\frac{n}{n+1}, \ldots, \frac{n}{n+1})\) and \(x_{12}=(\frac{1}{3},\frac{1}{3},\ldots,\frac{1}{3})\). In Tables 12, we report the problem number along with the initial point number (Pro(initial)), the dimension of each test problem (dim), the number of iterations (iter), the number of function evaluations (fun) and the CPU time in seconds (time). We claim that two method fails, and use the symbol ’F’, when some of the following options hold:

  1. (a)

    the number of iterations is greater than or equal to 500; or

  2. (b)

    the number of backtracking required by the line search along a step is greater than or equal to 20.

Table 1 Test results for Problem 1
Table 2 Test results for Problem 2

We tested each problem 100 times with the same initial point. The CPU time reported in Tables 12 is the average value. In the tables, “method 1” and “method 2” represent Algorithm 2.1 and the BFGS method in [13] respectively.

From Tables 12, we observe that method 1 performed much better than method 2 and in most cases the INM method performs best. Simultaneously, during the numerical experiments, it is interesting to note that the step d k =−F(x k ) never appeared when k>0 in method 1. In other words, the condition \(d_{k}^{T}F_{k} \leq-\xi\|F_{k}\|^{2}\) was always satisfied.

5 Conclusion

In this paper, we propose an algorithm for solving nonlinear monotone equations, which combines the Broyden method and the hyperplane projection method. Under appropriate conditions, we prove that the proposed method is globally convergent. We also report some numerical results to show efficiency of the proposed method.