1 Introduction

In this paper, we consider the following nonlinear systems of monotone equations:

$$ F(x) = 0, $$
(1)

where \(F:\mathbb{R}^{n} \rightarrow\mathbb{R}^{n}\) is monotone and continuous, which means F satisfies \((F(x) - F(y))^{T} (x - y) \geq0 \) for all \(x,y\in\mathbb{R}^{n}\). Nonlinear systems of monotone equations are widely used in economy, finance, engineering, industry and many other fields, so there are numerous iterative algorithms for solving (1).

Recently, La Cruz [3] presented a spectral method that uses the residual vector as search direction for solving large-scale systems of nonlinear monotone equations. Solodov and Svaiter [12] proposed a method which combined projection, proximal point and Newton method. According to the work of Solodov and Svaiter [12], Zhang and Zhou [17] developed a spectral gradient projection method for solving nonlinear monotone equations.

In particular, the conjugate gradient methods are widely used methods for solving large-scale nonlinear equations because of the low memory and simplicity [8, 10]. In the last few years, some authors proposed a series of methods for solving nonlinear monotone equations, which combined conjugate gradient methods and projection method [12]. For instance, Cheng [2] extended the PRP conjugate method to monotone equations. Yan et al. [13] proposed two modified HS conjugate method. Li and Li [7] designed a class of derivative-free methods based on line search technique. Ahookhosh et al. [1] developed two derivative-free conjugate gradient procedures. Papp and Rapajić [9] described some new FR type methods. Yuan et al. [14, 15] proposed new three-terms conjugate gradient methods. Dai et al. [4] gave a modified Perrys conjugate gradient method. Zhou et al. [20, 21] developed a class of methods. Zhang [16] developed a residual method-based secant condition.

For unconstrained optimization problem, Rivaie et al. [11] designed a RMIL conjugate gradient method. Fang and Ni [6] extended the RMIL method to solve large-scale nonlinear systems of equations with the ideas of nonmonotone line search. Numerical experiments show that the RMIL method is practically effective.

For systems of monotone equations, we describe a class of new derivative-free gradient type method, which is inspired by the efficiency of the RMIL method [11], and the strategy of projection method [12].

This paper is organized as follows. In Sect. 2, we propose the algorithm. In Sect. 3, we establish the global convergence. In Sect. 4, numerical results show the efficiency of the proposed methods. In Sect. 5, we give some conclusions. We denote by \(\|\cdot\|\) the Euclidean norm.

2 Algorithm

In this section, we first consider the conjugate gradient method for the following unconstrained optimization problem:

$$ \min f(x), $$
(2)

where \(f:\mathbb{R}^{n} \rightarrow\mathbb{R}\) is smooth.

Quite recently, Rivaie et al. [11] developed RMIL conjugate gradient method, and the search direction \(d_{k}\) is given by

$$ d_{k} = \left \{ \textstyle\begin{array}{l@{\quad}l} -g_{k} & \textrm{if } k = 0,\\ -g_{k} + \frac{g_{k}^{T} (g_{k} - g_{k-1})}{\|d_{k-1}\|^{2}} d_{k-1} & \textrm {if } k \geq1, \end{array}\displaystyle \right . $$
(3)

where \(g_{k}\) is the gradient of f.

Numerical results show that RMIL conjugate gradient method is superior and more efficient than other conjugate gradient method.

We focus on the method for solving monotone equations (1). We have the projection procedure in [12], by performing some line search techniques to find a point

$$ z_{k} = x_{k} + \alpha_{k} d_{k}, $$
(4)

such that

$$ \bigl(F(z_{k}), x_{k} - z_{k}\bigr) > 0. $$
(5)

On the other hand, for any \(x^{*}\) such that \(F(x^{*}) = 0\), by the monotonicity of F, we obtain

$$ \bigl(F(z_{k}), x^{*} - z_{k}\bigr) = - \bigl(F\bigl(x^{*}\bigr)-F(z_{k}), x^{*} - z_{k}\bigr) \leq0. $$
(6)

Equations (5) and (6) imply that the hyperplane

$$ H_{k} = \bigl\{ x \in R^{n} | \bigl(F(z_{k}), x - z_{k}\bigr)=0\bigr\} $$
(7)

strictly separates the zeros of systems of monotone equations (1) from \(x_{k}\). Therefore, Solodov and Svaiter [12] could compute the next iterate \(x_{k+1}\) by projecting \(x_{k}\) onto the hyperplane \(H_{k}\). Specifically, \(x_{k+1}\) is obtained by

$$ x_{k+1} = x_{k} - \frac{F(z_{k})^{T} (x_{k} - z_{k})}{ \Vert F(z_{k}) \Vert ^{2}}F(z_{k}). $$
(8)

The steplength \(\alpha_{k}\) of (4) is determined by a proper line search technique. Recently, Zhang and Zhou[17], Zhou[18] presented the following derivative-free line search rule:

$$ -F(x_{k} + \alpha_{k} d_{k})^{T} d_{k} \geq\sigma\alpha_{k} \bigl\Vert F(x_{k} + \alpha_{k} d_{k}) \bigr\Vert \Vert d_{k} \Vert ^{2}, $$
(9)

where \(\alpha_{k} = \max\{s,\rho s, {\rho}^{2} s ,\ldots\}\), \(d_{k}\) is the search direction, σ, s, ρ are constants, and \(\sigma>0\), \(s>0\), \(1 > \rho> 0 \).

Now, we extend RMIL conjugate gradient method [11] for solving nonlinear systems of monotone equations, which combined the projection method [12] and derivative-free line search technique [17, 18]. The steps of our algorithm are listed as follows.

Algorithm 1

(MRMIL)

Step 0: Choose an initial point \(x_{0}\in\mathbb{R}^{n}\). Let \(\delta> 0\), \(\sigma >0\), \(s >0\), \(\epsilon>0\), \(1 > \rho> 0\), \(k_{\mathrm{max}} > 0\). Set \(k = 0\).

Step 1: Choose the search direction \(d_{k}\) that satisfies the following sufficient descent condition:

$$ F_{k}^{T} d_{k} \leq- \delta \Vert F_{k} \Vert ^{2}, $$
(10)

and determine the initial steplength \(\alpha= s\).

Step 2: If

$$ -F(x_{k} + \alpha d_{k})^{T} d_{k} \geq\sigma\alpha \bigl\Vert F(x_{k} + \alpha d_{k}) \bigr\Vert \Vert d_{k} \Vert ^{2}, $$
(11)

then set \(\alpha_{k} = \alpha\), \(z_{k} = x_{k} + \alpha_{k} d_{k}\) and go to step 3.

Else set \(\alpha_{k} = \rho\alpha_{k}\), and go to step 2.

Step 3: If \(\|F(z_{k})\| > \epsilon\), then compute

$$ x_{k+1} = x_{k} - \frac{F(z_{k})^{T} (x_{k} - z_{k})}{ \Vert F(z_{k}) \Vert ^{2}}F(z_{k}) $$
(12)

and go to step 4, otherwise stop.

Step 4: If \(\|F(x_{k+1})\| > \epsilon\) and \(k < k_{\mathrm{max}}\), then set \(k = k + 1\) and go to step 1, otherwise stop.

Let \(y_{k-1} = F_{k} - F_{k-1}\), \(\beta_{k} = \frac{F_{k}^{T} y_{k-1}}{\|d_{k-1}\| ^{2}}\), \(0 < \gamma< 1\). Now, based on the direction \(d_{k}\) of RMIL conjugate gradient algorithm for unconstrained optimization, we are going to construct three directions \(d_{k}\) that satisfy the sufficient descent condition(10).

MRMIL1 direction:

$$ d_{k} = \left \{ \textstyle\begin{array}{l@{\quad}l} -F_{k} & \textrm{if } k = 0,\\ -\theta_{k} F_{k} + \beta_{k} d_{k-1} & \textrm{if } k \geq1, \end{array}\displaystyle \right . $$
(13)

where \(\theta_{k} = \frac{(F_{k}^{T} y_{k-1})^{2}}{4\gamma\|F_{k}\|^{2} \|d_{k-1}\| ^{2}} + 1 \), \(0 < \gamma< 1\). We set \(u = \sqrt{2\gamma}\|d_{k-1}\|^{2} F_{k}\), \(v = \frac{1}{\sqrt {2\gamma}}(F_{k}^{T} y_{k-1}) d_{k-1}\), and use \(u^{T} v \leq\frac{1}{2}(\|u\|^{2} + \|v\|^{2})\), then, for \(k \in \mathbb{N}\), we have

$$ \begin{aligned}[b] F_{k}^{T} d_{k}& = -\theta_{k} \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}F_{k}^{T} d_{k-1} \\ & = - \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1} \Vert d_{k-1} \Vert ^{2} F_{k}^{T} d_{k-1} - \frac{1}{4\gamma} (F_{k}^{T} y_{k-1})^{2} \Vert d_{k-1} \Vert ^{2}}{ \Vert d_{k-1} \Vert ^{4}} \\ & \leq(\gamma- 1) \Vert F_{k} \Vert ^{2}. \end{aligned} $$
(14)

The MRMIL1 method is the Algorithm1 with MRMIL1 direction which is defined by (13).

MRMIL2 direction:

$$ d_{k} = \left \{ \textstyle\begin{array}{l@{\quad}l} -F_{k} & \textrm{if } k = 0,\\ -\theta_{k} F_{k} + \beta_{k} d_{k-1} & \textrm{if } k \geq1, \end{array}\displaystyle \right . $$
(15)

where \(\theta_{k} = \frac{(F_{k}^{T} d_{k-1})^{2} \|y_{k-1}\|^{2}}{4\gamma\|F_{k}\| ^{2} \|d_{k-1}\|^{4}} + 1 \), \(0 < \gamma< 1\). We set \(u = \sqrt{2\gamma}\|d_{k-1}\|^{2} F_{k}\), \(v = \frac{1}{\sqrt {2\gamma}}(F_{k}^{T} d_{k-1}) y_{k-1}\), and use \(u^{T} v \leq\frac{1}{2}(\|u\|^{2} + \|v\|^{2})\), then, for \(k \in \mathbb{N}\), we have

$$ \begin{aligned}[b] F_{k}^{T} d_{k}& = -\theta_{k} \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}F_{k}^{T} d_{k-1} \\ & = - \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1} \Vert d_{k-1} \Vert ^{2} F_{k}^{T} d_{k-1} - \frac{1}{4\gamma}(F_{k}^{T} d_{k-1})^{2} \Vert y_{k-1} \Vert ^{2}}{ \Vert d_{k-1} \Vert ^{4}} \\ & \leq(\gamma- 1) \Vert F_{k} \Vert ^{2}. \end{aligned} $$
(16)

The MRMIL2 method is Algorithm 1 with the MRMIL2 direction which is defined by (15).

MRMIL3 direction:

$$ d_{k} = \left \{ \textstyle\begin{array}{l@{\quad}l} -F_{k} & \textrm{if } k = 0,\\ - F_{k} + \beta_{k} d_{k-1} - \theta_{k} y_{k-1} & \textrm{if } k \geq1, \end{array}\displaystyle \right . $$
(17)

where \(\theta_{k} = \frac{F_{k}^{T} y_{k-1}}{4\gamma\|d_{k-1}\|^{2}} \), \(0 < \gamma< 1\). We set \(u = \sqrt{2\gamma}\|d_{k-1}\|^{2} F_{k}\), \(v = \frac{1}{\sqrt {2\gamma}}(F_{k}^{T} y_{k-1}) d_{k-1}\), and use \(u^{T} v \leq\frac{1}{2}(\|u\|^{2} + \|v\|^{2})\), then, for \(k \in \mathbb{N}\), we have

$$ \begin{aligned}[b] F_{k}^{T} d_{k}& = - \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}F_{k}^{T} d_{k-1} - \theta_{k} F_{k}^{T} y_{k-1} \\ & = - \Vert F_{k} \Vert ^{2} + \frac{F_{k}^{T} y_{k-1} \Vert d_{k-1} \Vert ^{2} F_{k}^{T} d_{k-1} - \frac{1}{4\gamma}(F_{k}^{T} y_{k-1})^{2} \Vert d_{k-1} \Vert ^{2}}{ \Vert d_{k-1} \Vert ^{4}} \\ & \leq(\gamma- 1) \Vert F_{k} \Vert ^{2}. \end{aligned} $$
(18)

The MRMIL3 method is Algorithm 1 with the MRMIL3 direction which is defined by (17).

Using (13), (15) and (17), we get

$$ \bigl\Vert F_{0}^{T} d_{0} \bigr\Vert = - \Vert F_{0} \Vert ^{2}. $$
(19)

From (14), (16), (18) and (19), it is not difficult to show that the directions \(d_{k}\) defined by the MRMIL1, MRMIL2 and MRMIL3 directions satisfy the sufficient descent condition

$$ F_{k}^{T} d_{k} \leq-\delta \Vert F_{k} \Vert ^{2}, \quad \forall k \in \mathbb{N} \cup0, $$
(20)

if we let \(\delta= 1 - \gamma\) and \(0< \gamma<1\).

3 Convergence analysis

In this section, so as to obtain the global convergence of MRMIL1, MRMIL2 and MRMIL3 method, we give the following assumptions.

Assumption 3.1

  1. (1)

    The solution set of the systems of monotone equations \(F(x)=0\) is nonempty.

  2. (2)

    \(F(x)\) is Lipschitz continuous on \(\mathbb{R}^{n}\), namely

    $$ \bigl\Vert F(x) - F(y) \bigr\Vert \leq L \Vert x - y \Vert , \quad\forall x,y \in\mathbb{R}^{n}, $$
    (21)

    where L is a positive constant.

Assumption 3.1 implies that

$$ \bigl\Vert F(x) \bigr\Vert \leq\kappa,\quad\forall x \in \mathbb{R}^{n}, $$
(22)

where κ is a positive constant.

Now, we get Lemma 3.1 whose proof is similar to those in [12] and is omitted.

Lemma 3.1

Suppose Assumption3.1is satisfied and the sequence\(\{x_{k}\}\)is generated by the Algorithm1. For any\(x^{*}\)such that\(F({x^{*}}) = 0\), we have

$$\bigl\Vert x_{k+1} - x^{*} \bigr\Vert ^{2} + \Vert x_{k+1} - x_{k} \Vert ^{2}\leq \bigl\Vert x_{k} - x^{*} \bigr\Vert ^{2}. $$

In addition, the sequence\(\{ x_{k} \}\)satisfies

$$ \lim_{k\rightarrow\infty} \Vert x_{k+1} - x_{k} \Vert = 0. $$
(23)

Lemma 3.2

Suppose Assumption3.1is satisfied and the sequences\(\{x_{k}, d_{k}\}\)are generated by the Algorithm1with the MRMIL1, MRMIL2 or MRMIL3 direction. Then we have

$$ \Vert d_{k} \Vert \leq\kappa\biggl(1 + Ls + \frac{(Ls)^{2}}{4\gamma} \biggr), $$
(24)

whereκ, γ, s, Lare constants, and\(\kappa>0\), \(1 > \gamma > 0\), \(s>0\), \(L>0\).

Proof

From (4) and the step 3 of Algorithm 1, we have

$$ \Vert x_{k+1} - x_{k} \Vert = \frac{ \Vert F(z_{k})^{T} (x_{k} - z_{k})F(z_{k}) \Vert }{ \Vert F(z_{k}) \Vert ^{2}} \leq \Vert x_{k} - z_{k} \Vert = \alpha_{k} \Vert d_{k} \Vert . $$
(25)

By step 1 and step 2 of Algorithm 1, we get

$$ \alpha_{k} \leq s, \quad\forall k \in \mathbb{N} \cup0. $$
(26)

 □

For \(k \in\mathbb{N}\), the boundedness of \(d_{k}\) can be divided into three cases.

Case 1 (MRMIL1 direction): The MRMIL1 direction is defined by (13). Using (21), (22), (25) and (26), we have

$$ \begin{aligned}[b] \Vert d_{k} \Vert & = \biggl\Vert -\biggl(\frac{(F_{k}^{T} y_{k-1})^{2}}{4\gamma \Vert F_{k} \Vert ^{2} \Vert d_{k-1} \Vert ^{2}} + 1\biggr) F_{k} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}d_{k-1} \biggr\Vert \\ & \leq \Vert F_{k} \Vert \biggl(1 + \frac{ \Vert y_{k-1} \Vert ^{2}}{4\gamma \Vert d_{k-1} \Vert ^{2}} + \frac { \Vert y_{k-1} \Vert }{ \Vert d_{k-1} \Vert }\biggr) \\ & \leq \Vert F_{k} \Vert \biggl(1 + \frac{(L\alpha_{k-1})^{2}}{4\gamma} + L \alpha_{k-1}\biggr) \\ & \leq\kappa\biggl(1 + Ls + \frac{(Ls)^{2}}{4\gamma} \biggr). \end{aligned} $$
(27)

Case 2 (MRMIL2 direction): Analogously, the MRMIL2 direction is defined by (15). By (21), (22), (25) and (26), we get

$$ \begin{aligned}[b] \Vert d_{k} \Vert & = \biggl\Vert -\biggl(\frac{(F_{k}^{T} d_{k-1})^{2} \Vert y_{k-1} \Vert ^{2}}{4\gamma \Vert F_{k} \Vert ^{2} \Vert d_{k-1} \Vert ^{4}} + 1\biggr) F_{k} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}d_{k-1} \biggr\Vert \\ & \leq \Vert F_{k} \Vert \biggl(1 + \frac{ \Vert y_{k-1} \Vert ^{2}}{4\gamma \Vert d_{k-1} \Vert ^{2}} + \frac { \Vert y_{k-1} \Vert }{ \Vert d_{k-1} \Vert }\biggr) \\ & \leq\kappa\biggl(1 + Ls + \frac{(Ls)^{2}}{4\gamma} \biggr). \end{aligned} $$
(28)

Case 3 (MRMIL3 direction): The definition of MRMIL3 direction given by (17). (21), (22), (25) and (26) implies

$$ \begin{aligned} [b]\Vert d_{k} \Vert & = \biggl\Vert - F_{k} + \frac{F_{k}^{T} y_{k-1}}{ \Vert d_{k-1} \Vert ^{2}}d_{k-1} - \frac{F_{k}^{T} y_{k-1}}{4\gamma \Vert d_{k-1} \Vert ^{2}} y_{k-1} \biggr\Vert \\ & \leq \Vert F_{k} \Vert \biggl(1 + \frac{ \Vert y_{k-1} \Vert }{ \Vert d_{k-1} \Vert } + \frac{ \Vert y_{k-1} \Vert ^{2}}{4\gamma \Vert d_{k-1} \Vert ^{2}} \biggr) \\ & \leq\kappa\biggl(1 + Ls + \frac{(Ls)^{2}}{4\gamma} \biggr). \end{aligned} $$
(29)

From (13), (15), (17) and (22), we get

$$ \Vert d_{0} \Vert = \Vert -F_{0} \Vert \leq\kappa. $$
(30)

Combining with (27), (28) and (29), we find (24).

Lemma 3.3

Suppose Assumption3.1is satisfied and the sequence\(\{x_{k}, \alpha_{k}, d_{k}, F_{k}\}\)are generated by the Algorithm1with the MRMIL1, MRMIL2 or MRMIL3 direction. If there exists a constant\(\epsilon> 0\), such that\(\|F_{k}\| \geq \epsilon\)for all\(k \in \mathbb{N} \cup0\), then we have

$$ \alpha_{k} \geq\min \biggl\{ s, \frac{(1 - \gamma)\epsilon^{2}}{\rho ^{-1}\kappa^{2} (L + \sigma\kappa\rho^{-1} (\rho+Ls+(Ls)^{2} + \frac {(Ls)^{3}}{4\gamma} ) ) (1 + Ls + \frac{(Ls)^{2}}{4\gamma} )^{2}} \biggr\} . $$
(31)

Proof

If \(\alpha_{k} \neq s\), by the step 2 of Algorithm 1, we know that \(\rho^{-1}\alpha_{k}\) does not satisfy (11). Then we have

$$ -F\bigl(x_{k} + \rho^{-1} \alpha_{k} d_{k}\bigr)^{T} d_{k} < \sigma\rho^{-1}\alpha _{k} \bigl\Vert F\bigl(x_{k} + \rho^{-1}\alpha_{k} d_{k}\bigr) \bigr\Vert \Vert d_{k} \Vert ^{2}. $$
(32)

Combining with Assumption 3.1, (22) and (24), we have

$$ \begin{aligned}[b] \bigl\Vert F\bigl(x_{k} + \rho^{-1}\alpha_{k}d_{k}\bigr) \bigr\Vert & = \bigl\Vert F\bigl(x_{k} + \rho^{-1}\alpha _{k}d_{k}\bigr) - F(x_{k}) \bigr\Vert + \bigl\Vert F(x_{k}) \bigr\Vert \\ & \leq L \rho^{-1}\alpha_{k} \Vert d_{k} \Vert + \kappa \\ & \leq\kappa\rho^{-1} \biggl(\rho+ Ls + (Ls)^{2} + \frac{(Ls)^{3}}{4\gamma} \biggr). \end{aligned} $$
(33)

It follows from (14), (16), (18), (21), (24), (32) and (33) that

$$ \begin{aligned}[b] (1 - \gamma) \Vert F_{k} \Vert & \leq-F_{k}^{T}d_{k} \\ & = \bigl[F\bigl(x_{k} + \rho^{-1}\alpha_{k}d_{k} \bigr) - F(x_{k})\bigr]^{T} d_{k} - \bigl[F \bigl(x_{k} + \rho^{-1}\alpha_{k}d_{k} \bigr)\bigr]^{T} d_{k} \\ & < \bigl[L + \sigma \bigl\Vert F\bigl(x_{k} + \rho^{-1}\alpha_{k}d_{k}\bigr) \bigr\Vert \bigr]\rho ^{-1}\alpha_{k} \Vert d_{k} \Vert ^{2} \\ & \leq\rho^{-1}\alpha_{k} \kappa^{2} \biggl(L + \sigma\kappa\rho^{-1} \biggl(\rho+ Ls + (Ls)^{2} + \frac{(Ls)^{3}}{4\gamma} \biggr) \biggr) \biggl(1 + Ls + \frac{(Ls)^{2}}{4\gamma} \biggr)^{2}. \end{aligned} $$
(34)

Then we have

$$ \begin{aligned}[b] \alpha_{k}& > \frac{(1 - \gamma) \Vert F_{k} \Vert }{\rho^{-1}\kappa^{2} (L + \sigma\kappa\rho^{-1} (\rho+Ls+(Ls)^{2} + \frac{(Ls)^{3}}{4\gamma} ) ) (1 + Ls + \frac{(Ls)^{2}}{4\gamma} )^{2}} \\ & > \frac{(1 - \gamma)\epsilon^{2}}{\rho^{-1}\kappa^{2} (L + \sigma \kappa\rho^{-1} (\rho+Ls+(Ls)^{2} + \frac{(Ls)^{3}}{4\gamma} ) ) (1 + Ls + \frac{(Ls)^{2}}{4\gamma} )^{2}}. \end{aligned} $$
(35)

This implies (31). □

Theorem 3.4

Suppose Assumption3.1is satisfied and the sequences\(\{x_{k}, \alpha_{k}, d_{k}, F_{k} \}\)are generated by the Algorithm1with the MRMIL1, MRMIL2 or MRMIL3 direction. Then we have

$$ \liminf_{k\rightarrow\infty} \Vert F_{k} \Vert = 0. $$
(36)

In particular, the sequence\(\{ x_{k} \}\)converges to\(x^{*}\)and\(F(x^{*})=0\).

Proof

If (36) does not hold, then there exists a constant \(\epsilon> 0\) such that

$$ \Vert F_{k} \Vert \geq\epsilon, \quad\forall k \geq0. $$
(37)

From (14), (16) and (18). we get

$$ \Vert F_{k} \Vert \Vert d_{k} \Vert \geq \Vert F_{k} d_{k} \Vert \geq(1-\gamma) \Vert F_{k} \Vert ^{2}. $$
(38)

Using (37) and (38), we have

$$ \Vert d_{k} \Vert \geq(1 - \gamma) \Vert F_{k} \Vert \geq(1 - \gamma)\epsilon. $$
(39)

Suppose is an arbitrary accumulation point of \(\{x_{k}\}\) and \(K_{1}\) is an infinite index set such that

$$ \lim_{k \in K_{1}, k \rightarrow\infty} x_{k} = \tilde{x}. $$
(40)

From (23), (25) and (40), we get

$$ \lim_{k \in K_{1}, k \rightarrow\infty} \alpha_{k} \Vert d_{k} \Vert = 0. $$
(41)

On the other hand, together with the conclusion of Lemma 3.3 and (39), we obtain

$$\begin{aligned} \alpha_{k} \Vert d_{k} \Vert \geq& \min \biggl\{ (1-\gamma)\epsilon s, \frac{(1 - \gamma)^{2}\epsilon^{3}}{\rho^{-1}\kappa^{2} (L + \sigma\kappa\rho ^{-1} (\rho+Ls+(Ls)^{2} + \frac{(Ls)^{3}}{4\gamma} ) ) (1 + Ls + \frac{(Ls)^{2}}{4\gamma} )^{2}} \biggr\} \\ >& 0. \end{aligned}$$
(42)

Equations (41) and (42) are a contradiction, then the conclusion (36) is hold. From Assumption 3.1, Lemma 3.1 and (36), we see that the sequence \(\{x_{k}\}\) converges to some accumulation point \(x^{*}\) such that \(F(x^{*})=0\). □

4 Numerical experiments

In this section, we report some numerical test results for the MRMIL1 method, the MRMIL2 method, and the MRMIL3 method and compare with the DFPB1 method in [1] and the M3TFR2 method in [9]. Our tests are implemented in Matlab R2011a, run on a personal computer with 8 GB RAM and Intel CPU I5-3470.

In order to compare all methods, we employ the performance profiles [5], which are defined by the following fraction:

$$\rho_{v}(\tau)=\frac{1}{ \vert P \vert } \biggl\vert \biggl\{ p\in P : \textrm{log}_{2} \biggl(\frac {t_{p,v}}{\min\{ t_{p,v}:v\in V\}} \biggr) \leq\tau \biggr\} \biggr\vert , $$

where P is the test set, \(|P|\) is the number of problems in the test set P, V is the set of optimization solvers, and \(t_{p,v}\) is the CPU time (or the number of the function evaluations, or the number of iterations) for \(p \in P\) and \(v \in V\).

We test the following problems with different starting points and various sizes: (1) the problems 1–7 from [2, 7, 13, 19] with sizes 1000, 5000, 10,000, 50,000; (2) the problem 8 from [7] with sizes 10,000, 20,164, 40,000;

All problems are initialized with the following eight starting points: \(x_{0}^{1}= 10\cdot(1,1,\ldots,1)^{T}\), \(x_{0}^{2}= -10\cdot(1,1,\ldots,1)^{T}\), \(x_{0}^{3}= (1,1,\ldots,1)^{T}\), \(x_{0}^{4}= -(1,1,\ldots,1)^{T}\), \(x_{0}^{5}= 0.1\cdot(1,1,\ldots,1)^{T}\), \(x_{0}^{6}= (1,\frac{1}{2},\ldots,\frac{1}{n})^{T}\), \(x_{0}^{7}= (\frac{1}{n},\frac{2}{n},\ldots,1)^{T}\), \(x_{0}^{8}= (\frac{n-1}{n},\frac{n-2}{n},\ldots,0)^{T}\).

For all methods, the stopping criteria are (1) \(\|F(x_{k})\|\leq\epsilon\) or (2) \(\|F(z_{k})\|\leq\epsilon\) or (3) the number of iterations exceeds \(k_{\mathrm{max}}\). where \(\epsilon= 10^{-4}\), \(k_{\mathrm{max}}=10^{5}\). Similar to [1, 9], we used the same parameters for five methods: initial steplength \(s=\| \frac{F_{k}^{t} d_{k}}{ (F(x_{k} + 10^{-8} d_{k}) - F_{k})^{T} d_{k} / 10^{-8}} \|\), \(\rho= 0.7\), \(\sigma= 0.3\), \(\gamma= \frac{1}{4}\).

Figure 1 is for the iterations performance profiles related to five methods. As we can see, MRMIL3 guarantees better results than M3TFR2, DFPB1, MRMIL1 and MRMIL2 as it solves a higher percentage of problems when \(\tau\geq0.2\), It also can be seen that DFPB1, MRMIL1 and MRMIL2 have similar performances, especially when \(\tau\geq0.3\). Furthermore, M3TFR2 gives better results than DFPB1, MRMIL1 and MRMIL2, and the difference is significantly small as the performance ratio τ increases.

Figure 1
figure 1

Performance profiles for the number of iterations in a log2 scale

The number of function evaluations performance profiles are reported in Figure 2. we note that MRMIL3 performs better than the other four methods when \(\tau\geq0.2\). In addition, we also observe that MRMIL1 and MRMIL2 is more efficient than DFPB1 when \(\tau\geq0.3\), but more inefficient than M3TFR2.

Figure 2
figure 2

Performance profiles for the number of function evaluations in a log2 scale

Figure 3 shows the CPU time performance profiles. When \(\tau\leq0.15\), M3TFR2 and DFBP1 uses the shortest CPU time, but MRMIL3 gives the best result when \(\tau\geq0.15\).

Figure 3
figure 3

Performance profiles for the CPU time in a log2 scale

5 Conclusion

In this paper, we give a class of new derivative-free gradient type methods for large-scale nonlinear systems of monotone equations. Under mild assumptions, we prove that the methods possess global convergence properties. Numerical experiments show that the proposed methods are promising, especially the MRMIL3 method, which is the most efficient one.