Abstract
We present a derivative-free method for solving systems of nonlinear equations that belongs to the class of spectral residual methods. We will show that by endowing a previous version of the algorithm with a suitable new linesearch strategy, standard global convergence results can be attained under mild general assumptions. The robustness of the new method is therefore potentially improved with respect to the previous version as shown by the reported numerical experiments.
Similar content being viewed by others
1 Introduction
In this work we propose a variant of the derivative-free spectral residual method Pand-SR presented in [16], for solving nonlinear systems of equations of the form:
with the aim of obtaining stronger global convergence results when \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) is a continuously differentiable mapping. Indeed, the sequence generated by Pand-SR was proved to be convergent under mild standard assumptions, but only in a more specific setting it was shown in [16] that the limit point is also a solution of (1).
Inspired by [11], we adopt here a different linesearch strategy, which allows us to obtain a more general and nontrivial result for methods that do not make any use of derivatives of f, and in fact was not established in [16]. Namely we can prove that at every limit point \(x^*\) of the sequence \(\{x_k\}\) generated by the new algorithm, either \(F(x^*)=0\) or the gradient of the merit function
is orthogonal to the residual F:
being J the Jacobian of F.Footnote 1 Clearly the orthogonality condition (3) does not generally imply \(F(x^*)=0\); however this result can be recovered under additional conditions, e.g. when \(J(x^*)\) is positive (negative) definite. We further remark that the improvement with respect to Pand-SR is not only theoretical; as discussed in Sect. 4, the performed numerical experiments show that the new linesearch has a positive impact also on the practical behaviour of the method.
Given the current iterate \(x_k\), spectral residual methods are methods of linesearch type which produce a new iterate \(x_{k+1}\) of the form:
-
both the residual vectors \(\pm F(x_k)\) are used as search directions;
-
the spectral coefficient \(\beta _k\ne 0\) is generally the reciprocal of an appropriate Rayleigh quotient, approximating some eigenvalue of (suitable secant approximations of) the Jacobian [11, 15];
-
the steplength parameter \(\lambda _k>0\) is determined by suitable—typically nonmonotone—linesearch strategies to reduce the norm of F (or a smooth merit function as (2)).
Spectral residual methods have received a large attention because of the low-cost of the iterations, and because they require a low memory storage being matrix free, see e.g. [7, 9,10,11, 16]. They are particularly attractive when the Jacobian matrix of F is not available analytically or its computation is burdensome. Indeed, distinguishing features of these methods are that the computation of the search directions does not involve the solution of linear systems, and that effective derivative-free linesearch conditions can be defined [6, 7, 11, 12, 15].
The paper is organized as follows. Our algorithm is presented in Sect. 2, where we describe the new linesearch strategy and recall the main features of the spectral residual method Pand-SR . Convergence analysis is developed in Sect. 3 and numerical experiments are discussed in Sect. 4. Some conclusions and perspectives are drawn in Sect. 5.
1.1 Notations
The symbol \(\Vert \cdot \Vert \) denotes the Euclidean norm, J denotes the Jacobian matrix of F. Given a sequence of vectors \(\{x_k\}\), we occasionally denote \(F(x_k)\) by \(F_k\).
2 The Srand2 algorithm
We present a spectral residual method that is a modification of the Projected Approximate Norm Descent algorithm with Spectral Residual step (Pand-SR ) proposed in [16]. Pand-SR was developed for solving convexly constrained nonlinear systems; here it is applied in an unconstrained setting. A brief discussion on the constrained case is postponed to Sect. 5.
The new algorithm is denoted as Srand2 (Spectral Residual Approximate Norm Descent) and differs from Pand-SR in the definition of the linesearch conditions and in the choice of the spectral stepsize \(\beta _k\).
Both Pand-SR and Srand2 employ a nonmonotone linesearch strategy based on the so-called approximate norm descent property [12]. This means that the generated sequence of iterates \(\{x_k\}\) satisfies
for all k, where \(\{\eta _k\}\) is a positive sequence of scalars such that
The idea behind such a condition is to allow a highly nonmonotone behaviour of \(\Vert F_k\Vert \) for (initial) large values of \(\eta _k\) while promoting a decrease of \(\Vert F \Vert \) for small (final) values of \(\eta _k\). A nonmonotone behaviour of the norm of F is crucial to avoid practical stagnation of methods based on spectral stepsizes (see e.g. [5, 11, 17]); at the same time condition (4) ensures the sequence \(\{ \Vert F_k \Vert \}\) to be bounded (see Theorem 1 in Sect. 3).
In detail, given the current iterate \(x_k\) and the initial stepsize \(\beta _k\), in [16] a new iterate \(x_{k+1}\) of form
is computed. The scalar \(\lambda _k \in (0,1]\) is fixed by using a backtracking strategy; starting from \(\lambda _k=1\), it is progressively reduced by a factor \(\sigma \in (0,1)\) (e.g. halved) until one of the following conditions is satisfied:
or
where \(\alpha \in (0,1)\).
In Srand2 conditions (7) and (8) are respectively replaced by
and
All these conditions are derivative-free. If F is continuously differentiable, as long as \(F_k^T J(x_k)F_k\ne 0\), either \(+\beta _k F_k\) or \(-\beta _k F_k\) is a descent direction for the merit function f in (2) and for \(\Vert F\Vert \) at \(x_k\); hence the first condition (9) (similarly (7)) promotes a sufficient decrease in \(\Vert F\Vert \) and is crucial for establishing results on the convergence of \(\{\Vert F_k\Vert \}\) to zero. On the other hand, the second condition (10) (similarly (8)) allows for an increase of \(\Vert F\Vert \) depending on the magnitude of \(\eta _k\). Trivially, (9) implies (10) and both imply the approximate norm descent condition (4); the same holds for conditions (7) and (8).
We observe that the change in conditions (9) and (10) with respect to (7) and (8) only derives from the \(\lambda _k^2\) term in the right hand side of (9) and (10). This squared term is common to other linesearch strategies as e.g. those in [11, 12]. This small change in the linesearch conditions has a great impact on the global convergence result of the overall algorithm as shown in the forthcoming section.
As concerns the choice of the spectral coefficient \(\beta _k\) in (6), both Pand-SR and Srand2 use formulas closely related to the Barzilai–Borwein’s steplength employed in spectral gradient methods for optimization problems, see e.g. [2, 5]. However, differently from the optimization case, in spectral residual methods \(\beta _k\) may be positive or negative since both directions \(\pm F_k\) are attempted. Also, its absolute value is constrained to belong to a given interval \([\beta _{\min }, \beta _{\max }]\) to get a bounded sequence of stepsizes. As an example \(\beta _k\) can be chosen by computing
or
with \(p_{k-1} =x_{k}-x_{k-1}\) and \( y_{k-1}= F_k- F_{k-1},\) and then ensuring that \(\beta _{k,1}\) or \(\beta _{k,2}\) is such that \(|\beta _{k}| \in [\beta _{\min }, \beta _{\max }]\) by some thresholding rule. Alternative choices of \(\beta _k\) that suitably combine \(\beta _{k,1}\) and \(\beta _{k,2}\) can be found in [15], where a systematic analysis of the stepsize selection for spectral residual methods is addressed also in combination with an approximate norm descent linesearch. In Algorithm 2.1 we formally describe Srand2 for a general \(\beta _k\) such that \(|\beta _{k}| \in [\beta _{\min }, \beta _{\max }]\).
We observe that the Repeat loop at Step 2 terminates in a finite number of steps: indeed, from the continuity of F and the positivity of \(\eta _k\), there exists \(\bar{\lambda }>0\) such that
with \(\lambda \in (0, \bar{\lambda }],\) and \(i=1, \dots ,n;\) therefore, inequality (10) holds for small enough values of \(\lambda _k\).
3 Global convergence analysis
We now provide the convergence analysis of the Srand2 algorithm. Theorems 1 and 2 analyze the behaviour of the sequences \(\{\lambda _k\}\) and \(\{\Vert F_k\Vert \}\); they state general results which derive from the linesearch strategy and hold for Pand-SR as well. Their proofs follow the lines of [16, Theorem 4.2] and therefore are not reported in this work. Theorem 2 in particular identifies situations where \(\{\Vert F_k\Vert \}\) may or may not converge to zero. Theorem 3 constitutes the main contribution of this work. It is related both to the linesearch strategy and to the choice of the spectral residual steps, and it does not rely on the specific choice of \(\beta _k\).
Theorem 1
Let \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) be a continuous map, and let \(\{x_k\}\) and \(\{\lambda _k\}\) be the sequences of iterates and of linesearch stepsizes generated by the Srand2 algorithm. Then the sequence \(\{\Vert F_k\Vert \}\) is convergent and bounded by
where \(\eta >0\) is given in (5). Moreover
Theorem 2
Let \(F: {\mathbb {R}}^n \rightarrow {\mathbb {R}}^n\) be a continuous map, and let \(\{x_k\}\) and \(\{\lambda _k\}\) be the sequences of iterates and of linesearch stepsizes generated by the Srand2 algorithm. Then
-
(i)
\(\mathop {\mathrm{liminf }}_{k \rightarrow \infty } \lambda _k^2 > 0 \hbox { implies that } \lim _{k\rightarrow \infty } \Vert F_k\Vert =0. \)
-
(ii)
If \((9)\) is satisfied for infinitely many k, then \(\lim _{k \rightarrow \infty } \Vert F_k\Vert =0\).
-
(iii)
If \(\Vert F_k\Vert \le \Vert F_{k+1}\Vert \) for infinitely many iterations, then \(\mathop {\mathrm{liminf }}_{k \rightarrow \infty } \lambda _k^2=0\).
-
(iv)
If \(\Vert F_k\Vert \le \Vert F_{k+1}\Vert \) for all k sufficiently large, then \(\{ \Vert F_k\Vert \}\) does not converge to 0.
We now provide the main convergence result, that is at every limit point \(x^*\) of the sequence \(\{x_k\}\) generated by the Srand2 algorithm, the gradient of the merit function f in (2) is orthogonal to the residual \(F(x^*)\).
Theorem 3
Let F be continuously differentiable. Let \(\{x_k\}\) be the sequence generated by the Srand2 algorithm and let \(x^*\) be a limit point of \(\{x_k\}\). Then either
or
Proof
Let \(\textit{K}\) be an infinite subset of indices such that \(\lim _{k \in \textit{K}} x_k=x^*.\) By Theorem 1 we know that \(\lim _{k \in \textit{K}} \lambda _k^2\Vert F_k\Vert =0\). Hence there are two possibilities:
The first one implies \(\lim _{k \in \textit{K}} \Vert F_k\Vert =0\). Then using the continuity of F it follows easily that
In the second case we have \(\mathop {\mathrm{liminf }}_{k \in \textit{K}} \lambda _k^2 =\mathop {\mathrm{liminf }}_{k \in \textit{K}} \lambda _k = 0\). Let \(\underline{\lambda }_k=\lambda _k/\sigma \) denote the last attempted value for the linesearch parameter before \(\lambda _k \) is accepted during the backtracking phase. Hence for sufficiently large values of \(k\in \textit{K}\) we have
Being \(\eta _k > 0\), and by virtue of (12), there is a positive constant \(c_1\) such that
and multiplying both sides of (15) by \(\Vert F(x_k\pm \underline{\lambda }_k \beta _k F_k)\Vert +\Vert F(x_k)\Vert ,\) we obtain
Now we observe that \(x_k \pm \lambda _k \beta _k F_k\) is bounded \(\forall k \in \textit{K}\); indeed, by hypothesis \(\lambda _k\in (0,1]\), \(|\beta _k|\le \beta _{\max }\), the subsequence \(\{x_k\}_{k\in K}\) is convergent to \(x^*\) and hence bounded, and \(\Vert F_k\Vert \) is bounded by Theorem 1. Then recalling the definition of \(\underline{\lambda }_k=\lambda _k/ \sigma \) and the continuity of F, we have
for some positive constant \(c_2\). Consequently, from (16) and (17), there exists a constant \(c>0\) such that
for sufficiently large values of \( k \in \textit{K}\).
Now, we suppose that \(\beta _k >0\) for infinitely many indices \(k \in \textit{K}_1 \subseteq \textit{K}\), and we consider the two steps \( -\lambda _k \beta _k F_k\) and \(+\lambda _k \beta _k F_k\) separately.
-
Firstly, we consider \(-\lambda \beta _k F_k.\) By virtue of the Mean Value Theorem and (18), there exists \(\xi _k \in [0,1]\) such that
$$\begin{aligned} \big \langle \nabla f (x_k-\xi _k \underline{\lambda }_k\beta _kF_k), -\underline{\lambda }_k \beta _k F_k\big \rangle > - c \alpha \underline{\lambda }_k^2, \end{aligned}$$for sufficiently large \( k \in \textit{K}\). Hence, for all large \( k \in \textit{K}_1\) we have that:
$$\begin{aligned} \big \langle \nabla f(x_k-\xi _k \underline{\lambda }_k\beta _kF_k), F_k \big \rangle < c \alpha \frac{\underline{\lambda }_k}{\beta _k} \le c \alpha \frac{\underline{\lambda }_k}{\beta _{\text {min}}}. \end{aligned}$$(19) -
Now we consider \( +\lambda \beta _k F_k.\) Similarly there exists \(\xi '_k \in [0,1]\) such that for all large \( k \in \textit{K}_1\)
$$\begin{aligned} \big \langle \nabla f(x_k+\xi '_k \underline{\lambda }_k\beta _kF_k), F_k \big \rangle > -c \alpha \frac{\underline{\lambda }_k}{\beta _k} \ge -c \alpha \frac{\underline{\lambda }_k}{\beta _{\text {min}}}. \end{aligned}$$(20)
Since \(\mathop {\mathrm{liminf }}_{k \in \textit{K}} \lambda _k= 0,\) taking limits in (19) and (20) we get
We proceed in a quite similar way if \(\beta _k < 0\) for infinitely many indices. \(\square \)
Corollary 1
The orthogonality condition (14) implies \(F(x^*)=0\) in the following cases :
-
(a)
\(J(x^*)\) is positive (negative) definite;
-
(b)
\(v^T J(x^*)v \ne 0,\) for all \(v \in {\mathbb {R}}^n,\) \(v \ne 0.\)
Case (a) in Corollary 1 includes the class of nonlinear monotone systems of equations of the form (1) with F continuously differentiable and strictly monotone, that is \((F(x)-F(y))^T(x-y)> 0\) for any \(x, y\in {\mathbb {R}}^n\) with \(x\ne y\) [4]. Nonlinear monotone systems of equations arise in several applications and tailored spectral type methods have been recently proposed, see e.g. [18].
Remark 1
A general result like Theorem 3 was not proved for Pand-SR , which is in turn known to be convergent. Moreover, if \(x^*\) is the limit point and \(x_0\) the starting guess, the following bound
was provided in [16]. However it cannot be proved in general that \(F(x^*)=0\). Such a result was obtained in [16] basing the choice of \(\beta _k\) on (11), assuming the Jacobian J to be Lipschitz continuous, and focusing on specific classes of problems. For example, [16, Theorem 5.2] consider the case of \(J(x^*)\) with positive (negative) definite symmetric part and suitably bounded condition number. In [16, Theorem 5.2] instead, \(J(x^*)\) is assumed to be strongly diagonal dominant, with diagonal entries of constant sign.
We show in the forthcoming section, that the stronger convergence properties of Srand2 correspond in practice to an algorithm potentially more robust than Pand-SR . Of course, we cannot expect strong difference in the performance of the two methods, given the small change between the two. Nevertheless, the new linesearch is able to recover few cases when \(\Vert F_k\Vert \) does not converge to zero encountered with the previous one.
4 Numerical illustration
We compare the performance of Srand2 and Pand-SR algorithms on two problem sets. The first set (named set-Luksan) contains 17 nonlinear systems of the Lukšan test collection described in [13] that are commonly used as benchmark for optimization algorithms. The second set (named set-contact) consists in nonlinear systems arising in the solution of rail-wheel contact models via the classical CONTACT algorithm [8]. These tests were described in details and used in [15, Section 5.2]. We selected here the 153 problems generated with train speed of magnitude \(v = 16 m/s\), yielding systems whose dimensions vary from \(n=156\) to \(n=1394\).
Pand-SR and Srand2 algorithms were implemented as described in Sect. 2 with parameters
see [16]. A maximum number of \(10^5\) iterations and F-evaluations was imposed, and a maximum number of backtracks equal to 40 was allowed at each iteration. The procedure was declared successful when
Failure was declared when either the assigned maximum number of iterations or F-evaluations or backtracks was reached, or \(\Vert F\Vert \) was not reduced for 500 consecutive iterations. Such occurrences are denoted below as \(\mathtt{F_{it}}\), \(\mathtt{F_{fe}}\), \(\mathtt{F_{bt}}\), \(\mathtt{F_{in}}\), respectively.
Regarding the choice of \(\beta _k\), we used three classical rules based on \(\beta _{k,1}\), \(\beta _{k,2}\) and their alternation, respectively named BB1, BB2 and ALT in what follows. Given a scalar \(\beta \), let \(T(\beta )\) be the projection of \(|\beta |\) onto \({{I_{\beta }}}{\mathop {=}\limits ^\mathrm{def}}[ \beta _{\min }, \beta _{\max }]\), that is
We recall below the definition of BB1, BB2 and ALT as given in [15].
- BB1 rule.:
-
By [7, 9, 10, 16], at each iteration set
$$\begin{aligned} \beta _k= {\left\{ \begin{array}{ll} \beta _{k,1}&{} \text {if } \ |\beta _{k,1}|\in {{I_{\beta }}}\\ T(\beta _{k,1}) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$(24) - BB2 rule.:
-
At each iteration set
$$\begin{aligned} \beta _k= {\left\{ \begin{array}{ll} \beta _{k,2}&{} \text {if } \ |\beta _{k,2}|\in {{I_{\beta }}}\\ T(\beta _{k,2}) &{} \text {otherwise} \end{array}\right. } \end{aligned}$$(25) - ALT rule.:
-
Following [1, 7], at each iteration alternate between \(\beta _{k,1}\) and \(\beta _{k,2}\), setting:
$$\begin{aligned}&\beta ^{{\small \mathrm{ALT}}}_k= {\left\{ \begin{array}{ll} \beta _{k,1}&{} \text {for } k \hbox { odd} \\ \beta _{k,2}&{} \text {otherwise} \end{array}\right. } \end{aligned}$$(26)$$\begin{aligned}&\beta _k= {\left\{ \begin{array}{ll} \beta ^{{\small \mathrm{ALT}}}_k &{} \quad \text {if} \ |\beta ^{{\small \mathrm{ALT}}}| \in {{I_{\beta }}}\\ \beta _{k,1}&{} \quad \text {if}\ k\ \text {even},\ |\beta _{k,1}| \in {{I_{\beta }}},\ |\beta _{k,2}| \notin {{I_{\beta }}}\\ \beta _{k,2}&{} \quad \text {if}\ k\ \text {odd,} \ |\beta _{k,2}| \in {{I_{\beta }}},\ |\beta _{k,1}| \notin {{I_{\beta }}} \\ T(\beta ^{{\small \mathrm{ALT}}}_k) &{} \quad \text {otherwise}. \end{array}\right. } \end{aligned}$$(27)
We experimented Pand-SR and Srand2 also with more elaborated, adaptive rules for \(\beta _k\) see e.g. [2, 15], but the qualitative behaviour of the two methods did not change; therefore we do not report the corresponding results.
Problems in set-Luksan were solved setting \(n=500\) and starting from the initial guess \(x_0\) suggested in [13]. Problem lu5 requires an odd value for n and therefore we set \(n=501\). For 16 out of 17 problems, Pand-SR and Srand2 give the same results: Table 1 reports the number of F-evaluations varying the updating rule for \(\beta _k\). More interesting is the case of Problem lu16 reported in Table 2. Though performing a large number of F-evaluations, Srand2 is able to successfully solve Problem lu16 using BB2 and ALT, whereas Pand-SR returns a failure with all the attempted \(\beta _k\) rules.
In Fig. 1 we give an insight into the convergence behaviour of both methods with BB2 on Problem lu16. We display \(\Vert F_k\Vert \) versus the iterations and the number of F-evaluations (top part), the number of backtracks performed by both algorithms (central part), and values of \(\Vert F_k\Vert \) and \(\lambda _k\) versus the iterations for both algorithms (bottom part). All plots are obtained by disabling the stopping criterion on the number of consecutive increases of \(\Vert F\Vert \). In this setting Pand-SR fails since the maximum number of backtracks is reached, after 3278 iterations and 56883 F-evaluations while Srand2 converges after 8456 iterations and 45624 F-evaluations. We observe that the sequence of \(\{\Vert F_k\Vert \}\) generated by Pand-SR does not satisfy the stopping criterion (22), whereas the increasing number of backtracks along the iterations corresponds to the fact that \(\{\lambda _k\}\) is going to zero. On the contrary, the sequence \(\{\Vert F_k\Vert \}\) generated by Srand2 converges to zero and \(\lambda _k\) does not decrease with the iterations. Both situations are in accordance with the theory: at least one among the sequences \(\{\Vert F_k\Vert \}\) and \(\{\lambda _k\}\) converges to zero, but the linesearch adopted in Srand2 more likely generates a sequence \(\{\Vert F_k\Vert \}\) that goes to zero.
This behaviour is also confirmed by the experiments performed with the set-contact problems. Results obtained for these problems are summarized in the F-evaluation performance profiles [3] of Fig. 2, where Pand-SR and Srand2 , combined with rules BB2 (top plot) and ALT (bottom plot), are compared. Results with BB1 are not reported since the two algorithms give exactly the same values for the number of F-evaluations. The plots clearly show that the two algorithms perform similarly and Srand2 is slightly more robust. In detail, Pand-SR and Srand2 with BB2 solve 132 and 135 problems, respectively. Also in combination with the ALT rule, Srand2 solves 3 problems more than Pand-SR .
In the 6 cases recovered by Srand2 , the behaviour of the two methods was similar to what observed with Problem lu16. To witness, the graphs reported in Fig. 3 are relative to one of the cases where the BB2 rule was in use. Analogous observations as for Fig. 1 can be drawn, regarding convergence to zero of the sequences \(\{ \lambda _k \}\) and \(\{ \Vert F_k\Vert \}\).
5 Conclusions and outlook
In this work we show how to modify the algorithm proposed in [16] in order to establish mild general conditions that guarantee the convergence of the sequence \(\{\Vert F_k\Vert \}\) to zero, and the corresponding practical benefits in terms of robustness.
The Pand-SR algorithm in [16] was developed for solving constrained nonlinear system of the form
where \(\varOmega \subset {\mathbb {R}}^n\) is a convex set whose relative interior is non-empty. Srand2 can also be adapted to the solution of constrained problems of the form (28) by relying on suitable projection operator onto the feasible set \(\varOmega \) as follows. Proceeding as in [16], feasible iterates \(\{x_k\}\) can be defined by starting from a feasible \(x_0\), and by setting for \(k>0\)
where P denotes a projection operator onto the considered domain. As an example, if \(\varOmega \) is a n-dimensional box \(\{x\in {\mathbb {R}}^n\,\,\, \hbox { s.t. } \,\,\, l\le x\le u\}\), where \(l\in ({\mathbb {R}}\cup {-\infty })^n\), \(u\in ({\mathbb {R}}\cup {\infty })^n\), and the inequalities are meant component-wise, a projection map may be given by \(P(x)=\max \left\{ l, \min \left\{ x,u\right\} \right\} \).
Such a modification of the Srand2 algorithm to handle constrained problems, trivially enjoys the theoretical properties presented in Theorems 1 and 2. Remarkably, the new global convergence result of Theorem 3 can also be easily extended to problem (28) for limit points lying in the interior of \(\varOmega \). Convergence to solutions on the boundary of \(\varOmega \) is currently under investigation.
Notes
The symbol \(\langle x,y \rangle \) denotes the scalar product between vectors x and y.
References
Dai, Y.H., Fletcher, R.: Projected Barzilai–Borwein methods for large-scale box-constrained quadratic programming. Numer. Math. 100, 21–47 (2005)
di Serafino, D., Ruggiero, V., Toraldo, G., Zanni, L.: On the steplength selection in gradient methods for unconstrained optimization. Appl. Math. Comput. 318, 176–195 (2018)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems. Springer Series in Operations Research, vol. I. Springer, New York (2003)
Fletcher, R.: On the Barzilai–Borwein method. In: Optimization and Control with Applications. Applied Optimization, vol. 96, pp. 235–256. Springer, New York (2005)
Gonçalves, M.L.N., Oliveira, F.R.: On the global convergence of an inexact quasi-Newton conditional gradient method for constrained nonlinear systems. Numer. Algor. 84, 609–631 (2020)
Grippo, L., Sciandrone, M.: Nonmonotone derivative-free methods for nonlinear equations. Comput. Optim. Appl. 37, 297–328 (2007)
Kalker, J., Jacobson, B.: Rolling Contact Phenomena. Springer, Wien (2000)
La Cruz, W.: A projected derivative-free algorithm for nonlinear equations with convex constraints. Optim. Method Softw. 29, 24–41 (2014)
La Cruz, W., Raydan, M.: Nonmonotone spectral methods for large-scale nonlinear systems. Optim. Method Softw. 18, 583–599 (2003)
La Cruz, W., Martinez, J.M., Raydan, M.: Spectral residual method without gradient information for solving large-scale nonlinear systems of equations. Math. Comput. 75, 1429–1448 (2006)
Li, D.H., Fukushima, M.: A derivative-free line search and global convergence of Broyden-like method for nonlinear equations. Optim. Method Softw. 13(3), 181–201 (2000)
Lukšan, L.: Inexact trust region method for large sparse systems of nonlinear equations. J. Optim. Theory Appl. 81(3), 569–590 (1994)
Marini, L., Morini, B., Porcelli, M.: Quasi-Newton methods for constrained nonlinear systems: complexity analysis and applications. Comput. Optim. Appl. 71, 147–170 (2018)
Meli E., Morini, B., Porcelli, M., Sgattoni, C.: Solving nonlinear systems of equations via spectral residual methods: stepsize selection and applications, pp. 1–28 (2020). arXiv:2005.05851
Morini, B., Porcelli, M., Toint, P.: Approximate norm descent methods for constrained nonlinear systems. Math. Comput. 87, 1327–1351 (2018)
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7, 26–33 (1997)
Zhang, L., Zhou, W.: Spectral gradient projection method for solving nonlinear monotone equations. J. Comput. Appl. Math. 196, 478–484 (2006)
Acknowledgements
The authors are indebted to Benedetta Morini for valuable discussions on spectral residual methods, as well as the referee for his/her careful reading and suggestions, which led to significant improvement of the manuscript. The authors are members of the Gruppo Nazionale per il Calcolo Scientifico (GNCS) of the Istituto Nazionale di Alta Matematica (INdAM) and this work was partially supported by INdAM-GNCS under Progetti di Ricerca 2019 and 2020.
Funding
Open access funding provided by Alma Mater Studiorum - Università di Bologna within the CRUI-CARE Agreement.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Papini, A., Porcelli, M. & Sgattoni, C. On the global convergence of a new spectral residual algorithm for nonlinear systems of equations. Boll Unione Mat Ital 14, 367–378 (2021). https://doi.org/10.1007/s40574-020-00270-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40574-020-00270-5