Abstract
In this paper, we propose a Jacobian smoothing inexact Newton-type algorithm for solving the nonlinear complementarity problem by reformulating it as a system of nonlinear equations. We show that the algorithm converges up to q-quadratically and present numerical experiments that show its good local performance. In order to compare in terms of time our algorithm with other known algorithms, we introduce a new normalized measurement that we call time index.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider the nonlinear complementarity problem (NCP), which consists of solving the following system of inequalities and equalities:
with \(F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}, \,F(\varvec{x})=\left( F_{1}(\varvec{x}),\ldots ,F_{n}(\varvec{x})\right) \) continuously differentiable.
There are numerous and diverse applications of the NCP in areas such as Physics, Engineering and Economics (Anitescu et al. 1997; Kostreva 1984; Chen et al. 2010; Ferris and Pang 1997), where the concept of complementarity is synonymous with system in equilibrium.
To solve the NCP, we will use its reformulation as the following system of nonlinear equations,
where \(\Phi :\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) and \( \varphi :\mathbb {R}^{2}\rightarrow \mathbb {R}. \) This latter is called the complementarity function and satisfies the equivalence \(\, \varphi (a,b)=0\Longleftrightarrow a\ge 0,\, b\ge 0,\, ab=0, \,\) which allows to show that a vector \(\varvec{x}_*\) is a solution of NCP if and only if \(\varvec{x}_*\) is a solution of (1).
The lack of smoothness of \( \varphi \) (Arenas et al. 2014), leads to the nonsmooth system of equations (1). Among the numerical methods frequently used to solve this system are nonsmooth methods (Qi 1993; Sherman 1978; Broyden et al. 1973; Li and Fukushima 2001; Lopes et al. 1999), Jacobian smoothing methods (Kanzow and Pieper 1999; Arenas et al. 2020) and smoothing methods (Krejić and Rapajić 2008; Zhu and Hao 2011).
The last two methods, in the previous paragraph, have attracted the interest of many researchers in complementarity because these methods avoid working with matrices in the generalized Jacobian (Clarke 1975), which are difficult to compute.
In particular, the strategy used in a Jacobian smoothing method (Kanzow and Pieper 1999) is to approximate the function \( \Phi \) by a sequence of smooth functions \(\Phi _{\mu },\) defined by \(\Phi _\mu ( \varvec{x}) = (\varphi _\mu (x_{1},F_{1}(\varvec{x})), \ldots \varphi _\mu (x_{n},F_{n}(\varvec{x})))^T, \) where \(\varphi _{\mu }\) is a smoothing of the complementarity function \(\varphi \) and \(\mu > 0\,\) is the smoothing parameter. The basic idea of a Jacobian smoothing method is then to solve at each iteration the mixed Newton equation \(\Phi _{\mu }'( \varvec{x}_k) \varvec{d} =\Phi ( \varvec{x}_k), \) where \(\Phi _{\mu }'( \varvec{x}_k) \) is the Jacobian matrix of \(\Phi _{\mu } \) at \(\varvec{x}_k.\)
The above methods have good convergence properties (Chen et al. 2010; Kanzow and Kleinmichel 1998; Kanzow 1996; Burke and Xu 1998; Huang et al. 2001; Chen and Mangasarian 1996; Kanzow and Pieper 1999); however, they require solving a system of linear equations which may be computationally expensive as the problem size increases (Birgin et al. 2003; Dembo et al. 1982), or it may not have a solution. This has motivated the development of inexact Newton methods for complementarity (Dembo et al. 1982; Wan et al. 2015; Kanzow 2004). Its basic idea for solving the smooth system of equations \(G(\varvec{x})=\varvec{0} \) is to find an approximation, \(\varvec{d}_k,\) of the Newton direction such that satisfies the following inequality
where \(G'(\varvec{x}_k)\) is the Jacobian matrix of G at \(\varvec{x}_k. \)
In this paper, we propose a Jacobian smoothing inexact Newton method to solve (1) (indirectly, to solve NCP) using the uniparametric family \(\varphi _{\lambda }\) defined by Kanzow and Kleinmichel (1998),
and a smoothing of \(\varphi _{\lambda }\) defined by Arenas et al. (2020),
Using (3) and (4), we denote the reformulation (1), by
and its smoothing by
Two particular cases, perhaps the most popular ones, of (3) arise when \(\lambda =2\) and when \(\lambda \rightarrow 0\). In the first case, \(\varphi _\lambda \) reduces to the so-called Fischer function (Fischer 1992), and in the second one, \(\varphi _\lambda \) converges to a multiple of the Minimum function (Pang and Qi 1993).
Recently, smoothing inexact Newton-type algorithms using the complementarity functions Fischer and Minimum have been proposed (Wan et al. 2015; Rui and Xu 2010), with good numerical results. As far as we know, Jacobian smoothing inexact methods has not been used to solve the nonlinear complementarity problem. This motivated us to propose an algorithm of this type using the family of complementarity functions (3) and its smoothing (4). On the other hand, the \(\varphi _\lambda \) family has not been used in connection with inexact Newton methods to solve the NCP.
This paper is organized as follows: in Sect. 2, we present some preliminaries that will be used for the development of convergence results of our algorithmic proposal. In Sect. 3, we present a Jacobian smoothing inexact Newton algorithm to solve the nonlinear complementarity problem, and we develop its convergence theory. In Sect. 4, we analyze the numerical performance of the proposed algorithm and introduce a new index, which measures the speed (in terms of time) of an algorithm. Finally, in Sect. 5, we present some concluding remarks and possibilities for future works.
A few words about notation. Let \(\varvec{x}\in \mathbb {R}^n\) and \(A\in \mathbb {R}^{n\times n},\) we denote by \(\Vert \varvec{x} \Vert \) the Euclidean norm of \(\varvec{x}\) and by \(\Vert A \Vert \) the induced matrix norm of the Euclidian vector norm. The distance from a matrix \(A\in \mathbb {R}^{ n\times n}\) to a nonempty set of matrices \(\Lambda \) is defined by \(dist(A,\Lambda ){:}{=}\text {inf}_{B\in \Lambda }\left\| A-B\right\| . \) Let \(\left\{ \alpha _k\right\} \) and \(\left\{ \beta _k\right\} \) be two sequences of positive numbers with \(\beta _k\rightarrow 0\). We say that \(\alpha _k=o(\beta _k)\) if \(\frac{\alpha _k}{\beta _k}\rightarrow 0\), and \(\alpha _k=O(\beta _k)\) if there exists a constant \(c>0\) such that \(\alpha _k\le c\beta _k, \) for all \(k\in \mathbb {N}\). Given \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) continuously differentiable, \(G'(\varvec{x};\varvec{d})\) denotes its directional derivative at \(\varvec{x}\) in the direction \(\varvec{d}\).
2 Preliminaries
In this section, we present some definitions and lemmas that will be useful for the development of the convergence theory of the new algorithmic proposal. We begin with the concepts of generalized Jacobian (Clarke 1975) and C-subdifferential (Qi 1996).
Definition 1
Let \(\,G:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n} \) be Lipschitz continuous.
-
1.
The generalized Jacobian of G at \(\varvec{x}\) is defined by
$$\begin{aligned} \partial G(\varvec{x})\,=\,conv\;\partial _B G(\varvec{x}), \end{aligned}$$where \( \partial _B G(\varvec{x})= \{ \lim \nolimits _{k\rightarrow \infty } G^{\prime }(\varvec{x}_{k}) \in \mathbb {R}^{n\times n}:\lim \nolimits _{k\rightarrow \infty }\varvec{x}_{k}\rightarrow \varvec{x},\,\varvec{x}_{k}\in D_G \}\) is called B-Jacobian of G at \(\varvec{x}\), \(\,D_{G}\,\) is the set of all points of \(\mathbb {R}^{n},\) where \(\,G\,\) is differentiable and conv denotes the convex envelope.
-
2.
The C-subdifferential of G at \(\varvec{x},\) denoted \(\partial _C G(\varvec{x}), \) is defined by
$$\begin{aligned} \partial _C G(\varvec{x})^T{=}{:}\partial G_{1}(\varvec{x})\times \dots \times \partial G_{n}(\varvec{x}), \end{aligned}$$(7)where the right-hand side denotes the set of matrices whose ith column is the generalized gradient (Clarke 1975) of the ith component function \(G_i.\)
Given the difficulty of calculating the generalized Jacobian (specially, for \(n>1\)), a practical alternative is to use the set (7) taking into account that Clarke (1990), \( \partial G(\varvec{x})^T \subseteq \partial _C G(\varvec{x})^T\).
Frequently, some convergence results of nonsmooth Newton-type methods for solving systems of nonlinear equations are proved under an Assumption of semismooth or strongly semismooth, concepts that we define below.
Definition 2
(De Luca et al. 1996) A locally Lipschitz function \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) is semismooth at \(\varvec{x}\) if
exists, for all \(\varvec{v}\in \mathbb {R}^n.\)
Definition 3
(De Luca et al. 1996) A function \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) semismooth at \(\varvec{x}\), is strongly semismooth at \(\varvec{x},\) if for any \(\varvec{d}\rightarrow \varvec{0}\) and \(H \in \partial G(\varvec{x}+\varvec{d}),\)
A classical Assumption for the convergence of Newton-type methods to solve (1) and consequently the NCP, is that the matrices in \(\partial \Phi (\varvec{x}_*)\) or \(\partial _C \Phi (\varvec{x}_*)\) are nonsingular in a solution \(\varvec{x}_*\) of the problem. Related to this Assumption are the concepts of BD-regularity and C-regularity which we introduce below.
Definition 4
(De Luca et al. 1996) Let \(\varvec{x}_*\) be a solution of NCP. If all matrices in \(\partial _B \Phi (\varvec{x}_*)\) are nonsingular, \(\varvec{x}_*\) is called a BD-Regular solution.
Definition 5
Let \(\varvec{x}_*\) be a solution of NCP. If all matrices in \(\partial _C \Phi (\varvec{x}_*)\) are nonsingular, \(\varvec{x}_*\) is called a C-Regular solution.
The following three results are characterizations of semismooth and strongly semismooth, respectively.
Theorem 1
(Pang and Qi 1993) Let \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) be a semismooth function at \(\varvec{x}.\) Then
Lemma 1
(Qi and Sun 1993) Let \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) be a strongly semismooth function at \(\varvec{x}.\) Then
Theorem 2
Let \(G:\mathbb {R}^n\rightarrow \mathbb {R}^n\) be a strongly semismooth function at \(\varvec{x}.\) Then when \(\varvec{d}\rightarrow \varvec{0}\)
for any \(H\in \partial G(\varvec{x}+\varvec{d}).\)
Proof
By the strongly semismoothness of G at \(\varvec{x},\)
for any \(H\in \partial G(\varvec{x}+\varvec{d}),\) \(\varvec{d}\rightarrow \varvec{0}\) and some positive constant \(M_1.\) Moreover, by the Lemma 1 it follows that
for some positive constant \(M_2.\) Now, from (8) and (9), we have
where \(M=M_1+M_2. \) The above concludes the proof. \(\square \)
Finally, we present results that establish the semismoothness of \(\Phi _{\lambda }\) and a sufficient condition for its strongly semismoothness, for any matrices \(H\in \partial \Phi _\lambda (\varvec{x}+\varvec{d}) \) (Kanzow and Kleinmichel 1998), and \(H\in \partial _C \Phi _\lambda (\varvec{x}+\varvec{d}) \) (Kanzow and Pieper 1999), respectively.
Lemma 2
(Kanzow and Kleinmichel 1998) The function \(\Phi _{\lambda }\) is semismooth.
Lemma 3
(Kanzow and Kleinmichel 1998) If the Jacobian matrix of F is locally Lipschitz continuous, the function \(\Phi _{\lambda }\) is strongly semismooth.
The following result guarantees that the Jacobian matrix \(\Phi '_{\lambda \mu }(\varvec{x})\) is sufficiently close to \(\partial _C \Phi _{\lambda }(\varvec{x}),\) if the smoothing parameter \(\mu \) tends to zero. This makes it meaningful to consider methods that use \(\Phi '_{\lambda \mu }(\varvec{x})\) instead of matrices in \(\partial _C \Phi _{\lambda }(\varvec{x})\).
Lemma 4
(Arenas et al. 2020) Let \(\varvec{x}\in \mathbb {R}^n\) be arbitrary but fixed and \(\mu >0\). Then
The following is a technical lemma that provides useful bounds to demonstrate linear, superlinear, and even quadratic convergence of the proposed algorithm. Additionally, it guarantees the nonsingularity of \(\Phi '_{\lambda \mu } (\varvec{x}).\)
Lemma 5
(Arenas et al. 2020) If \(\varvec{x}_{*}\) is a C-Regular solution of (5), there exists a constant \(\epsilon >0\) such that, if \(\left\| \varvec{x}-\varvec{x}_{*}\right\| <\epsilon \) then the matrix \(\Phi '_{\lambda \mu }(\varvec{x})\) is nonsingular and
where c is a positive constant that satisfies
for any \(H_*\in \partial _C\Phi _{\lambda }(\varvec{x}_{*}).\) Moreover, for any \(\delta >0\) there exists \(\hat{\mu }>0\) such that
for all \(H_*\in \partial _C \Phi _{\lambda }(\varvec{x}_{*})\) and \(\mu <\hat{\mu }\). If the Jacobian matrix of F is locally Lipschitz continuous, then there exists a positive constant \(\eta ,\) such that
for all \(H\in \partial _C \Phi _{\lambda }(\varvec{x})\).
3 Algorithm and convergence results
In this section, we propose a Jacobian smoothing inexact Newton algorithm to solve (6) and thus, to solve the NCP. In addition, we develop its convergence theory.
We present below the new algorithm that we will call JSINA (Jacobian Smoothing Inexact Newton Algorithm).
Remark 1
If \(\theta = 0\), the JSINA reduces to an exact method like the one proposed in Arenas et al. (2014). Therefore, our algorithm can be seen as a generalization of this class of methods for solving the NCP.
Remark 2
If \(\lambda = 2\) and the sequence \(\{\mu _k\}\) is chosen conveniently, the JSINA reduces to the one presented in Rui and Xu (2010), and it can also be seen as a generalization of that method.
Remark 3
To find a vector \(\varvec{d}_k\) satisfying (13), iterative methods based on Krylov subspaces can be used. In particular we use GMRES (generalized minimum residual).
The following result gives a sufficient condition that guarantees the existence of a direction that satisfies (13).
Lemma 6
Let \(\varvec{x}\in \mathbb {R}^n\) such that \(\Phi _{\lambda }(\varvec{x})\ne \varvec{0}.\) If there is a nonzero vector \(\overline{\varvec{d}}\in \mathbb {R}^n\) such that
then there exists \(\theta _{\text {min}}\in [0,1)\) such that, for any \(\theta \in [\theta _{\text {min}},1),\) there exists \(\varvec{d}\in \mathbb {R}^n\) such that
In particular, if
then for any \(\theta \in [0,1)\), there exists \(\varvec{d}\in \mathbb {R}^n\) such that (14) is satisfied.
Proof
Let us assume \(\overline{\varvec{d}}\ne \varvec{0}\). Let
with \(\theta \in [\overline{\theta },1).\) From (15), after some algebraic calculations, we have that
If we define \(\theta _\textit{min}=\overline{\theta }\in [0,1)\) then the direction \(\varvec{d}\) given by (15) satisfies (14), for any \(\theta \in [\overline{\theta },1).\)
Finally, if
then (14) holds trivially for any \(\theta \in [0,1)\) and \(\varvec{d}=\overline{\varvec{d}}.\) Note that if the matrix \(\Phi '_{\lambda \mu }(\varvec{x})\) is nonsingular then choosing \(\overline{\varvec{d}}=-\Phi _{\lambda \mu }(\varvec{x})^{-1}\Phi _{\lambda }(\varvec{x}),\) it is guaranteed (16) and therefore, there exists a direction \(\varvec{d}\) that satisfies (14) for any \(\theta \in [0,1).\) \(\square \)
Under the following Assumptions, we will demonstrate that the JSINA is well defined and converges to a solution of (1).
- A1.:
-
The nonlinear system of equations \(\Phi _{\lambda }(\varvec{x})=\varvec{0}\) has a solution.
- A2.:
-
Every solution of (5) is C-Regular.
- A3.:
-
The Jacobian matrix of F is Lipschitz continuous.
The following theorem guarantees that if the starting point is sufficiently close to a solution of (5) then the sequence generated by the JSINA remains in a neighborhood of that solution.
Theorem 3
Suppose that the Assumptions A1 and A2 are verified. Let \(\tau \in (0,1), \) \(\theta _{\text {max}}\in [0,1)\) be such that \( \theta _{\text {max}}<\tau ,\) and \(\varvec{x}_*\) be a solution of (5). Then for all \(\theta \in [0,\theta _{\text {max}}],\) there exist constants \(\hat{\epsilon }>0\) and \(\overline{\mu }>0\) such that if \(\left\| \varvec{x}-\varvec{x}_{*}\right\| <\beta ^2\hat{\epsilon }\) and \(\mu <\overline{\mu }\) then
for any \(\varvec{d}\in \mathbb {R}^n\) such that
where \( \left\| \varvec{y}\right\| _*=\Vert H_*\varvec{y} \Vert ,\) \(H_*\in \partial _C \Phi _{\lambda }(\varvec{x}_{*})\) and \(\beta =\max \{\left\| H_*\right\| , \left\| H_*^{-1}\right\| \}.\)
Proof
From Assumption A2, \(\varvec{x}_{*}\) is a C-Regular solution then from Lemma 5 there exists \(\epsilon _1>0,\) such that if \(\left\| \varvec{x}-\varvec{x}_*\right\| <\epsilon _1\) then \(\Phi '_{\lambda \mu }(\varvec{x})\) is nonsingular and
Let \(H_*\in \partial _C\Phi _{\lambda }(\varvec{x}_{*})\) then \(H_*\) is nonsingular. Define
Given that \(\theta _{\textit{max}}<\tau ,\) there exists a sufficiently small \(\gamma >0,\) such that
In fact, since the function \(g(\gamma )=\left[ 1+\beta \gamma \right] \left[ \theta _\textit{max} \left[ 1+\gamma \beta \right] +2\gamma \beta \right] \) is continuous and \(g(0)=\theta _{\textit{max}}<\tau ,\) we have that \(g(\gamma )<\tau \) for sufficiently small \(\gamma ,\) giving (19).
On the other hand,
From (12), we have that there exists \(\hat{\mu _1}>0\) such that
By (12) and (20), there exists \(\hat{\mu _2}>0\) such that
Moreover, from Lemma 2, we have that for \(\gamma >0\), there exists \(\epsilon _2>0,\) such that if \(\left\| \varvec{x}-\varvec{x}_{*}\right\| <\epsilon _2\) then
Now, if \({\varvec{h}}= H_*(\varvec{x}+\varvec{d}-\varvec{x}_{*})\) and \(\varvec{r}=\Phi _{\lambda }(\varvec{x})+\Phi _{\lambda \mu }'(\varvec{x})\varvec{d},\) we have that
Let \(\hat{\epsilon }>0\) be sufficiently small such that \(\beta ^2\hat{\epsilon }<\min \{\epsilon _1,\epsilon _2\}\) and \(\overline{\mu }=\min \{\hat{\mu _1},\hat{\mu _2}\}\) then (18) to (22) are satisfied for \(\beta ^2\hat{\epsilon }\) and \(\overline{\mu }\). Using norm in (23),
Since
Then, using the Euclidean norm in the above equality
Replacing (25) in (24), it follows that
From the above, the definition of \(\Vert \cdot \Vert _*\) and (19), it follows that
which concludes the proof. \(\square \)
The norm \(\Vert \cdot \Vert _*\) is related to the Euclidean norm \(\Vert \cdot \Vert ,\) as follows.
Remark 4
For all \(\varvec{y}\in \mathbb {R}^n\) it is verified that
where \(\beta =\max \{\left\| H_*\right\| , \left\| H_*^{-1}\right\| \}.\)
The following theorem guarantees that the proposed algorithm is well-defined and converges linearly to a solution of (5).
Theorem 4
Suppose that the Assumptions A1 and A3 are verified. Let \(\tau \in (0,1), \) \(\theta _{\text {max}}\in [0,1)\) be such that \( \theta _{\text {max}}<\tau ,\) and \(\varvec{x}_*\) be a solution of (5). Then for all \(\theta \in [0,\theta _{\text {max}}],\) there exists a constant \(\epsilon _0>0,\) such that if \(\left\| \varvec{x}_0-\varvec{x}_*\right\| <\epsilon _0,\) the sequence \(\left\{ \varvec{x}_k\right\} \) generated for the JSINA is well-defined and converges to \(\varvec{x}_*.\) Moreover,
Proof
Let \(\tau \in (0,1)\) and \(\epsilon _0\in (0,\epsilon ),\) where \(\epsilon =\min \left\{ \hat{\epsilon },\;\hat{\epsilon }\beta ^2\right\} ,\) with \(\hat{\epsilon }\) and \(\beta \) the constants of Theorem 3. For the proof, we will use induction on k.
-
For \(k=0\), if \(\left\| \varvec{x}_{0}-\varvec{x}_*\right\| \le \epsilon _0<\epsilon \) then from Lemma 6, there exists \(\varvec{d}_0\) such that (14) is satisfied and therefore, \(\varvec{x}_1=\varvec{x}_0+\varvec{d}_0\) is well-defined. Now, as \(\left\| \varvec{x}_{0}-\varvec{x}_*\right\| <\epsilon _0 \le \hat{\epsilon }\beta ^2,\) then from Theorem 3, we have
$$\begin{aligned}\left\| \varvec{x}_{0}+\varvec{d}_0-\varvec{x}_*\right\| _*=\left\| \varvec{x}_{1}-\varvec{x}_*\right\| _*\le \tau \left\| \varvec{x}_0-\varvec{x}_*\right\| _*.\end{aligned}$$ -
Suppose that the result holds for all \(0<k\le m-1\) and we prove that it is true for \(k=m.\) Assume that \(\Vert \varvec{x}_m-\varvec{x}_* \Vert <\epsilon _0.\) From Lemma 6 and Theorem 3 there exists \(\varvec{d}_{m}\) such that (14) is satisfied so, \(\varvec{x}_{m+1}=\varvec{x}_{m}+\varvec{d}_{m}\) is well defined. Moreover,
$$\begin{aligned} \left\| \varvec{x}_{m}+\varvec{d}_{m}-\varvec{x}_*\right\| _*=\Vert \varvec{x}_{m+1}-\varvec{x}_* \Vert _*\le \tau \Vert \varvec{x}_{m}-\varvec{x}_* \Vert _*. \end{aligned}$$(29)From (29), using recursively the inductive hypothesis, we have that
$$\begin{aligned} \left\| \varvec{x}_{m+1}-\varvec{x}_*\right\| _*\nonumber\le & {} \tau \left\| \varvec{x}_{m}-\varvec{x}_*\right\| _*\\\le & {} \cdots \le \tau ^m \left\| \varvec{x}_{0}-\varvec{x}_*\right\| _*. \end{aligned}$$Thus, the inequality (28) holds for \(k=m,\) and since \(0<\tau <1\), the sequence \(\left\{ \varvec{x}_k\right\} \) converges to \(\varvec{x}_{*}\).
\(\square \)
Remark 5
From (27) and (28), we have that \(\frac{\left\| \varvec{x}_{k+1}-\varvec{x}_*\right\| }{\left\| \varvec{x}_k-\varvec{x}_*\right\| }\le \tau \beta ^2,\) so the JSINA converges linearly if \(\tau \beta ^2<1\) and sublinearly, if \(\tau \beta ^2\ge 1.\)
The following theorem states the conditions under which the new algorithm converges superlinearly.
Theorem 5
Suppose the Assumptions of Theorem 4. If in addition \(\theta _k\rightarrow 0\) then the sequence \(\left\{ \varvec{x}_k\right\} \) generated by the JSINA converges to \(\varvec{x}_*\) q-superlinearly.
Proof
From Lemma 5, for k sufficiently large the matrix \(\Phi '_{\lambda \mu _k}(\varvec{x}_k)\) is nonsingular and satisfies \( \left\| \Phi '_{\lambda \mu _k}(\varvec{x}_k)^{-1}\right\| \le 2c,\) consequently,
Then, by (17) we have that
where \(H_k\in \partial _C \Phi _{\lambda }(\varvec{x}_k)\). Therefore,
From (30) in Theorem 3, it follows that,
On the other hand, from Lemma 2, there exists a sequence \(\left\{ \alpha _k\right\} \) such that
where \(H_k\in \partial _C \Phi _{\lambda }(\varvec{x}_k)\) and \(\alpha _k\rightarrow 0\) when \(k\rightarrow \infty .\) Thus, from (31) and (32), it follows that
Since \(\theta _k\rightarrow 0,\) \(\alpha _k\rightarrow 0\) and from Lemma 4, it follows that \( \Vert \Phi '_{\lambda \mu _k}(\varvec{x}_k)-H_k \Vert \rightarrow 0, \) when \(k\rightarrow \infty \), therefore the sequence \( \{\varvec{x}_k \} \) converges q-superlinearly to \(\varvec{x}_{*}\). \(\square \)
One of the most desirable properties for iterative algorithms is the quadratic convergence rate. The following result guarantees that the JSINA can achieve such convergence rate.
Theorem 6
Under the Assumptions of Theorem 4. If \(\theta _k=O\left( \left\| \Phi _\lambda (\varvec{x}_k)\right\| \right) ,\) the sequence \(\left\{ \varvec{x}_k\right\} \) generated by the JSINA converges q-quadratically to \(\varvec{x}_*\).
Proof
From Lemma 3, \(\Phi _{\lambda }\) is strongly semismooth, therefore there exists a positive constant C, such that
where \(H_k\in \partial _C \Phi _{\lambda }(\varvec{x}_k).\) On the other hand, given that \(\theta _k=O\left( \left\| \Phi _\lambda (\varvec{x}_k)\right\| \right) \) there exists a positive constant B, such that
Given that \(\Phi _{\lambda }\) is locally Lipschitz, we have that
Finally, from the Assumption A3, the Jacobian matrix of F is locally Lipschitz, then from Lemma 5 there exists \(\eta >0,\) such that
Thus, from (32), (33), (34) and (35)
Therefore, the sequence \( \{\varvec{x}_k\}\) converges q-quadratically to \(\varvec{x}_*. \) \(\square \)
4 Numerical tests
In this section we analyze numerically the performance of JSINA in terms of the number of iterations and CPU time. For this, we compare it with two other algorithms. The first one, proposed in Arenas et al. (2020) is a Jacobian smoothing-type algorithm whose directional search is performed exactly (we will call it JSENA). The second, is a local version of the inexact Newton-type algorithm proposed in Kanzow (2004) which use the generalized Jacobian of \(\Phi _{\lambda }\) in the directional search (we will call it GINA).
The comparison between these algorithms will allow us to evaluate the use of inexact strategies for the solution of Newton’s equation and the use of smoothing techniques to approximate the Jacobian matrix of \(\Phi _{\lambda }\) instead of constructing matrices of its generalized Jacobian.
For clarity in the reading of this section, we include below the JSENA and GINA algorithms.
For the implementation of inexact algorithms (JSINA and GINA), we chose two sequences for the forcing parameter: \(\{2^{-(k+1)}\}\) and \(\{10^{-(k+1)}\}. \) For the smoothing algorithms (JSINA and JSENA), we chose three sequences for the smoothing parameter: \(\{2^{-(k+1)}\},\) \(\{10^{-(k+1)}\}\) and \(\{\overline{\mu }_k\}, \) this latter defined in Sánchez et al. (2021). Thus, we have eleven methods to analyze and compare, which we will identify as indicated in the Table 1.
For methods M1 to M6, we use the dynamic procedure for choosing the parameter \(\lambda \) proposed in Kanzow and Kleinmichel (1998). In addition, for methods M1 to M6, M10 and M11, we use GMRES to find a vector satisfying (13) and (36), respectively.
The software used for the implementation of methods was MATLAB and they were run on a computer with Intel(R) Core(TM) i5-8300 H CPU 2.30GHz.
For numerical tests, we consider four problems of varying size which are described below. In each case, we present the function (\(F:\mathbb {R}^n\rightarrow \mathbb {R}^n\)) that defines the problem and its solution(s). The starting points considered were \(\varvec{x}_1=(-1, \ldots ,-1)^T,\,\) \(\varvec{x}_2=(0, \ldots ,0)^T\) and \(\varvec{x}_3=(1, \ldots ,1)^T.\)
P1. Ahn (Byong-Hun 1983). \(F(\varvec{x})=M\varvec{x}+\varvec{q},\) where \(\,M= tridiag(-2,4, 1)\in \mathbb {R}^{n\times n}\,\) and \(\,{\varvec{q}} =(-1,\ldots ,-1)^T \in \mathbb {R}^n.\,\) The solution of this problem is given by
In problems P2 to P4 , \(F(\varvec{x})=(f_1(\varvec{x}), \dots ,f_n(\varvec{x}))^T,\) where (Lopes et al. 1999),
Below, we define the function \(h:\mathbb {R}^n\rightarrow \mathbb {R}^n, \) \(h(\varvec{x})=(h_1(\varvec{x}), \ldots ,h_n(\varvec{x}))^T \) in each case. For all problems, the solution is \(\varvec{x}_*=(1,0,1,\ldots ,0, 1).\)
P2. Chained Rosenbrock (Luksan and Vlcek 1999):
P3. Generalized tridiagonal Broyden (Luksan and Vlcek 1999):
P4. Structured Jacobian (Luksan and Vlcek 1999):
and for \(i=2,3,\ldots ,n-1,\,\,\)
4.1 Experiment 1.
In the first experiment, for each problem, we analyzed the performance of variants of JSINA (M1 to M6), JSENA (M7 to M9) and GINA (M10 and M11), in terms of number of iterations, inner iterations and CPU time.
The results obtained are shown in Tables 1, 2, 34 and 5, which contain the following information: Problem (P), dimension(n), starting point (\(\varvec{x}_0\)), number of iterations of algorithm (k), number of inner iterations of inexact method (In) and CPU time used in the execution of each algorithm (CPU).
In the Tables 2 and 3, we observe that if the forcing parameter (\(\theta _k\)) tends to zero quickly (M4 to M6), the number of inner iterations increases as well as CPU time. An alternative is to choose a more demanding smoothing parameter, whose computation does not imply a higher computational effort.
We also observe that in general, the methods with the smoothing parameter \(\overline{\mu }_k\) require a smaller number of external iterations to find the solution. However, they diverged a greater number of times than their counterparts.
The results of Table 4 show that the number of iterations and CPU time were similar for the three JSENA variants. That is, the algorithm is not very sensitive to the variation of smoothing parameter.
Comparing the results of Tables 2, 3 and 4, we can infer that although the number of iterations required by the exact methods (M7 to M9) was less than those of the inexact ones (M1 to M6), the CPU time of the latter was considerably lower. Moreover, in the cases where divergence occurred, the inexact methods reported it faster than the others. On the other hand, it can be observed that the number of successes of the exact methods was lower because they exceeded the maximum time allowed.
The Table 5 shows that, in general, the number of inner iterations of M11 is higher than those required by M10. This is because the sequence of forcing parameter in M11 was more demanding than in M10, since it converges more rapidly to zero than the one used in M10. On the other part, the CPU time of the two methods did not show significant differences.
Finally, the results of Tables 2, 3 and 5 show that although the number of internal and external iterations of GINA are significantly lower than those performed by JSINA, the CPU time of the latter is in general, considerably less than that required by GINA which shows that the use of matrices of the generalized Jacobian increases the computational effort compared to that of smoothing methods.
4.2 Experiment 2.
In this experiment, we compare efficiency and robustness of the eleven methods described in the Table 1. For this, we use the Robustness (\(R_j\)), efficiency (\(E_j\)) and combined robustness and efficiency indices (\(E_j \times R_j\)) (Buhmiler and Krejić 2008).
Robustness index (\(R_j\)). It measures the percentage of success of the method,
Efficiency index (\(E_j\)). It measures the speed of the method in terms of number of iterations,
Combined index (\(E_j\times R_j\)). It measures the balance of methods in terms of successes and average speed in number of iterations,
where \( r_{ij} \) is the number of iterations required to solve the problem i with the method j, \( r_{ib}=\min _{j} r_{ij} \) is the best result for problem i with any of the m methods, \(t_j\) the number of successes by method j and, \(n_j\) is the number of problems attempted by method j. For all these indices, if they are close to 1, the method will be better.
The previous indices provide standardized measures of some characteristics of the algorithms, but they do not show the speed (in time) of the methods to give an answer (positive or negative) to the user. For this purpose, we introduce in this paper a new index that we will call time index (\(T_j\)), define by
with
where \(T_{ij} \) is the CPU time spent to answer problem i by method j.
From Tables 2, 3, 4 and 5, we calculate the indices mentioned above for methods M1 to M11. The results obtained are shown in the Table 6.
The Table 6 shows that M1 has the highest robustness index, this means that JSINA with parameter: \(\mu _k=2^{-(k+1)}\) and \(\theta _k=2^{-(k+1)}\) converged in all cases. Now, if we analyze the robustness index taking into account its characteristics, it is observed that the inexact methods have the highest robustness indexes since they converged in more than \(91\%\) of the experiments.
Regarding the efficiency index, we observe that M7 to M9 present the best results, which is to be expected since they correspond to JSENA and this algorithm solves exactly a system of equations to find the Newton direction. This means that when these methods converge, they require fewer iterations compared to the inexact methods. However, it is also observed that these types of methods have a low robustness index.
The methods that have the highest combined index are M5 (JSINA, with \(\theta _k=\mu _k=10^{-(k+1)}\)) and M6 (JSINA, with \(\theta _k=10^{-(k+1)} \) and \(\overline{\mu _k}\)). This indicates that these methods are the most balanced in terms of robustness and efficiency, i.e., these methods have a high probability of convergence in relatively few iterations.
On the other hand, it is observed that in general the JSINA (smooth algorithm) algorithm has the best time indexes, followed by the GINA (nonsmooth algorithm) and JSENA (exact algorithm) algorithms respectively. This ratifies the expectations that are theoretically held with respect to this class of algorithms, i.e., the inexact ones have a lower computational cost than the exact ones.
Finally, we highlight that M2 (JSINA, with \(\theta _k=2^{-(k+1)} \) and \(\mu _k=10^{-(k+1)}\)) presents the highest time index and that it presents efficiency, robustness and combined indexes close to the best values in each case. Thus, M2 is the most balanced method in terms of robustness, efficiency and time.
5 Final remarks
In this paper, we propose a new inexact Newton-type algorithm to solve large nonlinear complementarity problems by its reformulation as a nonlinear nonsmooth system of equations. We show that the algorithm converges up to quadratically.
We define a new index that allows a comparison of algorithms in terms of time, which complements those that compare in terms of successes and iterations.
Numerical experimentation allows us to conclude a good performance of our algorithm for large problems in comparison with inexact and exact methods recently proposed (Wan et al. 2015; Kanzow 2004). Moreover, such numerical experimentation showed that it is important to have a standardized time index that, together with other known indexes, allows a more objective analysis of the algorithms.
We believe that more numerical experimentation is needed to establish other options for choosing the smoothing and inaccuracy parameter. Moreover, globalization of the proposed algorithm and global convergence analysis need to be done.
Code availability
this work introduce an algorithm completely available for the readers.
References
Anitescu M, Cremer J, Potra F (1997) On the existence of solutions to complementarity formulations of contact problems with friction. Complement Var Probl State Art 92:12
Arenas F, Marínez HJ, Pérez R (2014) Redefining the Kanzow complementarity function. Rev Cienc 18(2):111–122
Arenas F, Martínez H, Pérez R (2020) A local Jacobian smoothing method for solving nonlinear complementarity problems. Univ Sci 25:149–174. https://doi.org/10.11144/Javeriana.SC25-1.aljs
Birgin E, Krejić N, Martínez JM (2003) Globally convergent inexact quasi-Newton methods for solving nonlinear systems. Numer Algorithms 32:249–260. https://doi.org/10.1023/A:1024013824524
Broyden C, Dennis J Jr, Moré J (1973) On the local and superlinear convergence of quasi-Newton methods. IMA J Appl Math 12:223–245. https://doi.org/10.1093/imamat/12.3.223
Buhmiler S, Krejić N (2008) A new smoothing quasi-Newton method for nonlinear complementarity problems. J Comput Appl Math 211:141–155. https://doi.org/10.1016/j.cam.2006.11.007
Burke J, Xu S (1998) The global linear convergence of a noninterior path-following algorithm for linear complementarity problems. Math Oper Res 23:719–734. https://doi.org/10.1287/moor.23.3.719
Byong-Hun A (1983) Iterative methods for linear complementarity problems with upperbounds on primary variables. Math Program 26(3):295–315. https://doi.org/10.1007/BF02591868
Chen C, Mangasarian O (1996) A class of smoothing functions for nonlinear and mixed complementarity problems. Comput Optim Appl 5:97–138. https://doi.org/10.1007/BF00249052
Chen A, Oh J-S, Park D, Recker W (2010) Solving the bicriteria traffic equilibrium problem with variable demand and nonlinear path costs. Appl Math Comput 217:3020–3031. https://doi.org/10.1016/j.amc.2010.08.035
Clarke F (1975) Generalized gradients and applications. Trans Am Math Soc 205:247–262. https://doi.org/10.1090/S0002-9947-1975-0367131-6
Clarke F (1990) Optimization and Nonsmooth Analysis. SIAM, New York
De Luca T, Facchinei F, Kanzow C (1996) A semismooth equation approach to the solution of nonlinear complementarity problems. Math Program 75(3):407–439. https://doi.org/10.1007/BF02592192
Dembo R, Eisenstat S, Steihaug T (1982) Inexact Newton methods. SIAM J Numer Anal 19:400–408. https://doi.org/10.1137/0719025
Ferris M, Pang J-S (1997) Engineering and economic applications of complementarity problems. Siam Rev 39:669–713. https://doi.org/10.1137/S0036144595285963
Fischer A (1992) A special Newton-type optimization method. Optimization 24:269–284. https://doi.org/10.1007/978-3-319-23528-8-43
Huang Z, Han J, Xu D, Zhang L (2001) The non-interior continuation methods for solving the \(p_0\) function nonlinear complementarity problem. Sci China Ser A-Math 44:1107–1114. https://doi.org/10.1007/BF02877427
Kanzow C (1996) Some noninterior continuation methods for linear complementarity problems. SIAM J Matrix Anal Appl 17:851–868. https://doi.org/10.1137/S0895479894273134
Kanzow C (2004) Inexact semismooth newton methods for large-scale complementarity problems. Optim Methods Softw 19(3–4):309–325. https://doi.org/10.1080/10556780310001636369
Kanzow C, Kleinmichel H (1998) A new class of semismooth Newton-type methods for nonlinear complementarity problems. Comput Optim Appl 11:227–251. https://doi.org/10.1023/A:1026424918464
Kanzow C, Pieper H (1999) Jacobian smoothing methods for nonlinear complementarity problems. SIAM J Optim 9:342–373. https://doi.org/10.1137/S1052623497328781
Kostreva M (1984) Elasto-hydrodynamic lubrication: a non-linear complementarity problem. Int J Numer Methods Fluids 4:377–397. https://doi.org/10.1002/fld.1650040407
Krejić N, Rapajić S (2008) Globally convergent Jacobian smoothing inexact newton methods for NCP. Comput Optim Appl 41(2):243–261
Li DH, Fukushima M (2001) Globally convergent Broyden-like methods for semismooth equations and applications to VIP, NCP and MCP. Ann Oper Res 103:71–97. https://doi.org/10.1023/A:1012996232707
Lopes VLR, Martínez JM, Pérez R (1999) On the local convergence of quasi-Newton methods for nonlinear complementarity problems. Appl Numer Math 30:3–22. https://doi.org/10.1016/S0168-9274(98)00080-4
Luksan L, Vlcek J (1999) Sparse and partially separable test problems for unconstrained and equality constrained optimization. institute of computer science, Academy of Sciences of the Czech Republic. Technical report, Technical Report 767
Pang J-S, Qi L (1993) Nonsmooth equations: motivation and algorithms. SIAM J Optim 3:443–465. https://doi.org/10.1137/0803021
Qi L (1993) Convergence analysis of some algorithms for solving nonsmooth equations. Math Oper Res 18:227–244. https://doi.org/10.1287/moor.18.1.227
Qi L (1996) \(C\)-Differential Operators, \(C\)-Differentiability and Generalized Newton Methods. School of Mathematics. University of New South Wales, Sydney
Qi L, Sun J (1993) A nonsmooth version of Newton’s method. Math Program 58:353–367. https://doi.org/10.1007/BF01581275
Rui S-P, Xu C-X (2010) A smoothing inexact Newton method for nonlinear complementarity problems. J Comput Appl Math 233:2332–2338. https://doi.org/10.1016/j.cam.2009.10.018
Sánchez W, Pérez R, Martínez H (2021) An global Jacobian smoothing algorithm for nonlinear complementarity problems. Rev Integr 39:191–215. https://doi.org/10.18273/revint.v39n2-20210004
Sherman A (1978) On Newton-iterative methods for the solution of systems of nonlinear equations. SIAM J Numer Anal 15:755–771. https://doi.org/10.1137/0715050
Wan Z, Li H, Huang S (2015) A smoothing inexact Newton method for nonlinear complementarity problems. Abstr Appl Anal 2015:1085–3375. https://doi.org/10.1155/2015/731026
Zhu J, Hao B (2011) A non-monotone inexact regularized smoothing newton method for solving nonlinear complementarity problems. Int J Comput Math 88:3483–3495. https://doi.org/10.1080/00207160.2011.599380
Funding
Open Access funding provided by Colombia Consortium.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Sánchez, W., Arias, C.A. & Perez, R. A Jacobian smoothing inexact Newton method for solving the nonlinear complementary problem. Comp. Appl. Math. 43, 279 (2024). https://doi.org/10.1007/s40314-024-02775-7
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40314-024-02775-7
Keywords
- Nonlinear complementarity
- Inexact Newton method
- Smoothing methods
- Generalized Jacobian
- Quadratic convergence