1 Introduction

We consider the finite-dimensional quasi-variational inequality QVI(K, F): Find a vector \(x^{*}\in K(x^{*})\) such that

$$ F\bigl(x^{*}\bigr)^{T} \bigl(y-x^{*}\bigr)\ge 0,\quad \forall y\in K\bigl(x^{*}\bigr), $$

where \(F:R^{n}\rightarrow R^{n}\) is a point to point mapping and \(K:R^{n}\rightrightarrows R^{n}\) is a point to set mapping with closed and convex images. Throughout the paper, we assume that F belongs to \(C^{1}\) and for each \(x\in R^{n}\), the feasible set mapping K is given by

$$ K(x)\triangleq \bigl\{ y\in R^{n}\mid g(y,x)\le 0, h(y,x)=0\bigr\} , $$

where \(g:R^{n}\times R^{n}\rightarrow R^{m_{1}}\) belongs to \(C^{2}\) and \(g_{i}(\cdot ,x)\) is convex on \(R^{n}\) for each \(i=1,2,\ldots ,m_{1}\) and for all \(x\in R^{n}\), \(h:R^{n}\times R^{n}\rightarrow R^{m_{2}}\) belongs to \(C^{2}\) and \(h_{j}(\cdot ,x)\) is affine on \(R^{n}\) for each \(j=1,2,\ldots ,m_{2}\) and for all \(x\in R^{n}\). When the set \(K(x)\) is independent of x, (1.1) reduces to the famous variational inequality (VI). For VI, we refer the reader to [13] and the references therein.

QVI (1.1), which was first introduced by Bensoussan and Lions [2, 3], has important applications in many fields such as generalized Nash games, mechanics, economics, statistics, transportation and biology; see for example [1, 6, 10, 12] and the references therein. One interesting topic on QVI is to develop the efficient algorithms for the solution of QVI. Since QVI is nonsmooth and nonconvex, it is difficult to design effective methods for QVI, and by now, compared with VI, the numerical methods are still scarce. In this paper, we mainly focus on the numerical method based on the KKT conditions of QVI. This area attracts many people’s attention and much progress has been made. In [12] an interior point approach was proposed to solve QVI and the convergence was established for several classes of interesting QVIs. Reference [8, 9] developed a so called LP-Newton method and the method can be successfully applied to nonsmooth systems of equations with non-isolated solutions. Reference [21] developed an efficient regularized smoothing Newton-type algorithm for QVI. The proposed algorithm takes the advantage of newly introduced smoothing functions and a non-monotone line search strategy. [10] proposed a semismooth Newton method for QVI. They obtained global convergence and locally superlinear/quadratic convergence result for some important classes of quasi-variational inequality problems. The numerical results show that the method performs well.

There are many ways to compute a numerical solution of the nonlinear complementarity problems (NCP), such as linearized projected relaxation methods [13], the modulus-based matrix splitting method [24] and the penalty method [7, 23, 25]. In the past two decades, the nonsmooth-equation-based method has been thoroughly studied to solve NCP; see for example [5, 1419] and the references therein. A common way to reformulate the complementarity system is to use the so called NCP-function. A function \(\phi :R^{2}\rightarrow R\) is called an NCP-function if it satisfies

$$ \phi (a,b)=0\quad \Leftrightarrow\quad a\ge 0,\quad b\ge 0,\quad ab=0. $$

For example, the famous Fischer–Burmeister (FB) function takes the form

$$ \phi (a,b)=\sqrt{a^{2}+b^{2}}-a-b. $$

By the use of the NCP-function, nonlinear complementarity problem can be easily converted into a system of nonlinear equations. Most existing NCP-functions are generally nondifferentiable in the sense of Fréderivative but semismooth in the sense of Mifflin [20] and Qi and Sun [22]. In [17], the authors proposed a nonsmooth equation reformulation to the NCP. Their reformulation enjoys a nice property that it is continuous differentiable everywhere except at the solution. In this paper, we present a semismooth equation reformulation to the KKT system of a quasi-variational inequality and propose a semismooth Newton method to solve the equations.

The paper is organized as follows. In the next section, we describe a semismooth equation reformulation to the KKT system of a quasi-variational inequality, present the semismooth Newton method and establish the global convergence for the method. In Sect. 3, we compare the proposed method with some other methods on problems list in [11].

In the following, we introduce some notations that will be used in this paper. For a continuously differentiable function \(F:R^{n}\rightarrow R^{n}\), we write \(\mathit{JF}(x)\) for the Jacobian of F at a point \(x\in R^{n}\), whereas \(\nabla F(x)\) denotes the transposed Jacobian of F. Given a smooth mapping \(g:R^{n}\times R^{n}\rightarrow R^{m}\), \((y,x)\mapsto g(y,x)\), \(\nabla _{y}g(y,x)\) denotes the transpose of the partial Jacobian of g with respect to the y-variables. If F is locally Lipschitz continuous around x, then \(\partial F(x)\) denotes Clarke’s generalized Jacobian of F at x. For a vector \(x\in R^{n}\) and a subset \(I\subset \{1,2,\ldots ,n\}\), we write \(x_{I}\) for the subvector consisting of the elements \(x_{i}\), \(i\in I\). For a matrix \(A\in R^{n\times n}\) and two subsets \(I, J\subset \{1,2,\ldots ,n\}\), the symbol \(A_{IJ}\) stands for the submatrix with entries \(a_{ij}\) for \(i\in I\), \(j\in J\). The symbol \(\operatorname{diag}(a_{11},a_{22},\ldots ,a_{nn})\) stands for a diagonal matrix with diagonal elements \(a_{11},a_{22},\ldots ,a_{nn}\).

2 Semismooth equation reformulation and semismooth Newton method

Firstly, we give the following definition that will be used.

Definition 2.1


A function \(F:R^{n}\rightarrow R^{n}\) is semismooth at a point \(x\in R^{n}\) if it is locally Lipschitzian at x and

$$ \lim_{V\in \partial F(x+td'),d'\rightarrow d, t\downarrow 0}Vd' $$

exists for any \(d\in R^{n}\), where \(\partial F(x)\) is the generalized Jacobian of F at x. F is strongly semismooth at \(x\in R^{n}\) if for any \(d\rightarrow \mathbf{0} \) and any \(V\in \partial F(x+d)\),

$$ Vd-F'(x;d)=O\bigl( \Vert d \Vert ^{2}\bigr), $$

where \(F'(x;d)\) denotes the directional derivative of F at x along the direction d.

A point x is called a KKT point of QVI (1.1) if there exist Lagrange multipliers \(\lambda \in R^{m_{1}}\) and \(\nu \in R^{m_{2}}\) such that

$$ \textstyle\begin{cases} F(x)+\nabla _{y} g(x,x)\lambda +\nabla _{y} h(x,x)\nu =0, \\ h(x,x)=0, \\ \lambda \ge 0,\qquad g(x,x)\le 0,\qquad \lambda ^{T}g(x,x)=0. \end{cases} $$

Similar to Theorem 1 of [12], we find that \(x^{*}\in K(x^{*})\) is a solution of (1.1) if there exist \(\lambda ^{*}\in R^{m_{1}}\) and \(\nu ^{*}\in R^{m_{2}}\) such that \((x^{*},\lambda ^{*},\nu ^{*})\) satisfies the KKT conditions (2.1). Moreover, if \(x^{*}\in K(x^{*})\) is a solution of (1.1) and some suitable constraint qualification holds at \(x^{*}\), then there exist \(\lambda ^{*}\in R^{m_{1}}\) and \(\nu ^{*}\in R^{m_{2}}\) such that \((x^{*},\lambda ^{*},\nu ^{*})\) satisfies the KKT conditions (2.1). Based on the above relationship, our aim is to develop a numerical method for solving the KKT conditions (2.1). For convenience, let

$$\begin{aligned}& L(x,\lambda ,\nu ):=F(x)+\nabla _{y} g(x,x)\lambda +\nabla _{y} h(x,x) \nu , \\& p(x):=g(x,x),\qquad q(x):=h(x,x), \end{aligned}$$

and then (2.1) can be rewritten as

$$ \textstyle\begin{cases} L(x,\lambda ,\nu )=0, \\ q(x)=0, \\ p(x)+w=0, \\ \lambda \ge 0,\qquad w\ge 0, \qquad \lambda ^{T}w=0, \end{cases} $$

where the \(w\in R^{m_{1}}\) are slack variables.

It is not easy to solve (2.2) directly since the fourth formula is a complementarity system. We replace the complementarity system by an NCP-function [17], which is called the smoothed form of FB function:

$$ \phi (u,v,\varepsilon )=\sqrt{u^{2}+v^{2}+\varepsilon ^{2}}-(u+v). $$

It is clear that, for each \(\varepsilon \ne 0\), \(\phi (u,v,\varepsilon )\) is continuously differentiable. We use it to construct an almost smooth equation reformulation to the fourth formula.

Let \(\Phi _{FB}(\lambda ,w)=(\phi _{1}^{FB}(\lambda _{1},w_{1}),\ldots , \phi _{m_{1}}^{FB}(\lambda _{m_{1}},w_{m_{1}}))^{T}\) and \(S(\lambda ,w)=(S_{1}(\lambda ,w),\ldots ,S_{m_{1}}(\lambda ,w))^{T}\), where for each \(i=1,2,\ldots ,m_{1}\), the elements \(\phi _{i}^{FB}(\lambda _{i},w_{i})\) and \(S_{i}(\lambda ,w)\) are given by

$$ \phi _{i}^{FB}(\lambda _{i},w_{i})= \sqrt{\lambda _{i}^{2}+w_{i}^{2}}- \lambda _{i}-w_{i} $$


$$ S_{i}(\lambda ,w)=\phi \bigl(\lambda _{i},w_{i},\mu ^{\frac{1}{2}} \bigl\Vert \Phi _{FB}( \lambda ,w) \bigr\Vert \bigr)= \sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta (\lambda ,w)}-\lambda _{i}-w_{i}, $$

respectively, where \(0<\mu <\frac{(\sqrt{2}+1)^{2}}{m_{1}}\) is a parameter, \(\|\cdot \|\) is the Euclidean norm, and

$$ \theta (\lambda ,w)=\frac{1}{2} \bigl\Vert \Phi _{FB}( \lambda ,w) \bigr\Vert ^{2}. $$

It is obvious that, for each \(i=1,2,\ldots ,m_{1}\), \(S_{i}(\lambda ,w)\) is differentiable everywhere except at the degenerate point \((\lambda ,w)\) which satisfies \(\theta (\lambda ,w)=0\) and \(\lambda _{i}=w_{i}=0\) for some \(i=1,2,\ldots ,m_{1}\). Moreover, we can obtain from Theorem 2.3 of [17] that \(S(\lambda ,w)=0\) is equivalent to \(\lambda \ge 0\), \(w\ge 0\), \(\lambda ^{T} w=0\). This means that \((x^{*},\lambda ^{*},\nu ^{*})\) is a KKT point of the QVI if and only if \((x^{*},\lambda ^{*},\nu ^{*},w^{*})\) with \(w^{*}=-p(x^{*})\) is a solution of the nonsmooth system of equations

$$ H(x,\lambda ,\nu ,w)=0,\quad \mbox{with } H(x,\lambda ,\nu ,w):= \begin{pmatrix} L(x,\lambda ,\nu ) \\ q(x) \\ p(x)+w \\ S(\lambda ,w) \end{pmatrix}. $$

Associated with the system of \(H(x,\lambda ,\nu ,w)=0\), we consider its natural merit function

$$ \Psi (z):=\frac{1}{2} \bigl\Vert H(z) \bigr\Vert ^{2}, $$

where we set \(z:=(x,\lambda ,\nu ,w)\).

By a direct calculation, we find that the gradient \(\nabla \theta (\lambda ,w)\) of \(\theta (\cdot ,\cdot )\) at \((\lambda ,w)\) can be expressed as follows:

$$ \nabla \theta (\lambda ,w)=\bigl(\partial \theta (\lambda ,w)/\partial \lambda _{1},\ldots ,\partial \theta (\lambda ,w)/\partial \lambda _{m_{1}}, \partial \theta (\lambda ,w)/\partial w_{1},\ldots ,\partial \theta ( \lambda ,w)/\partial w_{m_{1}}\bigr)^{T}, $$


$$ \partial \theta (\lambda ,w)/\partial \lambda _{i}=\phi _{i}^{FB}( \lambda _{i},w_{i})v_{\lambda _{i}}, \quad \mbox{with } v_{\lambda _{i}} \in \partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i}), $$


$$ \partial \theta (\lambda ,w)/\partial w_{i}=\phi _{i}^{FB}(\lambda _{i},w_{i})v_{w_{i}}, \quad \mbox{with } v_{w_{i}}\in \partial _{w_{i}} \phi _{i}^{FB}( \lambda _{i},w_{i}), $$

which means that

$$ \nabla \theta (\lambda ,w)=\bigl[ \underbrace{\operatorname{diag}(v_{\lambda _{1}},v_{\lambda _{2}}, \ldots ,v_{\lambda _{m_{1}}})}_{ V_{\lambda }} \underbrace{\operatorname{diag}(v_{w_{1}},v_{w_{2}}, \ldots ,v_{w_{m_{1}}})}_{V_{w}}\bigr]^{T} \Phi _{FB}(\lambda ,w). $$

Here, \(\partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})\) denotes the partial generalized gradient of \(\phi _{i}^{FB}(\cdot ,w_{i})\) at \(\lambda _{i}\) and \(\partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})\) denotes the partial generalized gradient of \(\phi _{i}^{FB}(\cdot ,w_{i})\) at \(w_{i}\), respectively. In particular, if \(\theta (\lambda ,w)=0\), then \(\nabla \theta (\lambda ,w)=0\).

If \(\theta (\lambda ,w)\ne 0\), we can get by a direct calculation

$$ \begin{aligned} \nabla S_{i}(\lambda ,w)&= \biggl( \frac{\partial S_{i}}{\partial \lambda _{1}},\ldots ,\frac{\partial S_{i}}{\partial \lambda _{m_{1}}},\frac{\partial S_{i}}{\partial w_{1}},\ldots , \frac{\partial S_{i}}{\partial w_{m_{1}}} \biggr)^{T} \\ &= \biggl[ \biggl( \frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}-1 \biggr)e_{i}^{T}, \biggl( \frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}-1 \biggr)e_{i}^{T} \biggr]^{T} \\ &\quad {}+ \frac{\mu \nabla \theta (\lambda ,w)}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }} \\ &= \biggl[ \biggl( \underbrace{\frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{a_{i}( \lambda ,w)}-1 \biggr)e_{i}^{T}, \biggl( \underbrace{ \frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{b_{i}( \lambda ,w)}-1 \biggr)e_{i}^{T} \biggr]^{T} \\ &\quad {}+ \underbrace{\frac{\sqrt{2\mu \theta }}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}+2\mu \theta }}}_{c_{i}( \lambda ,w)} \sqrt{\mu } \begin{bmatrix} V_{\lambda }\\ V_{w}\end{bmatrix}\frac{\Phi _{FB}(\lambda ,w)}{ \Vert \Phi _{FB}(\lambda ,w) \Vert },\end{aligned} $$


$$ a_{i}^{2}(\lambda ,w)+b_{i}^{2}( \lambda ,w)+c_{i}^{2}(\lambda ,w)=1. $$

Otherwise, we have \(\lambda _{i}\geq 0\), \(w_{i}\geq 0\) and \(\lambda _{i}w_{i}=0\) for any i, which means that, if \(\lambda _{i}^{2}+w_{i}^{2}\neq 0\), then

$$ \begin{aligned} \nabla S_{i}(\lambda ,w)&= \biggl( \frac{\partial S_{i}}{\partial \lambda _{1}},\ldots , \frac{\partial S_{i}}{\partial \lambda _{m_{1}}}, \frac{\partial S_{i}}{\partial w_{1}},\ldots , \frac{\partial S_{i}}{\partial w_{m_{1}}} \biggr)^{T} \\ &= \biggl[ \biggl( \frac{\lambda _{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}}}-1 \biggr)e_{i}^{T}, \biggl(\frac{w_{i}}{\sqrt{\lambda _{i}^{2}+w_{i}^{2}}}-1 \biggr)e_{i}^{T} \biggr]^{T}, \end{aligned} $$

and, if \(\lambda _{i}^{2}+w_{i}^{2}=0\), then the element in \(\partial _{C}S_{i}(\lambda ,w)\) takes the form

$$ \bigl[ (a_{i}-1 )e_{i}^{T}, (b_{i}-1 )e_{i}^{T} \bigr]^{T}+c_{i} \sqrt{\mu }\bigl[\operatorname{diag}(\bar{v}_{ \lambda _{1}}, \bar{v}_{\lambda _{2}},\ldots ,\bar{v}_{\lambda _{m_{1}}}) \operatorname{diag}( \bar{v}_{w_{1}},\bar{v}_{w_{2}},\ldots ,\bar{v}_{w_{m_{1}}}) \bigr]^{T}u,$$

where \(\bar{v}_{\lambda _{i}}\in \partial _{\lambda _{i}} \phi _{i}^{FB}( \lambda _{i},w_{i})\), \(\bar{v}_{w_{i}}\in \partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})\), \(\|u\|=1\), and

$$ a_{i}^{2}+b_{i}^{2}+c_{i}^{2} \le 1. $$

Therefore, the partial generalized derivatives \(S(\lambda ,w)\) can be expressed in the form of

$$ U_{\lambda }=\operatorname{diag}(a_{1}-1,a_{2}-1, \ldots ,a_{m_{1}}-1)+\sqrt{\mu }\operatorname{diag}(c_{1},c_{2}, \ldots ,c_{m_{1}})EV_{\lambda }\operatorname{diag}(u) $$


$$ U_{w}=\operatorname{diag}(b_{1}-1,b_{2}-1, \ldots ,b_{m_{1}}-1)+\sqrt{\mu }\operatorname{diag}(c_{1},c_{2}, \ldots ,c_{m_{1}})EV_{w}\operatorname{diag}(u), $$

where \(a_{i}^{2}+b_{i}^{2}+c_{i}^{2}\le 1\), \(u\in R^{m_{1}}\) satisfies \(\|u\|=1\), E is a matrix whose elements are one, \(V_{\lambda }\) and \(V_{w}\) are diagonal matrices whose diagonal elements belong to \(\partial _{\lambda _{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})\) and \(\partial _{w_{i}} \phi _{i}^{FB}(\lambda _{i},w_{i})\), respectively.

On the basis of the above calculations, we have the following proposition.

Proposition 2.2

Let the mapping H be defined by (2.4). Then the following statements hold:

  1. (a)

    If F is continuously differentiable and g, h are twice continuously differentiable, then H is semismooth and

    $$ \partial H(x,\lambda ,\nu ,w)\subseteq \left \{ \begin{pmatrix} J_{x}L(x,\lambda ,\nu )& \nabla _{y} g(x,x) & \nabla _{y} h(x,x) & 0 \\ J q(x)&0&0&0 \\ J p(x)&0&0&I \\ 0&U_{\lambda }&0& U_{w} \end{pmatrix} \right \} , $$

    where \(U_{\lambda }\), \(U_{w}\) is defined by (2.7) and (2.8), respectively.

  2. (b)

    If, in addition, JF, \(\nabla ^{2}g_{i}\) (\(i=1,\ldots ,m_{1}\)) and \(\nabla ^{2}h_{j}\) (\(j=1,\ldots ,m_{2}\)) are locally Lipschitz, then H is strongly semismooth.

  3. (c)

    Let the merit function Ψ be defined by (2.5). If F is continuously differentiable and g, h are twice continuously differentiable, then Ψ is continuously differentiable, and its gradient is given by

    $$ \nabla \Psi (z)=V^{T}H(z) $$

    for an arbitrary element \(V\in \partial H(z)\).

Remark 2.1

Consider \(\operatorname{QVI}(\tilde{K}, F)\), where

$$ \tilde{K}(x)\triangleq \bigl\{ y\in R^{n}\mid g(y,x) \le 0\bigr\} . $$

That is, there are no equality constraints in QVI (1.1). Similarly, we can formulate the above problem in terms of the nonsmooth system of equations

$$ \tilde{H}(x,\lambda ,w)=0,\quad \mbox{with } \tilde{H}(x,\lambda ,w):= \begin{pmatrix} \tilde{L}(x,\lambda ) \\ p(x)+w \\ S(\lambda ,w) \end{pmatrix}, $$

where \(\tilde{L}(x,\lambda ):=F(x)+\nabla _{y} g(x,x)\lambda \). Similar to the Proposition 2.2, if F is continuously differentiable and g is twice continuously differentiable, then is semismooth and

$$ \partial \tilde{H}(x,\lambda ,w)\subseteq \left \{ \begin{pmatrix} J_{x}\tilde{L}(x,\lambda )& \nabla _{y} g(x,x) & 0 \\ J p(x)&0&I \\ 0&U_{\lambda }& U_{w} \end{pmatrix} \right \} , $$

where \(U_{\lambda }\) and \(U_{w}\) are the same as in Proposition 2.2.

Now, we present the semismooth Newton method for (1.1).

Algorithm 1

(Semismooth Newton Method)

  • Step 0. Choose \(z^{0}=(x^{0},\lambda ^{0},\nu ^{0},w^{0})\in R^{n}\times R^{m_{1}} \times R^{m_{2}}\times R^{m_{1}}\), \(\rho >0\), \(\beta \in (0,1)\), \(\sigma \in (0,\frac{1}{2})\), \(p>2\), \(\varepsilon \ge 0\), and set \(k:=0\).

  • Step 1. If \(\|\nabla \Psi (z^{k})\|\le \varepsilon \), stop.

  • Step 2. Choose an arbitrary element \(V_{k}\in \partial H(z^{k})\), and compute \(d^{k}\) as a solution of the linear system of equations

    $$ V_{k}d=-H\bigl(z^{k}\bigr). $$

    If either this system is not solvable or the sufficient decrease condition

    $$ \nabla \Psi \bigl(z^{k}\bigr)^{T}d^{k} \le -\rho \bigl\Vert d^{k} \bigr\Vert ^{p}$$

    is not satisfied, then take \(d^{k}:=-\nabla \Psi (z^{k})\).

  • Step 3. Compute a stepsize \(t_{k}\) as the maximum of the numbers \(\beta ^{l_{k}}\), \(l_{k}=0,1,2,\ldots \) , such that the following Armijo condition holds:

    $$ \Psi \bigl(z^{k}+t_{k}d^{k} \bigr)\le \Psi \bigl(z^{k}\bigr)+\sigma t_{k}\nabla \Psi \bigl(z^{k}\bigr)^{T} d^{k}.$$
  • Step 4. Set \(z^{k+1}:=z^{k}+t_{k}d^{k}\), \(k\leftarrow k+1\), and go to Step 1.


Below, we establish the following global convergence theorem for Algorithm 1.

Theorem 2.3

Let \(\{z^{k}\}=\{(x^{k},\lambda ^{k},\nu ^{k},w^{k})\}\) be a sequence of iterates generated by Algorithm 1. Then every accumulation point of the sequence \(\{z^{k}\}\) is a stationary point of the merit function Ψ.


We prove it by contradiction. Firstly, if for an infinite set of indices N, \(d^{k}=-\nabla \Psi (z^{k})\) for all \(k\in N\), then, by [4] Proposition 1.16, we see that any limit point \(z^{*}\) of \({z^{k}}\) satisfies \(\nabla \Psi (z^{*})\).

In the following, we suppose the direction is always given by (2.11). Suppose \(\{z^{k}\}\rightarrow z^{*}\) and \(\nabla \Psi (z^{*})\ne 0\), by (2.11), we have

$$ \bigl\Vert H\bigl(z^{k}\bigr) \bigr\Vert = \bigl\Vert V_{k}d^{k} \bigr\Vert \le \Vert V_{k} \Vert \times \bigl\Vert d^{k} \bigr\Vert . $$

Noting that \(\|V_{k}\|\) cannot be 0, otherwise \(H(z^{k})=0\) and \(z^{k}\) would be a stationary point. Hence, we have

$$ \bigl\Vert d^{k} \bigr\Vert \ge \frac{ \Vert H(z^{k}) \Vert }{ \Vert V_{k} \Vert }. $$

If for some subsequence N, \(\{d^{k}\}_{N}\rightarrow 0\), we have by (2.14), \(\{H(z^{k})\}_{N}\rightarrow 0\), and \(z^{*}\) is a solution of the QVI (1.1). Hence, there exists a \(m>0\) such that \(\|d^{k}\|\ge m\). Noting that \(\{\nabla \Psi (z^{k})\}_{N}\) is bounded and \(p>2\), there exists \(M>0\) such that \(\|d^{k}\|\le M\). Otherwise, it would contradict (2.12).

By (2.13) and \(\{z^{k}\}\) is a bounded sequence, \(\Psi (z^{k})\) is bounded from below and \(\{\Psi (z^{k+1})-\Psi (z^{k})\}\rightarrow 0\), which implies

$$ \bigl\{ \beta ^{l_{k}}\nabla \Psi \bigl(z^{k} \bigr)^{T}d^{k}\bigr\} \rightarrow 0. $$

Suppose, subsequencing if necessary, we have \(\{\beta ^{l_{k}}\}\rightarrow 0\). By (2.13), we have

$$ \frac{\Psi (z^{k}+\beta ^{l_{k}-1}d^{k})-\Psi (z^{k})}{\beta ^{l_{k}-1}}> \sigma \nabla \Psi \bigl(z^{k} \bigr)^{T}d^{k}. $$

By \(m\le \|d^{k}\|\le M\), we can assume, subsequencing if necessary, that \(\{d^{k}\}\rightarrow \bar{d}\ne 0\). By passing to the limit in (2.16), we get

$$ \nabla \Psi \bigl(z^{k}\bigr)^{T} \bar{d}\ge \sigma \nabla \Psi \bigl(z^{k}\bigr)^{T} \bar{d}. $$

On the other hand, by (2.12), we have \(\nabla \Psi (z^{k})^{T}\bar{d}\le -\rho \|\bar{d}\|^{p}<0\), which contradicts (2.17). Hence \(\beta ^{l_{k}}\) is bounded away from 0. (2.15) and (2.12) imply that \(\{d^{k}\}\rightarrow 0\), thus contradicting \(0< m\le \|d^{k}\|\), so that \(\nabla \Psi (z^{*})=0\). This completes the proof. □

Remark 2.2

The method proposed in [10] only considers the case of inequality constraints, while our method can solve QVI with both equality and inequality constraints. Besides, as we will see in the next section, our method can solve some problems in QVILIB [11], which cannot be solved by the method proposed by [10].

3 Numerical experiments

In this section, we report the results obtained by Algorithm 1 on problems list in QVILIB. All the computations in this paper were done using Matlab 2014a on a computer with 8.00 GB RAM and 2.5 GHz CPU. We solved all 55 test problems whose detailed description can be found in [11]. For each problem we list

  • the x-part of the starting point (the number reported is the value of all components of the x-part of the starting point);

  • the number of iterations;

  • the number of evaluations of Ψ;

  • the value of \(Y(x,\lambda ,\nu )\) at the termination.

In order to perform the linear algebra involved, we used Matlab’s linear system solver mldivide. If any entry of the solution given by mldivide is a NaN or it is equal to ±∞ or the sufficient decrease condition is not satisfied, then an anti gradient direction is used. We take \(\mu =10^{-5}\), \(\beta =0.5\), \(\rho =10^{-10}\), \(\sigma =0.01\) and \(p=2.1\). We choose \(\lambda ^{0}=0\), \(\nu ^{0}=0\) and \(w^{0}=0\) for all problems. For (2.6), we choose \(a_{i}=b_{i}=c_{i}=0\) when \((\lambda _{i},w_{i})= (0,0)\) and \(\theta =0\). Our aim is mainly to verify the reliability of the method, and compare the iteration numbers with the results presented in [10]. In order to perform a fair computation with the results in [10], we choose the same stopping criterion, i.e., let

$$ Y(x,\lambda ,\nu )=\left \Vert \begin{pmatrix} L(x,\lambda ,\nu ) \\ S(\lambda ,-p(x))\end{pmatrix} \right \Vert _{\infty }, $$

and choose the termination criterion to be \(Y(x^{k},\lambda ^{k},\nu ^{k})\le 10^{-4}\). The iteration is also stopped if the number of iterations exceeds 500 or the stepsize \(t_{k}\) computed at Step 3 is less than 10−6.

We denote Algorithm 2.2 proposed in [10] by SSN, and compare our method with SSN. The results are list in Table 1. From Table 1, for problems that can be solved by SSN, they can also be solved by our method with almost the same iteration numbers except problems Box2B, Box3A, KunR11, KunR12, KunR21 and KunR22. However, our method can solve the problems RHS1A1, RHS1B1, RHS2A1, RHS2B1 and Wal3, which cannot be solved by SSN.

Table 1 Test results for Algorithm 1 and SSN

We also compare our method with the interior point method (denoted by IP) proposed in [12] from the iteration number and CPU time. For IP, we use the same parameters presented in [12] and the results are list in Table 2. From Table 2, we can see that our method is much more effective than IP for most problems.

Table 2 Test results for Algorithm 1 and IP

We also consider other problems in QVILIB which are not test in Tables 1 and 2, including the QVIs with equality constraints, that is, Problems LunSS1 to Scrim12 in Table 3. As we can see from the table, Algorithm 1 can solve over half of those problems effectively.

Table 3 Test results for rest QVIs in QVILIB

We tried to make some modifications to the algorithm for those problems that cannot be solved. Specifically, when calculating the Jacobian of \(\tilde{H}(\tilde{z})\), we use \(\mathit{JF}(x)\) to approximate JL̃. For now, we cannot prove the convergence of the modified algorithm. However, it is interesting to find that the modified algorithm can find a solution for some problems, such as MoveSet3A1, MoveSet3A2, MoveSet3B1 and MoveSet3B2. The results are presented in Table 4.

Table 4 Test results for modified algorithm

Figures 14 display the performance of our method on the problems BiLin1A, Movset4A1, OutZ40 and Wal2. The vertical axis in those figures represents the value of Y and the horizontal axis represents the iteration number. As we can see from the figures, with the increase of the iteration numbers, the value of Y decrease.

Figure 1
figure 1

Algorithm 1 for BiLin1A

Figure 2
figure 2

Algorithm 1 for Moveset4A1

Figure 3
figure 3

Algorithm 1 for OutZ40

Figure 4
figure 4

Algorithm 1 for Wal2

Conclusion Remarks

In this paper, we have studied the numerical solution of QVI. We obtain the KKT system of a QVI and present a semismooth Newton method to solve the equations. We also establish its global convergence. Numerical results show that the performance of the proposed algorithm is promising.