# Convergence of inexact Newton methods for generalized equations

## Authors

- First Online:

- Received:
- Accepted:

DOI: 10.1007/s10107-013-0664-x

- Cite this article as:
- Dontchev, A.L. & Rockafellar, R.T. Math. Program. (2013) 139: 115. doi:10.1007/s10107-013-0664-x

- 5 Citations
- 386 Views

## Abstract

### Keywords

Inexact Newton methodGeneralized equationsMetric regularityMetric subregularityVariational inequalityNonlinear programming### Mathematics Subject Classification (2000)

49J5349K4049M3765J1590C31## 1 Introduction

^{1}have been used to describe in a unified way various problems such as equations (\(F\equiv 0\)), inequalities (\(Y = \mathbb{R }^m\) and \(F \equiv \mathbb{R }^m_{\scriptscriptstyle +}\)), variational inequalities (\(F\) the normal cone mapping \(N_C\) of a convex set \(C\) in \(Y\) or more broadly the subdifferential mapping \(\partial g\) of a convex function \(g\) on \(Y\)), and in particular, optimality conditions, complementarity problems and multi-agent equilibrium problems.

*Throughout*, \(X,\,Y\)*and*\(P\)*are (real) Banach spaces, unless stated otherwise*. *For the generalized equation* (1) *we assume that the function*\(f\)*is continuously Fréchet differentiable everywhere with derivative mapping*\(Df\)*and the mapping*\(F\)*has closed nonempty graph*.^{2}

Two issues are essential to assessing the performance of any iterative method: convergence of a sequence it generates, but even more fundamentally, its ability to produce an infinite sequence at all. With iteration (5) in particular there is the potential difficulty that a stage might be reached in which, given \(x_k\), there is no \(x_{k+1}\) satisfying the condition in question, and the calculations come to a halt. When that is guaranteed not to happen, we can speak of the method as being *surely executable*.

In this paper, we give conditions under which the method (5) is surely executable and every sequence generated by it converges with either q-linear, q-superlinear, or q-quadratic rate, provided that the starting point is sufficiently close to the reference solution. We recover, through specialization to (4), convergence results given in [5] and [14]. The utilization of *metric regularity* properties of set-valued mappings is the key to our being able to handle generalized equations as well as ordinary equations. Much about metric regularity is laid out in our book [9], but the definitions will be reviewed in Sect. 2.

The extension of the *exact* Newton iteration to generalized equations goes back to the PhD thesis of Josephy [13], who proved existence and uniqueness of a quadratically convergent sequence generated by (2) under the condition of strong metric regularity of the mapping \(f+F\). We extend this here to inexact Newton methods of the form (5) and also explore the effects of weaker regularity assumptions.

*semistable*, a property introduced in [4] which is related but different from the regularity properties considered in the present paper. Most importantly, it is assumed in [12, Theorem 2.1] that the mapping \(R_k\) in (5) does not depend on \(k\) and the following conditions hold:

- (a)
For every \(u\) near \(\bar{x}\) there exists \(x(u)\) solving \((f(u)+Df(u)(x-u) +F(x))\cap R(u, x)\ne \emptyset \) such that \(x(u) \rightarrow \bar{x}\) as \(u \rightarrow \bar{x}\);

- (b)
Every \(\omega \in (f(u)+Df(u)(x-u) +F(x))\cap R(u, x)\) satisfies \(\Vert \omega \Vert = o(\Vert x-u\Vert + \Vert u-\bar{x}\Vert )\) uniformly in \(u \in X \) and \(x\) near \(\bar{x}\).

*never*holds. Under conditions (a) and (b) above it is demonstrated in [12, Theorem 2.1] that there exists \(\delta > 0\) such that, for any starting point close enough to \(\bar{x}\), there exists a sequence \(\{x_k\}\) satisfying (5) and the bound \(\Vert x_{k+1}-x_k\Vert \le \delta \); moreover, each such sequence is superlinearly convergent to \(\bar{x}\). It is not specified however in [12] how to find a constant \(\delta \) in order to identify a convergent sequence.

In contrast to Izmailov and Solodov [12], we show here that under strong metric subregularity only for the mapping \(f +F\) plus certain conditions for the sequence of mappings \(R_k\), *all sequences* generated by the method (5) and staying sufficiently close to a solution \(\bar{x}\), converge to \(\bar{x}\) at a rate determined by a bound on \(R_k\). In particular, we recover the results in [5] and [14]. Strong subregularity of \(f+F\) alone is however not sufficient to guarantee that there exist infinite sequences generated by the method (5) for any starting point close to \(\bar{x}\).

To be more specific about the pattern of assumptions on which we rely, we focus on a particular solution \(\bar{x}\) of the generalized equation (1), so that the graph of \(f+F\) contains \((\bar{x},0)\), and invoke properties of *metric regularity, strong metric subregularity* and *strong metric regularity* of \(f+F\) at \(\bar{x}\) for \(0\) as quantified by a constant \(\lambda \). Metric regularity of \(f+F\) at \(\bar{x}\) for \(0\) is equivalent to a property we call *Aubin continuity* of \((f+F)^{-1}\) at \(0\) for \(\bar{x}\). However, we get involved with Aubin continuity in another way, more directly. Namely, we assume that the mapping \((u,x)\mapsto R_k(u,x)\) has the *partial* Aubin continuity property in the \(x\) argument at \(\bar{x}\) for \(0\), uniformly in \(k\) and \(u\) near \(\bar{x}\), as quantified by a constant \(\mu \) such that \(\lambda \mu <1\).

In that setting in the case of (plain) metric regularity and under a bound for the inner distance \(d(0,R_k(u, \bar{x}))\), we show that for any starting point close enough to \(\bar{x}\) the method (5) is surely executable and moreover generates at least one sequence which is linearly convergent. In this situation however, the method might also generate, through nonuniqueness, a sequence which is not convergent at all. This kind of result for the *exact* Newton method (2) was first obtained in [6]; for extensions see e.g. [11] and [3].

We further take up the case when the mapping \(f+F\) is strongly metrically subregular, making the stronger assumption on \(R_k\) that the outer distance \(d^{\scriptscriptstyle +}(0, R_k(u,x))\) goes to zero as \((u,x) \rightarrow (\bar{x}, \bar{x})\) for each \(k\), entailing \(R_k(\bar{x}, \bar{x}) = \{0\}\), and also that, for a sequence of scalars \(\gamma _k\) and \(u\) close to \(\bar{x}\), we have \(d^{\scriptscriptstyle +}(0,R_k(u,\bar{x})) \le \gamma _k\Vert u-\bar{x}\Vert ^p\) for \(p=1\), or instead \(p=2\). Under these conditions, we prove that every sequence generated by the iteration (5) and staying close to the solution \(\bar{x}\), converges to \(\bar{x}\) q-linearly \((\gamma _k \) bounded and \(p=1)\), q-superlinearly \((\gamma _k \rightarrow 0\) and \(p=1)\) or q-quadratically \((\gamma _k \) bounded and \(p=2)\). The strong metric subregularity, however, does not prevent the method (5) from perhaps getting “stuck” at some iteration and thereby failing to produce an infinite sequence.

Finally, in the case of strong metric regularity, we can combine the results for metric regularity and strong metric subregularity to conclude that there exists a neighborhood of \(\bar{x}\) such that, from any starting point in this neighborhood, the method (5) is surely executable and, although the sequence it generates may be not unique, every such sequence is convergent to \(\bar{x}\) either q-linearly, q-superlinearly or q-quadratically, depending on the bound for \(d^{\scriptscriptstyle +}(0, R_k(u,\bar{x}))\) indicated in the preceding paragraph.

For the case of an equation \(f=0\) with a smooth \(f:\mathbb{R }^n \rightarrow \mathbb{R }^n\) near a solution \(\bar{x}\), each of the three metric regularity properties we employ is equivalent to the nonsingularity of the Jacobian of \(f\) at \(\bar{x}\), as assumed in Dembo et. al. [5]. Even in this case, however, our convergence results extend those in [5] by passing to Banach spaces and allowing broader representations of inexactness.

In the recent paper [1], a model of an inexact Newton method was analyzed in which the sequence of mappings \(R_k\) in (5) is just a sequence of elements \(r_k \in Y\) that stand for error in computations. It is shown under metric regularity of the mapping \(f+F\) that if the iterations can be continued without getting stuck, and \(r_k\) converges to zero at certain rate, there exists a sequence of iterates \(x_k\) which is convergent to \(\bar{x}\) with the same r-rate as \(r_k\). This result does not follow from ours. On the other hand, the model in [1] does not cover the basic case in [5] whose extension has been the main inspiration of the current paper.

There is a vast literature on inexact Newton-type method for solving equations which employs representations of inexactness other than that in Dembo et. al. [5], see e.g. [2] and the references therein.

In the following section we present background material and some technical results used in the proofs. Section 3 is devoted to our main convergence results. In Sect. 4 we present applications. First, we recover there the result in [5] about linear convergence of the iteration (4). Then we deduce convergence of the exact Newton method (2), slightly improving previous results. We then discuss an inexact Newton method for a variational inequality which extends the model in [5]. Finally, we establish quadratic convergence of the sequential quadratically constrained quadratic programming method.

## 2 Background on metric regularity

Let us first fix the notation. We denote by \(d(x,C)\) the inner distance from a point \(x \in X\) to a subset \(C \subset X\); that is \(d(x,C) = \inf \,\{ \Vert x-x^{\prime }\Vert \,\big |\,x^{\prime }\in C\}\) whenever \(C \ne \emptyset \) and \(d(x, \emptyset ) = \infty \), while \(d^{\scriptscriptstyle +}(x,C)\) is the outer distance, \(d^{\scriptscriptstyle +}(x,C) = \sup \,\{ \Vert x-x^{\prime }\Vert \,\big |\,x^{\prime }\in C\}\). The excess from a set \(C\) to a set \(D\) is \(e(C,D) = \sup _{x \in C}d(x, D)\) under the convention \(e(\emptyset ,D)=0\) for \(D \ne \emptyset \) and \(e(D,\emptyset )=+\infty \) for any \(D\). A set-valued mapping \(F\) from \(X\) to \(Y\), indicated by , is identified with its graph \({\mathop {\text{ gph }}\nolimits } F =\{(x,y) \in X\times Y \,|\, y \in F(x)\} \). It has effective domain \({\mathop {\text{ dom }}\nolimits } F=\big \{\,x\in X{\,\big |\,} F(x)\ne \emptyset {\big \}}\) and effective range \({\mathop {\text{ rge }}\nolimits } F= {\big \{\,} y\in Y {\,\big |\,} \exists \, x\,\text{ with }\, F(x)\ni y{\big \}}\). The inverse of a mapping is obtained by reversing all pairs in the graph; then \({\mathop {\text{ dom }}\nolimits } F^{-1}= {\mathop {\text{ rge }}\nolimits } F\).

We start with the definitions of three regularity properties which play the main roles in this paper. The reader can find much more in the book [9], most of which is devoted to these properties.

**Definition 1**

*metric regularity*) Consider a mapping and a point \((\bar{x}, \bar{y}) \in X\times Y\). Then \(H\) is said to be

*metrically regular*at \(\bar{x}\) for \(\bar{y}\) when \(\bar{y}\in H(\bar{x})\) and there is a constant \(\lambda > 0\) together with neighborhoods \(U\) of \(\bar{x}\) and \(V\) of \(\bar{y}\) such that

*linear openness*of \(H\) and to

*Aubin continuity*of the inverse \(H^{-1}\), both with the same constant \(\lambda \) but perhaps with different neighborhoods \(U\) and \(V\). Recall that a mapping is said to be

*Aubin continuous*(or have the Aubin property) at \(\bar{y}\) for \(\bar{x}\) if \(\bar{x}\in S(\bar{y})\) and there exists \(\lambda >0\) together with neighborhoods \(U\) of \(\bar{x}\) and \(V\) of \(\bar{y}\) such that

*partially Aubin continuous*at \(\bar{y}\) for \(\bar{x}\) uniformly in \(p\) around \(\bar{p}\) if \(\bar{x}\in T(\bar{y},\bar{p})\) and there exist \(\lambda >0\) and neighborhoods \(U\) of \(\bar{x},\,V\) of \(\bar{y}\) and \(Q\) of \(\bar{p}\) such that

**Definition 2**

(*strong metric regularity*) Consider a mapping and a point \((\bar{x}, \bar{y}) \in X\times Y\). Then \(H\) is said to be *strongly metrically regular* at \(\bar{x}\) for \(\bar{y}\) when \(\bar{y}\in H(\bar{x})\) and there is a constant \(\lambda > 0\) together with neighborhoods \(U\) of \(\bar{x}\) and \(V\) of \(\bar{y}\) such that (6) holds together with the property that the mapping \(U\ni x \mapsto H^{-1}(x)\cap V\) is single-valued.

When a mapping \(y \mapsto S(y)\cap U^{\prime }\) is single-valued and Lipschitz continuous on \(V^{\prime }\), for some neighborhoods \(U^{\prime }\) and \(V^{\prime }\) of \(\bar{x}\) and \(\bar{y}\), respectively, then \(S\) is said to have a *Lipschitz localization* around \(\bar{y}\) for \(\bar{x}\). Strong metric regularity of a mapping \(H\) at \(\bar{x}\) for \(\bar{y}\) is then equivalent to the existence of a Lipschitz localization of \(H^{-1}\) around \(\bar{y}\) for \(\bar{x}\). A mapping \(S\) is Aubin continuous at \(\bar{y}\) for \(\bar{x}\) with constant \(\lambda \) and has a single-valued localization around \(\bar{y}\) for \(\bar{x}\) if and only if \(S\) has a Lipschitz localization around \(\bar{y}\) for \(\bar{x}\) with Lipschitz constant \(\lambda \).

Strong metric regularity is the property which appears in the classical inverse function theorem: when \(f:X\rightarrow Y\) is smooth around \(\bar{x}\) then \(f\) is strongly metrically regular if and only if \(Df(\bar{x})\) is invertible.^{3} In Sect. 4 we will give a sufficient condition for strong metric regularity of the variational inequality representing the first-order optimality condition for the standard nonlinear programming problem.

Our next definition is a weaker form of strong metric regularity.

**Definition 3**

*strong metric subregularity*) Consider a mapping and a point \((\bar{x}, \bar{y}) \in X\times Y\). Then \(H\) is said to be

*strongly metrically subregular*at \(\bar{x}\) for \(\bar{y}\) when \(\bar{y}\in H(\bar{x})\) and there is a constant \(\lambda > 0\) together with neighborhoods \(U\) of \(\bar{x}\) such that

In the proofs of convergence of the inexact Newton method (5) given in Sect. 3 we use some technical results. The first is the following coincidence theorem from [7] (with a minor adjustment communicated to the authors by A. Ioffe):

**Theorem 1**

- (a)
\(d(\bar{y}, \varPhi (\bar{x})) < c(1 - \kappa \mu )/(2\mu )\);

- (b)
\(d(\bar{x}, \varUpsilon (\bar{y})) < c(1 - \kappa \mu )/2\);

- (c)
\(e(\varPhi (u)\cap {I\!\!B}_{c/\mu }(\bar{y}), \varPhi (v)) \le \kappa \, \rho (u,v)\) for all \(u, v \in {I\!\!B}_c(\bar{x})\) such that \(\rho (u,v) \le c(1-\kappa \mu )/\mu \);

- (d)
\(e(\varUpsilon (u)\cap {I\!\!B}_c(\bar{x}), \varUpsilon (v)) \le \mu \, \rho (u,v)\) for all \(u, v \in {I\!\!B}_{c/\mu }(\bar{y})\) such that \(\rho (u,v) \le c(1-\kappa \mu )\).

To prove the next technical result given below as Corollary 1, we apply the following extension of [1, Theorem 2.1], where the case of strong metric regularity was not included but its proof is straightforward. This is actually a “parametric” version of the Lyusternik-Graves theorem; for a basic statement see [9, Theorem 5E.1].

**Theorem 2**

From this theorem we obtain the following extended version of Corollary 3.1 in [1], the main difference being that here we assume that \(f\) is merely continuously differentiable near \(\bar{x}\), not necessarily with Lipschitz continuous derivative. Here we also suppress the dependence on a parameter, which is not needed, present the result in the form of Aubin continuity, and include the case of strong metric regularity; all this requires certain modifications in the proof, which is therefore presented in full.

**Corollary 1**

*Proof*

If \(f+F\) is strongly metrically regular, then we repeat the above argument but now by applying the strong regularity version of Theorem 2, obtaining constants \(a^{\prime }\) and \(b^{\prime }\) that might be different from \(a\) and \(b\) for metric regularity. \(\square \)

The following theorem is a “parametric” version of [9, Theorem 3I.6]:

**Theorem 3**

*Proof*

We will use the following corollary of Theorem 3.

**Corollary 2**

*Proof*

## 3 Convergence of the inexact Newton method

**Theorem 4**

*Proof*

The induction step repeats the argument used in the first step. Having iterates \(x_i \in {I\!\!B}_{a^{\prime }}(\bar{x})\) from (5) for \(i=0,1\dots , k-1\) with \(x_0 = u\), we apply Theorem 1 with \(c: = t\Vert x_k-\bar{x}\Vert , \) obtaining the existence of \(x_{k+1}\) satisfying (5) which is in \( {I\!\!B}_{c}(\bar{x})\subset {I\!\!B}_{a^{\prime }}(\bar{x})\) and \(\Vert x_{k+1}-\bar{x}\Vert \le t\Vert x_k-\bar{x}\Vert \) for all \(k\). \(\square \)

If we assume that in addition \(Df\) is Lipschitz continuous near \(\bar{x}\) and also \(0 \in R_k(u,x)\) for any \((u,x)\) near \((\bar{x}, \bar{x})\), the above theorem would follow from [9, Theorem 6C.6], where the existence of a quadratically convergent sequence is shown generated by the *exact* Newton method (2). Indeed, in this case any sequence that satisfies (2) will also satisfy (5).

Under metric regularity of the mapping \(f+F\), even the exact Newton method (2) may generate a sequence which is not convergent. The simplest example of such a case is the inequality \(x \le 0\) in \(\mathbb R \) which can be cast as the generalized equation \(0 \in x+\mathbb{R }_{+}\) with a solution \(\bar{x}= 0\). Clearly the mapping \(x \mapsto x + \mathbb{R }_{+}\) is metrically regular at \(0\) for \(0\) but not strongly metrically subregular there. The (exact) Newton method has the form \(0 \in x_{k+1} +\mathbb{R }_{+}\) and it generates both convergent and non-convergent sequences from any starting point.

The following result shows that strong metric subregularity of \(f+F\), together with assumptions for the mappings \(R_k\) that are stronger than in Theorem 4, implies convergence of any sequence generated by the method (5) which starts close to \(\bar{x}\), but cannot guarantee that the method is surely executable.

**Theorem 5**

- (i)
- Let \(t\in (0,1)\) and let there exist positive \(\gamma < t(1-\lambda \mu )/\lambda \) and \(\beta \) such that$$\begin{aligned} d^{\scriptscriptstyle +}(0, R_k(u, \bar{x}))\le \gamma \Vert u-\bar{x}\Vert \quad \quad \text{ for } \text{ all }\quad u \in {I\!\!B}_\beta (\bar{x})\,k=0,1,\dots . \end{aligned}$$(21)

- (ii)
- Let there exist a sequences of positive scalars \(\gamma _k {\searrow } 0\), with \(\gamma _0<(1-\lambda \mu )/\lambda \), and \(\beta >0\) such that$$\begin{aligned} d^{\scriptscriptstyle +}(0, R_k(u, \bar{x}))\le \gamma _k \Vert u-\bar{x}\Vert \quad \quad \text{ for } \text{ all }\quad u \in {I\!\!B}_{\beta }(\bar{x}) \ k=0,1,\dots . \end{aligned}$$(23)

- (iii)
- Suppose that the derivative mapping \(Df\) is Lipschitz continuous near \(\bar{x}\) with Lipschitz constant \(L\) and let there exist positive scalars \(\gamma \) and \(\beta \) such that$$\begin{aligned} d^{\scriptscriptstyle +}(0, R_k(u, \bar{x}))\le \gamma \Vert u-\bar{x}\Vert ^2 \quad \quad \text{ for } \text{ all }\quad u \in {I\!\!B}_\beta (\bar{x})\,k=0,1,\dots . \end{aligned}$$(25)

*Proof of (i)*

*Proof of (ii)*

Choose a sequence \(\gamma _k {\searrow } 0\) with \(\gamma _0 < (1-\lambda \mu )/\lambda \) and \(\beta > 0 \) such that (23) holds and then pick \(\kappa >\lambda \) such that \(\kappa \mu <1\) and \(\gamma _0 < (1-\kappa \mu )/\kappa .\) As in the proof of (i), choose \(a\le \beta \) and \(b\) such that (15) and (17) are satisfied and since \(R_k(\bar{x}, \bar{x}) = \{0\}\) from (25), adjust \(a\) so that \(R_k(u,x) \subset {I\!\!B}_b(0)\) whenever \(u,x \in {I\!\!B}_a(\bar{x})\).

*Proof of (iii)*

We come to the central result of this paper, whose proof is a combination of the two preceding theorems.

**Theorem 6**

- (i)
Let \(t\in (0,1)\) and let there exist positive \(\gamma < t(1-\lambda \mu )\min \{1/\kappa , 1/\mu \}\) and \(\beta \) such that the condition (21) in Theorem 5 holds. Then there exists a neighborhood \(O\) of \(\bar{x}\) such that for any starting point \(x_0 \in O\) the inexact Newton method (5) is sure to generate a sequence which stays in \(O\) and converges to \(\bar{x}\), which may be not unique, but every such sequence is convergent to \(\bar{x}\) q-linearly in the way described in (22);

- (ii)
Let there exist sequences of positive scalars \(\gamma _k {\searrow } 0\), with \(\gamma _0<(1-\lambda \mu )/\lambda \), and \(\beta \) such that condition (23) in Theorem 5 is satisfied. Then there exists a neighborhood \(O\) of \(\bar{x}\) such that for any starting point \(x_0 \in O\) the inexact Newton method (5) is sure to generate a sequence which stays in \(O\) and converges to \(\bar{x}\), which may be not unique, but every such sequence is convergent to \(\bar{x}\) q-superlinearly;

- (iii)
Suppose that the derivative mapping \(Df\) is Lipschitz continuous near \(\bar{x}\) with Lipschitz constant \(L\) and let there exist positive scalars \(\gamma \) and \(\beta \) such that (25) in Theorem 5 holds. Then for every constant \(C\) satisfying (26) there exists a neighborhood \(O\) of \(\bar{x}\) such that for any starting point \(x_0 \in O\) the inexact Newton method (5) is sure to generate a sequence which stays in \(O\) and converges to \(\bar{x}\), which may be not unique, but every such sequence is convergent q-quadratically to \(\bar{x}\) in the way described in (27).

*Proof*

The statements in (i), (ii) and (iii) follow immediately by combining Theorem 5 and Theorem 4. Let \(R_k\) have a single-valued localization at \((\bar{x}, \bar{x})\) for \(0\). Choose \(a\) and \(b\) as above and adjust them so that \(R_k(u,x) \in {I\!\!B}_b(0)\) is a singleton for all \(u, x \in {I\!\!B}_a(\bar{x})\). Recall that in this case the mapping \(x \mapsto R_0(u,x)\cap {I\!\!B}_b(0)\) is Lipschitz continuous on \({I\!\!B}_a((\bar{x},\bar{x}))\) with constant \(\mu \). Then, by observing that \(x_1 = G_{u}^{-1}(-R_0(u,x_1)\cap {I\!\!B}_b(0))\cap {I\!\!B}_a(\bar{x})\) and the mapping \(x \mapsto G_{u}^{-1}(-R_0(u,x)\cap {I\!\!B}_b(0))\cap {I\!\!B}_a(\bar{x})\) is Lipschitz continuous on \({I\!\!B}_a(\bar{x})\) with a Lipschitz constant \(\kappa \mu < 1\), hence it is a contraction, we conclude that there is only one Newton iterate \(x_1\) from \(x_0\) which is in \({I\!\!B}_a(\bar{x})\). By induction, the same argument works for each iterate \(x_k\). \(\square \)

## 4 Applications

For the inexact method (4) with \(f\) having Lipschitz continuous derivative near \(\bar{x}\), it is proved in [14, Theorem 6.1.4] that when \(\eta _k {\searrow } 0\) with \(\eta _0 <\bar{\eta }<1\), any sequence of iterates \(\{x_k\}\) starting close enough to \(\bar{x}\) is q-superlinearly convergent to \(\bar{x}\). By choosing \(\gamma _0,\,\beta \) and \(\lambda \) as \(\gamma ,\,\beta \) and \(\lambda \) in the preceding paragraph, and then applying (32) with \(\gamma \) replaced by \(\gamma _k\), this now follows from Theorem 6(ii) without assuming Lipschitz continuity of \(Df\).

If we take \(R_k(u,x) = {I\!\!B}_{\eta \Vert f(u)\Vert ^2}(0)\), we obtain from Theorem 6(iii) q-quadratic convergence, as claimed in [14, Theorem 6.1.4]. We note that Dembo et al. [5] gave results characterizing the rate of convergence in terms of the convergence of relative residuals.

When \(R_k \equiv 0\) in (5), we obtain from the theorems in Sect. 3 convergence results for the exact Newton iteration (2) as shown in Theorem 7 below. The first part of this theorem is a new result which claims superlinear convergence of any sequence generated by the method under strong metric subregularity of \(f+F\). Under the additional assumption that the derivative mapping \(Df\) is Lipschitz continuous around \(\bar{x}\) we obtain q-quadratic convergence; this is essentially a known result, for weaker versions see, e.g., [1, 6] and [9, Theorem 6D.1].

**Theorem 7**

- (i)
There exists a neighborhood \(O\) of \(\bar{x}\) such that for any starting point \(x_0 \in O\) every sequence \(\{x_k\}\) generated by (2) starting from \(x_0\) and staying in \(O\) is convergent q-superlinearly to \(\bar{x}\).

- (ii)
Suppose that the derivative mapping \(Df\) is Lipschitz continuous near \(\bar{x}\). There exists a neighborhood \(O\) of \(\bar{x}\) such that for any starting point \(x_0 \in O\) every sequence \(\{x_k\}\) generated by (2) and staying in \(O\) is q-quadratically convergent to \(\bar{x}\).

*polyhedral*set \(C\subset \mathbb{R }^n\):

- (a)
the gradients \(\nabla _x g_i(\bar{x})\) for \(i\in I\) are linearly independent,

- (b)
\(\langle w,\nabla ^2_{xx} L(\bar{x},\bar{y})w\rangle >0\) for every nonzero \(w\in M^{{\scriptscriptstyle +}}\) with \(\nabla _{xx}^2L(\bar{x},\bar{y})w {\perp } M^{{\scriptscriptstyle -}}\).

*sequential quadratic programming*(SQP) method.

*sequential quadratically constrained quadratic programming*method. This method has attracted recently the interest of people working in numerical optimization, mainly because at each iteration it solves a second-order cone programming problem to which efficient interior-point methods can be applied. The main idea of the method is to use second-order expansions for the constraint functions, thus obtaining that at each iteration one solves the following optimization problem with a quadratic objective function and quadratic constraints:

In this final section we have presented applications of the theoretical results developed in the preceding sections to standard, yet basic, problems of solving equations, variational inequalities and nonlinear programming problems. However, there are a number of important variational problems that go beyond these standard models, such as problems in semidefinite programming, co-positive programming, not to mention optimal control and PDE constrained optimization, for which inexact strategies might be very attractive numerically and still wait to be explored. Finally, we did not consider in this paper ways of globalization of inexact Newton methods, which is another venue for further research.

Since our analysis is local, one could localize these assumptions around a solution \(\bar{x}\) of (1). Also, in some of the presented results, in particular those involving strong metric subregularity, it is sufficient to assume continuity of \(Df\) only at \(\bar{x}\). Since the paper is already quite involved technically, we will not go into these refinements in order to simplify the presentation as much as possible.

The classical inverse function theorem actually gives us more: it shows that the single-valued localization of the inverse is smooth and provides also the form of its derivative.

## Acknowledgments

The authors wish to thank the referees for their valuable comments on the original submission.