JOHANNES KEPLER UNIVERSITY LINZ Institute of Computational Mathematics An SQP Method for Mathematical Programs with Vanishing Constraints with Strong Convergence Properties

We propose an SQP algorithm for mathematical programs with vanishing constraints which solves at each iteration a quadratic program with linear vanishing constraints. The algorithm is based on the newly developed concept of Q-stationarity [5]. We demonstrate how QM -stationary solutions of the quadratic program can be obtained. We show that all limit points of the sequence of iterates generated by the basic SQP method are at least Mstationary and by some extension of the method we also guarantee the stronger property of QM -stationarity of the limit points.


Introduction
Consider the following mathematical program with vanishing constraints (MPVC) with continuously differentiable functions f , h i , i ∈ E, g i , i ∈ I, G i , H i , i ∈ V and finite index sets E, I and V .
Theoretically, MPVCs can be viewed as standard nonlinear optimization problems, but due to the vanishing constraints, many of the standard constraint qualifications of nonlinear programming are violated at any feasible pointx with H i (x) = G i (x) = 0 for some i ∈ V . On the other hand, by introducing slack variables, MPVCs may be reformulated as so-called mathematical programs with complementarity constraints (MPCCs), see [7]. However, this approach is also not satisfactory as it has turned out that MPCCs are in fact even more difficult to handle than MPVCs. This makes it necessary, both from a theoretical and numerical point of view, to consider special tailored algorithms for solving MPVCs. Recent numerical methods follow different directions. A smoothing-continuation method and a regularization approach for MPCCs are considered in [6,10] and a combination of these techniques, a smoothing-regularization approach for MPVCs is investigated in [2]. In [8,3] the relaxation method has been suggested in order to deal with the inherent difficulties of MPVCs.
In this paper, we carry over a well known SQP method from nonlinear programming to MPVCs. We proceed in a similar manner as in [4], where SQP method for MPCCs was introduced by Benko and Gfrerer. The main task of our method is to solve in each iteration step a quadratic program with linear vanishing constraints, so-called auxiliary problem. Then we compute the next iterate by reducing a certain merit function along some polygonal line which is given by the solution respect to (β 1 , β 2 ).
5. Q M -stationary, if it is Q-stationary and at least one of the multipliers λ and λ fulfills Mstationarity condition (5).
The first implication follows from the fact that the multiplier corresponding to S-stationarity fulfills the requirements for both λ and λ. The third implication holds because for (β 1 , β 2 ) = (∅, I 00 (x)) the multiplier λ fulfills (5) since λ G i = 0 for i ∈ I 00 (x). Note that the S-stationarity conditions are nothing else than the Karush-Kuhn-Tucker conditions for the problem (1). As we will demonstrate in the next theorems, a local minimizer is S-stationary only under some comparatively stronger constraint qualification, while it is Q Mstationary under very weak constraint qualifications. Before stating the theorems we recall some common definitions. Denoting F(x) := (h(x) T , g(x) T , F (x) T ) T , we see that problem (1) can be rewritten as Recall that the contingent (also tangent) cone to a closed set Ω ⊂ R m at u ∈ Ω is defined by The linearized cone to Ω V atx ∈ Ω V is then defined as T lin Ω V (x) := {d ∈ R n | ∇F(x)d ∈ T D (F(x))}. Further recall thatx ∈ Ω V is called B-stationary if Every local minimizer is known to be B-stationary.
Definition 2.2. Letx be feasible for (1), i.ex ∈ Ω V . We say that the generalized Guignard constraint qualification (GGCQ) holds atx, if the polar cone of T Ω V (x) equals the polar cone of T lin Ω V (x).
Note that these two theorems together also imply that a local minimizerx ∈ Ω V is S-stationary provided GGCQ is fulfilled atx and there exists a partition (β 1 , β 2 ) ∈ P(I 00 (x)), such that for every j ∈ β 1 there exists z j fulfilling (9) andz fulfilling (10).
Moreover, note that (9) and (10) are fulfilled for every partition (β 1 , β 2 ) ∈ P(I 00 (x)) e.g. if the gradients of active constraints are linearly independent. On the other hand, in the special case of partition (∅, I 00 (x)) ∈ P(I 00 (x)), this conditions read as the requirement that the system has a solution, which resembles the well-known Mangasarian-Fromovitz constraint qualification (MFCQ) of nonlinear programming and it seems to be a rather weak and possibly often fulfilled assumption.
Finally, we recall the definitions of normal cones. The regular normal cone to a closed set Ω ⊂ R m at u ∈ Ω can be defined as the polar cone to the tangent cone by The limiting normal cone to a closed set Ω ⊂ R m at u ∈ Ω is given by In case when Ω is a convex set, regular and limiting normal cone coincide with the classical normal cone of convex analysis, i.e.
Well-known is also the following description of the limiting normal cone We conclude this section by the following characterization of M-and Q-stationarity via limiting normal cone. Straightforward calculations yield that and hence the M-stationarity conditions (4) and (5) can be replaced by and the Q-stationarity conditions (4) and (6) can be replaced by where for (β 1 , β 2 ) ∈ P(I 00 (x)) we define Note also that for every i ∈ V we have 3 Solving the auxiliary problem In this section, we describe an algorithm for solving quadratic problems with vanishing constraints of the type Here the vector θ = (θ g , θ G , θ H ) ∈ {0, 1} |I|+2|V | =: B is chosen at the beginning of the algorithm such that some feasible point is known in advance, e.g. (s, δ) = (0, 1). The parameter ρ has to be chosen sufficiently large and acts like a penalty parameter forcing δ to be near zero at the solution. B is a symmetric positive definite n × n matrix, ∇f , ∇h i , ∇g i , ∇G i , ∇H i denote row vectors in R n and h i , g i , G i , H i are real numbers. Note that this problem is a special case of problem (1) and consequently the definition of Q− and Q M − stationarity as well as the definition of index sets (2) remain valid.
It turns out to be much more convenient to operate with a more general notation. Let us denote by F i := (−H i , G i ) T a vector in R 2 , by ∇F i := (−∇H T i , ∇G T i ) T a 2-by-n matrix and by P 1 := {0} × R and P 2 := R 2 − two subsets of R 2 . Note that for P given by (7) it holds that P = P 1 ∪ P 2 . The problem (18) can now be equivalently rewritten in a form For a given feasible point (s, δ) for the problem QP V C(ρ) we define the following index sets , where the index sets I 0+ (s, δ), I +0 (s, δ), I +− (s, δ), I 0− (s, δ), I 00 (s, δ) are given by (2).
Further, consider the distance function d defined by The following proposition summarizes some well-known properties of d.
In particular, 2. d(·, A) : R 2 → R + is Lipschitz continuous with Lipschitz modulus L = 1 and consequently 3. d(·, A) : Due to the disjunctive structure of the auxiliary problem we can subdivide it into several QP-pieces. For every partition (V 1 , V 2 ) ∈ P(V ) we define the convex quadratic problem At the solution (s, δ) of QP (ρ, V 1 ) there is a corresponding multiplier λ(ρ, V 1 ) = (λ h , λ g , λ H , λ G ) and a number λ δ ≥ 0 with λ δ δ = 0 fulfilling the KKT conditions: Since P 1 and P 2 are convex sets, the above normal cones are given by (12).
The definition of the problem QP (ρ, V 1 ) allows the following interpretation of Q-stationarity, which is a direct consequence of (15) and (16).
Finally, let us denote byδ(V 1 ) the objective value at a solution of the problem min (s,δ)∈R n+1 δ subject to the constraints of (23). (29) An outline of the algorithm for solving QP V C(ρ) is as follows.
Compute (s t+1 , δ t+1 ) as the solution and λ t+1 as the corresponding multiplier of the first problem with (s t+1 , δ t+1 ) = (s t , δ t ), set V t+1 1 to the corresponding index set and increase the counter t of pieces by 1. If δ t > δ t−1 , perform a restart: set ρ := ρρ and go to step 1. 3: Check the degeneracy: If δ t < ζ set N := t, stop the algorithm and return. Else if the non-degeneracy condition min{δ(I 1 (s t , δ t )),δ(I 1 (s t , δ t ) ∪ I 00 (s t , δ t ))} < ζ is fulfilled, perform a restart: set ρ := ρρ and go to step 1. Else stop the algorithm because of degeneracy.
We first summarize some consequences of the Initialization step. 1. Vector θ is chosen in a way that for all i ∈ V it holds that 2. Partition (V 1 1 , V 1 2 ) is chosen in a way that for j = 1, 2 it holds that (21) and (20).
Proof. 1. The algorithm is obviously finite unless we perform a restart and hence increase ρ. Thus we can assume that ρ is sufficiently large, say with C ρ (V 1 ) given by the previous lemma. However this means, taking into account also Proposi- Therefore we do not perform a restart in step 1 or step 2. On the other hand, since we enter step 3 with δ t =δ(I 1 (s t , δ t )) =δ(I 1 (s t , δ t ) ∪ I 00 (s t , δ t )), we either terminate the algorithm with δ t < ζ if the non-degeneracy condition (33) is fulfilled or we terminate the algorithm because of degeneracy. This finishes the proof.
2. The statement regarding stationarity follows easily from the fact that we enter step 3 of the algorithm only when (s, δ) is a solution of problems (32) and this means that it is also Q-stationary with respect to (∅, I 00 (s N , δ N )) by Lemma 3.1. Thus, (s, δ) is also Q M -stationary for problem (19). The claim about δ follows from the assumption that the Algorithm 3.1 is not terminated because of degeneracy.
We conclude this section with the following proposition that brings together the basic properties of the Algorithm 3.1. 1. For all t = 1, . . . , N the points (s t−1 , δ t−1 ) and (s t , δ t ) are feasible for the problem QP (ρ, V t 1 ) and the point (s t , δ t ) is also the solution of the convex problem QP (ρ, V t 1 ).

For all
3. There exists a constant C t , dependent only on the number of constraints, such that Proof. 1. By definitions of the problems QP V C(ρ) and QP (ρ, V 1 ) it follows that a point (s, δ), The point (s 0 , δ 0 ) is clearly feasible for QP (ρ, V 1 1 ) and hence as an induction hypothesis we assume that (s t−1 , δ t−1 ) is feasible for QP (ρ, V t 1 ). But then (s t , δ t ) is also feasible for QP (ρ, V t 1 ) and consequently also for QP V C(ρ) by its definition. Thus we obtain that (41) holds true with (s, δ) = (s t , δ t ) and V 1 = V t 1 and it remains to show that (41) also holds true with (s, δ) = (s t , δ t ) and is defined by one of the index sets of (31)-(32), in case V t+1 1 := V t 1 ∩ (I 1 (s, δ) ∪ I 00 (s, δ)) we use (41) with (s, δ) = (s t , δ t ) and V 1 = V t 1 to conclude that while in three remaining cases this follows directly. The induction now completes the argument. Obviously (s t , δ t ) is the solution of QP (ρ, V t 1 ) by definition. 2. Statement follows from δ 0 = 1, from the fact that we perform a restart whenever δ t > δ t−1 occurs and from the constraint −δ ≤ 0.
3. Since whenever the parameter ρ is increased the algorithm goes to the step 1 and thus the counter t of the pieces is reset to 0, it follows that after the last time the algorithm enters step 1 we keep ρ constant. It is obvious that all the index sets V t 1 are pairwise different implying that the maximum of switches to a new piece is 2 |V | .

The basic SQP algorithm for MPVC
An outline of the basic algorithm is as follows. Select a starting point x 0 ∈ R n together with a positive definite n × n matrix B 0 , a parameter ρ 0 > 0 and constants ζ ∈ (0, 1) andρ > 1. Select positive penalty parameters . Set the iteration counter k := 0. 2: Solve the Auxiliary problem: Compute new penalty parameters σ k . Set x k+1 := x k + s k where s k is a point on the polygonal line connecting the points s 0 , s 1 , . . . , s N such that an appropriate merit function depending on σ k is decreased. Set ρ k+1 := ρ, the final value of ρ in Algorithm 3.1. Update B k to get positive definite matrix B k+1 . Set k := k + 1 and go to step 2.
Remark 4.1. We terminate the Algorithm 4.1 only in the following two cases. In the first case no sufficient reduction of the violation of the constraints can be achieved. The second case will be satisfied only by chance when the current iterate is a Q M -stationary solution. Normally, this algorithm produces an infinite sequence of iterates and we must include a stopping criterion for convergence. Such a criterion could be that the violation of the constraints at some iterate is sufficiently small, where F i is given by (7) and the expected decrease in our merit function is sufficiently small, see Proposition 4.1 below.

The next iterate
Denote the outcome of Algorithm 3.1 at the k−th iterate by The new penalty parameters are computed by with maximum being taken over t ∈ {1, . . . ,

The merit function
We are looking for the next iterate at the polygonal line connecting the points s 0 i ∈ E, etc. and we further denote Lemma 4.1.
2. For every t ∈ {1, . . . , N k } the functionφ t k is a first order approximation of φ t k , that is Proof. 1. By convexity of P 1 and P 2 ,φ t k is convex because it is sum of convex functions. 2. By Lipschitz continuity of distance function with Lipschitz modulus L = 1 we conclude and hence the assertion follows.
We state now the main result of this subsection. For the sake of simplicity we omit the iteration index k in this part.
Proof. Fix t ∈ {1, . . . , N k } and note that because of s 0 = 0. For j = 0, 1 consider r t+j 1−j defined by (45). We obtain r t+j Using that s τ is the solution of QP (ρ, V τ 1 ) and multiplying the first order optimality condition (24) by (s τ − s τ −1 ) T yields Summing up the expression on the left hand side from τ = 1 to t, subtracting it from the right hand side of (48) and taking into account the identity we obtain for j = 0, 1 First, we claim that Consider i ∈ V and τ ∈ {1, . . . , t} with i ∈ V τ 1 . By the feasibility of (s τ , δ τ ) and (s τ −1 , δ τ −1 ) for (27) and (12) we conclude Analogous argumentation yields (52) also for i, τ with i ∈ V τ 2 and since V τ 1 , V τ 2 form a partition of V , the claimed inequality (51) follows.
Further, we claim that for j = 0, 1 it holds that i∈V t+j From feasibility of (s t , δ t ) for either QP (ρ, hence, using (34) and (22), Finally, we have due to the fact that V 1 1 , V 1 2 form a partition of V and (35). Similar arguments as above show Taking this into account and putting together (50), (51), (53) and (55) we obtain for j = 0, 1 and hence (46) and (47) follow by monotonicity of δ and (44). This completes the proof.

Searching for the next iterate
We choose the next iterate as a point from the polygonal line connecting the points s 0 k , . . . , s N k k . First we parametrize this line by its length as a curveŝ k : [0, 1] → R n in the following way. We define t k (1) := N k , for every γ ∈ [0, 1) we denote by t k (γ) the smallest number t such that S t k > γS N k k and we set α k (1) := 1, Now consider some sequence of positive numbers γ k 1 = 1, γ k 2 , γ k 3 , . . . with 1 >γ ≥ γ k j+1 /γ k j ≥ γ > 0 for all j ∈ N. Consider the smallest j, denoted by j(k) such that for some given constant Then the new iterate is given by x k+1 := x k +ŝ k (γ k j(k) ). The following relations are direct consequences of the properties of φ t k andφ t The last property holds due to Proposition 4.1 and Z k (γ) − Z k (0) = (1 − α k (γ))r (0) =φ 1 k (0). We recall that r t k,0 and r t k,1 are defined by (45). Lemma 4.2. The new iterate x k+1 is well defined.
Proof. In order to show that the new iterate is well defined, we have to prove the existence of some j such that (57) is fulfilled. Note that S , whenever γS N k k ≤ δ k . Since lim j→∞ γ k j = 0, we can choose j sufficiently large to fulfill γ k j S N k k < min{δ k , S t k (0) k } and then t k (γ k j ) = t k (0) and Then by second property of (58), (59), taking into account r (0)). Thus (57) is fulfilled for this j and the lemma is proved.

Convergence of the basic algorithm
We consider the behavior of the Algorithm 4.1 when it does not prematurely stop and it generates an infinite sequence of iterates Note that δ N k k < ζ. We discuss the convergence behavior under the following assumption. Assumption 1.
1. There exist constants C x , C s , C λ such that For our convergence analysis we need one more merit function Proof. The first claim follows from the definitions of Φ k and Y k and the estimate which holds by (20). The second claim follows from (35).
A simple consequence of the way that we define the penalty parameters in (42) is the following lemma.
Proof. Take an existedk from Lemma 4.4. Then we have for k ≥k Hence the sequence Φ k (x k ) is monotonically decreasing and therefore convergent, because it is bounded below by Assumption 1. Hence and the assertion follows. Proof. We prove (63) by contraposition. Assuming on the contrary that (63) does not hold, by taking into accountŶ k (1) −Ŷ k (0) ≤ 0 by Proposition 4.1, there exists a subsequence K = {k 1 , k 2 , . . .} such thatŶ k (1) −Ŷ k (0) ≤r < 0. By passing to a subsequence we can assume that for all k ∈ K we have k ≥k withk given by Lemma 4.4 and N k =N , where we have taken into account (40). By passing to a subsequence once more we can also assume that where r t k,1 and r t k,0 are defined by (45). Note thatrN 1 ≤r < 0. Let us first consider the caseSN = 0. There exists δ > 0 such that and this implies that for the next iterate we have j(k) = 1 and hence γ k j(k) = 1, contradicting (62). Now consider the caseS N = 0 and let us define the numberτ := max{t |S t = 0} + 1. Note that Proposition 4.1 yields and thereforer := max t>τr t < 0, wherer t := max{r t 0 ,r t 1 }. By passing to a subsequence we can assume that for every t >τ and every k ∈ K we have r t k,0 , r t k,1 ≤r t 2 . Now assume that for infinitely many k ∈ K we have γ k j(k) SN k ≥ Sτ k , i.e. t k (γ k j(k) ) >τ . Then we conclude contradicting (62). Hence for all but finitely many k ∈ K, without loss of generality for all k ∈ K, we have γ k j(k) SN k < Sτ k . There exists δ > 0 such that whenever γSN k ≤ δ. By eventually choosing δ smaller we can assume δ ≤ Sτ /2 and by passing to a subsequence if necessary we can also assume that for all k ∈ K we have Now let for each k the indexj(k) denote the smallest j with γ j SN k ≤ δ. It obviously holds that γ k j(k)−1 SN k > δ and by (67) we obtain by (67). Taking this into account together with (66) and γ k j(k) SN k ≤ δ we conclude . Now we can proceed as in the proof of Lemma 4.2 to show thatj(k) fulfills (57).
However, this yieldsj(k) ≥ j(k) by definition of j(k) and hence γ k showing t k (γ k j(k) ) = t k (γ k j(k) ) =τ . But then we also have α k (γ k j(k) ) ≥ α k (γ k j(k) ) ≥ γδ 4Sτ and from (57) we obtain contradicting (62) and so (63) is proved. Condition (64) now follows from (63) because we conclude Now we are ready to state the main result of this section. Proof. Letx denote a limit point of the sequence x k and let K denote a subsequence such that lim k K → ∞ x k =x. Further let λ be a limit point of the bounded sequence λ N k k and assume without loss of generality that lim First we show feasibility ofx for the problem (1) together with λ g i ≥ 0 = λ g i g i (x), i ∈ I and (λ H , λ G ) ∈ N P |V | (F (x)).
Consider i ∈ I. For all k it holds that Hence λ g i ≥ 0 = λ g i g i (x). Similar arguments show that for every i ∈ E we have Finally consider i ∈ V . Taking into account (22), (34) and δ N k k ≤ ζ we obtain Hence, ∇F i (x k )s N k k → 0 by Proposition 4.2 implies showing the feasibility ofx. Moreover, the previous arguments also implỹ Taking into account (14), the fact that λ N k k fulfills M-stationarity conditions at ( However, this together with (λ H,N k k , λ G,N k k ) K →(λ H , λ G ), (69), and (13) yield (λ H , λ G ) ∈ N P |V | (F (x)) and consequently (68) follows.
Moreover, by first order optimality condition we have for each k and by passing to a limit and by taking into account that B k s N k k → 0 by Proposition 4.2 we obtain Hence, invoking (14) again, this together with the feasibility ofx and (68) implies M-stationarity ofx and the proof is complete.

The extended SQP algorithm for MPVC
In this section we investigate what can be done in order to secure Q M -stationarity of the limit points. First, note that to prove M-stationarity of the limit points in Theorem 4.1 we only used that it is sufficient to exploit only the M-stationarity of the solutions of auxiliary problems. Further, recalling the comments after Lemma 3.1, the solution (s, δ) of QP (ρ, I 1 (s, δ) ∪ I 00 (s, δ)) is M-stationary for the auxiliary problem. Thus, in Algorithm 3.1 for solving the auxiliary problem, it is sufficient to consider only the last problem of the four problems (31),(32). Moreover, definition of limiting normal cone (11) reveals that, in general, the limiting process abolishes any stationarity stronger that M-stationarity, even S-stationarity.
Nevertheless, in practical situations it is likely that some assumption, securing that a stronger stationarity will be preserved in the limiting process, may be fulfilled. E.g., letx be a limit point of x k . If we assume that for all k sufficiently large it holds that I 00 (x) = I 00 (s N k k , δ N k k ), thenx is at least Q M -stationary for (1). This follows easily, since now for all i ∈ I 00 (x) it holds that This observation suggests that to obtain a stronger stationarity of a limit point, the key is to correctly identify the bi-active index set at the limit point and it serves as a motivation for the extended version of our SQP method. Before we can discuss the extended version, we summarize some preliminary results.

Preliminary results
Let a : R n → R p and b : R n → R q be continuously differentiable. Given a vector x ∈ R n we define the linear problem Note that d = 0 is always feasible for this problem. Next we define a set A by Letx ∈ A and recall that the Mangasarian-Fromovitz constraint qualification (MFCQ) holds at x if the matrix ∇a(x) has full row rank and there exists a vector d ∈ R n such that Moreover, for a matrix M we denote by M p the norm given by and we also omit the index p in case p = 2.
Lemma 5.1. Letx ∈ A, let assume that MFCQ holds atx and letd denote the solution of LP (x). Then for every > 0 there exists where d denotes the solution of LP (x).
Proof. The classical Robinson's result (c.f. [9, Corollary 1, Theorem 3]), together with MFCQ at x, yield the existence of κ > 0 andδ > 0 such that for every x with x −x ≤δ there existsd, feasible for LP (x) and fulfilling Thus, taking into account ∇a(x)d = 0, (b(x)) − + ∇b(x)d ≤ 0 and d ∞ ≤ 1, we obtain Hence, given > 0, by continuity of objective and constraint functions as well as their derivatives atx we can define δ ≤δ such that for all x with x −x ≤ δ it holds that Consequently, we obtain and since ∇f (x)d ≤ ∇f (x)d by feasibility ofd for LP (x), the claim is proved.
Lemma 5.2. Let ν ∈ (0, 1) be a given constant and for a vector of positive parameters ω = (ω E , ω I ) let us define the following function Further assume that there exist > 0 and a compact set C such that for all x ∈ C it holds that ∇f (x)d ≤ − , where d denotes the solution of LP (x). Then there existsα > 0 such that holds for all x ∈ C and every α ∈ [0,α].
Proof. Definition of ϕ, together with u By uniform continuity of the derivatives of constraint functions and objective function on compact sets, it follows that there existsα > 0 such that for all x ∈ C and every h with h ∞ ≤α we have Hence, for all x ∈ C and every α ∈ [0,α] we obtain On the other hand, taking into account ∇a(x)d = 0, d ∞ ≤ 1, (77) and we similarly obtain for all x ∈ C and every α ∈ [0,α] Consequently, (75) follows from (76) and the proof is complete.

The extended version of Algorithm 4.1
For every vector x ∈ R n and every partition (W 1 , W 2 ) ∈ P(V ) we define the linear problem Note that d = 0 is always feasible for this problem and that the problem LP (x, W 1 ) coincides with the problem LP (x) with a, b given by The following proposition provides the motivation for introducing the problem LP (x, W 1 ).
On the other hand, if (80) is fulfilled, is follows that min{∇f (x)d 1 , ∇f (x)d 2 } = 0 as well. Thus, d = 0 is an optimal solution forLP 1 andLP 2 and duality theory of linear programming yields that the solutions λ 1 and λ 2 of the dual problems exist and their objective values are both zero. However, this implies that for j = 1, 2 we have and consequently λ 1 fulfills the conditions of λ and λ 2 fulfills the conditions of λ, showing thatx is indeed Q-stationary with respect to (β 1 , β 2 ). Now for each k consider two partitions (W 1 1,k , W 1 2,k ), (W 2 1,k , W 2 2,k ) ∈ P(V ) and let d 1 and let (W 1,k , W 2,k ) ∈ {(W 1 1,k , W 1 2,k ), (W 2 1,k , W 2 2,k )} denote the corresponding partition. Next, we define the function ϕ k in the following way (83) Note that the function ϕ k coincides with ϕ for a, b given by (79) with (W 1 , W 2 ) := (W 1,k , W 2,k ) and ω = (ω E , ω I ) given by Proposition 5.2. For all x ∈ R n it holds that Proof. Non-negativity of the distance function, together with (20) yield for every i ∈ V, j = 1, 2 Hence (84) now follows from j=1,2 i∈W k,j An outline of the extended algorithm is as follows.
Naturally, Remark 4.1 regarding the stopping criteria for Algorithm 4.1 aplies to this algorithm as well.
Proof. In order to show that j(k) is well defined, we have to prove the existence of some j such that either (85) or (86) is fulfilled. By (84) we know that Φ k (x k )−ϕ k (x k ) ≤ 0. In case Φ k (x k )−ϕ k (x k ) < 0 every j sufficiently large clearly fulfills (86). On the other hand, if Φ k (x k ) − ϕ k (x k ) = 0, taking into account (84) we obtain However, Lemma 5.2 for ν := µ and C := {x k } yields that if ∇f (x k )d k < 0 then there exists somẽ α such that holds for all α ∈ [0,α] and thus (85) is fulfilled for every j sufficiently large. This finishes the proof.

Convergence of the extended algorithm
We consider the behavior of the Algorithm 5.1 when it does not prematurely stop and it generates an infinite sequence of iterates . We discuss the convergence behavior under the following assumption.
2. Mangasarian-Fromovitz constraint qualification (MFCQ) holds at every limit pointx of the sequence of iterates x k .
3. For every limit pointx of the sequence of iterates x k there exists a subsequence K(x) such that lim → ∞ x k =x and . Note that the Next iterate step from Algorithm 5.1 remains almost unchanged compared to the Next iterate step from Algorithm 4.1, we just consider the pointx k instead of x k . Consequently, most of the results from subsections 4.1 and 4.2 remain valid, possibly after replacing x k byx k where needed, e.g. in Lemma 4.3. The only exception is the proof of Lemma 4.5, where we have to show that the sequence Φ k (x k ) is monotonically decreasing. This follows now from (85) and hence Lemma 4.5 remains valid as well.
We state now the main result of this section.
Theorem 5.1. Let Assumption 2 be fulfilled. Then every limit point of the sequence of iterates x k is at least Q M -stationary for problem (1).
Proof. Letx denote a limit point of the sequence x k and let K(x) denote a subsequence from Assumption 2 (3.). Since → ∞x k−1 =x and by applying Theorem 4.1 to sequencex k−1 we obtain the feasibility ofx for problem (1).
Next we considerd 1 ,d 2 as in Proposition 5.1 with β 1 := ∅ and without loss of generality we only consider k ∈ K(x), k ≥k, wherek is given by Lemma 4.4. We show by contraposition that the case min{∇f (x)d 1 , ∇f (x)d 2 } < 0 can not occur. Let us assume on the contrary that, say ∇f (x)d 1 < 0. Assumption 2 (3.) yields that W 1 1,k = I 0+ (x) and feasibility ofx for (1) together with I 0+ (x) ⊂ W 1 1,k ⊂ I 0 (x) implyx ∈ A for A given by (71) and a, b given by (79) with (W 1 , W 2 ) := (W 1 1,k , W 1 2,k ). Taking into account Assumption 2 (2.), Lemma 5.1 then yields that for := −∇f (x)d 1 /2 > 0 there exist δ such that for all x k −x ≤ δ we have ∇f (x k )d k ≤ ∇f (x k )d 1 k ≤ ∇f (x)d 1 /2 = − , with d k given by (82). Next, we choosek to be such that for k ≥k it holds that x k −x ≤ δ and we set ν := (1+µ)/2, C := {x | x −x ≤ δ}. From Lemma 5.2 we obtain that holds for all α ∈ [0,α]. Moreover, by choosingk larger if necessary we can assume that for all i ∈ V we have For the partition (W 1,k , W 2,k ) ∈ {(W 1 1,k , W 1 2,k ), (W 2 1,k , W 2 2,k )} corresponding to d k it holds that I 0+ (x) ⊂ W 1,k ⊂ I 0 (x) and this, together with the feasibility ofx for (1), imply F i (x) ∈ P j , i ∈ W k,j for j = 1, 2. Therefore, taking into account (22), we obtain max{ max i∈W 1,k d(F i (x k ), P 1 ), max In Ten-bar Truss example we consider the ground structure depicted in Figure 1(a) consisting of N = 10 potential bars and 6 nodal points. We consider a load which applies at the bottom right hand node pulling vertically to the ground with force f = 1. The two left hand nodes are fixed, and hence the structure has d = 8 degrees of freedom for displacements.
We set c := 10,ā := 100 andσ := 1 as in [7] and the resulting structure consisting of 5 bars is shown in Figure 1(b) and is the same as the one in [7]. For comparison, in the following table we show the full data containing also the stress values. We can see that although our final structure and optimal volume are the same as the final structure and the optimal volume in [7], the solution (a * , u * ) is different. For instance, since f T u * = 8 < 10 = c, our solution does not reach the maximal compliance. Similarly as in [7], we observe the effect of vanishing constraints since the stress values from the table show that σ * max := max 1≤i≤N |σ i (a * , u * )| = 1.4882 >σ * := max 1≤i≤N :a * i >0 |σ i (a * , u * )| = 1 =σ.
In Cantilever Arm example we consider the ground structure depicted in Figure 2(a) consisting of N = 224 potential bars and 27 nodal points. Again, we consider a load acting at the bottom 26 right hand node pulling vertically to the ground with force f = 1. Now the three left hand nodes are fixed, and hence d = 48.