A structured modified Newton approach for solving systems of nonlinear equations arising in interior-point methods for quadratic programming

The focus in this work is on interior-point methods for inequality-constrained quadratic programs, and particularly on the system of nonlinear equations to be solved for each value of the barrier parameter. Newton iterations give high quality solutions, but we are interested in modified Newton systems that are computationally less expensive at the expense of lower quality solutions. We propose a structured modified Newton approach where each modified Jacobian is composed of a previous Jacobian, plus one low-rank update matrix per succeeding iteration. Each update matrix is, for a given rank, chosen such that the distance to the Jacobian at the current iterate is minimized, in both 2-norm and Frobenius norm. The approach is structured in the sense that it preserves the nonzero pattern of the Jacobian. The choice of update matrix is supported by results in an ideal theoretical setting. We also produce numerical results with a basic interior-point implementation to investigate the practical performance within and beyond the theoretical framework. In order to improve performance beyond the theoretical framework, we also motivate and construct two heuristics to be added to the method.


Introduction
This work is intended for inequality-constrained quadratic programs on the form minimize 1  2 x T Hx + c T x subject to Ax ≥ b, (IQP) where H ∈ R n×n , A ∈ R m×n , x ∈ R n , c ∈ R n and b ∈ R m .We consider primal-dual interior-point methods which means solving or approximately solving a sequence of systems of nonlinear equations.Application of Newton iterations on each system of nonlinear equations gives an unreduced unsymmetric block 3-by-3 system of linear equations, with dimension n + 2m, to be solved at each iteration.This system can be put on an equivalent form which contains a reduced symmetric block 2-by-2 system of dimension n + m, or a condensed system of dimension n, see Section 2.
work; in Section 3 we propose a modified Newton approach and discuss how it relates to some previous work on interior-point methods; Section 4 contains a description of the implementation along with two heuristics and a refactorization strategy; in Section 5, we give numerical results on convex quadratic programs; finally in Section 6 we give some concluding remarks.

Notation
Throughout, ρ(M ) denotes the spectral radius of a matrix M and |S| denotes the cardinality of a set S. The notion "•" is defined as the component-wise multiplication operator, "≻ 0" as positive definite, and "∧" as the logical and.Quantities associated with Newton iterations will throughout be labeled with " ˆ".Vector subscript and superscript denote component index and iteration index respectively.The only exception is e i which denotes the ith unit vector of appropriate dimension.All norms considered are of type 2-norm unless otherwise stated.

Background
The theoretical setting is analogous to the setting in a previous work of ours on bound-constrained nonlinear problems [8].For completeness, we review the background adapted to inequality-constrained quadratic programs in this section.Our interest is focused on the situation as primal-dual interior-point methods converge to a local minimizer x * ∈ R n with its corresponding multipliers λ * ∈ R m and slack variables s * ∈ R m .Specifically we assume that the iterates of the method converge to a vector x * T , λ * T , s * T T (x * , λ * , s * ) that satisfies Hx * + c − A T λ * = 0, (stationarity) (2.1a) s * ≥ 0, (non-negativity of slack variables) (2.1c) λ * ≥ 0, (non-negativity of multipliers) (2.1d) s * + λ * > 0, (strict complementarity), (2.1g) A A (x * ) of full column rank, (regularity), (2.1h) where A A (x ⋆ ) denotes the Jacobian corresponding to the active constraints and Z(x * ) denotes a matrix whose columns span the nullspace of A A (x * ).First-order necessary conditions for a local minimizer of (IQP) are given by (2.1a)-(2.1e).The first-order conditions together with (2.1f) and (2.1g) constitute second-order sufficient conditions for a local minimizer of (IQP) [20].In the theoretical results, we also assume that (x * , λ * , s * ) satisfies (2.1h).
To simplify the notation, we let z denote the triplet (x, λ, s).For a given barrier parameter µ ∈ R, we are interested in the function F µ : R n+2m → R n+2m given by where S = diag(s), Λ = diag(λ) and e is a vector of ones of appropriate size.Firstorder necessary conditions for a local minimizer of (IQP), (2.1a)-(2.1e),are satisfied by a vector z, with s ≥ 0 and λ ≥ 0, that fulfills F µ (z) = 0 for µ = 0.In interiorpoint methods F µ (z) = 0 is solved or approximately solved for a sequence of µ > 0, that approaches zero, while preserving s > 0 and λ > 0. Application of Newton iterations means solving a sequence of systems of linear equations on the form where ∆ẑ = (∆x, ∆ λ, ∆ŝ) and F ′ : R n+2m → R (n+2m)×(n+2m) is the Jacobian of F µ (z), defined by In consequence it follows that the Jacobian at z + ∆z can be written as where with ∆Λ = diag(∆λ) and ∆S = diag(∆s).The subscript µ has been omitted since F ′ is independent of the barrier parameter.Under the assumption that Λ is nonsingular the unreduced block 3-by-3 system (2.3) can be reformulated as the reduced system together with ∆ŝ = −(s − µΛ −1 e) − Λ −1 S∆ λ.If in addition S is nonsingular, then a Schur complement reduction of Λ −1 S in (2.6) gives the condensed system together with ∆ λ = −S −1 (Λ(Ax−b)−µe)−S −1 ΛA∆x.The focus in the manuscript will mainly be on the unreduced block 3-by-3 system (2.3).However, analogous reductions of the modified Newton system similar to those of (2.6) and (2.7) will also be discussed.
To improve efficiency, many methods seek approximate solutions of F µ (z) = 0, for each µ.There are different strategies to update µ, e.g., dynamically every iteration or to keep µ fixed until sufficient decrease of a merit function is achieved.Herein, our model method uses the latter.In particular, our model method is similar to the basic interior-point method of Algorithm 19.1 in Nocedal and Wright [25,Ch. 19,p. 567].However, termination and reduction of µ are based on the merit function φ µ (x) = F µ (z) .Similarly, in the theoretical framework we consider the basic condition F µ (z) ≤ Cµ, for some constant C > 0, see e.g., [25,Ch. 17,p. 572].The additional assumption that all vectors z satisfy s > 0 and λ > 0 is made throughout.
In the remaining part of this section we give some definitions and provide the details for the theoretical framework.Definition 2.1.(Order-notation) Let α, γ ∈ R be two positive related quantities.If there exists a constant C 1 > 0 such that γ ≥ C 1 α for sufficiently small α, then γ = Ω(α).Similarly, if there exists a constant The first of the following two lemmas provides the existence of a neighborhood where the Jacobian is nonsingular.The second lemma gives the existence of a Lipschitz continuous barrier trajectory z µ in the neighborhood where the Jacobian is nonsingular.The results are well known and can be found in e.g., Ortega and Rheinboldt [26].See also Byrd, Liu and Nocedal [4] for the corresponding results in a setting similar to the one considered here.Lemma 2.1.Under Assumption 2.1 there exists δ > 0 such that F ′ (z) is continuous and nonsingular for z ∈ B(z * , δ) and for some constant M > 0.
The following lemma provides a relation between the distance of vectors z to the barrier trajectory and the quantity F µ (z) , when the distance is sufficiently small.A corresponding result is also given by Byrd, Liu and Nocedal [4].
The next lemma provides a bound on the Newton direction, ∆ẑ, for z sufficiently close to the barrier trajectory.

A structured modified Newton approach
In order to describe the approach and its ideas, we first consider a simple setting with one iteration.For a given µ > 0, consider the interior-point iterate z + ∈ B(z * , δ) defined by z + = z + ∆ẑ, where z ∈ B(z * , δ) and ∆ẑ satisfies (2.3) with µ + = σµ, 0 < σ < 1.Since ∆ẑ has been computed with (2.3) we assume that a factorization of F ′ (z) is known.Instead of performing another Newton step ∆ẑ + at z + for some µ ++ = σ + µ + , 0 < σ + ≤ 1, which requires the solution of (2.3) with µ ++ and z + , we would like to compute an approximate solution, which is computationally less expensive, from where and U is some low-rank update matrix.A natural question is then how to choose the update matrix U .Gondzio and Sobral [17] consider rank-1 update matrices such that the distance, in Frobenius norm, between B + and the previous Jacobian approximation is minimized, when B + in addition satisfies the secant condition.They show that the sparsity pattern of the first two block rows is maintained, however the sparsity pattern of the third row block row is typically lost.In contrast, our strategy is, for a given rank restriction r on U , to choose U such that the distance, in both 2-norm and Frobenius norm, between B + and the actual Jacobian F ′ (z + ) is minimized.The sparsity of the Jacobian is maintained, however there is no requirement for B + to fulfill the secant condition.
To further support the choice of update matrix we give some additional theoretical results.First, we show that there is a region where the modified Newton approach produces small errors with respect to the Newton direction.In particular, a region that depends on µ where the modified Newton direction approaches the Newton direction as µ → 0. Later, we also discuss general errors, descent directions with respect to our merit function φ µ (z) and conditions for local convergence.
The error of using the modified Jacobian B + of (3.1) is Given a rank restriction r, 0 ≤ r ≤ m, on U , the Eckart-Young-Mirsky theorem gives the update matrix U that minimizes the Jacobian error E + , in terms of the measure . 2 and .F .In Proposition 3.1 below, we give an expression for U and show that the resulting modified Jacobian may be viewed as a Jacobian evaluated at a point z+ = (x + , λ+ , s+ ).
Proposition 3.1.For z = (x, λ, s) and ∆z = (∆x, ∆λ, ∆s), let F ′ (z) and ∆F ′ (∆z) be defined by (2.4) and (2.5) respectively, and let z + = z + ∆z.For a given rank r, 0 ≤ r ≤ m, let U r be the set of indices corresponding to the r largest quantities of (∆λ i ) 2 + (∆s i ) 2 , i = 1, . . ., m.The optimal solution of minimize U ∈R (2m+n)×(2m+n) where . is either of type . 2 or .F , is In consequence, it holds that 2).The result then follows from the Eckart-Young-Mirsky theorem, stated in Theorem A.1, together with Lemma A.1.The last part of the proposition follows directly from performing the update.
Proposition 3.1 shows that each rank-1 term of the sum in U * added to F ′ (z) is equivalent to the update of one component-pair, (λ, s), in the Λ and S blocks of the Jacobian.The essence is that adding the rank-r update matrix U * to F ′ (z) is equivalent to updating pairs (λ i , s i ) to λ + i , s + i , i ∈ U r , and that the modified Jacobian at z + may be viewed as a Jacobian evaluated at z+ .In particular, r = m gives z+ = z + and B + = F ′ (z + ).Before we give the analogous result of Proposition 3.1 in a more general framework, we show that there exists a region where the modified Newton approach may be started without causing large errors in the search direction.In particular, we give a bound on the search direction error ∆ẑ + − ∆z + , where ∆z + satisfies (3.1) with update matrix U * of rank r as given in Proposition 3.1.In the derivation, the inverse of B + is expressed as a Neumann series which requires ρ(F ′ (z + ) −1 E + ) < 1.We first show that among U such that rank(U ) ≤ r, U * is sound in regard to the reduction of an upper bound of ρ(F ′ (z + ) −1 E + ).Thereafter, in Lemma 3.1 we show that, for iterates z sufficiently close to the barrier trajectory, the quantity F ′ (z + ) −1 E + , and consequently also ρ(F ′ (z + ) −1 E + ), is bounded above by a constant times µ.This gives the existence of a region, that depends on the barrier parameter, where ρ(F ′ (z By assumption z + ∈ B(z * , δ), hence by Lemma 2.1 there exists a constant M > 0 such that F ′ (z + ) −1 ≤ M , and it holds that Lemma A.1 shows that the singular values of ∆F ′ (∆ẑ) are given by (∆ λi ) 2 + (∆ŝ i ) 2 , i = 1, . . ., m.The largest reduction of the upper bound in (3.3), among U such that rank(U ) ≤ r, is achieved with the rank-r update matrix U * of Proposition 3.1, which gives where the indices i = 1, . . ., m are ordered such that (∆ λi ) 2 + (∆ŝ i ) 2 are in descending order.Thus supporting the choice of update matrix in regard to the reduction of the upper bound of the spectral radius, and 2-norm, of F ′ (z + ) −1 E + .
Proof.See Appendix.
Similarly as for Lemma 3.1, the result of Theorem 3.1 is also valid for U as a zero matrix.The essence is again that, among update matrices U such that rank(U ) ≤ r, the rank-r update matrix U * of Proposition 3.1 is the matrix that provides the tightest bound on (3.7) with our analysis.As mentioned, U * is also the matrix that gives the upper bound in (3.5), and in addition, C (r+1) decreases with increasing r.Consequently, the bound in (3.7) decreases with increasing r, and larger values of r may thus also increase the region where the result of Theorem 3.1 is valid, i.e., the region where the proposed modified Newton approach may be a viable alternative.

At a general iteration k
In this section we give a result analogous to Proposition 3.1, at iteration k, k ≥ 1, in a damped modified Newton setting.Consider the sequence {z i } k i=0 generated by z i+1 = z i + α i ∆z i , i = 0, . . ., k − 1, where α i is the step size.Suppose that each ∆z i satisfies for some µ i > 0 and update matrix U i of rank r i .If at k = 1, for a given rank r k , the update matrix is chosen as the optimal solution of the optimization problem in Proposition 3.1, then B 1 = F ′ (z 1 ), for some z1 .Inductively, at an iteration k, k ≥ 1, for a given zk−1 and rank r k , 0 ≤ r k ≤ m, we wish to choose U k as the optimal solution of minimize where . is either of type . 2 or .F .The optimal solution of (3.9), the update of zk from zk−1 , and the resulting optimal B k are shown in Proposition 3.2.This is analogous to the update from z 0 to z1 given in Proposition 3.1.The essence is that the rank-r k update matrix, defined by the solution of (3.9), is equivalent to updating information corresponding to the r k largest quantities (λ In essence, the r k largest deviations from the Newton step are corrected, and . The optimal solution of (3.9) is where U r k is the set of indices corresponding to the r k largest quantities of In consequence, it holds that Proof.The proof is analogous to that of Proposition 3.1 with z = zk−1 , z + = z k and ∆z = z k − zk−1 .
In the described approach, U k * of Proposition 3.2 gives B k = F ′ (z k ) for some zk .A direct consequence is that iterates which become primal-dual feasible, i.e., satisfies the first two block equations of (2.3), will remain so.Moreover, at iteration k, the error of using the modified Jacobian is and it holds that In fact, for any z and z it holds that which implies that the Lipschitz constant of F ′ may be chosen as one.Recall that, among the update matrices Next we show that the update matrix U k * of Proposition 3.2 is sound with respect to reducing an upper bound of the relative error of the search direction ∆z k .At iteration k, k ≥ 1, ∆z k satisfies (3.8) which equivalently can be written as Given the restrictions on the update, U k * of Proposition 3.2 is the update that minimizes E k , and in consequence gives the largest reduction in the upper bound on the relative error (3.12).

Convergence
In this section we discuss convergence towards the barrier trajectory, i.e., convergence of the inner loop of Algorithm 4.1.We first give a condition for descent direction with respect to our merit function, as our setting is compatible with linesearch strategies.Thereafter, results are given in a setting where unit steps are assumed to be tractable.We give conditions on B k , and thus r k , so that the modified Newton approach converges locally.The theoretical setting is here widened slightly from the previous sections, in that we wish to quantify the effects of the modified Newton approach also for larger values of µ.For a given µ > 0, iterate z and vector ∆z, consider the merit function and the univariate function and The directional derivative is then At iteration k, k ≥ 0, z k and ∆z k in the modified Newton approach satisfies with E k as in (3.10).From (3.13) it follows that ∆z k is a descent direction with respect to Under the restrictions of the update, the rank-r k matrix U k * of Proposition 3.2 is chosen such that E k is minimized.In addition, E k = 0 for r k = m.A descent direction can hence always be ensured for r k sufficiently large.Moreover, in our theoretical setting, Lemma 3.1 gives the existence of a region, that depends on µ, where the modified Newton approach may be initiated so that E k is sufficiently small for (3.14) to hold, even when r k = 0.However, the essence is again that Next we give a result on local convergence and discuss the modified Newton approach in the framework of inexact Newton methods.Note that the local convergence result in the proposition shows balance between quadratic rate of convergence, which would follow if B k = F ′ (z k ), i.e., C 6 = 0, and F ′ (z k ) −1 ≤ C 7 for all k, and linear rate of convergence, which would follow when B k differs from F ′ (z k ).Proposition 3.3.For a given µ > 0, assume that z µ exists.At an iteration k 0 , consider the sequence of iterates generated by z k+1 = z k +∆z k , k = k 0 , k 0 +1, . . ., where each ∆z k satisfies (3.8), with µ k = µ and update matrix of rank-r k , 0 ≤ r k ≤ m, given by U k * of Proposition 3.2.Assume that at each iteration k, (B k ) −1 exists and that r k is chosen such that for all k, If in addition, there is a C 7 such that for all k, (B k ) −1 ≤ C 7 , then it holds that so that z k converges to z µ if Proof.
Under the conditions of the proposition, ∆z k and z k satisfy B k ∆z k = −F µ (z k ).At iteration k + 1 the error may be written as has been added and subtracted in the second equality.Taking 2-norm while considering Lipschitz continuity of F ′ , see end of Section 3.1, and norm inequalities give Insertion of the assumed C 6 and C 7 into (3.17)gives (3.15).Finally, if Conditions for local convergence may also be obtained when the modified Newton approach is interpreted in the context of inexact Newton methods [6].For a given µ > 0, such steps may be viewed on the form The sequence of iterates z k + ∆z k converges to z µ , with at least linear rate, for z 0 sufficiently close to z µ if η k < 1 uniformly.Given that the iterates converge, the convergence is superlinear if and only if The modified Newton approach can be put onto the form of (3.18) under the assumption that I −E k F ′ (z k ) −1 is nonsingular, where E k is given by (3.10).Nonsingularity may be ensured with an update matrix U k * of Proposition 3.2 of sufficiently large rank r k , 0 ≤ r k ≤ m, or by starting the modified Newton approach when µ is sufficiently small, as shown by Lemma 3.1.A straightforward calculation shows that (3.8) at iteration k, with µ k = µ, can be written as Identification of terms in (3.18) and (3.19) gives Norm inequalities and standard geometric series results give Local convergence towards the barrier trajectory follows if, at each iteration k, Similarly, the modified Newton approach may also be viewed in the framework of inexact interior-point methods in order to study conditions for global convergence, see e.g., [2,3,16].However, our analysis have only given further technical conditions which are outside the scope of this initial work.Instead, we have chosen to limit our focus to basic supporting results and to the study of practical performance with update matrices of low-rank at larger values of µ.

Reduced systems
The ideas presented so far have been on the unreduced unsymmetric block 3-by-3 system (2.3).In this section we describe the corresponding reduced and condensed system which are similar to those of (2.6) and (2.7).
In essence, the system of linear equations to be solved for each iterate z in the modified Newton approach takes the form for some z.System (3.20) may be reformulated on the reduced form ), may also be considered.

Compatibility with previous work on interior-point methods
In order to have simple notation, we have chosen to formulate our problem on the form (IQP), with inequality constraints only.Analogous results hold for quadratic programs on standard form, as considered in [1,12,17,22,23].However, when working with reduced systems, then the update will be on the diagonal of the H-matrix in the symmetric block 2-by-2 indefinite system.Moreover, the proposed approach is also compatible with regularized methods for quadratic programming, e.g., [1,12,27], as long as the scaling of the regularization is not changed at iterations where the modified Jacobian is updated by a low-rank matrix.The scaling of the regularization may be changed at a refactorization step, e.g., on the form suggested in (4.5) of Section 4.
As each modified Jacobian may be viewed as a Jacobian evaluated at a different point, the modified Newton approach may also be interpreted in the framework of previous work on stability, effects of finite-precision arithmetic and spectral properties of the arising systems, e.g., [5,9,10,19,[22][23][24][29][30][31][32].

Implementation
All numerical simulations were performed in matlab on benchmark problems from the repository of convex quadratic programming problems by Maros and Mészáros [21].Many of these problems contain both linear equality and linear inequality constraints.However, in order not to complicate the description of the implementation with further technical details, we choose to give the description for problems on the same form as in previous sections.Note however that some of the parameters will depend on quantities related to the format of benchmark problems.

Basic algorithm
The aim is to study the fundamental behavior of the modified Newton approach as primal-dual interior-point methods converge.In particular, when each search direction is generated with a modified Newton equation on the form (3.8), with update matrix U k * of Proposition 3.2, relative to a Newton equation on the form (2.3).In order not to risk combining effects of the proposed approach with effects from other features in more advanced methods, we chose to implement the modified Newton approach in a simple interior-point framework.Moreover, all systems of linear equations were solved with matlab's built in direct solver.No low-rank update of factorizations was used or implemented for the numerical tests.In consequence, the results are not dependent on the particular factorization or the procedure used to update the factors.For low-rank updates of matrix factorizations, see e.g., [7,13,14,28].Our basic interior-point algorithm is similar to Algorithm 19.1 in Nocedal and Wright [25,Ch. 19,p.567], however, termination and the update of µ are based on the merit function φ µ (z) = F µ (z) .Algorithm 4.1 Basic interior-point method (IQP).
end while Our reference method, which in all experiments is denoted by Newton, is defined by Algorithm 4.1 where the search direction at iteration k, ∆z k = (∆x k , ∆λ k , ∆s k ), satisfies (2.3).The method, whose behavior we aim to study, is defined by Algorithm 4.1 where the search direction satisfies (3.8), with an update matrix of rank r k = r given by U k * of Proposition 3.2.This method is in the numerical experiments denoted by mN-r(r).Although the rank of the update matrices can be varied between the iterations, this initial study is limited to update matrices of constant rank in order to keep the comparisons clean.

Benchmark problems
Each problem was pre-processed and put on an equivalent form with n x-variables, m in inequality constraints and m eq equality constraints.The total number of variables in the primal-dual formulation is thus N = n + m eq + 2m in variables, see Appendix for a description and formulation of the systems that arise.A trivial equality constraint that fixed a variable at any of its bounds was removed from the problem along with the variable.A problem was accepted if m in ≥ 4 and, in addition, if Newton converged from a given initial solution.Due to the simplicity of Newton convergence was not achieved for some problems due to reasons as, nontrivial equality constraints fixing variables at its bounds, singular Jacobians caused by linearly dependent equality constraints, etc.Moreover, we were not able to run CONT-300, BOYD1 and BOYD2 due to memory restrictions.These conditions reduced the benchmark set, P, to 90 problems (out of 138).The problems were divided into the three subsets: small, S, medium, M, and large, L. The sets were defined as follows: S = {p ∈ P : N < 500}, M = {p ∈ P : 500 ≤ N < 10000} and L = {p ∈ P : N ≥ 10000}.Consequently, |S| = 25, |M| = 37 and |L| = 28.The specific problems of each group and details on their individual sizes can be found in Appendix.

Heuristics
The theoretical results in Section 3 concern iterates sufficiently close to the trajectory for sufficiently small µ, or when the rank of each update matrix is sufficiently large.However, we are also interested in studying the behavior of the modified Newton approach beyond this setting.In particular, for larger vales of µ and when each update matrix is of low-rank.In this section, we discuss the behavior in these cases.In order to improve performance we also suggest two heuristics and a refactorization strategy.In essence, the heuristics allow for change of indices in the set U r k of Proposition 3.2, and the refactorization strategy limits the total rank change on an initial Jacobian.
Numerical experiments with mN-r(r), for small r, have shown that convergence may slow down due to small step sizes α k P and α k D .Small step sizes can be caused by few components in the modified Newton direction which differ considerably from those in the Newton direction.We first show some numerical evidence of this behavior and suggest a partial explanation on which we base two heuristics.The effectiveness of each heuristic is then illustrated, and finally a refactorization strategy is included in the modified Newton approach.Step sizes and convergence, in terms of the measure F µ with µ = 0, for Newton and mN-r(r) with r = [0, 2, 4] are shown in the left-hand side of Fig. 1.The results are for benchmark problem qafiro with parameters µ 0 = 1, σ = 0.1 and ε tol = 10 −6 .The right-hand side of the figure shows the inverse of the limiting step sizes and the relative error in the search direction at the iteration marked by the red circle of mN-r(2), hence large spikes imply small step sizes.Moreover, the figure only contains negative components of the modified Newton direction.The result for r = 0, i.e., simplified Newton, is given to illustrate that low-rank updates can indeed make a difference compared to a simplified Newton approach for which some of our theoretical results are still valid, although in a smaller region.The results in the left-hand side of Fig. 1 indicate that convergence may slow down with the low-rank modified Newton approach due to small step sizes.The right-hand side of Fig. 1 suggests that small steps may be caused by large relative errors in certain components of the search direction.The results are similar to those shown by Gondzio and Sobral [17] for quasi-Newton approaches, hence indicating that the proposed modified Newton approach suffers from the same phenomenon as quasi-Newton approaches.In theory, zero steps are not harmful for the modified Newton approach, as long as the Newton step makes progress from this point, since after m/r iterations with zero steps the modified Jacobian will indeed be the Jacobian at that point.In practice however, close to zero steps have negative effects on the convergence.In consequence we would like to understand what causes these steps and how to avoid them.The partial solution ∆x of (3.20) satisfies (3.22).For z sufficiently close to z * , (3.22) may be approximated by where A and I are sets of indices corresponding to active and inactive constraints at the solution z * respectively, i.e., A = {i : s * i = 0, i = 1, . . ., m.} and I = {i : λ * i = 0, i = 1, . . ., m.}.If the modified Newton approach is initiated for small µ, or if the rank of each update matrix is sufficiently large, then z ≈ z.If in addition, A A ∆x is sufficiently large, i.e, ∆x is not in or almost in the null-space of A A (if it is then the the search direction will not cause limiting steps), then the dominating terms of (4.1) are In essence, (4.2) is an approximation of (4.1) in the particular case.By our assumptions A A has full row rank.Consequently, component-wise (4.2) gives λi si where A i denotes the ith row of A. Equation (4.3) gives an approximate description of how each pair ( λi , si ), i ∈ A affects A i ∆x, the inner product of the search direction ∆x and the corresponding constraint A i .This means that each pair ( λi , si ), i ∈ A, affects the angle between ∆x and constraint A i , and/or ∆x .Both errors in angle and large ∆x may cause small step sizes.In the proposed modified Newton approach, depending on the rank of the update matrix, some of the factors in the modified Jacobian, i.e., some components λi /s i , i ∈ A, may contain information from previous iterates.The analysis above suggests that it may be important to update certain components-pairs (λ, s) in order to avoid limiting steps.Such pairs may not be updated with the matrix of Proposition 3.1 or Proposition 3.2.In light of the discussion above and the results of Fig. 1, we construct two heuristics in an attempt to decrease negative effects on convergence caused by small step sizes.Both heuristics have an update matrix U k analogous to the one given by Proposition 3.2, with where U k is an index set of cardinality r.However, not all indices in U k are chosen according to the criteria of Proposition 3.2, so that U k of the heuristics may differ from U r of Proposition 3.2.(Note that r k = r, for all k, in the numerical tests.)The first heuristic can have at most two indices that differ between U k and U r , whereas the second is more flexible and can potentially change all r indices.We choose to replace indices instead of adding indices in order to obtain a fair comparison in the study of the heuristics.

Heuristic H1
The idea of the first heuristic is to ensure that information corresponding to componentpairs (λ, s) is updated if either limited the step size in the previous iteration.At iteration k, k ≥ 1, U k is based on U r of Proposition 3.2, but the last one or two indices are replaced by î1 = argmin respectively.

Heuristic H2
The principle of the second heuristic is based on the observation in the analysis above.Similarly as in H1 the idea is to ensure that certain component-pairs (λ, s) are updated.In particular, the components with the largest relative error of the ratio in the left-hand of (4.3).However, the set of active constraints at the solution is unknown, instead all components which could have limited the step size in the previous iteration are considered in the selection.At iteration k, k ≥ 1, U k is based on U r of Proposition 3.2, but at most r indices are replaced by the indices corresponding to the, at most, r largest quantities of

Heuristic test
To demonstrate the impact of heuristic H1 and H2 we show results in Fig. 2 which are analogous to those in the left-hand side of Fig. 1.The methods mN-r(r)-H1 and mN-r(r)-H2 denote mN-r(r), r = [2,4], combined with heuristic H1 and H2 respectively.In addition, Table 1 shows the average of the sum (α k P + α k D )/2 for a subset of the benchmark problems.The subset contains problems from each of the sets S, M and L.  The results of Fig. 2 show that mN-r(r)-H1 and mN-r(r)-H2, r = [2,4], use larger step sizes, and converges in fewer iterations, compared to mN-r(r) in Fig. 1.Hence showing that the heuristics H1 and H2 have the intended effect on benchmark problem qafiro.The results in Table 1 indicate that H1 and H2 have the intended effect on more benchmark problems but also that they are not effective on all problems.Part of the reason is that the rank of each update matrix is restricted to 2 and 4 respectively, but for some problems there are many components which limit the step size.For instance, H1 can at most take two of these and H2 can, at most, take as many as the maximum rank of the update.The results for problem yao is an example where H2 does worse than without heuristic.The heuristic replaced indices in U r which caused low quality in the search direction, indicating that it may be beneficial to also update the information that is suggested in Proposition 3.2.Numerical experiments have further shown small step sizes can be avoided by allowing update matrices of varying rank.In particular, update matrices where the rank is determined by the components which potentially limit the step size.However, numerical experiments have also shown that avoiding small step sizes is not sufficient to obtain increased convergence speed.Similarly as in the results of Gondizo and Sobral for quasi-Newton approaches [17], numerical experiments have shown that it is occasionally important to use the Jacobian at z k instead of zk to improve convergence.In light of this, we will limit the number of allowed steps before the modified Jacobian is refactorized.In consequence the total rank change of an actual Jacobian will be limited.In the following numerical simulations, the modified Newton approaches mN-r(r), mN-r(r)-H1 and mN-r(r)-H2 include a refactorization strategy on the form where U k is given by (4.4) for index sets U k corresponding to H1, H2 for mN-r(r)-H1, mN-r(r)-H2, and U r of Propositon 3.2 for mN-r(r), respectively.In general, other refactorization strategies or dynamical procedures may be considered, e.g., instead of accepting all steps, refactor or increase the rank of the update matrix if a particular step is deemed bad for some reason.

Numerical results
In this section we give results on the form of number of iterations and factorizations.The results are meant to give an initial indication of the performance of the proposed modified Newton approach in a basic interior-point framework.The results are for the methods Newton, mN-r(r), mN-r(r)-H1 and mN-r(r)-H2, with r = [2,16], described in Section 4. In essence the methods differ in how the search direction is computed.The direction at iteration k satisfies (2.3) in Newton and (3.8) in the mN-methods.In contrast to Section 4, here the mN-methods also include the refactorization strategy described in (4.5).Due to the large variety in number of inequality constraints and number of variables in each benchmark problem, the parameter l of (4.5) was defined as the closest integer to l S , l M and l L for p ∈ S, p ∈ M and p ∈ L respectively, see Table 2 for the specific values.The computational cost of a refactorization of the unreduced, reduced and condensed system all depends on the sparsity structure given by the specific problem.We therefore choose l S , l M and l L such that they relate to the full rank change that corresponds to a new factorization.The values of Table 2 were chosen such that a low-rank update is performed as long as the total rank change on an actual Jacobian is not larger than a factor of 1/2, 1/10 and 1/100 for the small, medium and large problems respectively.Moreover, the parameters of Algorithm 4.1 were chosen as follows: σ = 0.1, termination tolerance ε tol = 10 −6 for the small and medium sized problems and ε tol = 10 −5 for the large sized problems.In each run, the initial (x 0 , λ 0 , s 0 ) was found with Newton, with stopping criteria corresponding to the requirement on the initial solution in Algorithm 4.1.
Table 2: Refactorization parameter for the different problem sizes, m in is the number of inequality constraints.
l S l M l L m in r/2 m in r/10 m in r/100 Results are first shown for problems in the set S, tables 3-4, thereafter for problems in M, tables 5-7, and finally for problems in L, tables 8-10.The results are for three different regions depending on µ 0 , namely µ 0 = [1, 10 −3 , 10 −6 ].The intention is to illustrate the performance of the modified Newton approach, both close to a solution and in a larger region where the theoretical results are not expected to hold.The results corresponding to r = 16 for problems in S are omitted due to similarity of the performance caused by the refactorization strategy.In all tables the initial factorization of B 0 = F ′ (z 0 ) is counted as one factorization.In essence, "1" in the factorization column, F, means that no refactorization was performed.Moreover, "-" denotes that the method failed to converge within a maximum number of iterations.For each problem the maximum number of iterations was set to 10N , where N depends on the number of variables, see Appendix for the value of N associated with each problem.Table 3: Number of factorizations and iterations for problems in S with µ 0 = 1.Newton mN-r(2) mN-r(2)-H1 mN-r(2)-H2  Table 7: Number of factorizations and iterations for problems in M with µ 0 = 10 −6 .
Newton mN-r(2) mN-r(2)-H1 mN-r(2)-H2 mN-r( 16) mN-r( 16)-H1 mN-r( 16)-H2     The results in tables 3-10 indicate that the number of factorizations compared to those done by Newton may be reduced by instead performing low-rank updates, as with mN-r(r), r = [2,16].The reduced number of factorizations is however often at the expense of performing additional iterations with low-rank updates.The total number of iterations and/or factorizations are for many problems, but not for all, further reduced with heuristics H1 and H2, as shown by the results corresponding to mN-r(r)-H1 and mN-r(r)-H2, r = [2,16].This behavior is most significant in the simulations with larger values of µ, as shown in Table 3, Table 5 and Table 8.Comparing the number of iterations in the mN-methods gives an indication of whether the heuristics have been active on a specific problem.The results in tables 3-10 show that the performance of the heuristics varies with each problem, and in addition with µ 0 .In particular, the results show that the heuristics are most effective, and hence more likely to be active, at larger values of µ.For smaller values of µ, the mNmethods show similar performance, see Table 7, Table 10 and the right-hand side of Table 4.This indicates that the heuristics are less likely to have been active for smaller values of µ.Consequently, the mN-methods are less likely to produce limiting steps at small values of µ on the benchmark problems.The observation is in line with the results of the theoretical sections.Overall mN-r(2) fails to converge for two problems within a maximum number of iterations due to small step sizes.This is overcome with both H1 and H2, as shown by the corresponding results in Table 5.
Numerical experiments have further shown that decreasing the refactorization parameters of Table 2 decreases the number of iterations, and increases the number of factorizations done by the mN-methods.In general, an increased rank of each update matrix reduces the number of iterations overall, but due to the refactorization strategy the methods are required to refactorize more often.
Table 7, Table 10 and the right-hand side of Table 4 show that low-rank updates give convergence for small values of µ in many of the benchmark problems, even for update matrices of rank-two on large scale problems.Finally, we want to mention simplified Newton, i.e., mN-r(0) without a refactorization strategy.Our simulations showed that this approach was significantly less robust.It is not clear to us how to deduce a refactorization strategy for mN-r(0) that gives a fair comparison to the results of tables 3-10.Therefore, we have omitted specific results.

Conclusion
In this work we have proposed and motivated a structured modified Newton approach for solving systems of nonlinear equations that arise in interior-point methods for quadratic programming.In essence, the Jacobian of each Newton system is modified to a previous Jacobian plus one low-rank update matrix per succeeding iteration.The modified Jacobian maintains the sparsity pattern of the Jacobian and may thus be viewed as a Jacobian evaluated at a different point.The approach may in consequence be interpreted in the framework of previous works on primal-dual interior-point methods, e.g., effects of finite-precision arithmetic, stability, convergence and solution techniques.
Numerical simulations have shown that small step sizes can have negative effects on convergence with the modified Newton approach, especially at larger values of µ.In order to decrease these negative effects, we have constructed and motivated two heuristics.Further numerical simulations have shown that the two heuristics often increase the step sizes but also that this is not always sufficient to improve convergence.We have therefore also suggested a refactorization strategy.The heuristics and refactorization strategy that we have proposed are merely options, however, the framework allows for both different versions of these as well as other heuristics and/or strategies.
In addition, we have performed numerical simulations on a set of convex quadratic benchmark problems.The results indicated that the number of factorizations compared to those of the Newton based method can be reduced, often at the expense of performing more iterations with low-rank updates.The total number of iterations and/or factorizations were for many problems, but not for all, further reduced with the two heuristics.Although the theoretical results are in the asymptotic region as µ → 0, or when the rank of each update matrix is sufficiently large, we still obtain interesting numerical results for larger values of µ and update matrices of low-rank.
Our work is meant to contribute to the theoretical and numerical understanding of modified Newton approaches for solving systems of nonlinear equations that arise in interior-point methods.We have laid a foundation that may be adapted and included in more sophisticated interior-point solvers as well as contribute to the development of preconditioners.We have limited ourselves to a numerical study of the accuracy of the approaches in a high-level language.To get a full understanding of the practical performance, precise ways of solving the updated modified Newton systems would have to be investigated further.

A. Appendix
For completeness we state the Eckart-Young-Mirsky theorem.
Theorem A.1.(Eckart-Young-Mirsky theorem) Let A ∈ R m×n be of rank r and denote its singular value decomposition by U ΣV T , where U ∈ R m×m , V ∈ R n×n are orthogonal matrices, and Σ = diag(σ) ∈ R m×n , for σ 1 ≥ • • • ≥ σ p ≥ 0, where p = min{m, n}.For a given q, 0 < q ≤ r, the optimal solution of minimize where . is either 2-norm or Frobenius norm, is with u i and v i as the ith column of U and ith column of V respectively.
The following lemma contains the singular value decomposition of ∆F ′ (∆z) given in (2.5).
Similarly, the right singular vectors are the set of orthonormal eigenvectors of (∆F ′ (∆z))

10 :
µ k+1 ← σµ k 11: end while In Algorithm 4.1 at iteration k, α max,k P and α max,k D are the maximum feasible step sizes for s k along ∆s k and λ k along ∆λ k respectively.

Figure 1 :
Figure 1: The left-hand side shows step sizes and convergence on benchmark problem qafiro.The right-hand side shows the inverse of the limiting step sizes and the relative error in the search direction for negative components of the modified Newton direction in mN-r(2), at the iteration marked by the red circle.

Table 1 :
Average of the sum (α k P + α k D )/2 for a subset of the benchmark problems.

Table 4 :
Number of factorizations and iterations for problems in S with µ 0 = 10 −3 to the left and µ 0 = 10 −6 to the right.

Table 5 :
Number of factorizations and iterations for problems in M with µ 0 = 1.

Table 6 :
Number of factorizations and iterations for problems in M with µ 0 = 10 −3 .

Table 8 :
Number of factorizations and iterations for problems in L with µ 0 = 1.

Table 9 :
Number of factorizations and iterations for problems in L with µ 0 = 10 −3 .

Table 10 :
Number of factorizations and iterations for problems in L with µ 0 = 10 −6 .