Implementation of Interior-point Methods for LP based on Krylov Subspace Iterative Solvers with Inner-iteration Preconditioning

We apply novel inner-iteration preconditioned Krylov subspace methods to the interior-point algorithm for linear programming (LP). Inner-iteration preconditioners recently proposed by Morikuni and Hayami enable us to overcome the severe ill-conditioning of linear equations solved in the final phase of interior-point iterations. The Krylov subspace methods do not suffer from rank-deficiency and therefore no preprocessing is necessary even if rows of the constraint matrix are not linearly independent. By means of these methods, a new interior-point recurrence is proposed in order to omit one matrix-vector product at each step. Extensive numerical experiments are conducted over diverse instances of 138 LP problems including the Netlib, QAPLIB, Mittelmann and Atomizer Basis Pursuit collections. The largest problem has 434,580 unknowns. It turns out that our implementation is more robust than the standard public domain solvers SeDuMi (Self-Dual Minimization), SDPT3 (Semidefinite Programming Toh-Todd-T\"ut\"unc\"u) and the LSMR iterative solver in PDCO (Primal-Dual Barrier Method for Convex Objectives) without increasing CPU time. The proposed interior-point method based on iterative solvers succeeds in solving a fairly large number of LP instances from benchmark libraries under the standard stopping criteria. The work also presents a fairly extensive benchmark test for several renowned solvers including direct and iterative solvers.


Introduction
Consider the linear programming (LP) problem in the standard primal-dual formulation max y,s b T y subject to A T y + s = c, s ≥ 0, where A ∈ R m×n , m ≤ n, and we assume the existence of an optimal solution. In this paper, we describe an implementation of the interior-point method for LP based on iterative solvers.
The main computational task in one iteration of the interior-point method is the solution of a system of linear equations to compute the search direction. For this task, direct solvers are usually used. But some solvers also employ iterative solvers. Iterative solvers are advantageous when the systems are large and sparse, or even when they are large and dense but the product of the coefficient matrix and a vector can be approximated cheaply, as in [11,64]. The difficulty with iterative solvers is that the linear system becomes notoriously ill-conditioned towards the end of interior-point iterations. One approach is to precondition the mathematically equivalent indefinite augmented system (as in equation (5)) as in HOPDM (Higher Order Primal-Dual Method) [28] and also [12,25,26,7,57,6,60,3,2,32]. The other approach is to precondition the equivalent normal equations (as in equation (6)) [27,39,44,9,43,47,59,41,69,14].
In this paper, we treat the normal equations and apply novel inner-iteration preconditioned Krylov subspace methods to them. The inner-iteration preconditioners recently proposed by Morikuni and Hayami [53,54] enable us to deal with the severe ill-conditioning of the normal equations. Furthermore, the proposed Krylov subspace methods do not suffer from singularity and therefore no preprocessing is necessary even if A is rank-deficient.
The main contribution of the present paper is that we actually show that the use of the inner-iteration preconditioner enables the efficient interior-point solution of wide-ranging LP problems. We further proposed combining the row-scaling scheme with the inner-outer iteration methods, where the row norm appears in the successive overrelaxation (SOR) inner-iterations, to improve the condition of the system at each interior-point step. The linear systems are solved with a gradually tightened stopping tolerance. We proposed a new recurrence in order to omit one matrix-vector product at each interior-point step. These techniques reduce the CPU time.
Extensive numerical experiments were conducted over diverse instances of 127 LP problems taken from the standard benchmark libraries Netlib, QAPLIB, and Mittelmann collections. The largest problem has 434,580 unknowns. The proposed interior-point method is entirely based on iterative solvers and yet succeeds in solving a fairly large number of standard LP instances from the benchmark libraries with standard stopping criteria. We could not find any other analogous result where this level of LP instances were solved just relying on iterative solvers.
SeDuMi and SDPT3 are solvers for conic linear programming including semidefinite programming (SDP) and second-order cone programming (SOCP). PDCO is for LP and convex quadratic programming (QP) and has options to solve the system of linear equations with Krylov subspace iterative method LSMR in addition to the direct method. MOSEK is considered as one of the state-of-the-art solvers for LP.
As summarized in Table 1, our implementation was able to solve most instances, which is clearly superior to SeDuMi, SDPT3, PDCO-Direct, and PDCO-LSMR with comparable computation time, though it is still slower than MOSEK.
We also tested our solvers on different problems which arise in basis pursuit [11] where the coefficient matrix is much denser than the aforementioned standard benchmark problems.
We emphasize that there are many interesting topics to be further worked out based on this paper. There is still room for improvement regarding the iterative solvers as well as using more sophisticated methods for the interior-point iterations.
In the following, we introduce the interior-point method and review the iterative solvers previously used. We employ an infeasible primal-dual predictor-corrector interior-point method, one of the methods that evolved from the original primal-dual interior-point method [66,40,48,70] incorporating several innovative ideas, e.g., [72,44].
The following system is obtained by relaxing (2c) to XSe = µe with µ > 0: The interior-point method solves the problem (1) by generating solutions to (3), with µ decreasing towards zero, so that (2) is satisfied within some tolerance level at the solution point.
The search direction at each infeasible interior-point step is obtained by solving the Newton equations where r d := c − A T y − s ∈ R n is the residual of the dual problem, r p := b − Ax ∈ R m is the residual of the primal problem, r c := −XSe + σµe , µ := x T s/n is the duality measure, and σ ∈ [0, 1) is the centering parameter, which is dynamically chosen to govern the progress of the interior-point method. Once the kth iterate (x (k) , y (k) , s (k) ) is given and (4) is solved, we define the next iterate as (x (k+1) , y (k+1) , s (k+1) ) := (x (k) , y (k) , s (k) ) + α(∆x, ∆y, ∆s), where α ∈ (0, 1] is a step length to ensure the positivity of x and s, and then reduce µ to σµ before solving (4) again.
At each iteration, the solution of (4) dominates the total CPU time. The choice of linear solvers depends on the way of arranging the matrix of (4). Aside from solving the (m + 2n) × (m + 2n) system (4), one can solve its reduced equivalent form of size (m + n) × (m + n) or a more condensed equivalent form of size m × m both of which are obtained by performing block Gaussian eliminations on (4). We are concerned in this paper with solving the third equivalent form (6). It is known that the matrix of (6) is semidefinite when any of the following cases is encountered. First, when A is rank-deficient, system (6) is singular. There exist presolving techniques that address this problem, see, e.g., [4,30]. However, they do not guarantee to detect all dependent rows in A. Second, in late interior-point iterations, the diagonal matrix XS −1 has very tiny and very large diagonal values as a result of convergence. Thus, the matrix may become positive semidefinite. In particular, the situation becomes severe when primal degeneracy occurs at an optimal solution. One can refer to [33,73] for more detailed explanations.
Thus, when direct methods such as Cholesky decomposition are applied to (6), some diagonal pivots encountered during decomposition can be zero or negative, causing the algorithm to break down. Many direct methods adopt a strategy of replacing the problematic pivot with a very large number. See, e.g., [73] for the Cholesky-Infinity factorization, which is specially designed to solve (6) when it is positive semidefinite but not definite. Numerical experience [1,42,24,43,5,71,16] indicates that direct methods provide sufficiently accurate solutions for interior-point methods to converge regardless of the ill-conditioning of the matrix. However, as the LP problems become larger, the significant fill-ins in decompositions make direct methods prohibitively expensive. It is stated in [31] that the fill-ins are observed even for very sparse matrices. Moreover, the matrix can be dense, as in QP in support vector machine training [23] or linear programming in basis pursuit [11], and even when A is sparse, AXS −1 A T can be dense or have a pattern of nonzero elements that renders the system difficult for direct methods. The expensive solution of the KKT systems is a usual disadvantage of second-order methods including interior-point methods.
These drawbacks of direct methods and the progress in preconditioning techniques motivate researchers to develop stable iterative methods for solving (6) or alternatively (5). The major problem is that as the interior-point iterations proceed, the condition number of the term XS −1 increases, making the system of linear equations intractable. One way to deal with this is to employ suitable preconditioners. Since our main focus is on solving (6), we explain preconditioners for (6) in detail in the following. We mention [12,25,26,7,57,6,60,3,2] as literature related to preconditioners for (5).
For the iterative solution of (6), the conjugate gradient (CG) method [37] has been applied with diagonal scaling preconditioners [9,59,41] or incomplete Cholesky preconditioners [44,39,12,47]. LSQR with a preconditioner was used in [27]. A matrix-free method of using CG for least squares (CGLS) preconditioned by a partial Cholesky decomposition was proposed in [32]. In [14], a preconditioner based on Greville's method [15] for generalized minimal residual (GMRES) method was applied. Suitable preconditioners were also introduced for particular fields such as the minimum-cost network flow problem in [61,38,49,50]. One may refer to [17] for a review on the application of numerical linear algebra algorithms to the solutions of KKT systems in the optimization context.
In this paper, we propose to solve (6) using Krylov subspace methods preconditioned by stationary inner-iterations recently proposed for least squares problems in [36,53,54]. In Section 2, we briefly describe the framework of Mehrotra's predictor-corrector interior-point algorithm we implemented and the normal equations arising from this algorithm. In Section 3, we specify the application of our method to the normal equations. In Section 4, we present numerical results comparing our method with a modified sparse Cholesky method, three direct solvers in CVX, a major public package for specifying and solving convex programs [34,35], and direct and iterative solvers in PDCO [64]. The testing problems include the typical LP problems from the Netlib, Qaplib and Mittelmann collections in [19] and basis pursuit problems from the package Atomizer [10]. In Section 5, we conclude the paper. Throughout, we use bold lower case letters for column vectors. We denote quantities related to the kth interior-point iteration by using a superscript with round brackets, e.g., x (k) , the kth iteration of Krylov subspace methods by using a subscript without brackets, e.g., x k , and the kth inner iteration by using a superscript with angle brackets, e.g., x k . R(A) denotes the range space of a matrix A. κ(A) denotes the condition number κ(A) = σ 1 (A)/σ r (A), where σ 1 (A) and σ r (A) denote the maximum and minimum nonzero singular values of A, respectively. K k (A, b) = span{b, Ab, . . . , A k−1 b} denotes the Krylov subspace of order k.

Interior-point algorithm and the normal equations
We implement an infeasible version of Mehrotra's predictor-corrector method [45], which has been established as a standard in this area [42,43,70,46]. Note that our method can be applied to other interior-point methods (see, e.g., [70] for more interior-point methods) whose directions are computed via the normal equations (6).

Mehrotra's predictor-corrector algorithm
In this method, the centering parameter σ is determined by dividing each step into two stages.
In the first stage, we solve for the affine direction (∆x af , ∆y af , ∆s af ) and measure its progress in reducing µ. If the affine direction makes large enough progress without violating the nonnegative boundary (2d), then σ is assigned a small value. Otherwise, σ is assigned a larger value to steer the iterate to be more centered in the strictly positive region.
In the second stage, we solve for the corrector direction (∆x cc , ∆y cc , ∆s cc ) where ∆X af = diag(∆x af ), ∆S af = diag(∆s af ) and σ is determined according to the solution in the first stage. Finally, we update the current iterate along the linear combination of the two directions. In our implementation of the interior-point method, we adopt Mehrotra's predictor-corrector algorithm as follows.
In line 13, we first compute trial step lengths α p , α d using equations (9) with (∆x, ∆s) = (∆x (k) , ∆s (k) ). Then, we gradually reduce α p , α d to find the largest step lengths that can ensure the centrality of the updated iterates, i.e., to find the maximumα p ,α d that satisfy where φ is typically chosen as 10 −5 .

The normal equations in the interior-point algorithm
We consider modifying Algorithm 1 so that it is not necessary to update y (k) . Since we assume the existence of an optimal solution to problem (1), we have b ∈ R(A). Let D := S −1/2 X 1/2 and A := AD. Problem (6) with ∆w = A T ∆y (the normal equations of the second kind) is equivalent to In the predictor stage, the problem (7) is equivalent to first solving (11) for ∆w af with ∆w = ∆w af , f = f af := b + AS −1 Xr d , and then updating the remaining unknowns by In the corrector stage, the problem (8) is equivalent to first solving (11) for ∆w cc with ∆w = ∆w cc , f = f cc := AS −1 ∆X af ∆S af e − σµAS −1 e, and then updating the remaining unknowns by By solving (11) for ∆w instead of solving (6) for ∆y, we can compute ∆s af , ∆x af , ∆s cc , and ∆x cc and can save 1MV in (12a) and another in (13a) if a predictor step is performed per interior-point iteration. Here, MV denotes the computational cost required for one matrixvector multiplication. (6) using a suited Krylov subspace method, updating (x, w, s) rather than (x, y, s) can save 1MV each interior-point iteration.

Remark 2.1. For solving an interior-point step from the condensed step equation
Note that in the predictor and corrector stages, problem (11) has the same matrix but different right-hand sides. We introduce methods for solving it in the next section.

Application of inner-iteration preconditioned Krylov subspace methods
In lines 4 and 10 of Algorithm 1, the linear system (11) needs to be solved, with its matrix becoming increasingly ill-conditioned as the interior-point iterations proceed. In this section, we focus on applying inner-iteration preconditioned Krylov subspace methods to (11) because they are advantageous in dealing with ill-conditioned sparse matrices. The methods to be discussed are the preconditioned CG and MINRES methods [37,58] applied to the normal equations of the second kind ((P)CGNE and (P)MRNE, respectively) [13,54], and the right-preconditioned generalized minimal residual method (AB-GMRES) [36,54]. Consider solving linear system Ax = b, where A ∈ R n×n . First, the conjugate gradient (CG) method [37] is an iterative method for such problems when A is a symmetric and positive (semi)definite matrix and b ∈ R(A). CG starts with an initial approximate solution x 0 ∈ R n and determines the kth iterate x k ∈ R n by minimizing [58] is another iterative method for solving such problems but only requires A to be symmetric. MINRES with x 0 determines the kth iterate x k by minimizing b − Ax 2 over the same space as CG.
Third, the generalized minimal residual (GMRES) method [63] only requires A to be square. GMRES with x 0 determines the kth iterate x k by minimizing b − Ax 2 over x 0 + K k (A, r 0 ).

Application of inner-iteration preconditioned CGNE and MRNE methods
We first introduce CGNE and MRNE. Let A = AA T , x = ∆y af , b = f af , and ∆w af = A T ∆y af for the predictor stage, and similarly, let A = AA T , x = ∆y cc , b = f cc , and ∆w cc = A T ∆y cc for the corrector stage. CG and MINRES applied to systems Ax = b are CGNE and MRNE, respectively. With these settings, let the initial solution ∆w 0 ∈ R(A T ) in both stages, and denote the initial residual by g 0 := f − A∆w 0 . CGNE and MRNE can solve (11) without forming AA T explicitly.
Concretely, CGNE gives the kth iterate ∆w k such that ∆w k − ∆w We use inner-iteration preconditioning for CGNE and MRNE methods. The following is a brief summary of the part of [54] where the inner-outer iteration method is analyzed. We give the expressions for the inner-iteration preconditioning and preconditioned matrices to state the conditions under which the former is SPD. Let M be a symmetric nonsingular splitting matrix of AA T such that AA T = M − N . Denote the inner-iteration matrix by H = M −1 N . The inner-iteration preconditioning and preconditioned matrices are

Algorithm 2 CGNE method preconditioned by inner iterations.
1: Let ∆w 0 be the initial approximate solution, and g 0 := f − A∆w 0 . 2: Apply ℓ steps of a stationary iterative method to AA T z = g 0 , u = A T z to obtain z 0 := C ℓ g 0 and u 0 : Apply ℓ steps of a stationary iterative method to AA T z = g k+1 to obtain z k+1 := C ℓ g k+1 and u k+1 := A T z k+1 . 7: Apply ℓ steps of a stationary iterative method to

Application of inner-iteration preconditioned AB-GMRES method
Next, we introduce AB-GMRES. GMRES can solve a square linear system transformed from the rectangular system A∆w af = f af in the predictor stage and A∆w cc = f cc in the corrector stage by using a rectangular right-preconditioning matrix that does not necessarily have to be A T . Let B ∈ R n×m be a preconditioning matrix for A. Then, AB-GMRES corresponds to GMRES [63] applied to which is equivalent to the minimum-norm solution to the problem (11), Specifically, we apply AB-GMRES preconditioned by inner iterations [53,54] to (11). This method was shown to outperform previous methods on ill-conditioned and rank-deficient problems. We give expressions for the inner-iteration preconditioning and preconditioned matrices. Let M be a nonsingular splitting matrix such that

Algorithm 4 AB-GMRES method preconditioned by inner iterations.
1: Let ∆w 0 ∈ R n be the initial approximate solution, and g 0 : Apply ℓ steps of a stationary iterative method to end for 9: h k+1,k := u k 2 , v k+1 := u k /h k+1,k 10: end for 11: 1 is the first column of the identity matrix, and Note that the left-preconditioned generalized minimal residual method (BA-GMRES) [36,53,54] can be applied to solve the corrector stage problem, which can be written as the normal equations of the first kind AA T ∆y cc = A(SX) −1/2 (∆X af ∆S af e − σµe) , or equivalently min In fact, this formulation was adopted in [31] and solved by the CGLS method preconditioned by partial Cholesky decomposition that works in m-dimen-sional space. The BA-GMRES also works in m-dimensional space.
The advantage of the inner-iteration preconditioning methods is that we can avoid explicitly computing and storing the preconditioning matrices for A in (11). We present efficient algorithms for specific inner iterations in the next section.

SSOR inner iterations for preconditioning the CGNE and MRNE methods
The inner-iteration preconditioned CGNE and MRNE methods require a symmetric preconditioning matrix. This is achieved by the SSOR inner-iteration preconditioning, which works on the normal equations of the second kind AA T z = g, u = A T z, and its preconditioning matrix C ℓ is SPD for ℓ odd for ω ∈ (0, 2) [51, 52, Theorem 2.8]. This method exploits a symmetric splitting matrix by the forward updates, i = 1, 2, . . . , m in lines 3-6 in Algorithm 6 and the reverse updates, i = m, m − 1, . . . , 1, and can be efficiently implemented as the NE-SSOR method [62], [54,Algorithm D.8]. See [8] where SSOR preconditioning for CGNE with ℓ = 1 is proposed. Let α T i be the ith row vector of A. Algorithm 5 shows the NE-SSOR method.

SOR inner iterations for preconditioning the AB-GMRES method
Next, we introduce the SOR method applied to the normal equations of the second kind AA T p = g, z = A T p with g = v k or q k as used in Algorithm 4. If the relaxation parameter ω satisfies ω ∈ (0, 2), then the iteration matrix H of this method is semiconvergent, i.e., lim i→∞ H i exists [20]. An efficient algorithm for this method is called NE-SOR and is given as follows [62], [54,Algorithm D.7].
When Algorithm 6 is applied to lines 4 and 12 of Algorithm 4, the normal equations of the second kind are solved approximately.
Since the rows of A are required in the NE-(S)SOR iterations, it would be more efficient if A is stored row-wise.

Numerical experiments
In this section, we compare the performance of the interior-point method based on the iterative solvers with the standard interior-point programs. We also developed an efficient direct solver coded in C to compare with the iterative solvers. For the sake of completeness, we briefly describe our direct solver first.

Direct solver for the normal equations
To deal with the rank-deficiency, we used a strategy that is similar to the Cholesky-Infinity modification scheme introduced in the LIPSOL solver [73]. However, instead of penalizing the elements that are close to zero, we removed them and solved the reduced system. We implemented this modification by an LDLT decomposition. We used the Matlab built-in function chol to detect whether the matrix is symmetric positive definite. We used the ldlchol from CSparse package version 3.1.0 [18] when the matrix was symmetric positive definite, and we turned to the Matlab built-in solver ldl for the semidefinite cases which uses MA57 [22].

Implementation specifications
In this section, we describe our numerical experiments. The initial solution for the interior-point method was set using the method described in LIPSOL solver [73]. The initial solution for the Krylov subspace iterations and the inner iterations was set to zero.
We set the maximum number of the interior-point iterations as 99 and the stopping criterion regarding the error measure as Γ (k) ≤ ǫ out = 10 −8 , where Γ (k) is defined by (10). For the iterative solver for the linear system (11), we set the maximum number of iterations for CGNE, MRNE and AB-GMRES as m, and relaxed it to 40, 000 for some difficult problems for CGNE and MRNE. We set the stopping criterion for the scaled residual as where ǫ in is initially 10 −6 and is kept in the range [10 −14 , 10 −4 ] during the process. We adjusted ǫ in according to the progress of the interior-point iterations. We truncated the iterative solving prematurely in the early interior-point iterations, and pursued a more precise direction as the LP solution was approached. The progress was measured by the error measure Γ (k) . Concretely, we adjusted ǫ in as For steps where iterative solvers failed to converge within the maximum number of iterations, we adopted the iterative solution with the minimum residual norm and slightly increased the value of ǫ in by multiplying by 1.5 which would be used in the next interior-point step.
Note that preliminary experiments were conducted with the tolerance being fixed for all the problems. However, further experiments showed that adjusting the parameter ǫ in with the progress towards an optimal solution worked better. This is also another advantage of using iterative solvers rather than direct solvers.
We adopt the implementation of AB-GMRES preconditioned by NE-SOR inner-iterations [55] with the additional row-scaling scheme (Section 3.5). No restarts were used for the AB-GMRES method. The non-breakdown conditions discussed in Sections 3.1, 3.2 are satisfied.
For the direct solver, the tolerance for dropping pivot elements close to zero was 10 −16 for most of the problems; for some problems this tolerance has to be increased to 10 −6 to overcome breakdown.
The experiment was conducted on a MacBook Pro with a 2.6 GHz Intel Core i5 processor with 8 GB of random-access memory, OS X El Capitan version 10.11.2. The interior-point method was coded in Matlab R2014b and the iterative solvers including AB-GMRES (NE-SOR), CGNE (NE-SSOR), and MRNE (NE-SSOR), were coded in C and compiled as Matlab Executable (MEX) files accelerated with Basic Linear Algebra Subprograms (BLAS).
We compared our implementation with PDCO version 2013 [64] and three solvers available in CVX [34,35]: SDPT3 version 4.0 [67,68], SeDuMi version 1.34 [67] and MOSEK version 7.1.0.12 [56], with the default interior-point stopping criterion (18). Note that SDPT3, SeDuMi, and PDCO are non-commercial public domain solvers, whereas MOSEK is a commercial solver known as one of the state-of-the-art solvers. PDCO provides several choices for the solvers for the interior-point steps, among which we chose the direct (Cholesky) method and the LSMR method. Although MINRES solver is another iterative solver available in PDCO, its homepage [64] suggests that LSMR performs better in general. Thus, we tested with LSMR. For PDCO parameters, we chose to suppress scaling for the original problem. The other solvers were implemented with the CVX Matlab interface, and we recorded the CPU time reported in the screen output of each solver. However, it usually took a longer time for the CVX to finish the whole process. The larger the problem was, the more apparent this extra CPU time became. For example, for problem ken_18, the screen output of SeDuMi was 765.3 seconds while the total processing time was 7,615.2 seconds.
We tested on two classes of LP problems: 127 typical problems from the benchmark libraries and 13 problems arising from basis pursuit. The results are described in Section 4.3 and Section 4.4, respectively.

Typical LP problems: sparse and ill-conditioned problems
We tested 127 typical LP problems from the Netlib, Qaplib and Mittelmann collections in [19]. Most of the problems have sparse and full-rank constraint matrix A (except problems bore3d and cycle). For the problems with l ≤ x ≤ u, l = 0, u = ∞, we transform them using the approach in LIPSOL [73].
The overall summary of numerical experiments on the 127 typical problems is given in Table  1. The counts in column "Failed" include the case where a problem was solved at a relaxed tolerance (phrased as "inaccurately solved" in CVX). Column "Expensive" refers to the case where the interior-point iterations took more than a time limit of 20 hours.
MOSEK was most stable in the sense that it solved all 127 problems, and MRNE (NE-SSOR) came next with only two failures with the Netlib problems greanbea and greanbeb. CGNE (NE-SSOR) method solved almost all the problems that MRNE (NE-SSOR) solved, except for the largest Qaplib problem, which was solved to a slightly larger tolerance level of 10 −7 . AB-GMRES (NE-SOR) was also very stable and solved the problems accurately enough. However, it took longer than 20 hours for two problems that have 105,127 and 16,675 equations, respectively, although it succeeded in solving larger problems such as pds-80. The other solvers were less stable. The modified Cholesky solver and PDCO (Direct) solved 92% and 87% of the problems, respectively, although they were faster than the other solvers for the problems that they could successfully solve. PDCO (LSMR) solved 69% problems and was slower than the proposed solvers. The reason could be that it does not use preconditioners. SDPT3 solved 60% and SeDuMi 82% of the problems. Here we should mention that SeDuMi and SDPT3 are designed for LP, SDP, and SOCP, while our code is (currently) tuned solely for LP.
Note that MOSEK solver uses a multi-corrector interior-point method [29] while our implementation is a single corrector (i.e., predictor-corrector) method. This led to different numbers of interior-point iterations as shown in the tables. Thus, there is still room for improvement in the efficiency of our solver based on iterative solvers if a more elaborately tuned interior-point framework such as the one in MOSEK is adopted.
In order to show the trends of performance, we use the Dolan-Moré performance profiles [21] in Figures 1 and 2, with π(τ ) := P (log 2 r ps ≤ τ ) the proportion of problems for which log 2 -scaled performance ratio is at most τ , where r ps := t ps /t * p , t ps is the CPU time for solver s to solve problem p, and t * p is the minimal CPU time for problem p. Figure 1 includes the commercial solver MOSEK while Figure 2 does not. Note that the generation of Figure 2 is not by simply removing the curve of MOSEK from Figure 1, but rather removing the profile of MOSEK from the comparison dataset and thus changing the minimum CPU time cost for   each problem. The comparison indicates that the iterative solvers, although slower than the commercial solver MOSEK in some cases, were often able to solve the problems to the designated accuracy.
In Tables 2, 3, and 4, we give the following information: 1. the name of the problem and the size (m, n) of the constraint matrix, 2. the number of interior-point iterations required for convergence, 3. CPU time for the entire computation in seconds. For the cases shorter than 3, 000 seconds, CPU time is taken as an average over 10 measurements. In each row, we indicate in red boldface and blue underline the fastest and second fastest solvers in CPU time, respectively.
Besides the statistics, we also use the following notation: † inaccurately solved, i.e., the value of ǫ out was relaxed to a larger level. In the column "Iter", we provide extra information † a at the stopping point: for our solvers, a = ⌊log 10 Γ (k) ⌋,  where ⌊·⌋ is the floor function; for CVX solvers, a = ⌊log 10 µ⌋ as provided in the CVX output; PDCO solvers do not provide this information, thus they are not given; f the interior-point iterations diverged; t the iterations took longer than 20 hours.
Note that all zero rows and columns of the constraint matrix A were removed beforehand. The problems marked with # are with rank-deficient A even after this preprocessing. For these problems we put rank(A) in brackets after m, which is computed using the Matlab function sprank.
In order to give an idea of the typical differences between methods, we present the interiorpoint convergence curves for problem ken_ 13. The problem has a constraint matrix A ∈ R 28,632×42,659 with full row rank and 97, 246 nonzero elements.
Different aspects of the performance of the four solvers are displayed in Figure 3. The red dotted line with diamond markers represents the quantity related to AB-GMRES (NE-SOR), the blue with downward-pointing triangle CGNE (NE-SSOR), the yellow with asterisk MRNE (NE-SSOR), and the dark green with plus sign the modified Cholesky solver. Note that for this problem ken_ 13, the modified Cholesky solver became numerically inaccurate at the last step and it broke down if the default dropping tolerance was used. Thus, we increased it to 10 −6 . Figure 3a shows κ(AA T ) in log 10 scale. It verifies the claim that the least squares problem becomes increasingly ill-conditioned at the final steps in the interior-point process: κ(AA T ) started from around 10 20 and increased to 10 80 at the last 3-5 steps. Figure 3b shows the convergence curve of the duality measure µ in log 10 scale. The µ drops below the tolerance and the stopping criterion is satisfied. Although it is not shown in the figure, we found that the interior-point method with modified Cholesky with the default value of the dropping tolerance 10 −16 stagnated for µ ≃ 10 −4 . Comparing with Figure 3a, it is observed that the solvers started to behave differently as κ(AA T ) increased sharply. Figures 3c and 3d show the relative residual norm f af −AA T ∆y af 2 / f af 2 in the predictor stage and f cc − AA T ∆y cc 2 / f cc 2 in the corrector stage, respectively. The quantities are in log 10 scale. The relative residual norm for modified Cholesky tended to increase with the interior-point iterations and sharply increased in the final phase when it lost accuracy in solving the normal equations for the steps. We observed similar trends for other test problems and, in the worst cases, the inaccuracy in the solutions prevented interior-point convergence. Among the iterative solvers, AB-GMRES (NE-SOR) and MRNE (NE-SSOR) were the most stable in keeping the accuracy of solutions to the normal equations; CGNE (NE-SSOR) performed similarly but lost numerical accuracy at the last few interior-point steps. Figures 3e and 3f show the CPU time and number of iterations of the Krylov methods for each interior-point step, respectively. It was observed that the CPU time of the modified Cholesky solver was more evenly distributed in the whole process while that of the iterative solvers tended to be less in the beginning and ending phases. At the final stage, AB-GMRES (NE-SOR) required the fewest number of iterations but cost much more CPU time than the other two iterative solvers. This can be explained as follows: AB-GMRES (NE-SOR) requires increasingly more CPU time and memory with the number of iterations because it has to store the orthonormal vectors in the modified Gram-Schmidt process as well as the Hessenberg matrix. In contrast, CGNE (NE-SSOR) and MRNE (NE-SSOR) based methods require constant memory. CGNE (NE-SSOR) took more iterations and CPU time than MRNE (NE-SSOR). Other than A and the preconditioner, the memory required for k iterations of AB-GMRES is O(k 2 + km + n) and that for CGNE and MRNE iterations is O(m + n) [36,54]. This explains why AB-GMRES (NE-SOR), although requiring fewer iterations, usually takes longer to obtain the solution at each interior-point step. We also did experiments on restarting AB-GMRES for a few problems. However, the performance was not competitive compared to the non-restarted version.
On the other hand, the motivation for using AB-GMRES (NE-SOR) is that GMRES is more robust for ill-conditioned problems than the symmetric solvers CG and MINRES. This is because GMRES uses a modified Gram-Schmidt process to orthogonalize the vectors explicitly; CG and MINRES rely on short recurrences, where orthogonality of vectors may be lost due to rounding error. Moreover, GMRES allows using non-symmetric preconditioning while the symmetric solvers require symmetric preconditioning. For example, using SOR preconditioner is cheaper than SSOR for one iteration because the latter goes forwards and backwards. SOR requires 2MV + 3m operations per inner iteration, while SSOR requires 4MV + 6m. In this sense, the GMRES method has more freedom for choosing preconditioners.
From Figure 3, we may draw a few conclusions. For most problems, the direct solver gave the most efficient result in terms of CPU time. However, for some problems, the direct solver tended to lose accuracy as interior-point iterations proceeded and, in the worst cases, this would inhibit convergence. For problems where the direct method broke down, the proposed inner-

Conclusions
We proposed a new way of preconditioning the normal equations of the second kind arising within interior-point methods for LP problems (11). The resulting interior-point solver is composed of three nested iteration schemes. The outer-most layer is the predictor-corrector interior-point method; the middle layer is the Krylov subspace method for least squares problems, where we may use AB-GMRES, CGNE or MRNE; on top of that, we use a row-scaling scheme that does not incur extra CPU time but helps improving the condition of the system at each interior-point step; the inner-most layer, serving as a preconditioner for the middle layer, is the stationary inner iterations. Among the three layers, only the outer-most one runs towards the required accuracy and the other two are terminated prematurely. The linear systems are solved with a gradually tightened stopping tolerance. We also proposed a new recurrence regarding ∆w in place of ∆y to omit one matrix-vector product at each interior-point step.
We showed that the use of inner-iteration preconditioners in combination with these techniques enables the efficient interior-point solution of wide-ranging LP problems. We also presented a fairly extensive benchmark test for several renowned solvers including direct and iterative solvers.
The advantage of our method is that it does not break down, even when the matrices become ill-conditioned or (nearly) singular. The method is competitive for large and sparse problems and may also be well-suited to problems in which matrices are too large and dense for direct approaches to work. Extensive numerical experiments showed that our method outperforms the open-source solvers SDPT3, SeDuMi, and PDCO regarding stability and efficiency.
There are several aspects of our method that could be improved. The current implementation of the interior-point method does not use a preprocessing step except for eliminating empty rows and columns. Its efficiency may be improved by adopting some existing preprocessing procedure such as presolve to detect and remove linear dependencies of rows and columns in the constraint matrix. Also, the proposed method could be used in conjunction with more advanced interiorpoint frameworks such as the multi-corrector interior-point method. In terms of the linear solver, future work is to try reorthogonalization for CG and MINRES and the Householder orthogonalization for GMRES. It is also important to develop preconditioners that only require the action of the operator on a vector, as in huge basis pursuit problems.
It would also be worthwhile to extend our method to problems such as convex QP and SDP.