A new long-step interior point algorithm for linear programming based on the algebraic equivalent transformation

In this paper, we investigate a new primal-dual long-step interior point algorithm for linear optimization. Based on the step size, interior point algorithms can be divided into two main groups, short-step, and long-step methods. In practice, long-step variants perform better, but usually, a better theoretical complexity can be achieved for the short-step methods. One of the exceptions is the large-update algorithm of Ai and Zhang. The new wide neighborhood and the main characteristics of the presented algorithm are based on their approach. In addition, we use the algebraic equivalent transformation technique of Darvay to determine new modified search directions for our method. We show that the new long-step algorithm is convergent and has the best known iteration complexity of short-step variants. We present our numerical results and compare the performance of our algorithm with two previously introduced Ai-Zhang type interior point algorithms on a set of linear programming test problems from the Netlib library.


Introduction
In this paper, we propose a new long-step interior point algorithm (IPA) for linear optimization. We consider the primal-dual linear programming (LP) problem pair in the following standard form: where A ∈ R m×n with full row rank, b ∈ R m and c ∈ R n are given. The simplex method for solving linear optimization problems was developed by Dantzig (1951). Although there were different attempts to propose new methods, this was the only numerically efficient algorithm to solve LP problems for many years. Khachian (1979) proved that the ellipsoid method can solve the linear optimization problem in polynomial time. This result received much attention because of its theoretical importance, but it turned out that in practice, its performance is significantly worse than that of the simplex algorithm. Karmarkar (1984) proposed a new polynomial algorithm for LP, and this result started a new era in operations research. This algorithm generates a sequence of points in the interior of the feasible polyhedron (i.e., it is an IPA) and therefore follows an entirely different approach from the simplex method, which gives a sequence of vertices of the feasible set. Since then, this approach has received much attention, and numerous new IPAs have been introduced not just for linear optimization but also for many other problem classes, such as linear complementarity problems (LCPs), convex optimization, symmetric optimization, second-order cone optimization, etc.
Based on the step length, IPAs can be divided into two main groups, short-step and long-step methods. Long-step methods perform better in practice, but in general short-step variants have better theoretical complexity O( √ nL). Here, n denotes the dimension of the problem and L = log where (x 0 , y 0 , s 0 ) is the given starting point and ε is the required precision. This discrepancy was pointed out by Renegar (2001) as the "irony of IPAs". In the last twenty years, different attempts have been made to overcome this issue (e.g., Bai et al. 2008;Peng et al. 2002;Potra 2004).
The wide neighbourhood N − ∞ (to be defined in Sect. 3) has been proposed by Kojima et al. (1989). Their algorithm turned out to be efficient in practice, and its complexity was O(nL). Ai and Zhang (2005) introduced an IPA that works in a new wide neighbourhood of the central path. They proved that the method has the same theoretical complexity as the short-step variants.
Using the wide neighborhood applied by Ai and Zhang, several authors proposed new long-step methods with the best known theoretical complexity. There are related results for linear programming (Darvay and Takács 2018;Liu et al. 2011;Yang et al. 2016), for horizontal linear complementarity problems (Potra 2014), and also for semidefinite optimization (Feng and Fang 2014;Li and Terlaky 2010;Pirhaji et al. 2017).
To be able to determine new search directions in IPAs, Darvay (2003) introduced the method of algebraic equivalent transformation. His main idea was to apply a strictly increasing, continuously differentiable function ϕ to the centering equation of the central path system, then apply Newton's method to determine the new search directions. In his paper, Darvay applied the function ϕ(t) = √ t and introduced a new, short-step algorithm for linear optimization. Most algorithms in the literature can be considered as a special case of this technique, where ϕ(t) = t, i.e., the identity map. The function ϕ(t) = t − √ t has been introduced by Darvay et al. (2016), also in the context of linear optimization, and has recently been investigated in several papers of Darvay and his coauthors. They presented a corrector-predictor IPA for linear optimization (Darvay et al. 2020a), and proposed another corrector-predictor IPA for sufficient LCPs (Darvay et al. 2020b), and also introduced a short-step IPA for sufficient LCPs (Darvay et al. 2021). Furthermore, the function ϕ(t) = √ t 2(1+ √ t) has been proposed by Kheirfam and Haghighi (2016), to solve P * (κ) linear complementarity problems. In this paper, we investigate a new long-step IPA for linear optimization, based on the function ϕ(t) = t − √ t. Most of the algorithms based on the algebraic equivalent transformation technique are short-step variants, except for the method of Darvay and Takács (2018), which is based on the function ϕ(t) = √ t and applies an Ai-Zhang type wide neighborhood.
Throughout this paper, we use the following notations. Scalars and indices are denoted by lowercase Latin letters. Vectors are denoted by bold lowercase Latin letters, and we use uppercase Latin letters to denote matrices. Sets are denoted by capital calligraphic letters. Let x, s ∈ R n be two vectors; then xs is componentwise, namely, the Hadamard product of x and s. x + and x − represent the positive and negative part of the vector x, i.e., x + = max{x, 0} ∈ R n and x − = min{x, 0} ∈ R n , where the maximum and minimum are taken componentwise.
If α ∈ R, then x α = [x α 1 , x α 2 , . . . , x α n ] T . If s i = 0 holds for all i ∈ {1, . . . , n}, then the fraction of x and s is the vector x/s = [x 1 /s 1 , x 2 /s 2 . . . , x n /s n ] T . The vector of ones is denoted by e. x is the Euclidean norm of x, is the diagonal matrix with the elements of the vector x in its diagonal. Finally, I denotes the index set I = {1, . . . , n}.
The paper is organized as follows. In Sect. 2 we give an overview of Darvay's algebraic equivalent transformation technique. In Sect. 3 we define a new wide neighborhood, introduce a large-update IPA, and examine its correctness. In the last subsection, we prove that the complexity of the new method is O( √ nL). In Sect. 6 we present our preliminary numerical results. Section 7 summarizes our conclusions.

The algebraic equivalent transformation technique
The optimality criteria of the primal-dual pair (1) can be formulated as: In the case of IPAs, instead of the third equation of the optimality criteria (the complementarity condition), we consider a perturbed version where ν is a given positive parameter.
denote the set of primal-dual feasible solutions and F + = {(x, y, s) ∈ F : x > 0, s > 0} the set of strictly feasible solutions.
If F + = ∅, then for each ν > 0 system (2) has a unique solution (Sonnevend 1986), the ν-center. The set of ν-centers form a path that is called the central path, and system (2) is called the central path problem. Furthermore, as ν tends to 0, the ν-centers converge to a solution of the linear programming problem (1).
To be able to find new search directions, Darvay (2003) introduced the algebraic equivalent transformation technique (AET). His main idea was to transform the central path problem (2) to an equivalent form: where ϕ : (ξ, ∞) → R is a continuously differentiable function with ϕ (t) > 0 for all t ∈ (ξ, ∞), ξ ∈ [0, 1). It is important to note that the transformed system (3) does not modify the central path; it determines different search directions depending on the function ϕ. More precisely, if we are at the point (x, y, s) ∈ F + ⊂ R n+m+n and take a step toward the ν = τ μ-center, where μ = x T s/n and τ ∈ (0, 1) is a given update parameter, then applying Newton's method to (3), the search direction (Δx, Δy, Δs) is the solution of the following system: Traditionally, in the analysis of Ai-Zhang type methods, the value of the update parameter τ is included in the formulation of the Newton-system; this is the main reason why we chose the value of ν as τ μ. The value of τ does not depend on the dimension of the problem; i.e., we propose a large-update IPA.
Since we assumed that A has full row rank and x and s are strictly positive vectors, the Newton-directions are uniquely determined by the system (4).
To facilitate the analysis of IPAs, we consider a scaled version of (4). Let With these notations, the scaled Newton-system can be written as: In this paper, we investigate the function ϕ(t) = t − √ t, t > 1/2 (i.e., ξ = 1/2) introduced by Darvay et al. (2016). Since we fixed the function ϕ, from now on, we omit the subscript ϕ and simply write Our goal is to introduce a new long-step IPA based on this function. To be able to prove the correctness of this method, we need to ensure that p is well-defined. Therefore, we assume that v i > 1/2 is satisfied for all i ∈ I. Let p be the function for which p(v i ) = p i holds for all v i ∈ (1/2, ∞), i.e., Throughout the analysis, we will also investigate different estimations of the function p(t).

The new algorithm
The main idea of Ai and Zhang (2005) was to decompose the Newton-directions into positive and negative parts and use different step lengths with the two components. If we apply this approach to the system (4), we get the following two systems: and the new point with step length α = (α 1 , α 2 ) will be x(α) = x + α 1 Δx − + α 2 Δx + , y(α) = y + α 1 Δy − + α 2 Δy + and s(α) = s + α 1 Δs − + α 2 Δs + . For both systems, the coefficient matrix is exactly the same as in the system (4); therefore, using the same reasoning, it is easy to see that both systems have unique solutions.
It is important to notice that Δx + is not the positive part of Δx (in this case the sign + is a subscript instead of a superscript), it is the solution of the system with p + on its right-hand side. The notation is similar for the other solutions of these systems.
We introduce the index sets I + = {i ∈ I : x i s i ≤ τ μ} = {i ∈ I : v i ≤ 1}, and I − = I \ I + . Under the technical assumption v i > 1 2 , the nonnegativity of a coordinate p i is equivalent to i ∈ I + .
To facilitate the analysis of the algorithm, we introduce the scaled search directions The systems (5) then transform to the following systems The wide neighborhood N − ∞ has been introduced by Kojima et al. (1989). It is defined as follows: Notice that this means that a point is in the neighborhood N − ∞ (1 − τ ) if and only if the corresponding index set I + is empty, namely p + = 0. In the analysis, we are going to use a new neighborhood that depends only on the positive part of the vector p: where 0 < β < 1/2 is a given parameter value. The role of the technical condition v > e/2 has been discussed at the end of Sect. 2. This neighborhood is a modification of the one introduced by Ai and Zhang (2005) (since they require vp + ≤ β) and it is equivalent to the one used by Darvay and Takács (2018) for the function ϕ(t) = √ t. Following the idea of Ai and Zhang (2005), the next lemma verifies that W(τ, β) is indeed a wide neighborhood: Lemma 1 Let 0 < β < 1/2 and 0 < τ < 1 be given parameters, and let γ = Since p(t) is a strictly decreasing function, which is a contradiction.
The following lower and upper bounds on the coordinates of the vector v will be useful for different estimations during the analysis.
From now on, we assume that a point (x, y, s) ∈ W(τ, β) is given, and in the next section, we prove the correctness of the algorithm.

Analysis of the algorithm
Let us introduce the following notations: where α 1 , α 2 ∈ [0, 1] are given step lengths, whose values will be specified later. With these notations, the equation can be written as It is important to note that the search directions are orthogonal, as usually in the case of LP problems, since Furthermore, dx + and dx − are in the kernel of the matrixĀ, while ds + and ds − are in the rowspace ofĀ (see system (6)). Therefore, all four scalar products are 0 in the previous expression. The next two lemmas give lower bounds on the value of h(α).
i v i p i holds. Let us examine the expression 1−t 2 t p(t) over the interval (1, ∞): On the other hand, α 1 ≤ 1 by definition. Thus, h i (α) ≥ τ μ holds for all i ∈ I − .
We show that h(α) is a componentwise strictly positive vector.

Proof By Lemma 1, τ μv
In the case of i ∈ I − , the statement is a consequence of Lemma 2, since h i (α) ≥ τ μ ≥ γ μ.
To be able to prove the feasibility of the new iterates and ensure that they stay in the neighborhood W(τ, β), we need the following technical lemma: Lemma 4 Let (x, y, s) ∈ W(τ, β), α 1 = βτ n and α 2 = 1. Then Proof According to Lemma 3.5 of Ai and Zhang (2005) and using the orthogonality of dx(α) and ds(α), we have By the definition of W(τ, β), we have p + ≤ β. We need to estimate the term p − 2 . According to (7), we have Using these two estimations and substituting the values of α 1 and α 2 , we can write The next lemma gives a positive lower bound on the vector x(α)s(α), which is the first step to prove the strict feasibility of the new point.

Lemma 5
Let (x, y, s) ∈ W(τ, β), α 1 = βτ n and α 2 = 1. Then Proof By Lemma 3, we have h(α) ≥ γ μe. Using Lemma 4 and substituting the value of γ , we get The following statement is the linear programming analogue of Proposition 3.2 by Ai and Zhang (2005) (they proposed it for monotone linear complementarity problems). The proof remains the same.
The following two statements propose bounds on the duality gap of the new point: μ(α) = x(α) T s(α)/n.

Lemma 7
Let α 1 = βτ n and α 2 = 1. Then μ(α) ≥ (1 − α 1 ) μ. (7) we have The following theorem guarantees the proper reduction of the duality gap after an iteration: Lemma 8 Assume that (x, y, s) ∈ W(τ, β), α 1 = βτ n and α 2 = 1. Then First, let us estimate the term v T p + : The first equality holds since v is positive, and we consider only the positive part of p. By applying the Cauchy-Schwarz inequality, we get the first estimation. Using the property v i ≤ 1 when i ∈ I + and the definition of the neighborhood W(τ, β), the last inequality can also be verified.
To obtain an upper bound on the expression v T p − , consider the inequalities 2v−e > 0 and v i > 1 for all i ∈ I − : Using (9) and (10) we obtain Notice, that the upper bound on μ(α) in (8) is positive for all β, τ ∈ (0, 1). Indeed, With a suitable parameter setting, we can ensure that the duality gap decreases strictly monotonically, i.e., μ(α) < μ.
Since 1/2 < v i ≤ 1 for all i ∈ I + , we have Using Corollary 2 and (12), we obtain that where in the last estimation, we used the first statement of Corollary 1.
Using the definition of W(τ, β), we obtain which concludes the proof.
Now we are ready to prove that after an iteration, if the right-hand side of the third equation in the Newton system (6) is denoted by p(α), then p(α) + ≤ β holds. Together with Lemma 9, this means that the new iterates after the Newton-step remain in the neighborhood W(τ, β) .
Proof By the definition of W(τ, β) and Lemma 9, we need to prove Let q : 1 2 , ∞ → R defined by q(t) = 2t 2t 2 +t−1 . This function is strictly decreasing on its domain, therefore using (11), the first term in (13) can be estimated as where the expression (1 − 2β + √ 1 − 2β)/2 is strictly decreasing in β, implying that the upper bound is strictly increasing in β.
To give an upper bound on e − v 2 (α) + , we use Lemmas 10, 4 and then 7: where the last term is strictly increasing in both β and τ . Using the just proved inequalities (13), (14) and (15), we obtain To prove that this expression is less than or equal to β, we need to ensure that the value of the term in square brackets is at most 1. Notice that by the monotonicity of the estimations (14) and (15), their product is also strictly increasing both in β and τ . Moreover, substituting β = τ = 1/8, the coefficient of β on the right-hand side of (16) is less than 0.77, which concludes the proof.

Iteration bound of the algorithm
Theorem 1 Let β = τ = 1 8 , α 1 = βτ n , α 2 = 1, and suppose that a starting point (x 0 , y 0 , s 0 ) ∈ W(τ, β) is given. The algorithm then provides an ε-optimal solution of the primal-dual pair of LPs in Proof Let (x k , y k , s k ) denote the point given by the algorithm in the k th iteration. According to Lemma 8, the following inequality holds for the duality gap in the k th iteration: From the above inequalities, we get that x T k s k ≤ ε holds if 1 − βτ is satisfied. Taking the logarithm of both sides, we obtain k log 1 − βτ Using the inequality − log(1 − ϑ) ≥ ϑ, we can require the fulfillment of the stronger inequality −k βτ The last inequality is satisfied when k ≥ n βτ and this proves the statement.
In the analysis, we applied the fixed step lengths α 1 = βτ n , α 2 = 1. When describing the IPA in Algorithm 1, we chose α 1 as the largest value so that the new iterate remains in the neighborhood W(τ, β). Since the duality gap is strictly decreasing in α 1 and βτ n is a lower bound on the value of α 1 in Algorithm 1, its complexity is at least as good as the analyzed case, i.e., the derived complexity result holds for Algorithm 1 as well.
As can be seen from Theorem 1, the investigated method can produce an ε-optimal solution to LP problems in polynomial time. There are different results in the literature on rounding the solutions provided by an IPA to an exact solution in polynomial time, see, e.g., Mehrotra and Ye (1993), Roos et al. (1997).

Numerical results
To illustrate that the method can be applied to solve LP problems in practice, we implemented it in Matlab and solved selected linear programming problem instances from the Netlib library (Gay 1985). The numerical experiments were carried out on a Dell laptop with an Intel i7 processor and 16 GB RAM.
First, we transformed the problems to the standard form, then eliminated the redundant constraints using the procedure eliminateRedundantRows.m by Ploskas   Table 2 continued and Samaras (2017). After these reformulations, we applied a similar method to procedure CLEAN from Adler et al. (1989) to eliminate fix-valued variables from the linear programming problems.
To be able to give strictly feasible initial points in the neighborhood W(τ, β), we first transformed the problems into symmetric form and then applied the self-dual embedding technique (Ye et al. 1994). To avoid doubling the number of constraints in the first case, we carried out this reformulation according to the last Remark of Jansen et al. (1994, p. 232).
The numbers of rows and columns of the original LP problems (in standard form) are denoted by m 0 and n 0 , respectively, while the sizes after the reformulations and the embedding procedure are denoted by m and n. These are shown in the second to fifth columns of Table 1. We note that m 0 and n 0 differ from the number of rows and columns given on the Netlib site, since the original formulation of the Netlib LP problems possibly contains lower and upper bounds on the variables. In these cases, the sizes were modified when we reformulated the problems in standard form. The times required to clean and embed the problem (preprocessing) and retrieve the solution of the original optimization problem (postsolve) are also shown in Table 1, in the columns "Prep. (s)" and "Posts. (s)".
For the embedded problem, we may choose x = e and s = e as proper initial points, since they are strictly feasible and are included in the neighborhood W(τ, β). The step lengths α 1 and α 2 were calculated in the following greedy way. We fixed the value of α 2 as 1 and determined the largest value α 1 so that the new point (x(α), y(α), s(α)) remains in the neighborhood W(τ, β).
We compared three variants of Algorithm 1, based on the functions ϕ(t) = t, ϕ(t) = √ t and ϕ(t) = t − √ t. The first IPA is a moderately modified version of the original method of Ai and Zhang (2005) (we used a slightly different neighborhood definition W(τ, β)). The second case is the IPA proposed by Darvay and Takács (2018). The third IPA is the algorithm introduced in this paper.
The value of the precision parameter ε was 10 −6 . The number of iterations and the running time (in seconds) required to achieve this precision (i.e., to find a point for which the duality gap is less than ε) for the different algorithm variants are shown in Table 2.
According to our numerical results, there is no significant difference in the performance of the three algorithms for linear programming problems; however, the second variant is moderately better on this test problem set, both in terms of average number of iterations and average running time. It can also be observed that the new variant performs slightly better than the algorithm based on the function ϕ(t) = t.

Conclusion
We investigated a new long-step IPA based on the algebraic equivalent transformation technique, using the function ϕ(t) = t − √ t and a new Ai-Zhang-type wide neighborhood W(τ, β).
We proved that the algorithm is well-defined and provides an ε-optimal solution in at most O √ n log x T 0 s 0 ε steps, therefore, it has the same theoretical complexity as the best short-step variants. According to our preliminary numerical results, the new algorithm performs well in practice.
To extend our results, we would like to propose a similar long-step algorithm for P * (κ) linear complementarity problems, based on the function ϕ(t) = t − √ t. We expect that the choice of function ϕ will cause significant difference in the performance of the different variants. Another interesting question for further research is investigating an infeasible variant of the proposed IPA to avoid applying the self-dual embedding technique when determining the starting points.