Polynomial worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms for linear programming

Gondzio, Jacek; Sobral, Francisco N. C.

doi:10.1007/s10589-024-00584-6

Polynomial worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms for linear programming

Open access
Published: 07 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Computational Optimization and Applications Aims and scope Submit manuscript

Polynomial worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms for linear programming

Download PDF

147 Accesses
Explore all metrics

Abstract

Quasi-Newton methods are well known techniques for large-scale numerical optimization. They use an approximation of the Hessian in optimization problems or the Jacobian in system of nonlinear equations. In the Interior Point context, quasi-Newton algorithms compute low-rank updates of the matrix associated with the Newton systems, instead of computing it from scratch at every iteration. In this work, we show that a simplified quasi-Newton primal-dual interior point algorithm for linear programming, which alternates between Newton and quasi-Newton iterations, enjoys polynomial worst-case iteration complexity. Feasible and infeasible cases of the algorithm are considered and the most common neighborhoods of the central path are analyzed. To the best of our knowledge, this is the first attempt to deliver polynomial worst-case iteration complexity bounds for these methods. Unsurprisingly, the worst-case complexity results obtained when quasi-Newton directions are used are worse than their counterparts when Newton directions are employed. However, quasi-Newton updates are very attractive for large-scale optimization problems where the cost of factorizing the matrices is much higher than the cost of solving linear systems.

Complexity analysis of a full-Newton step interior-point method for linear optimization

Article 19 March 2016

Complexity analysis of interior-point methods for second-order stationary points of nonlinear semidefinite optimization problems

Article 12 July 2023

Local convergence of primal–dual interior point methods for nonlinear semidefinite optimization using the Monteiro–Tsuchiya family of search directions

Article 28 February 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let us consider the following general linear program

$$\begin{aligned} \min \quad c^T x, \quad {\text{ s.t. } } \quad A x = b, \ x \ge 0, \end{aligned}$$

(1)

where $x, c \in \mathbb {R}^n$, $b \in \mathbb {R}^m$ and $A \in \mathbb {R}^{m \times n}$. We assume that (1) is feasible and the rows of A are linearly independent. Define function $F: \mathbb {R}^{2n + m} \rightarrow \mathbb {R}^{2n + m}$ by

$$\begin{aligned} F(x, \lambda , z) = (A^T \lambda + z - c, A x - b, X Z e), \end{aligned}$$

(2)

where $X, Z \in \mathbb {R}^{n \times n}$ are diagonal matrices defined by $X = {{\,\textrm{diag}\,}}(x)$ and $Z = {{\,\textrm{diag}\,}}(z)$, respectively, and e is the vector of ones of appropriate size. First order necessary optimality conditions for (1) state that, if $x^* \ge 0$ is a minimizer, then there exist $z^* \in \mathbb {R}^n$, $z^* \ge 0$, and $\lambda ^* \in \mathbb {R}^m$ such that $F(x^{*}, {\lambda }^{*}, z^{*}) = 0$ holds.

Interior point methods (IPMs) follow the so-called central-path of problem (1), defined by the solution of the perturbed KKT system $F(x^{}, {\lambda }^{}, z^{}) = (0, 0, \mu e)$, as $\mu \rightarrow 0$. This path is well defined when A has full rank and the set $\mathcal {F} \doteq \! \{ (x, \lambda , z) \!\mid \! A x = b, A^T \lambda + z = c, x\!>\! 0, z\! >\! 0\}$ is non empty. Instead of solving such a system exactly, primal-dual IPMs apply one iteration of Newton method for a given value of $\mu _k$ at iteration k. In order to calculate this step, the Jacobian of F is needed

$$\begin{aligned} J(x, \lambda , z) = \left[ \begin{matrix} 0 &{} A^T &{} I \\ A &{} 0 &{} 0 \\ Z &{} 0 &{} X \end{matrix}\right] . \end{aligned}$$

(3)

With an iterate $(x^{k}, {\lambda }^{k}, z^{k})$ at iteration k, the classical Newton direction is calculated by solving the following system

$$\begin{aligned} J(x^{k}, {\lambda }^{k}, z^{k}) \cdot (\Delta {x^k}, \Delta {\lambda ^k}, \Delta {z^k}) = (0, 0, \sigma _k \mu _k e) - F(x^{k}, {\lambda }^{k}, z^{k}), \end{aligned}$$

(4)

where $\mu _k$ is set to be the average complementarity gap ${x^k}^T z^k / n$ and $\sigma _k \in (0,1)$ determines its target reduction. Under our hypotheses, system (4) always has a solution and when the iterates belong to $\mathcal {F}$ (i.e. the feasible case), the average complementarity gap going to zero implies convergence to the solution of (1).

While the coefficient matrix in (4) can be efficiently evaluated and stored, it changes at each iteration. The solution of (4) is usually accomplished by direct methods, using suitable matrix factorizations [1, 2], or by iterative methods [3], computing preconditioners to improve their convergence properties. In this paper we are concerned with classes of problems for which it is advantageous to approximate $J(x^{k}, {\lambda }^{k}, z^{k})$ in order to reduce the cost of solving (4).

Usually, IPMs do not deal explicitly with the unreduced system (4), but rather consider its reduced form as augmented system (which is symmetric) or as normal equations (whose coefficient matrix is positive definite) [4]. The interest in working directly with unreduced systems has attracted more attention in the recent years, since they have good sparsity structure and also interesting spectral properties [5]. In [6], numerical experiments comparing preconditioners for unreduced and augmented systems were made. The appeal for using preconditioners for unreduced systems is their good conditioning close to the solution. (Let us mention here that there have been successful attempts to use quasi-Newton strategies to update the preconditioners [7,8,9].) Gondzio and Sobral [10] considered unreduced systems in a way similar to [6]. They studied the Jacobian of F asking whether it is possible to approximate it by classical quasi-Newton approaches for nonlinear systems. Broyden low-rank updates were used and the numerical experiments showed that this approach is effective for IPMs when the cost of solving linear systems is considerably lower than the cost of computing the factorization of the Jacobian (or its associated reduced form).

Recently, Ek and Forsgren [11] presented a theoretical background and numerical experiments regarding a different kind of low rank updates. The proposed update is based on the Eckart–Young–Mirsky theorem, rather than on the secant equation satisfied by the Broyden update, and affects only the “third row” of matrix $J(x^{k}, {\lambda }^{k}, z^{k})$, related to the nonlinear part of F. Convex quadratic optimization problems were considered and local convergence was established for a simplified primal-dual interior point algorithm, but no complexity bound was provided. It is worth mentioning that the iteration worst-case complexity of $O(\sqrt{n})$ was shown for a short-step primal algorithm by Gonzaga [12], where low-rank updates were used to compute the projection matrix needed by such type of algorithms. Secant equations were also used in [13] for the same purpose, but without complexity results.

Polynomial worst-case iteration complexity is a key feature of IPMs for linear and convex-quadratic problems [14]. It is achieved by taking steps in the Newton direction (4) such that the new iterate belongs to some neighborhood of the central path. In the case of linear programming, the iteration worst-case complexity involves polynomials of orders between $\sqrt{n}$ and $n^2$, depending on the type of neighborhood of the central path used and whether feasible or infeasible iterates are allowed [15]. Those results have also been generalized to symmetric cone optimization problems [16].

This work is intended to provide the first steps towards the study of iteration worst-case complexity of quasi-Newton primal-dual interior point algorithms. We define a simplified algorithm that alternates between Newton and quasi-Newton iterations and present non-trivial extensions of well known complexity results from [15] and properties that arise when Broyden “bad” quasi-Newton updates are used. Worst-case complexity is proven for both feasible and infeasible cases in the most commonly used neighborhoods. The theoretical study is motivated by the very promising results from [10] in quadratic programming problems. As expected, the degrees of polynomials in the complexity results are higher than those obtained when steps in Newton directions are made.

The paper is organized as follows. In Sect. 2 we review basic quasi-Newton concepts and the properties of quasi-Newton algorithms presented in [10]. Then, in Sect. 3 we analyze the worst-case complexity for the feasible case, considering two popular neighborhoods of the central path: $\mathcal {N}_2$ and $\mathcal {N}_s$, which are formally defined in the section. Section 4 is devoted to the infeasible case when the iterates are confined to the $\mathcal {N}_s$ neighborhood. Final comments, observations and possible directions of future work are discussed in Sect. 5.

Notation

We define $\Vert \cdot \Vert $ as the Euclidean norm for vectors and the induced $\ell _2$-norm for matrices. We will use the short versions $F_k$ and $J_k$ to describe $F(x^{k}, {\lambda }^{k}, z^{k})$ and $J(x^{k}, {\lambda }^{k}, z^{k})$, respectively. In addition, we will use both inline $(x^{k}, {\lambda }^{k}, z^{k})$ and matrix $\left[ {\begin{matrix} x^k \\ \lambda ^k \\ z^k \end{matrix}} \right] $ notations to address vectors in this work.

2 Background

Given a function $G:\mathbb {R}^N \rightarrow \mathbb {R}^N$, suppose that we want to solve the nonlinear system $G(w) = 0$. Secant methods iteratively construct a linear model $M_k(w)$ of G which interpolates the last two computed iterates of the method. At each iteration, they need to compute an approximation to the Jacobian of G, which has to satisfy the secant equation $ B s_{k-1} = y_{k - 1} $, where $s_{k - 1} = w^k - w^{k - 1}$ and $y_{k - 1} = G(w^{k}) - G(w^{k - 1})$. There are infinitely many solutions to the secant equation for $N \ge 2$ and different approaches generate different secant methods [17]. Among them, the Broyden “bad” approach uses the already computed approximation to the inverse of G at $w^{k - 1}$, called $H_{k - 1}$, to compute the current approximation $H_k$ as

$$\begin{aligned} H_k = H_{k - 1} + \frac{(s_{k - 1} - H_{k - 1} y_{k - 1}) y_{k - 1}^T}{y_{k - 1}^T y_{k - 1}} = H_{k - 1} V_{k - 1} + \frac{s_{k - 1} y_{k - 1}^T}{\rho _{k - 1}}, \end{aligned}$$

(5)

where $V_{k - 1} = \left( I - \frac{y_{k - 1} y_{k - 1}^T}{\rho _{k - 1}} \right) $ and $\rho _{k - 1} = y_{k - 1}^T y_{k - 1}$. The Broyden “bad” update is a rank-1 update where $H_k$ is the matrix closest to $H_{k - 1}$ in the Frobenius norm which satisfies the secant equation.

For the specific case of this work, where G is given by F defined in (2), we have that $N = 2n + m$, $w = (x^{}, {\lambda }^{}, z^{})$ and the vectors $s_{k - 1}$ and $y_{k - 1}$ from the secant equation assume a more specific description

$$\begin{aligned} s_{k - 1} = \alpha _{k - 1} \left[ \begin{matrix} \Delta {x}^{k - 1} \\ \Delta {\lambda }^{k - 1} \\ \Delta {z}^{k - 1} \end{matrix} \right] \quad \text {and}\quad y_{k - 1} = \left[ \begin{matrix} \alpha _{k - 1} (A^T \Delta {\lambda }^{k - 1} + \Delta {z}^{k - 1}) \\ \alpha _{k - 1} A \Delta {x}^{k - 1} \\ X^{k} Z^{k} e - X^{k - 1} Z^{k - 1} e \end{matrix} \right] , \end{aligned}$$

(6)

where $\alpha _{k - 1} \in (0, 1]$ is the step-size taken at iteration $k - 1$ towards the solution of (4). Broyden “bad” updates were shown to be numerically efficient and theoretically sound for IPMs in [10], where the authors described and implemented an interior point method based on low rank quasi-Newton approximations to the Jacobian of F.

Since we are interested in finding an approximate solution of the linear system given by the Newton method (4), in the Broyden “bad” approach the following direction is computed

$$\begin{aligned} \left[ \begin{matrix} \Delta x^k \\ \Delta \lambda ^k \\ \Delta z^k \end{matrix} \right] = H_k \left[ \begin{matrix} c - A^T \lambda ^k - z^k \\ b - A x^k \\ \sigma \mu _k e - X^k Z^k e \end{matrix} \right] . \end{aligned}$$

(7)

If $H_k = J_k^{-1}$, system (7) can be converted to (4) by simply multiplying both sides of (7) by $J_k$. Therefore, in the same way as discussed in [10], when the quasi-Newton approximation is used, we assume that the initial approximation $H_{k - 1}$ in (5) is given by the exact $J_{k - 1}^{-1}$. The quasi-Newton procedure strongly uses the fact that the factorization of $J_{k - 1}$ (or a good preconditioner) has already been computed. In addition, it is not hard to show that, with this choice of initial approximation, the computation of $r = H_k v$, where $H_k$ is the Broyden “bad” approximation of $J_k^{-1}$, is equivalent to the solution of

$$\begin{aligned} J_{k - 1} r = v + \eta , \end{aligned}$$

(8)

where $\eta = \big (0, 0, \gamma _1 \left[ \alpha _{k - 1} \left( Z^{k - 1} \Delta {x}^{k - 1} + X^{k - 1} \Delta {z}^{k - 1} \right) - \left( X^{k} Z^{k} - X^{k - 1} Z^{k - 1}\right) e \right] \big )$ and $\gamma _1 = y_{k - 1}^T v / y_{k - 1}^T y_{k - 1}$. This result can be generalized for an arbitrary number of updates (see [10, Lemma 2]). However, delivering the worst-case complexity analysis for an IPM with such updates is not straightforward, even for a single update.

Equation (8) is the basis of the analysis developed in this work. It states that we can study quasi-Newton steps using the Jacobian of the Newton step. The only difference is the right-hand side. Using this property of the Broyden “bad” update we are able to extend the well known complexity results described in [15] for the case where one quasi-Newton update is performed. The difficulty in the analysis will be mostly caused by the extra term, added to the usual right-hand side of (4). It is important to note that (8) does not assume that the iterates are feasible, hence it is useful in both feasible and infeasible cases. Although by (5), the sparsity structure of the third row of $B_k$ (the inverse of $H_k$) is lost, we can see that structural sparsity of $J_{k - 1}$ can still be used to solve the linear systems.

Let us define a skeleton primal-dual quasi-Newton interior point algorithm. It is given by Algorithm 1 and generates a sequence of alternating Newton and quasi-Newton steps. Clearly, by the nature of update (5), the first step needs to be a Newton step. We also observe that the inverse of $J_k$ is never computed.

Let us analyze what happens when a sequence of two steps is performed: at iteration k the Newton step is made (with stepsize $\alpha _k$) and then at iteration $k+1$ the quasi-Newton step is taken (with stepsize $\alpha _{k + 1}$). In the remainder of this work, to make equations shorter, we will remove the index of iterates. The Newton iterates will be denoted by $(x, \lambda , z)$, $(\Delta {x}, \Delta {\lambda }, \Delta {z})$ and $\alpha $, while quasi-Newton iterates will be denoted by $(\bar{x}, \bar{\lambda }, \bar{z})$, $(\Delta {\bar{x}}, \Delta {\bar{\lambda }}, \Delta {\bar{z}})$ and $\bar{\alpha }$. The same can be said to F, J and $\bar{F}$, $\bar{J}$, respectively, and all other iteration-based constants. When necessary, the iterate after a quasi-Newton step (iteration $k + 2$) will be denoted by double bars: $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$.

For the iteration after a Newton step, we first observe that

$$\begin{aligned} \begin{aligned} \bar{X} \bar{Z} e&= \left( X + \alpha \Delta {X}\right) \left( Z + \alpha \Delta {Z}\right) e = (1 - \alpha ) X Z e + \alpha \sigma \mu e + {\alpha }^2 \Delta {X}\Delta {Z}e, \end{aligned} \end{aligned}$$

(9)

where matrices $\Delta {Z}$ and $\Delta {X}$ are given, respectively, by ${{\,\textrm{diag}\,}}(\Delta {z})$ and ${{\,\textrm{diag}\,}}(\Delta {x})$. Later, in the proofs of several technical results, we will need to analyze the error produced when the quasi-Newton direction $(\Delta {\bar{x}}, \Delta {\bar{\lambda }}, \Delta {\bar{z}})$ is multiplied by $\bar{J}$:

$$\begin{aligned} \bar{J} \begin{bmatrix} \Delta {\bar{x}}\\ \Delta {\bar{\lambda }}\\ \Delta {\bar{z}}\end{bmatrix} = \left( \begin{bmatrix} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ \bar{Z} - Z &{} 0 &{} \bar{X} - X \\ \end{bmatrix} + J \right) \begin{bmatrix} \Delta {\bar{x}}\\ \Delta {\bar{\lambda }}\\ \Delta {\bar{z}}\end{bmatrix}. \end{aligned}$$

Applying (8) for the quasi-Newton iteration and then noting that $(\Delta {x}, \Delta {\lambda }, \Delta {z})$ solves the Newton system (4) and using (9), the third block equation in system (8) gives

$$\begin{aligned} \begin{aligned} Z \Delta {\bar{x}}+ X \Delta {\bar{z}}&= \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e + \gamma _1 \left[ \alpha \left( Z \Delta {x}+ X \Delta {z}\right) - \left( \bar{X} \bar{Z} - X Z\right) e \right] \\&= \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e + \gamma _1 (\alpha \sigma \mu e + (1 - \alpha ) X Z e - \bar{X} \bar{Z} e) \\&= \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e - \gamma _1 \alpha ^2 \Delta {X}\Delta {Z}e. \end{aligned} \end{aligned}$$

(10)

Hence, using (6) and (10)

$$\begin{aligned} \begin{aligned} \bar{Z} \Delta {\bar{x}}+ \bar{X} \Delta {\bar{z}}&= (\bar{Z} - Z) \Delta {\bar{x}}+ (\bar{X} - X) \Delta {\bar{z}}+ Z \Delta {\bar{x}}+ X \Delta {\bar{z}}\\&= \ \alpha \left( \Delta {Z}\Delta {\bar{X}}e + \Delta {X}\Delta {\bar{Z}}e\right) + \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e - \gamma _1 \alpha ^2 \Delta {X}\Delta {Z}e, \end{aligned} \end{aligned}$$

(11)

where matrices $\Delta {\bar{Z}}$ and $\Delta {\bar{X}}$ are given, respectively, by ${{\,\textrm{diag}\,}}(\Delta {\bar{z}})$ and ${{\,\textrm{diag}\,}}(\Delta {\bar{x}})$.

Next we compute the new complementarity products obtained after a sequence of two steps, apply (11), and add and subtract the term $\alpha ^2 \Delta {X}\Delta {Z}e$ to derive

$$\begin{aligned} \begin{aligned} \bar{\bar{X}}&\bar{\bar{Z}} e = (\bar{X} + \bar{\alpha }\Delta {\bar{X}}) (\bar{Z} + \bar{\alpha }\Delta {\bar{Z}}) e \\&= \bar{X} \bar{Z} e + \bar{\alpha }(\bar{Z} \Delta {\bar{X}}+ \bar{X} \Delta {\bar{Z}}) e + \bar{\alpha }^2 \Delta {\bar{X}}\Delta {\bar{Z}}e \\&= \bar{X} \bar{Z} e + \bar{\alpha }\alpha ( \Delta {Z}\Delta {\bar{X}}e + \Delta {X}\Delta {\bar{Z}}e ) + \bar{\alpha }^2 \Delta {\bar{X}}\Delta {\bar{Z}}e + \alpha ^2 \Delta {X}\Delta {Z}e \\&\quad - \alpha ^2 \Delta {X}\Delta {Z}e + \bar{\alpha }\bar{\sigma }\bar{\mu }e - \bar{\alpha }\bar{X} \bar{Z} e - \gamma _1 \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \\&= (1 - \bar{\alpha }) \bar{X} \bar{Z} e + (\alpha \Delta {X}+ \bar{\alpha }\Delta {\bar{X}}) (\alpha \Delta {Z}+ \bar{\alpha }\Delta {\bar{Z}}) e \\&\quad + \bar{\alpha }\bar{\sigma }\bar{\mu }e - (1 + \bar{\alpha }\gamma _1) \alpha ^2 \Delta {X}\Delta {Z}e \\&= (1 - \bar{\alpha }) \bar{X} \bar{Z} e + PQe + \bar{\alpha }\bar{\sigma }\bar{\mu }e - (1 + \bar{\alpha }\gamma _1) \alpha ^2 \Delta {X}\Delta {Z}e, \end{aligned} \end{aligned}$$

(12)

where $p = \alpha \Delta {x}+ \bar{\alpha }\Delta {\bar{x}}$ and $q = \alpha \Delta {z}+ \bar{\alpha }\Delta {\bar{z}}$, denote, respectively, the primal and dual composite directions, and $P= {{\,\textrm{diag}\,}}(p)$ and $Q= {{\,\textrm{diag}\,}}(q)$ their associated matrix forms. By multiplying both sides of Eq. (12) with $e^T$ we get the complementarity product at iteration $k+2$:

$$\begin{aligned} \begin{aligned} \bar{\bar{x}}^T \bar{\bar{z}} = (1 - \bar{\alpha }(1 - \bar{\sigma })) \bar{x}^T \bar{z} + p^T q- (1 + \gamma _1 \bar{\alpha }) \alpha ^2 {\Delta {x}}^T \!\Delta {z}. \end{aligned} \end{aligned}$$

(13)

It is worth noting that the final expression in (12) involves a composite direction $(p, q) = (\alpha \Delta {x}+ \bar{\alpha }\Delta {\bar{x}}, \alpha \Delta {z}+ \bar{\alpha }\Delta {\bar{z}})$ which corresponds to an aggregate of two consecutive steps: in Newton direction at iteration k and in quasi-Newton direction at iteration $k+1$. Much of the effort of the analysis presented in this paper is focused on this composite direction. Let us mention that we will also use the component-wise versions of Eqs. (9), (11) and (12). For example, in case of (12) this gives

$$\begin{aligned} \begin{aligned} \left[ \bar{x} \right. + \left. \bar{\alpha }\Delta {\bar{x}}\right] _i \left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i&= (1 - \bar{\alpha }) \left( \bar{x}_i \bar{z}_i \right) + p_i q_i + \bar{\alpha }\bar{\sigma }\bar{\mu }\\ {}&\quad - (1 + \bar{\alpha }\gamma _1) {\alpha }^2 [\Delta {x}]_i [\Delta {z}]_i. \end{aligned} \end{aligned}$$

(14)

Observe that Eqs. (9)–(14) are valid regardless whether the iterates belong to $\mathcal {F}$. However, the analysis in Sects. 3 and 4 will distinguish between these two cases because in the feasible one we are able to take advantage of the orthogonality of primal and dual directions and exploit it to deliver better final worst-case complexity results.

Before we conclude the brief background section and take the reader through a detailed analysis of different versions of the primal-dual quasi-Newton interior point algorithm, let us observe that Eq. (12) involves an important term $\gamma _1$. Recalling that $k + 1$ is the quasi-Newton iteration, by Eq. (8) and the Cauchy-Schwarz inequality, it is easy to observe that

$$\begin{aligned} \Vert \gamma _1 \bar{y} \Vert = | \bar{y}^T v | \Vert \bar{y} \Vert / \Vert \bar{y}^T \bar{y} \Vert \le \Vert v \Vert \iff |\gamma _1| \le \Vert v\Vert / \Vert \bar{y}\Vert , \end{aligned}$$

(15)

where $\bar{y} = \bar{F} - F$ in our notation (see Eqs. (2) and (4)). In the next lemma, a lower bound for $\Vert \bar{y}\Vert $ is derived. It states that the denominator of (15) can be bounded away from zero if a sufficient decrease of $\bar{\mu }= \bar{x}^T \bar{z} / n$ is ensured and non-null step-sizes are taken. The bound for $\Vert v\Vert $ involves the right-hand side in (7) and therefore depends on the feasibility of iterates and on the choice of the centering parameter $\sigma $.

Lemma 1

Let $\bar{y}$ be the quasi-Newton vector defined by (6) to construct $\bar{H}$, the approximation of $\bar{J}^{-1}$ by the Broyden “bad” update (5). Suppose that $\bar{\mu }\le (1 - \rho \alpha ) \mu $ holds, for $\alpha , \rho \in [0, 1]$. Then $ \Vert \bar{y}\Vert \ge \rho \alpha \mu / 2$.

Proof

If $\alpha = 0$ or $\rho = 0$ the result trivially holds, so we can assume that $\alpha , \rho \in (0, 1]$. Suppose, by contradiction, that $\Vert \bar{y} \Vert < \rho \alpha \mu / 2$. Therefore, by definition of $\bar{y}$,

$$\begin{aligned} \begin{aligned} \Vert \bar{X} \bar{Z} e - X Z e \Vert \le \Vert \bar{y} \Vert< \frac{\rho \alpha }{2} \mu&\!\Rightarrow \! |\bar{x}_i \bar{z}_i - x_i z_i| < \frac{\rho \alpha }{2} \mu , \forall i. \end{aligned} \end{aligned}$$

By removing the modulus, adding up all the n left previous inequalities and dividing by n, we obtain $\bar{\mu }- \mu > - \frac{\rho \alpha }{2} \mu $. By hypothesis we have that $\bar{\mu }\le (1 - \rho \alpha ) \mu $ and, therefore, $ - \rho \alpha \mu / 2 < \bar{\mu }- \mu \le - \rho \alpha \mu , $ which implies $\rho /2 > \rho $. This is a contradiction. $\square $

Note that Lemma 1 puts restrictions on $\alpha $ and $\bar{\mu }$. Due to properties obtained from the step in the Newton iteration, we show later in Lemmas 2 and 14 that the sufficient decrease condition of $\mu $ can be ensured if the step sizes are not zero.

3 Worst-case complexity in the feasible case

For all the results in this section, we suppose that $(x, \lambda , z)$, the iterate in the Newton step, is primal and dual feasible, as per Assumption 1 below.

Assumption 1

$(x, \lambda , z) \in \mathcal {F}$.

Our analysis follows closely the theory in [15]. Under Assumption 1, the primal and dual directions in the Newton step are orthogonal to each other and can decrease the barrier parameter $\mu = (x^T z) / n$, as stated in Lemma 2.

Lemma 2

If Assumption 1 holds, then $\Delta {x}^T \Delta {z}= 0$ and $\bar{\mu }= (\bar{x}^T \bar{z}) / n = (1 - \alpha (1 - \sigma )) \mu $, for $\alpha , \sigma \in [0, 1]$.

Proof

See [15, Lemma 5.1]. $\square $

In the next two lemmas, we show that the same holds for quasi-Newton steps.

Lemma 3

If Assumption 1 holds, then $ {\Delta {\bar{x}}}^T \Delta {\bar{z}}= 0$ and $ p^T q = {\Delta {x}}^T \Delta {z}= {\Delta {x}}^T \Delta {\bar{z}}= {\Delta {\bar{x}}}^T {\Delta {z}} = 0.$

Proof

Using (8) with $r=Hv$ defined by their respective terms in (7) and by the primal and dual feasibility of $(x, \lambda , z)$, we observe that the first two block rows of system (7) (at the quasi-Newton iteration) given by $A^T \Delta {\bar{\lambda }}+ \Delta {\bar{z}}= 0$ and $A \Delta {\bar{x}}= 0$, respectively, are the same as in the system solved by the Newton step. Therefore, ${\Delta {\bar{x}}}^T \Delta {\bar{z}}= - {\Delta {\bar{x}}}^T A^T \Delta {\bar{\lambda }}= - \left( A \Delta {\bar{x}}\right) ^T \Delta {\bar{\lambda }}= 0$. The remaining equalities follow from this result and Lemma 2. $\square $

Lemma 4

For any step-size $\bar{\alpha }\in [0, 1]$ and $\bar{\sigma }\in [0, 1]$

$$\begin{aligned} \bar{\bar{\mu }}&= (1 - \bar{\alpha }(1 - \bar{\sigma })) \bar{\mu } \end{aligned}$$

(16)

$$\begin{aligned} \Vert (\bar{\bar{\mu }} - \mu ) e \Vert ^2&= [1 - (1 - \bar{\alpha }(1 - \bar{\sigma })) (1 - \alpha (1 - \sigma ))]^2 \mu ^2 n. \end{aligned}$$

(17)

Proof

By Eq. (13) and using Lemma 3, we obtain

$$\begin{aligned} \begin{aligned} n \bar{\bar{\mu }}&= \bar{\bar{x}}^T \bar{\bar{z}} = (\bar{x} + \bar{\alpha }\Delta {\bar{x}})^T (\bar{z} + \bar{\alpha }\Delta {\bar{z}}) \\&= (1 - \bar{\alpha }) {\bar{x}}^T \bar{z} + p^T q +\! \bar{\alpha }\bar{\sigma }n \bar{\mu }\!-\! (1 \! + \! \bar{\alpha }\gamma _1) {\alpha }^2 {\Delta {x}}^T \Delta {z}= (1 - \bar{\alpha }(1 - \bar{\sigma })) n \bar{\mu }. \end{aligned} \end{aligned}$$

By dividing both sides of the last equation by n, we obtain (16). Then, using (16) and Lemma 2 we get

$$\begin{aligned} \begin{aligned} \Vert (\bar{\bar{\mu }} - \mu ) e \Vert ^2&= (\bar{\bar{\mu }} - \mu )^2 n \ = \ [(1 - \bar{\alpha }(1 - \bar{\sigma })) \bar{\mu }- \mu ]^2 n \\&= [(1 - \bar{\alpha }(1 - \bar{\sigma })) (1 - \alpha (1 - \sigma )) \mu - \mu ]^2 n \\&= [1 - (1 - \bar{\alpha }(1 - \bar{\sigma })) (1 - \alpha (1 - \sigma ))]^2 \mu ^2 n. \end{aligned} \end{aligned}$$

$\square $

It is worth noting that (in the feasible case) the term $\gamma _1$ originating from system (8) does not have any influence on the value of $\bar{\bar{\mu }}$.

3.1 The $\mathcal {N}_2$ neighborhood

In this section we will consider a short-step interior point method and employ the notion of $\mathcal {N}_2$ neighborhood of the central path

$$\begin{aligned} \mathcal {N}_2(\theta ) = \{(x, \lambda , z) \in \mathcal {F} \ | \ \Vert XZe - \mu e\Vert \le \theta \mu \}, \end{aligned}$$

where $\theta \in (0, 1)$ and $\mathcal {F}$ is the set of primal and dual feasible points such that $x, z > 0$, see Assumption 1. For all considerations in this subsection we add the following assumption, which is known to hold for suitable choices of $\theta $, $\bar{\theta }$, $\sigma $ and $\bar{\sigma }$ since we start from a Newton iteration (see [15]), as we state in Theorem 1.

Assumption 2

$(x, \lambda , z) \in \mathcal {N}_2(\theta )$ and $(\bar{x}, \bar{\lambda }, \bar{z}) \in \mathcal {N}_2(\bar{\theta })$, for $\theta , \bar{\theta }\in (0, 1)$.

Our main goal is to show that the new iterate

$$\begin{aligned} (\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) = (\bar{x}, \bar{\lambda }, \bar{z}) + \bar{\alpha }(\Delta {\bar{x}}, \Delta {\bar{\lambda }}, \Delta {\bar{z}})\end{aligned}$$

also belongs to $\mathcal {N}_2(\bar{\bar{\theta }})$, for suitable choices of $\bar{\bar{\theta }} \in (0, 1)$, $\bar{\alpha }$ and $\alpha $. Therefore, we are interested in the analysis of the Euclidean norm of the vector $(\bar{X} + \bar{\alpha }\Delta {\bar{X}}) (\bar{Z} + \bar{\alpha }\Delta {\bar{Z}})e - \bar{\bar{\mu }} e$ and to deliver it we will exploit several useful results stated earlier in Eqs. (12), (14) and Lemma 4. Combining (14) and Lemma 4 we get

$$\begin{aligned} \begin{aligned} \left[ \bar{x} \right. + \left. \bar{\alpha }\Delta {\bar{x}}\right] _i \left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i -&\bar{\bar{\mu }}\,\,{=}\,\,(1 - \bar{\alpha }) \left( \bar{x}_i \bar{z}_i {-} \bar{\mu }\right) + p_i q_i \\ {}&\quad -(1 + \bar{\alpha }\gamma _1) {\alpha }^2 [\Delta {x}]_i [\Delta {z}]_i. \end{aligned}\nonumber \\ \end{aligned}$$

(18)

For the term $[\Delta {x}]_i [\Delta {z}]_i$ on the right-hand side of Eq. (18), a well known bound is given in Lemma 5.

Lemma 5

Under Assumptions 1 and 2 , $ \Vert \Delta {X}\Delta {Z}e\Vert \le \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \mu . $

Proof

See [15, Lemma 5.4]. $\square $

By (18), Lemma 5 and Assumption 2, we deliver the following bound on the proximity measure of the $\mathcal {N}_2$ neighborhood of the iterate after the quasi-Newton step

$$\begin{aligned} \begin{aligned} \Vert (\bar{X}&+ \bar{\alpha }\Delta {\bar{X}}) ( \bar{Z} + \bar{\alpha }\Delta {\bar{Z}}) e - \bar{\bar{\mu }} e \Vert = \\&= \big \Vert \left\{ (1 - \bar{\alpha }) \left( \bar{x}_i \bar{z}_i - \bar{\mu }\right) + p_i q_i - (1 + \bar{\alpha }\gamma _1) {\alpha }^2 [\Delta {x}]_i [\Delta {z}]_i \right\} _{i = 1}^n \big \Vert \\&\le (1 - \bar{\alpha }) \left\| \bar{X} \bar{Z} e - \bar{\mu }e \right\| + \left\| P Q e \right\| + | 1 + \bar{\alpha }\gamma _1 | {\alpha }^2 \Vert \Delta {X}\Delta {Z}e\Vert \\&\le (1 - \bar{\alpha }) \bar{\theta }\bar{\mu }+ | 1 + \bar{\alpha }\gamma _1 | {\alpha }^2 \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \mu + \left\| P Q e \right\| . \end{aligned}\nonumber \\ \end{aligned}$$

(19)

To further exploit (19) we need bounds on two terms which appear in it: $1 + \bar{\alpha }\gamma _1$ and the second-order error contributed by the Newton and quasi-Newton composite direction $\left\| P Q e \right\| $. The following technical result delivers a bound for $|\gamma _1|$. (Observe that $\gamma _1$ is evaluated only when the quasi-Newton iteration is performed, see (8).)

Lemma 6

Consider a quasi-Newton iteration of Algorithm 1. Suppose that v in system (8) is given by the right-hand side of (4) and Assumption 2 holds. If $\alpha \in (0, 1]$ and $\sigma \in [0, 1)$, then

$$\begin{aligned} |\gamma _1| \le \frac{2 (1 - \alpha (1 - \sigma )) \sqrt{\bar{\theta }^2 + (1 - \bar{\sigma })^2 n}}{\alpha (1 - \sigma )}. \end{aligned}$$

Proof

The bound for $\gamma _1$ is obtained by Eq. (15) and Lemma 1, so an upper bound for $\Vert v \Vert $ is necessary. By the assumptions of the lemma, we have the property $e^T (\bar{\mu }e - \bar{X} \bar{Z} e) = 0$, which, together with Lemma 2 is used to derive the result

$$\begin{aligned} \begin{aligned} \Vert v\Vert&= \Vert \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e\Vert = \Vert (\bar{\mu }e - \bar{X} \bar{Z} e) - (1 - \bar{\sigma }) \bar{\mu }e \Vert \\&= \sqrt{\Vert \bar{\mu }e - \bar{X} \bar{Z} e \Vert ^2 - 2 (1 - \bar{\sigma }) \bar{\mu }e^T (\bar{\mu }e - \bar{X} \bar{Z} e) + \Vert (1 - \bar{\sigma }) \bar{\mu }e \Vert ^2} \\&\le \sqrt{\bar{\theta }^2 \bar{\mu }^2 + (1 - \bar{\sigma })^2 \bar{\mu }^2 n} \ = (1 - \alpha (1 - \sigma )) \mu \sqrt{\bar{\theta }^2 + (1 - \bar{\sigma })^2 n}. \end{aligned} \end{aligned}$$

By defining $\rho = 1 - \sigma $ we can see that $\bar{\mu }= (1 - (1 - \sigma ) \alpha ) \mu = (1 - \rho \alpha ) \mu $ which ensures the sufficient decrease condition of Lemma 1. Since $\alpha > 0$ and $\sigma < 1$, by the assumptions of the lemma, we have that $\Vert \bar{y} \Vert > 0$. The result follows by simple substitution of the previous equations and Lemma 1 in Eq. (15). $\square $

Next we turn our attention to obtaining a bound for the error in the composite direction $\Vert P Q e\Vert $ and start from two technical results.

Lemma 7

Under Assumptions 1 and 2,

$$\begin{aligned} \begin{aligned} Z&P e + X Q e = (\bar{\alpha }+ \alpha (1 \! - \! \bar{\alpha })) \! \left( \mu e \! - \! X Z e \right) \!+\! (\bar{\bar{\mu }} \!-\! \mu ) e \!-\! (1 \!+\! \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e. \end{aligned} \nonumber \\ \end{aligned}$$

(20)

Proof

We use Eqs. (4), (10) and (9) and some simple manipulations to obtain

$$\begin{aligned} \begin{aligned} Z&P e + X Q e \!=\! \alpha \left( Z \Delta {x}\! + \! X \Delta {z}\right) \!+\! \bar{\alpha }\left( Z \Delta {\bar{x}}\! + \! X \Delta {\bar{z}}\right) \\&= \alpha (\sigma \mu e \! - \! X Z e) \!+\! \bar{\alpha }\left( \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e - \gamma _1 {\alpha }^2 \Delta {X}\Delta {Z}e \right) \\&= \alpha (\sigma \mu e \! - \! X Z e) \!+\! \bar{\alpha }\left( \bar{\sigma }\bar{\mu }e \! - \! (1 \! - \! \alpha ) X Z e \! - \! \alpha \sigma \mu e \! - \! {\alpha }^2 \Delta {X}\Delta {Z}e \! - \! \gamma _1 {\alpha }^2 \Delta {X}\Delta {Z}e \right) \\&\!=\! (1 \!-\! \bar{\alpha }) \alpha \sigma \mu e - (\bar{\alpha }\!+\! \alpha (1 \!-\! \bar{\alpha })) X Z e \!+\! \bar{\alpha }\bar{\sigma }\bar{\mu }e \!-\! (1 \!+\!\gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e. \end{aligned}\nonumber \\ \end{aligned}$$

(21)

After adding and subtracting the term $(\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \mu e$ we further rearrange the previous equation

$$\begin{aligned} \begin{aligned} Z&\left( \alpha \Delta {X}+ \bar{\alpha }\Delta {\bar{X}}\right) e + X \left( \alpha \Delta {Z}+ \bar{\alpha }\Delta {\bar{Z}}\right) e = (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) \\&\quad - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e + ((1 - \bar{\alpha }) \alpha \sigma - (\bar{\alpha }+ \alpha (1 - \bar{\alpha }))) \mu e + \bar{\alpha }\bar{\sigma }\bar{\mu }e. \end{aligned}\nonumber \\ \end{aligned}$$

(22)

Then, using $\bar{\mu }= (1 \! - \! \alpha (1 \! - \! \sigma )) \mu $ (which clearly holds for a step in Newton direction) and Lemma 4, which delivers a similar result for a step in quasi-Newton direction, we get:

$$\begin{aligned} \begin{aligned} ((1 -&\bar{\alpha }) \alpha \sigma - (\bar{\alpha }+ \alpha (1 - \bar{\alpha }))) \mu e + \bar{\alpha }\bar{\sigma }\bar{\mu }e = \\&= (\alpha \sigma -1- \bar{\alpha }\alpha \sigma - \bar{\alpha }- \alpha + 1 + \bar{\alpha }\alpha ) \mu e + \bar{\alpha }\bar{\sigma }\bar{\mu }e = \bar{\bar{\mu }} e - \mu e. \end{aligned}\nonumber \\ \end{aligned}$$

(23)

By substituting (23) in (22), the desired result is obtained. $\square $

Lemma 8

Let u, v be vectors such that $v^T u = 0$, and $U = {{\,\textrm{diag}\,}}(u)$ and $V = {{\,\textrm{diag}\,}}(v)$ their respective diagonal matrices. Then $\Vert U V e \Vert \le 2^{- 3/2} \Vert u + v\Vert $.

Proof

See [15, Lemma 5.3]. $\square $

Lemma 9

Under Assumptions 1 and 2,

$$\begin{aligned} \begin{aligned} \Vert P Q e \Vert \le \frac{\mu }{2^{3/2} (1 \! - \! \theta )}&\Bigg [ [1 - (1 \! - \! \bar{\alpha }(1 \! - \! \bar{\sigma })) (1 \! - \! \alpha (1 \! - \! \sigma ))]^2 n + \\&\left( (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \theta + |1 \! + \! \gamma _1| \bar{\alpha }\alpha ^2 \frac{\theta ^2 + n (1 \! - \! \sigma )^2}{2^{3/2} (1 - \theta )} \right) ^2 \Bigg ]. \end{aligned} \end{aligned}$$

Proof

Let us first define $D = (X)^{1/2} (Z)^{-1/2}$ and the scaled vectors $u = {D}^{-1} P e = {D}^{-1} p$ and $v = D Q e = D q$. Using Lemma 3 and observing that all the involved matrices are diagonal, we get $u^T v = 0$. Hence, using Lemma 8, we write

$$\begin{aligned} \Vert P Q e \Vert = \Vert D^{-1} P D Q e \Vert = \Vert U V e \Vert \ \le \ 2^{-3/2} \Vert u + v \Vert ^2 = \frac{ \Vert D^{-1} P e + D Q e \Vert ^2}{2^{3/2}}.\nonumber \\ \end{aligned}$$

(24)

After multiplying both sides of (20) by $(X Z)^{-1/2}$ and replacing it in (24) we obtain

$$\begin{aligned} \begin{aligned} \Vert&P Q e \Vert \le 2^{-3/2} \bigg \Vert (X Z)^{-1/2} \big \{ (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) + \\&\hspace{2.5cm} (\bar{\bar{\mu }} - \mu ) e - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \big \} \bigg \Vert ^2 \\&= \sum _{i = 1}^n \frac{ \left\{ (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \mu - x_i z_i \right) + \bar{\bar{\mu }} - \mu - (1 + \gamma _1) \bar{\alpha }\alpha ^2 [\Delta {x}]_i [\Delta {z}]_i \right\} ^2}{x_i z_i \left( 2^{3/2}\right) } \\&\le \ \frac{\Vert (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e + (\bar{\bar{\mu }} - \mu ) e \Vert ^2}{2^{3/2}(1 - \theta ) \mu }, \end{aligned}\nonumber \\ \end{aligned}$$

(25)

where the last inequality comes from the fact that $(x, \lambda , z)$ belongs to $\mathcal {N}_2(\theta )$, hence $(1 - \theta ) \mu \le x_i z_i \le (1 + \theta ) \mu $ for all $i = 1, \dots , n$. Now we use Lemma 3 again and the definition of $\mu $ to observe that $ e^T \left( (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \right) = 0 $ and further rearrange (25):

$$\begin{aligned} \begin{aligned} \Vert P Q e \Vert&\! \le \! \frac{ \left\| (\bar{\alpha }\!+\! \alpha (1 \!-\! \bar{\alpha })) \left( \mu e \!-\! X Z e \right) \!-\! (1 \!+\! \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \right\| ^2 \!+\! \Vert (\bar{\bar{\mu }} \!-\! \mu ) e \Vert ^2}{2^{3/2}(1 - \theta ) \mu }. \end{aligned}\nonumber \\ \end{aligned}$$

(26)

The second norm on the right-hand side of (26) is given by (17). Using the triangle inequality, the definition of $\mathcal {N}_2(\theta )$ neighborhood and Lemma 5, for the first norm we get

$$\begin{aligned} \begin{aligned} \big \Vert (\bar{\alpha }&+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \big \Vert ^2 \le \\&\le \left( (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \theta + |1 + \gamma _1| \bar{\alpha }\alpha ^2 \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \right) ^2 \mu ^2.\\ \end{aligned}\nonumber \\ \end{aligned}$$

(27)

Finally, by substituting (17) and (27) in (26) we obtain the required result. $\square $

We are ready to state the main result of this subsection. In Theorem 1 we show that it is possible to choose sufficiently large values for the step-sizes $\alpha $ and $\bar{\alpha }$ such that $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) \in \mathcal {N}_2(\theta )$, where $\theta $ defines the size of neighborhood used in the Newton step. Therefore, we ensure that the quasi-Newton step taken after a Newton step also remains in the $\mathcal {N}_2$ neighborhood. This implies that all the iterates generated by the algorithm belong to $\mathcal {N}_2$. We recall from [15, Theorem 5.6] that it is possible to choose parameters $\theta , \bar{\theta }\in (0, 1)$ and $\sigma , \bar{\sigma }\in (0, 1)$ so that

$$\begin{aligned} \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \le \theta \sigma \quad \text {and} \quad \frac{\bar{\theta }^2 + n (1 - \bar{\sigma })^2}{2^{3/2} (1 - \bar{\theta })} \le \bar{\theta }\bar{\sigma }. \end{aligned}$$

(28)

Theorem 1

Suppose that Assumptions 1 and 2 hold and that $\bar{\theta }= \theta $ and $\bar{\sigma }= \sigma $. If the step-sizes in Newton and quasi-Newton iterations $\alpha $ and $\bar{\alpha }$, respectively, satisfy

$$\begin{aligned} \alpha \! \in \!\left( \!0, \min \left\{ \frac{1 - \sigma }{4 \sigma }, \frac{\sigma (1 - \sigma )}{10 (1 - \sigma ) + 4}\right\} \right] \, \text {and} \ \, \bar{\alpha }\!\in \! \left[ \alpha , \frac{\sigma (1 - \sigma )}{10 (1 - \sigma ) + 4} \right] \end{aligned}$$

(29)

for $\sigma \in [0, 1)$ and $\theta \in (0, 16/25)$, then

$$\begin{aligned} (\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) \doteq (\bar{x}, \bar{\lambda }, \bar{z}) + \bar{\alpha }(\Delta {\bar{x}}, \Delta {\bar{\lambda }}, \Delta {\bar{z}})\in \mathcal {N}_2(\theta ). \end{aligned}$$

Proof

We will first show that for all $\alpha $ and $\bar{\alpha }$ satisfying the conditions of the theorem,

$$\begin{aligned} \Vert \left( \bar{X} + \right. \left. \bar{\alpha }\Delta {\bar{X}}\right) \left( \bar{Z} + \bar{\alpha }\Delta {\bar{Z}}\right) e - \bar{\bar{\mu }} e \Vert - \theta _{k} \bar{\bar{\mu }} \le 0 \end{aligned}$$

(30)

holds.

By inequality (19), condition (30) is satisfied if

$$\begin{aligned} \begin{aligned} \left[ (1 - \bar{\alpha }) \bar{\theta }\bar{\mu }- \theta _{k} \bar{\bar{\mu }}\right] + | 1 + \bar{\alpha }\gamma _1 | {\alpha }^2 \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \mu + \left\| P Q e \right\| \le 0. \end{aligned}\nonumber \\ \end{aligned}$$

(31)

We will derive bounds to each term on the left-hand side of this inequality in order to find an expression in a form $K_1 \bar{\alpha }^2 - K_2 \bar{\alpha }$, $K_1, K_2 > 0$, which will be nonpositive for small values of $\bar{\alpha }$.

For the first term, we use the fact that $\bar{\theta }= \theta $, $\bar{\sigma }= \sigma $ and $\alpha \in (0, 1)$. In addition, we use Lemma 4 to expand $\bar{\bar{\mu }}$ and the fact that $\bar{\mu }$ was calculated in the Newton step to obtain

$$\begin{aligned} (1 - \bar{\alpha }) \bar{\theta }\bar{\mu }- \theta \bar{\bar{\mu }} = [ (1 - \bar{\alpha }) (\bar{\theta }- \theta ) - \bar{\alpha }\bar{\sigma }\theta ] (1 - \alpha (1 - \sigma )) \mu \le - \bar{\alpha }\sigma ^2 \theta \mu .\nonumber \\ \end{aligned}$$

(32)

For the second and third terms, we first apply (28) in Lemma 6 to simplify the bound of $\gamma _1$:

$$\begin{aligned} \begin{aligned} |\gamma _1|&\le \frac{2 (1 - \alpha (1 - \sigma )) \sqrt{2^{3/2} (1 - \bar{\theta })} \sqrt{\frac{\bar{\theta }^2 + (1 - \bar{\sigma })^2 n}{2^{3/2} (1 - \bar{\theta })}}}{\alpha (1 - \sigma )} \\&\le \frac{2 (1 - \alpha (1 - \sigma )) \sqrt{2^{3/2} (1 - \bar{\theta })} \sqrt{\bar{\theta }\bar{\sigma }}}{\alpha (1 - \sigma )} \le \frac{4}{\alpha (1 - \sigma )} - 4. \end{aligned}\nonumber \\ \end{aligned}$$

(33)

Therefore, using (28) again, we derive the following bound to the second term

$$\begin{aligned} \begin{aligned} |1&+ \bar{\alpha }\gamma _1| \alpha ^2 \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \mu \le \left[ 1 + \bar{\alpha }\left( \frac{4}{\alpha (1 - \sigma )} - 4 \right) \right] \alpha ^2 \theta \sigma \mu \\&= \alpha ^2 \theta \sigma \mu + 4 \bar{\alpha }\alpha \! \left( \frac{1 \!-\! \alpha (1 \!-\! \sigma )}{1 \!-\! \sigma } \right) \theta \sigma \mu \!\le \! \bar{\alpha }^2\! \left[ 1\! +\! \frac{4}{1 \!-\! \sigma } \right] \theta \sigma \mu , \end{aligned}\nonumber \\ \end{aligned}$$

(34)

where in the last inequality we used the condition $\alpha \le \bar{\alpha }\le 1$ from (29). For the third term in (31), we use the bound obtained in Lemma 9 and analyze each part of it independently. Since $\alpha \le \bar{\alpha }$ and $\bar{\sigma }= \sigma $, we observe that

$$\begin{aligned} \begin{aligned}&[1 \!-\! (1 \!-\! \bar{\alpha }(1 - \bar{\sigma })) (1 - \alpha (1 - \sigma ))]^2 n = [\alpha (1 - \sigma ) + \bar{\alpha }(1 - \alpha (1 - \sigma )) (1 - \sigma ))]^2 n\\&\quad \le \bar{\alpha }^2 (1 - \sigma )^2 [1 + (1 - \alpha (1 - \sigma ))]^2 n \le 4 \bar{\alpha }^2 (1 - \sigma )^2 n. \end{aligned}\nonumber \\ \end{aligned}$$

(35)

Using bound (33), assumption $\alpha \le (1 - \sigma ) / (4 \sigma )$ in (29), and (28) again, we also obtain

$$\begin{aligned} \begin{aligned} \Bigg [ (\bar{\alpha }+ \alpha&(1 - \bar{\alpha })) \theta + |1 + \gamma _1| \bar{\alpha }\alpha ^2 \frac{\theta ^2 + n (1 - \sigma )^2}{2^{3/2} (1 - \theta )} \Bigg ]^2 \le \\&\le \left[ \bar{\alpha }+ \alpha (1 - \bar{\alpha }) + \left( \frac{4}{\alpha (1 - \sigma )} - 3 \right) \bar{\alpha }\alpha ^2 \sigma \right] ^2 \theta ^2 \\&\le \left[ \bar{\alpha }+ \alpha (1 - \bar{\alpha }) + \frac{4}{\alpha (1 - \sigma )} \bar{\alpha }\alpha ^2 \sigma \right] ^2 \theta ^2 \\&\le \bar{\alpha }^2 \left[ 1 + (1 - \bar{\alpha }) + \frac{4 \alpha \sigma }{(1 - \sigma )} \right] ^2 \theta ^2 \le 9 \bar{\alpha }^2 \theta ^2. \end{aligned}\nonumber \\ \end{aligned}$$

(36)

By combining (35) and (36) in the statement of Lemma 9 and applying (28) once more, we derive a bound to the third term of (31)

$$\begin{aligned} \begin{aligned} \Vert P Q e \Vert \le \frac{\mu }{2^{3/2} (1 - \theta )} \left( 4 \bar{\alpha }^2 (1 - \sigma )^2 n + 9 \bar{\alpha }^2 \theta ^2 \right) \le 9 \bar{\alpha }^2 \theta \sigma \mu . \end{aligned} \end{aligned}$$

(37)

By (32), (34) and (37), the left expression in (31) can be bounded by

$$\begin{aligned} \bar{\alpha }\left[ 10 \bar{\alpha }+ \frac{4 \bar{\alpha }}{1 - \sigma } - \sigma \right] \theta \sigma \mu , \end{aligned}$$

which is negative only if $\bar{\alpha }\le \sigma (1 - \sigma ) / (10 (1 - \sigma ) + 4)$, as requested by (29). This bound on $\bar{\alpha }$ implies a bound on $\alpha $, due to condition $\alpha \le \bar{\alpha }$. Therefore, we arrive in the step-size conditions (29) of the theorem.

We now show that the new iterate belongs to $\mathcal {F}$. By Assumption 1 and Eq. (8) we know that all iterates remain primal and dual feasible. It remains to be shown that $\bar{\bar{x}}$ and $\bar{\bar{z}}$ are strictly positive. We follow the same arguments as [18] and adapt them to our case. Suppose by contradiction that $\bar{\bar{x}}_i \le 0$ or $\bar{\bar{z}}_i \le 0$ hold for some i. By inequality (30) we have that $ \left( \bar{x}_i + \bar{\alpha }\Delta {\bar{x}}_i \right) \left( \bar{z}_i + \bar{\alpha }\Delta {\bar{z}}_i \right) \ge (1 - \theta ) \bar{\bar{\mu }} > 0 $. Hence, $\bar{\bar{x}}_i < 0$ and $\bar{\bar{z}}_i < 0$ and that implies $x_i z_i < p_i q_i \le \Vert P Q e\Vert \le 9 \bar{\alpha }^2 \theta \sigma \mu $ by inequality (37). Since $(x, \lambda , z) \in \mathcal {N}_2(\theta )$ and $\bar{\alpha }\le 1/4$, by (29), we conclude that $(1 - \theta ) \mu < (9/16) \theta \mu $ and, therefore, $\theta > 16/25$, which contradicts the choice of $\theta $. Hence, $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$ belongs to $\mathcal {N}_2(\theta )$.

$\square $

The upper bounds for the step-sizes as stated in Theorem 1 are used to determine the worst-case complexity of Algorithm 1 operating in $\mathcal {N}_2$, given by Theorem 2.

Theorem 2

Suppose that $(x^{0}, {\lambda }^{0}, z^{0}) \in \mathcal {N}_2(0.4)$ and $\theta _k = 0.4$ for all iterations k of Algorithm 1. If $\mu _0 = \epsilon ^\kappa $, for a given $\epsilon > 0$ and a constant $\kappa $, then there exists $K > 0$, $K = O(n)$, such that $\mu _k \le \epsilon $ for all $k \ge K$.

Proof

From Lemmas 2 and 4 we have that $\bar{\bar{\mu }} \le \bar{\mu }\le \mu $. So it is enough to look just at the Newton steps, which are easier to analyze. Let us define $\sigma _k = 1 - 0.4 / \sqrt{n}$ and $\theta _k = 0.4$ for all k. This choice satisfies condition (28) (see [4, 15]) and maintains all the previous results. For the remaining of this proof, let k be a Newton iteration. Using (29) it is not hard to see that

$$\begin{aligned} \alpha _k \ge \min \left\{ \frac{1 - \sigma }{4 \sigma }, \frac{\sigma (1 - \sigma )}{10 (1 - \sigma ) + 4}\right\} \ge \min \left\{ \frac{0.1}{\sqrt{n}}, \frac{0.03}{\sqrt{n}} \right\} = \frac{0.03}{\sqrt{n}}. \end{aligned}$$

Therefore,

$$\begin{aligned} \mu _k \le \left( 1 - \frac{0.03}{\sqrt{n}} \frac{0.4}{\sqrt{n}} \right) \mu _{k - 2} = \left( 1 - \frac{0.012}{n} \right) \mu _{k - 2} \le \left( 1 - \frac{0.012}{n} \right) ^{k/2} \mu _0. \end{aligned}$$

By applying $\log $ to both sides of this inequality, using the fact that $\log (1 + \beta ) \le \beta $, for $\beta > -1$, and condition $\mu _0 = \epsilon ^\kappa $, we obtain the desired result. $\square $

By taking a closer look at Eqs. (32) and (35), we conclude that the same results can be obtained if we allow $\bar{\theta }\le \theta $ and $\bar{\sigma }\ge \sigma $. Our choice was intended to simplify the arguments during the proof.

3.2 The $\mathcal {N}_s$ neighborhood

Colombo and Gondzio [19] used the symmetric neighborhood $\mathcal {N}_s(\gamma )$, defined by

$$\begin{aligned} \mathcal {N}_s(\gamma ) \doteq \left\{ (x, \lambda , z) \in \mathcal {F}\ |\ \gamma \mu \le x_i z_i \le \frac{1}{\gamma } \mu , i = 1,\dots , n \right\} , \end{aligned}$$

for $\gamma \in (0, 1)$, which is related with the $\mathcal {N}_{-\infty }(\gamma )$ neighborhood used in long-step primal-dual interior point algorithms. The idea of the symmetric neighborhood is to add an upper bound on the complementarity pairs, so that their products do not become too large with respect to the average. The authors showed that the worst-case iteration complexity for linear feasible primal-dual interior point methods remains O(n) and the new neighborhood has a better practical interpretation. As HOPDM[20] implements the $\mathcal {N}_s$ neighborhood and it was used in the numerical experiments in [10] for quasi-Newton IPM, it is natural to ask about the iteration complexity of Algorithm 1 operating in the $\mathcal {N}_s$ neighborhood. The analysis presented below will follow closely that from Sect. 3.1. We start from an assumption, but it is worth observing that, from [15, 19], this assumption holds if the step-size in the Newton direction is sufficiently small: $\alpha \in \left[ 0, \min \left\{ 2^{3/2} \frac{1 - \gamma }{1 + \gamma } \frac{\sigma }{n}, 2^{3/2} \gamma \frac{1 - \gamma }{1 + \gamma } \frac{\sigma }{n} \right\} \right] $.

Assumption 3

Let $\gamma \in (0, 1)$ and $(x, \lambda , z) \in \mathcal {N}_s(\gamma )$. Let the iterate after a step $\alpha $ in Newton direction also satisfy $(\bar{x}, \bar{\lambda }, \bar{z}) \in \mathcal {N}_s(\gamma )$.

Our main goal is to show that the next iterate obtained after a step in the quasi-Newton direction $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$ also belongs to $\mathcal {N}_s(\gamma )$ if suitable step-sizes $\bar{\alpha }$ (quasi-Newton) and $\alpha $ (Newton) are chosen. To demonstrate this, we will consider lower and upper bounds on the complementarity products in the $\mathcal {N}_s(\gamma )$ neighborhood using two possible values of $\zeta \in \{\gamma , \frac{1}{\gamma }\}$. Using (14) and Lemma 4, we rewrite the complementary product

$$\begin{aligned} \begin{aligned} \big [ \bar{x} + \bar{\alpha }\Delta {\bar{x}}\big ]_i&\left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i - \zeta \bar{\bar{\mu }} = \\&= (1 \! - \! \bar{\alpha }) (\bar{x}_i \bar{z}_i \! - \! \zeta \bar{\mu }) + (1 \! - \! \zeta ) \bar{\alpha }\bar{\sigma }\bar{\mu }+ p_i q_i \!-\! (1 + \gamma _1 \bar{\alpha }) \alpha ^2 [\Delta {x}]_i [\Delta {z}]_i. \end{aligned}\nonumber \\ \end{aligned}$$

(38)

To deliver the main result of this section we will need a bound for the quasi-Newton term $\gamma _1$ defined in (8) when the algorithm operates in the $\mathcal {N}_s(\gamma )$ neighborhood.

Lemma 10

Suppose that v in (8) is given by the right-hand side of (4) and Assumptions 1 and 3 hold. If $\alpha \in (0, 1]$ and $\sigma \in [0, 1)$, then

$$\begin{aligned} |\gamma _1| \le \frac{2 \sqrt{n}}{(1 - \sigma ) \alpha \gamma }. \end{aligned}$$

Proof

Using the conditions of the lemma, the definition of $\mu $ and the fact that $\bar{\sigma }\in [0, 1]$, we have that

$$\begin{aligned} \begin{aligned} \Vert v \Vert&= \Vert \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e \Vert = \sqrt{\bar{\sigma }^2 \bar{\mu }^2 n - 2 \bar{\sigma }\bar{\mu }^2 n + \textstyle \sum _{i = 1}^n (\bar{x}_i \bar{z}_i)^2} \\&\le \sqrt{(1 / \gamma ^2) \bar{\mu }^2 n - (2 - \bar{\sigma }) \bar{\sigma }\bar{\mu }^2 n} \le \sqrt{(1/\gamma ^2) - \bar{\sigma }} \sqrt{n} \bar{\mu }\le \frac{\sqrt{n}}{\gamma } \bar{\mu }\le \frac{\sqrt{n}}{\gamma } \mu , \end{aligned}\nonumber \\ \end{aligned}$$

(39)

where the last inequality comes from Lemma 2. Since $\alpha \in (0, 1]$ and $\sigma \in [0, 1)$, by defining $\rho = 1 - \sigma > 0$ we can use Lemmas 1 and 2 to ensure that $\Vert \bar{y} \Vert > 0$ and, by (15) and the previous inequality, we conclude the proof. $\square $

The next lemma is a technical result needed to deliver a bound for $p_i q_i$ under Assumption 3. Then, the bound is given in Lemma 12, but most of the calculations have already been made in the proof of Lemma 9. Let us mention that Lemmas 9 and 12 can be viewed as the quasi-Newton versions of Lemmas 5 and 11, respectively, taken from [15].

Lemma 11

Under Assumptions 1 and 3 , $ \left\| \Delta {X}\Delta {Z}e \right\| \le \left( \frac{1 + \gamma }{\gamma }\right) 2^{-3/2} \mu n. $

Proof

See [15, Lemma 5.10] and observe that this bound also holds for the $\mathcal {N}_s$ neighborhood [19]. $\square $

Lemma 12

Under Assumptions 1 and 3,

$$\begin{aligned} \begin{aligned} \Vert P Q e \Vert \le \frac{n \mu }{2^{3/2} \gamma } ( K_1 n\ + K_2 ), \end{aligned} \end{aligned}$$

where $K_1 \! = \! \left[ (\bar{\alpha }\! + \! \alpha (1 \! - \! \bar{\alpha })) \! + \! \frac{|1 \! + \! \gamma _1| \bar{\alpha }\alpha ^2}{2^{3/2}} \right] ^2 \!\! \left( \frac{1 \! + \! \gamma }{\gamma } \right) ^2$ and $K_2 \! = \! [1 \! - \! (1 \! - \! \bar{\alpha }(1 \! - \! \bar{\sigma })) (1 \! - \! \alpha (1 \! - \! \sigma ))]^2$.

Proof

We observe that the arguments used at the beginning of the proof of Lemma 9 remain valid for the $\mathcal {N}_s$ neighborhood. We can still define vectors $u = {D}^{-1} P e = {D}^{-1} p$ and $v = D Q e = D q$, where $D = (X)^{1/2} (Z)^{-1/2}$, and conclude that $u^T v = 0$. Then, the bound for $\Vert P Q e\Vert $ can be obtained in a similar fashion, with the only difference that the bound $x_i z_i \ge (1 - \theta ) \mu $ needs to be replaced with $x_i z_i \ge \gamma \mu $.

Inequalities (25) and (26) are then replaced with the following

$$\begin{aligned} \begin{aligned} \Vert P Q e \Vert&\!\le \! \frac{\Vert (\bar{\alpha }\!+\! \alpha (1 \!-\! \bar{\alpha })) \left( \mu e \! - \! X Z e \right) \!-\! (1 \!+\! \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \Vert ^2 \!+\! \Vert (\bar{\bar{\mu }} \! - \! \mu ) e \Vert ^2}{2^{3/2} \gamma \mu }. \end{aligned}\nonumber \\ \end{aligned}$$

(40)

We already have the expression for $\Vert (\bar{\bar{\mu }} - \mu ) e \Vert ^2$ (see (17)), but we need a bound for the first norm in (40). By similar arguments to those used to deliver (39), we observe that $ \Vert \mu e - X Z e \Vert \le \frac{1 + \gamma }{\gamma } \sqrt{n} \mu . $ Therefore, by this inequality, the triangle inequality, and Lemma 11, we obtain

$$\begin{aligned} \begin{aligned} \big \Vert (\bar{\alpha }&+ \alpha (1 - \bar{\alpha })) \left( \mu e - X Z e \right) - (1 + \gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \big \Vert ^2 \le \\&\le \left[ (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \left( \frac{1 + \gamma }{\gamma } \right) \sqrt{n} \mu + \frac{|1 + \gamma _1| \bar{\alpha }\alpha ^2}{2^{3/2}} \left( \frac{1 + \gamma }{\gamma } \right) \mu n \right] ^2 \\&\le \left[ (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) + \frac{|1 + \gamma _1| \bar{\alpha }\alpha ^2}{2^{3/2}} \right] ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 n^2 \mu ^2, \end{aligned}\nonumber \\ \end{aligned}$$

(41)

since $n \ge 1$. Using (17) and (41) in (40), we obtain the desired result. $\square $

Using (38) and the bounds obtained so far, we now show that, for sufficiently small step-sizes $\alpha $ and $\bar{\alpha }$, in the Newton and quasi-Newton iterations, respectively, the point $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$ also belongs to $\mathcal {N}_s(\gamma )$.

Theorem 3

Let Assumptions 1, 3 hold. Let $0< \sigma _{min} \le \sigma _{max} < 1$ be user-defined constants such that $\sigma , \bar{\sigma }\!\in \! [\sigma _{min}, \sigma _{max}]$ and define $ l \!=\! \frac{\sigma _{min}}{2} \!\! \left[ \frac{3}{2^{3/2} \gamma } \!\! \left( 2 \! + \! \frac{1}{\gamma (1 \! - \! \sigma _{max})} \right) ^2 \!\! \left( \frac{1 \! + \! \gamma }{\gamma } \right) ^2 \right] ^{-1}\!\! $.

If

$$\begin{aligned} \alpha \in \left( 0, \frac{(1 - \gamma ) l}{2 n^3} \right] \quad \text {and}\quad \bar{\alpha }\in \left[ 2 \alpha , \frac{(1 - \gamma ) l}{n^3} \right] \end{aligned}$$

(42)

then $\gamma \bar{\bar{\mu }} \le \bar{\bar{x}}_i \bar{\bar{z}}_i \le (1 / \gamma ) \bar{\bar{\mu }}, \, {i = 1, \dots , n}$. If, in addition, $\gamma \ge \sigma _{min} / 2$, then $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) \in \mathcal {N}_s(\gamma )$.

Proof

By construction we guarantee that $\bar{\alpha }\ge 2 \alpha $ as needed in (42). We start by setting $\zeta = \gamma $ in (38) and showing that $\left[ \bar{x} + \bar{\alpha }\Delta {\bar{x}}\right] _i \left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i - \gamma \bar{\bar{\mu }} \ge 0$ for sufficiently small step-sizes. By defining $K_3 = \bar{\sigma }(1 - \alpha (1 - \sigma ))$ and $K_4 = |1 + \gamma _1 \bar{\alpha }| \alpha ^2 \left( \frac{1 + \gamma }{\gamma } \right) $, and using Eq. (38), Assumption 3 and Lemmas 11 and 12, we obtain

$$\begin{aligned} \begin{aligned} \big [ \bar{x} + \bar{\alpha }\Delta {\bar{x}}\big ]_i&\left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i - \gamma \bar{\bar{\mu }} \ge \\&\ge (1 - \gamma ) \bar{\alpha }\bar{\sigma }(1 - \alpha (1 - \sigma )) \mu - |p_i q_i| - (1 + \gamma _1 \bar{\alpha }) \alpha ^2 [\Delta {x}]_i [\Delta {z}]_i \\&\ge \mu \left( (1 - \gamma ) \bar{\alpha }K_3 - K_4 \frac{n}{2^{3/2}} - K_1 \frac{n^2}{2^{3/2} \gamma } - K_2 \frac{n}{2^{3/2} \gamma } \right) . \end{aligned}\nonumber \\ \end{aligned}$$

(43)

Next, we show that (43) is non-negative for sufficiently small values of $\alpha $ and $\bar{\alpha }$. To get the result, every term inside the parenthesis in (43) will be bounded. Since $\alpha \le \frac{1}{2} \bar{\alpha }\le \frac{1}{2} $, we have $1 - \alpha (1 \! - \! \sigma ) \ge 1 - \frac{1}{2} (1 \! - \! \sigma ) \ge \frac{1}{2}$ hence for the first term in (43) we obtain

$$\begin{aligned} (1 - \gamma ) \bar{\alpha }K_3 = \, (1 - \gamma ) \bar{\alpha }\bar{\sigma }(1 - \alpha (1 - \sigma )) \, \ge \, \tfrac{1}{2} [(1 - \gamma ) \sigma _{min}] \bar{\alpha }. \end{aligned}$$

(44)

Using $\bar{\alpha }\ge 2 \alpha $ (guaranteed by (42)), $n \ge 1$ and Lemma 10, the second term becomes

$$\begin{aligned} \begin{aligned} K_4 \frac{n}{2^{3/2}}&\le \left( \!\alpha ^2 \!+\! \frac{2 \sqrt{n}}{\gamma (1 - \sigma )} \bar{\alpha }\alpha \! \right) \!\! \left( \frac{1 + \gamma }{\gamma } \right) \! \frac{n}{2^{3/2}} \le \! \left( \frac{1}{4} + \frac{\sqrt{n}}{\gamma (1 - \sigma )} \right) \!\! \left( \! \frac{1 + \gamma }{\gamma }\! \right) \! \frac{n}{2^{3/2}} \bar{\alpha }^2 \\&\le \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) \left( \frac{1 + \gamma }{\gamma } \right) \frac{n \sqrt{n} }{2^{3/2} \gamma } \bar{\alpha }^2. \end{aligned}\nonumber \\ \end{aligned}$$

(45)

By applying $\bar{\alpha }\ge 2 \alpha $ again, we observe that $ \bar{\alpha }+ \alpha (1 - \bar{\alpha }) = \bar{\alpha }+ \alpha - \bar{\alpha }\alpha \le \bar{\alpha }+ \frac{1}{2} \bar{\alpha }- \bar{\alpha }\alpha \le \frac{3}{2} \bar{\alpha }$ and, using the same arguments as before, for the third term in (43) we obtain

$$\begin{aligned} \begin{aligned} K_1 \frac{n^2}{2^{3/2} \gamma }&\le \left[ \frac{3}{2} \bar{\alpha }+ \left( 1 + \frac{2 \sqrt{n}}{\gamma (1 - \sigma ) \alpha } \right) \frac{\bar{\alpha }\alpha ^2}{2^{3/2}} \right] ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \frac{n^2}{2^{3/2} \gamma } \\&\le \left( \frac{3}{2} + \frac{1}{2} + \frac{\sqrt{n}}{\gamma (1 - \sigma )} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \frac{n^2}{2^{3/2} \gamma } \bar{\alpha }^2 \\&\le \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \frac{n^3}{2^{3/2} \gamma } \bar{\alpha }^2. \end{aligned}\nonumber \\ \end{aligned}$$

(46)

In a similar fashion, the last term of (43) can be bounded as

$$\begin{aligned} \begin{aligned} K_2 \frac{n}{2^{3/2} \gamma }&= [\alpha (1 - \sigma ) + \bar{\alpha }(1 - \bar{\sigma }) (1 - \alpha (1 - \sigma )) ]^2 \frac{n}{2^{3/2} \gamma } \\&\le \left[ \frac{(1 - \sigma )}{2} + (1 - \bar{\sigma }) (1 - \alpha (1 - \sigma )) \right] ^2 \frac{n}{2^{3/2} \gamma } \bar{\alpha }^2 \ \le \ 4 \frac{n}{2^{3/2} \gamma } \bar{\alpha }^2. \end{aligned}\nonumber \\ \end{aligned}$$

(47)

Since $n \ge 1$ and assuming without loss of generality that $\left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) \left( \frac{1 + \gamma }{\gamma } \right) \ge 4$, using (44), (45), (46) and (47) in (43) we have that

$$\begin{aligned} \begin{aligned} \Big [ \bar{x}&+ \bar{\alpha }\Delta {\bar{x}}\Big ]_i \Big [ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\Big ]_i - \gamma \bar{\bar{\mu }} \ge \\&\ge \Bigg \{ \frac{1}{2} [(1 - \gamma ) \sigma _{min}] \bar{\alpha }- 3 \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \frac{n^3}{2^{3/2} \gamma } \bar{\alpha }^2 \Bigg \} \mu . \end{aligned} \end{aligned}$$

Therefore, in order to guarantee that $\left[ \bar{x} + \bar{\alpha }\Delta {\bar{x}}\right] _i \left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i \ge \gamma \bar{\bar{\mu }}$ for $i = 1, \dots , n$, it is sufficient that the quasi-Newton step-size $\bar{\alpha }$ satisfies

$$\begin{aligned} \bar{\alpha }\le \frac{\frac{1}{2} (1 - \gamma ) \sigma _{min}}{\frac{3}{2^{3/2} \gamma } \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 n^3} = \frac{(1 - \gamma ) l}{n^3}. \end{aligned}$$

(48)

We now set $\zeta = 1/\gamma $ in (38). In order to show that the resulting expression is non-positive, we use the same arguments as before, that is, Lemmas 11 and 12 and Eqs. (44)–(47) to obtain

$$\begin{aligned} \begin{aligned} \big [&\bar{x} + \bar{\alpha }\Delta {\bar{x}}\big ]_i \big [ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\big ]_i - (1/\gamma ) \bar{\bar{\mu }} = \\&\le (1 - (1/\gamma )) \bar{\alpha }\bar{\sigma }\bar{\mu }+ | p_i q_i | + |1 + \gamma _1 \bar{\alpha }| \alpha ^2 | [\Delta {x}]_i [\Delta {z}]_i | \\&\le \mu \left( K_4 \frac{n}{2^{3/2}} + K_1 \frac{n^2}{2^{3/2} \gamma } + K_2 \frac{n}{2^{3/2} \gamma } - \left( \frac{1 - \gamma }{\gamma } \right) \bar{\alpha }K_3 \right) \\&\le \mu \left\{ 3 \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \frac{n^3}{2^{3/2} \gamma } \bar{\alpha }^2 - \frac{1}{2} \sigma _{min} \left( \frac{1 - \gamma }{\gamma } \right) \bar{\alpha }\right\} . \end{aligned}\nonumber \\ \end{aligned}$$

(49)

Therefore, if

$$\begin{aligned} \bar{\alpha }\le \frac{\left( \frac{1 - \gamma }{\gamma } \right) l}{n^3}, \end{aligned}$$

(50)

then by (49) we have that $\left[ \bar{x} + \bar{\alpha }\Delta {\bar{x}}\right] _i \left[ \bar{z} + \bar{\alpha }\Delta {\bar{z}}\right] _i \le (1/\gamma ) \bar{\bar{\mu }}$ for all $i = 1, \dots , n$. Observe that bound (48) is tighter than (50) because $\gamma \in (0,1)$ hence (48) appears as an upper bound on $\bar{\alpha }$ in (42). The remaining bounds in (42) are consistent with the need to satisfy $\bar{\alpha }\ge 2 \alpha $.

It remains to show that $\bar{\bar{x}}$ and $\bar{\bar{z}}$ are strictly positive. Similarly to Theorem 1, if, by contradiction, $\bar{\bar{x}}_i \le 0$ or $\bar{\bar{z}}_i \le 0$ for some i, then we must have $\bar{\bar{x}}_i < 0$ and $\bar{\bar{z}}_i < 0$. On one hand, we already know that $x_i z_i \ge \gamma \mu $, by Assumption 3. Hence, by Lemma 12, (46), (47) and similar arguments to those used in this proof

$$\begin{aligned} \begin{aligned} x_i z_i < p_i q_i&\le \Vert P Q e \Vert \le \frac{3}{2^{3/2} \gamma } \left( 2 + \frac{1}{\gamma (1 - \sigma _{max})} \right) ^2 \left( \frac{1 + \gamma }{\gamma } \right) ^2 \bar{\alpha }^2 \mu n^3 \\ {}&= \frac{\sigma _{min} \bar{\alpha }^2 \mu n^3}{2 l}. \end{aligned} \end{aligned}$$

Hence we conclude that $\gamma < \sigma _{min} n^3 \bar{\alpha }^2 / (2 l)$ which, together with condition $\gamma \ge \sigma _{min} / 2$ and (42), implies the following contradiction

$$\begin{aligned} \frac{\sigma _{min}}{2} \le \gamma < \frac{\sigma _{min} n^3 \bar{\alpha }^2}{2 l} \le \frac{\sigma _{min} n^3 \bar{\alpha }}{2 l} \le \frac{\sigma _{min} (1 - \gamma )}{2} \le \frac{\sigma _{min}}{2}. \end{aligned}$$

By Assumption 1, we have that $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$ belongs to $\mathcal {F}$, hence we conclude that it belongs to $\mathcal {N}_s(\gamma )$. $\square $

The upper bounds for the step-sizes delivered by Theorem 3 are used to determine the $O(n^3)$ iteration worst-case complexity of Algorithm 1 operating in $\mathcal {N}_s(\gamma )$.

Theorem 4

Suppose that $(x^{0}, {\lambda }^{0}, z^{0}) \in \mathcal {N}_s(\gamma )$ and $\gamma \ge \sigma _{min} / 2$, where $0< \sigma _{min} \le \sigma _{max} < 1$ are user-defined constants such that $\sigma _k \in [\sigma _{min}, \sigma _{max}]$ for all iterations k of Algorithm 1. If $\mu _0 = \epsilon ^\kappa $, for a given $\epsilon > 0$ and a constant $\kappa $, then there exists $K > 0$, $K = O(n^3)$, such that $\mu _k \le \epsilon $ for all $k \ge K$.

Proof

We follow the same arguments as those presented in Theorem 2 and consider only Newton iterations which are simpler to analyze. If $k \ge 2$ is a Newton iteration, then $k - 2$ also is. Then, by (42), we have that $ \alpha _{k - 2} \ge \frac{(1 - \gamma ) l}{2 n^3} $ and, by Lemma 2 and a recursive argument we obtain $ \mu _k \le \left( 1 - \frac{(1 - \gamma ) l}{2 n^3} \right) ^{k / 2} \mu _0, $ from which the convergence with the worst-case iteration complexity of $O(n^3)$ can be established easily. $\square $

Because of the close relation of the symmetric neighborhood and the $\mathcal {N}_{-\infty }(\gamma )$ neighborhood, one should expect similar complexity results to that of Theorem 3. However, if the $\mathcal {N}_s(\gamma )$ neighborhood is not applied in Lemma 10, then a naive approach is to bound $\sum _{i = 1}^n \left( \bar{x}_i \bar{z}_i \right) ^2$ by $\bar{\mu }^2 n^2$. Such approach would increase the worst-case iteration complexity to $O(n^4)$, as the degree would be increased in Eq. (46). A different approach to reduce the worst-case polynomial degree when working with the $\mathcal {N}_{-\infty }(\gamma )$ neighborhood is subject to future developments.

4 Worst-case complexity in the infeasible case

In this section, we analyze worst-case iteration complexity for linear programming problems without assuming feasibility of the starting point (Assumption 1). The analysis uses some ideas developed in Sect. 3 and follows [15, Chapter 6], by adding extra requirements on the computation of step-size $\alpha $. The proofs are modified to consider the symmetric neighborhood and admit the quasi-Newton steps.

We start by defining $r_b^k$ and $r_c^k$ to be the primal and dual infeasibility vectors at point $(x^{k}, {\lambda }^{k}, z^{k})$ which appear in the right-hand side of Eq. (4): $ r_b^k = A x^k - b \ \text {and} \ r_c^k = A^T \lambda ^k + z^k - c $. Then, given an initial guess $(x^{0}, {\lambda }^{0}, z^{0})$ and parameters $\beta \ge 1$ and $\gamma \in (0, 1)$, the infeasible version of the symmetric neighborhood is defined as follows

$$\begin{aligned} \mathcal {N}_s(\gamma , \beta ) \!\doteq \! \left\{ (x, \lambda , z) \ \Big |\ \Vert (r_b, r_c)\Vert \!\le \! \frac{\Vert (r_b^0, r_c^0)\Vert }{\mu _0} \beta \mu , (x, z) \!>\! 0, \gamma \mu \le x_i z_i \le \frac{1}{\gamma } \mu \ \forall i \right\} . \end{aligned}$$

In order to obtain complexity results, $\alpha _k$ needs to be computed in such a way that the new iterate remains in $\mathcal {N}_s(\gamma , \beta )$ and a sufficient decrease condition for $\mu _{k + 1}$ is satisfied. More precisely, given $\alpha _\textrm{dec}\in (0, 1)$, $\alpha _k$ is the largest value in [0, 1] (or a fixed fraction of it) such that

$$\begin{aligned} (x^{k}, {\lambda }^{k}, z^{k}) + \alpha _k (\Delta x^k, \Delta \lambda ^k, \Delta z^k) \in \mathcal {N}_s(\gamma , \beta )\ \text {and}\ \mu _{k + 1} \le (1 - \alpha _\textrm{dec}\alpha _k) \mu _k. \end{aligned}$$

(51)

The goal of this section is to show that, assuming that one quasi-Newton step is performed from a point in $\mathcal {N}_s(\gamma , \beta )$ by Algorithm 1, then conditions (51) are satisfied by the new point. When only Newton steps are taken, it is shown in [15, Lemma 6.7] that there is an interval $[0, \hat{\alpha }]$ such that (51) holds for the $\mathcal {N}_{-\infty }(\gamma , \beta )$ neighborhood. This is the case if a special starting point is used and $\hat{\alpha }\ge \delta _1 / n^2$, where $\delta _1$ is a constant independent of n. In Lemma 14, we extend those results in order to ensure that the iterates belong to $\mathcal {N}_s(\gamma , \beta )$ when only Newton steps are taken.

First, let us recall some notation from [15] that will be used frequently. We define the scalar $\displaystyle \nu _k = \prod _{i = 0}^{k - 1} (1 - \alpha _i)$, which allows us to write vectors $r_b^k$ and $r_c^k$ as $r_b^k = \nu _k r_b^0$ and $r_c^k = \nu _k r_c^0$, respectively. The parameter $\sigma _k$ used in (7) satisfies $0< \sigma _{min} \le \sigma _k \le \sigma _{max} < 1$ for all k. We also assume that a special initial point is used in Algorithm 1, given by

$$\begin{aligned} (x^{0}, {\lambda }^{0}, z^{0}) = \begin{bmatrix} \xi e&0&\xi e \end{bmatrix}^T, \end{aligned}$$

(52)

where $\xi $ is such that $\Vert (x^*, z^*) \Vert _\infty \le \xi $, for some primal-dual solution $(x^{*}, {\lambda }^{*}, z^{*})$.

From now on we will again drop the index k of the iterates, to make notation simpler, even for constants $\nu $, $r_b$ and $r_c$. For a Newton iteration, let $D = X^{1/2} Z^{-1/2}$ and $\omega = 9 \beta / \gamma ^{1/2}$ be a constant independent of n. Lemma 13 gives bounds for very common vectors that appear in the proofs.

Lemma 13

When (52) is used as the starting point, if $(x, \lambda , z) \in \mathcal {N}_s(\gamma , \beta )$ the following bounds hold:

$$\begin{aligned} \nu \Vert (x, z) \Vert _1&\le (4 \beta / \xi ) n \mu \end{aligned}$$

(53)

$$\begin{aligned} \Vert D^{-1} \Delta {x}\Vert&\le \omega n \mu ^{1/2} \quad \text {and} \quad \Vert D \Delta {z}\Vert \le \omega n \mu ^{1/2}. \end{aligned}$$

(54)

Proof

The proofs for these bounds follow from Lemmas 6.4 and 6.6 of [15] and the fact that $\mathcal {N}_s(\gamma , \beta ) \subseteq \mathcal {N}_{-\infty }(\gamma , \beta )$. $\square $

We start the analysis from looking at Newton step in $\mathcal {N}_s(\gamma , \beta )$ neighborhood.

Lemma 14

If $(x, \lambda , z) \in \mathcal {N}_s(\gamma , \beta )$ then there exists a constant $\delta _2$ independent of n and a value $\hat{\alpha }\ge \delta _2 / n^2$, such that for all $\alpha \in [0, \hat{\alpha }]$

$$\begin{aligned} (1 - \alpha ) {x}^T z&\le (x + \alpha \Delta {x})^T(z + \alpha \Delta {z}) \le (1 - \alpha _\textrm{dec}\alpha ) {x}^T z \end{aligned}$$

(55)

$$\begin{aligned} \gamma \bar{\mu }&\le [x + \alpha \Delta {x}]_i [z + \alpha \Delta {z}]_i \le (1/\gamma ) \bar{\mu }. \end{aligned}$$

(56)

Proof

We only need to show the right inequality of (56), since all the other results were detailed in [15, Lemma 6.7]. Since we consider a Newton iteration, by (54), we obtain

$$\begin{aligned} |[\Delta {x}]_i [\Delta {z}]_i |\le |[D^{-1}]_{ii} [\Delta {x}]_i ||[D]_{ii} [\Delta {z}]_i |\le \Vert D^{-1} \Delta {x}\Vert \Vert D \Delta {z}\Vert \le \omega ^2 n^2 \mu . \end{aligned}$$

As $(\Delta {x}, \Delta {\lambda }, \Delta {z})$ solves (4) and $(x, \lambda , z) \in \mathcal {N}_s(\gamma , \beta )$, using the previous inequality and a component-wise version of (9) we get

$$\begin{aligned}{}[x + \alpha \Delta {x}]_i [z + \alpha \Delta {z}]_i&\le \frac{1 - \alpha }{\gamma } \mu + \alpha \sigma \mu + \alpha ^2 \omega ^2 n^2 \mu . \end{aligned}$$

(57)

Using similar arguments and Eq. (9) again, we also obtain

$$\begin{aligned} \begin{aligned} \frac{1}{\gamma }&\bar{\mu }= \frac{1}{\gamma } \frac{(x + \alpha \Delta {x})^T(z + \alpha \Delta {z})}{n} \\&\ge \frac{1 - \alpha }{\gamma } \mu + \frac{\alpha \sigma }{\gamma } \mu - \frac{\alpha ^2 |{\Delta {x}}^T \Delta {z}|}{\gamma n} \ge \frac{1 - \alpha }{\gamma } \mu + \frac{\alpha \sigma }{\gamma } \mu - \frac{\alpha ^2 \omega ^2 n}{\gamma } \mu . \end{aligned}\nonumber \\ \end{aligned}$$

(58)

Using (57), (58) and the fact that $n \ge 1$, we obtain the inequality

$$\begin{aligned}{}[x + \alpha \Delta {x}]_i [z + \alpha \Delta {z}]_i - \frac{1}{\gamma } \bar{\mu }\le \alpha \sigma \left( 1 - \frac{1}{\gamma } \right) \mu + \alpha ^2 \omega ^2 n^2 \left( 1 + \frac{1}{\gamma } \right) \mu . \end{aligned}$$

Its right-hand side is non-positive if $\alpha \le \frac{\sigma _{min}(1 - \gamma )}{\omega ^2 n^2 (1 + \gamma )}$. By defining $\delta _2$ as the minimum of $\frac{\sigma _{min}(1 - \gamma )}{\omega ^2 (1 + \gamma )}$ and $\delta _1$ defined in [15, Lemma 6.7], we obtain the desired result. $\square $

By Lemma 14, in a Newton iteration of Algorithm 1 if $\alpha \le \delta _2 / n^2$ then point $(\bar{x}, \bar{\lambda }, \bar{z})$ satisfies (51). If only Newton steps are made, then $O(n^2)$ worst-case iteration complexity can be proved for $\mathcal {N}_s(\gamma , \beta )$ (see [15, Chapter 6]).

Based on the previous paragraphs, we set some assumptions that will be used in the remainder of this section. First, we recall that the indexes were removed from all terms related to Newton and quasi-Newton iterates, including $\nu $, $r_b$ and $r_c$. Bars are added to quasi-Newton iterates and double bars to the iteration after a quasi-Newton one (which is a Newton again).

Assumption 4

The initial point $(x^{0}, {\lambda }^{0}, z^{0})$ satisfies (52), both $(x, \lambda , z)$ and $\alpha $ satisfy (51), and $(\bar{x}, \bar{\lambda }, \bar{z}) \in \mathcal {N}_s(\gamma , \beta )$.

In our analysis, we will use various well-known results for Newton steps, such as Lemmas 13 and 14. We observe that, while in the standard Newton approach one has to find bounds for ${\Delta {x}}^T \Delta {z}$, in the quasi-Newton approach (see for example (13)), it is necessary to get also bounds for $\gamma _1$ and $p^T q = (\alpha \Delta {x}\! + \! \bar{\alpha }\Delta {\bar{x}})^T (\alpha \Delta {z}\! + \! \bar{\alpha }\Delta {\bar{z}})$. In the next lemma, we give a bound for $\gamma _1$ without assuming feasibility of iterates.

Lemma 15

Suppose that v in (8) is given by the right-hand side of (4) and Assumption 4 holds. If $\alpha \in (0, 1]$, then $| \gamma _1 | \le C_3 \sqrt{n} / \alpha $, where

$$\begin{aligned} C_3 = 2 (\gamma \alpha _\textrm{dec})^{-1} \left[ \left( \dfrac{\Vert (r_c^0, r_b^0)\Vert \beta \gamma }{\xi ^2} \right) ^2 \! + \! 1 \right] ^{1/2}. \end{aligned}$$

Proof

Using scalar $\bar{\nu }$ defined in the beginning of this section, vector v at a quasi-Newton iteration is given by

$$\begin{aligned} v = \begin{bmatrix} 0 \\ 0 \\ \bar{\sigma }\bar{\mu }e \end{bmatrix} - F(\bar{x}, \bar{\lambda }, \bar{z}) = \begin{bmatrix} - \bar{\nu }r_c^0 \\ - \bar{\nu }r_b^0 \\ \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e \end{bmatrix}. \end{aligned}$$

(59)

We observe that, since $e^T (\bar{\mu }e - \bar{X} \bar{Z} e) = 0$, by (51) we obtain

$$\begin{aligned} \begin{aligned} \Vert \bar{\sigma }\!\bar{\mu }e \!-\! \bar{X} \bar{Z} e \Vert ^2&\!=\! \Vert (\bar{\sigma }\!-\! 1) \bar{\mu }e \!+\! (\bar{\mu }e \!-\! \bar{X} \bar{Z} e) \Vert ^2 = (\bar{\sigma }- 1)^2 \bar{\mu }^2 n + \Vert \bar{\mu }e - \bar{X} \bar{Z} e \Vert ^2 \\&= (\bar{\sigma }- 1)^2 \bar{\mu }^2 n + \Vert \bar{X} \bar{Z} e \Vert ^2 - n \bar{\mu }^2 \\ {}&\le - (2 - \bar{\sigma }) \bar{\sigma }n \bar{\mu }^2 + \gamma ^{-2} n \bar{\mu }^2\ \le \ \gamma ^{-2} n \bar{\mu }^2. \end{aligned}\nonumber \\ \end{aligned}$$

(60)

Taking the 2-norm in (59) and using (60), we get

$$\begin{aligned} \begin{aligned} \Vert v \Vert ^2&= \Vert (r_c^{k + 1}, r_b^{k + 1})\Vert ^2 + \Vert \bar{\sigma }\bar{\mu }e - \bar{X} \bar{Z} e \Vert ^2 \\&\le \left( \frac{\Vert (r_c^0, r_b^0)\Vert \beta }{\mu _0} \right) ^2 \!\!\mu ^2_{k + 1} \!+\! \gamma ^{-2} n \bar{\mu }^2 \!\le \! \left[ \left( \frac{\Vert (r_c^0, r_b^0)\Vert \beta }{\xi ^2} \right) ^2 \!\!+\! \gamma ^{-2} \right] \!n \mu ^2_{k}, \end{aligned}\nonumber \\ \end{aligned}$$

(61)

where in the last inequality we used (51) to yield $\bar{\mu }\le \mu $ and (52) to yield $\mu _0 = \xi ^2$. By defining $\rho = \alpha _\textrm{dec}> 0$, we can see that condition (51) results in the sufficient decrease condition of Lemma 1. Since $\alpha \in (0, 1]$, this ensures that $\Vert \bar{y} \Vert > 0$. Using Lemma 1, (51) and (61) in Eq. (15), we obtain the desired result. $\square $

Our goal now is to bound the term $p^T q$ in (13). We follow the same approach as [15] but compute, instead, bounds of $D^{-1} p$ and Dq. It is important to note that matrix $D = (X)^{1/2} (Z)^{-1/2}$ is related to the Newton iteration, hence we can use Eq. (8) and the properties of the true Jacobian discussed in Sect. 2.

When matrix A multiplies the combined direction p, we observe that

$$\begin{aligned} \begin{aligned} A p&= \alpha A \Delta {x}+ \bar{\alpha }A \Delta {\bar{x}}= - \alpha (A x \! - \! b) - \bar{\alpha }(A \bar{x} \! - \! b)\\&= - \alpha (A x \! - \! b) - \bar{\alpha }(A (x + \alpha \Delta {x}) \! - \! b) \\&= - \alpha (A x - b) - \bar{\alpha }(A x - b) + \alpha \bar{\alpha }(A x - b) = (\alpha \bar{\alpha }- \alpha - \bar{\alpha }) (A x - b) \\&= (\alpha \bar{\alpha }- \alpha - \bar{\alpha }) \nu (A x^0 - b) = A \left[ (\alpha \bar{\alpha }- \alpha - \bar{\alpha }) \nu (x^0 - x^*) \right] , \end{aligned} \end{aligned}$$

where $x^*$ is the primal solution of (1) used to define constant $\xi $ in (52). By defining $\hat{\nu }= - (\alpha \bar{\alpha }- \alpha - \bar{\alpha }) \nu = (1 - (1 - \alpha ) (1 - \bar{\alpha })) \nu $, we can see that the point $\hat{x} = p + \hat{\nu }(x^0 - x^*)$ is such that $A \hat{x} = 0$. Similar arguments can be used to show that $\hat{z} = q + \hat{\nu }(z^0 - z^*)$ and $\hat{\lambda }= \alpha \Delta {\lambda }+ \bar{\alpha }\Delta {\bar{\lambda }}+ \hat{\nu }(\lambda ^0 - \lambda ^*)$ satisfy $A^T \hat{\lambda }+ \hat{z} = 0$. Therefore, it is not hard to see that

$$\begin{aligned} {\hat{x}}^T \hat{z} = \left( p + \hat{\nu }(x^0 - x^*) \right) ^T \left( q + \hat{\nu }(z^0 - z^*) \right) = 0. \end{aligned}$$

(62)

We now multiply the third row of J by vector $(\hat{x}, \hat{\lambda }, \hat{z})$ to obtain $ Z \hat{x} + X \hat{z} = Z \left( p + \hat{\nu }(x^0 - x^*) \right) + X \left( q + \hat{\nu }(z^0 - z^*) \right) = Z p + X q + \hat{\nu }Z (x^0 - x^*) + \hat{\nu }X ( z^0 - z^*) $. Multiplying this equation on both sides by $(X Z)^{-1/2}$, we get

$$\begin{aligned} \begin{aligned} D^{-1} \hat{x} + D \hat{z}&= (X Z)^{-1/2} \big [ Z p + X q \big ] + \hat{\nu }D^{-1} (x^0 - x^*) + \hat{\nu }D (z^0 - z^*). \end{aligned} \end{aligned}$$

(63)

Using (62) and (63), we conclude that $ \Vert D^{-1} \hat{x} \Vert ^2 + \Vert D \hat{z} \Vert ^2 = \Vert D^{-1} \hat{x} + D \hat{z} \Vert ^2 = \Vert (X Z)^{-1/2} [ Z p + X q ] + \hat{\nu }D^{-1} (x^0 - x^*) + \hat{\nu }D ( z^0 - z^*) \Vert ^2 $. Using this relation we get

$$\begin{aligned} \begin{aligned} \Vert D^{-1} \hat{x} \Vert&\!\le \!\left\| (X Z)^{-1/2} \!\left[ Z p + X q \right] \right\| + \hat{\nu }\Vert D^{-1} (x^0 - x^*) \Vert + \hat{\nu }\Vert D ( z^0 - z^*) \Vert \end{aligned} \end{aligned}$$

(64)

and observe that the same bound holds for $\Vert D \hat{z} \Vert $. In the next two lemmas, we compute the bounds for the terms in the right-hand side of (64).

Lemma 16

Under Assumption 4, let $\omega $ and $C_3$ be the constants (independent of n) defined in Eq. (54) and Lemma 15, respectively. Then

$ \Big \Vert (X Z)^{\!-\!1/2} \left[ Z p + X q \right] \Big \Vert \le \gamma ^{-1/2} \left[ (\sigma _{max} + \gamma ^{-1})(\alpha + \bar{\alpha }) \!+\! (\alpha + C_3) \bar{\alpha }\alpha \omega ^2 \right] n^{5/2} \mu ^{1/2}. $

Proof

First, we observe that the diagonal matrices $\Delta {X}$ and $\Delta {Z}$ were computed in the Newton iteration and use (54) to obtain

$$\begin{aligned} \Vert \Delta {X}\Delta {Z}e \Vert \le \Vert D^{-1} \Delta {X}\Vert \Vert D \Delta {Z}e \Vert \le \Vert D^{-1} \Delta {X}\Vert \omega n \mu ^{1/2}. \end{aligned}$$

Since both $D^{-1}$ and $\Delta {X}$ are diagonal matrices, using the property of the induced 2-norm of matrices, we get $ \Vert D^{-1} \Delta {X}\Vert = \underset{i}{\max }\ \frac{|[\Delta {x}]_i|}{[D]_{ii}} \!=\! \Vert D^{-1} \!\Delta {X}e \Vert _\infty \!\le \!\Vert D^{-1} \!\Delta {X}e \Vert \le \omega n \mu ^{1/2}. $ Hence, $\Vert \Delta {X}\Delta {Z}e \Vert \le \omega ^2 n^2 \mu $. Now, we use Eq. (21) (which does not depend on the feasibility of the iterate or the type of the neighborhood) to expand the desired expression in the statement of this Lemma. Additionally, we use Eq. (51) to bound $\bar{\mu }$ and XZe as well as Lemma 15 and the previous inequality to derive:

$$\begin{aligned} \begin{aligned} \Vert&(X Z)^{-1/2} \left[ Z p + X q \right] \Vert = \\&= \left( \sum _{i = 1}^n \frac{\left( (1 - \bar{\alpha }) \alpha \sigma \mu - (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) x_i z_i + \bar{\alpha }\bar{\sigma }\bar{\mu }- (1 +\gamma _1) \bar{\alpha }\alpha ^2 [\Delta {x}]_i [\Delta {z}]_i \right) ^2}{x_i z_i} \right) ^{1/2} \\&\le (\gamma \mu )^{-1/2} \left\| (1 - \bar{\alpha }) \alpha \sigma \mu e - (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) X Z e + \bar{\alpha }\bar{\sigma }\bar{\mu }e - (1 +\gamma _1) \bar{\alpha }\alpha ^2 \Delta {X}\Delta {Z}e \right\| \\&\le (\gamma \mu )^{-1/2} \left[ (1 - \bar{\alpha }) \alpha \sigma \sqrt{n} \mu + (\bar{\alpha }+ \alpha (1 - \bar{\alpha })) \gamma ^{-1} \sqrt{n} \mu + \bar{\alpha }\bar{\sigma }\sqrt{n} \mu + |1 +\gamma _1| \bar{\alpha }\alpha ^2 \omega ^2 n^2 \mu \right] \\&\le \gamma ^{-1/2} \left[ \left( (\sigma + \gamma ^{-1})(1 - \bar{\alpha }) \alpha + (\bar{\sigma }+ \gamma ^{-1}) \bar{\alpha }\right) \sqrt{n} + \left( 1 + C_3 \frac{\sqrt{n}}{\alpha } \right) \bar{\alpha }\alpha ^2 \omega ^2 n^2 \right] \mu ^{1/2} \\&\le \gamma ^{-1/2} \left[ (\sigma _{max} + \gamma ^{-1})(\alpha + \bar{\alpha }) + (\alpha + C_3) \bar{\alpha }\alpha \omega ^2 \right] n^{5/2} \mu ^{1/2}, \\ \end{aligned} \end{aligned}$$

which proves the lemma. $\square $

Lemma 17

Under Assumption 4, let $\omega $ and $C_3$ be the constants (independent of n) defined before. Then,

$$\begin{aligned} \begin{aligned} \Vert D^{-1} p \Vert&\le \left[ (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (\alpha + \bar{\alpha }) + (\alpha + C_3) \omega ^2 \alpha \bar{\alpha }\right] \gamma ^{-1/2} n^{5/2} \mu ^{1/2} \\ \Vert D q \Vert&\le \left[ (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (\alpha + \bar{\alpha }) + (\alpha + C_3) \omega ^2 \alpha \bar{\alpha }\right] \gamma ^{-1/2} n^{5/2} \mu ^{1/2}. \end{aligned} \end{aligned}$$

Proof

We will only consider $\Vert D^{-1} p \Vert $, since getting a bound for $\Vert D q \Vert $ follows the same arguments. We use the definition of $\hat{x}$ (see (62)), add and subtract $\hat{\nu }D^{-1} (x^0 - x^*)$ inside the norm, and then use the triangle inequality and (64) to obtain

$$\begin{aligned} \begin{aligned} \Vert D^{-1} p \Vert&= \Vert D^{-1} \hat{x} - \hat{\nu }D^{-1} (x^0 - x^*) \Vert \le \Vert D^{-1} \hat{x} \Vert + \hat{\nu }\Vert D^{-1} (x^0 - x^*) \Vert \\&\le \left\| (X Z)^{-1/2} \left[ Z p + X q \right] \right\| + 2 \hat{\nu }\Vert D^{-1} (x^0 - x^*) \Vert + 2 \hat{\nu }\Vert D ( z^0 - z^*) \Vert , \end{aligned}\nonumber \\ \end{aligned}$$

(65)

where another term $\hat{\nu }\Vert D ( z^0 - z^*) \Vert $ was added in the last inequality. We already have bounds for the first term in the right-hand side of (65), by Lemma 16. From [15, Lemma 6.6] we know that

$$\begin{aligned} \begin{aligned} \Vert D^{-1} (x^0 - x^*) \Vert&\le \Vert (x^0 - x^*) \Vert \Vert D^{-1} \Vert \le \xi \Vert D^{-1} \Vert \!=\! \xi \ \underset{i = 1, \dots , n}{\max }\ \frac{1}{[D]_{ii}} = \xi \Vert D^{-1} e \Vert _\infty \\&\le \xi \Vert D^{-1} e \Vert = \xi \Vert (X Z)^{-1/2} Z e \Vert \le \xi \Vert (X Z)^{-1/2} \Vert \Vert z \Vert _1 \end{aligned} \end{aligned}$$

and, similarly, $\Vert D ( z^0 - z^*) \Vert \, \le \, \xi \Vert (X Z)^{-1/2} \Vert \Vert x \Vert _1$. We recall that $\hat{\nu }\, = \, (1 - (1 - \alpha )(1 - \bar{\alpha })) \nu $ and apply all these inequalities together with (53) to obtain

$$\begin{aligned} \begin{aligned} 2 \hat{\nu }&\Vert D^{-1} (x^0 - x^*) \Vert + 2 \hat{\nu }\Vert D ( z^0 - z^*) \Vert \le \\&\le 2 \xi (1 - (1 - \alpha )(1 - \bar{\alpha })) \Vert (X Z)^{-1/2} \Vert \nu \Vert (x, z) \Vert _1 \\&\le 2 \xi (1 \!-\! (1 \!-\! \alpha )(1 \!-\! \bar{\alpha })) (\gamma \mu )^{-1/2} \frac{4 \beta }{\xi } n \mu \!=\! (1 \!-\! (1 \!-\! \alpha )(1 \!-\! \bar{\alpha })) \frac{8 \beta }{\gamma ^{1/2}} n \mu ^{1/2}, \end{aligned} \end{aligned}$$

which provides bounds for the last two terms in (65). Therefore, using Lemma 16 and the above inequality we write

$$\begin{aligned} \begin{aligned}&\Vert D^{-1} p \Vert \\&\!\le \! \left[ \frac{(\sigma _{max} +\gamma ^{-1}) (\alpha + \bar{\alpha }) \!+\! (\alpha + C_3) \bar{\alpha }\alpha \omega ^2 }{\gamma ^{1/2}} n^{5/2} + \frac{8 \beta (1 - (1 - \alpha )(1 - \bar{\alpha })) }{\gamma ^{1/2}} n \right] \mu ^{1/2} \\&\le \frac{ (\sigma _{max} + \gamma ^{-1}) (\alpha + \bar{\alpha }) + (\alpha + C_3) \bar{\alpha }\alpha \omega ^2 + 8 \beta (1 - (1 - \alpha )(1 - \bar{\alpha }))}{\gamma ^{1/2}} n^{5/2} \mu ^{1/2} \\&\le \left[ (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (\alpha + \bar{\alpha }) + (\alpha + C_3) \omega ^2 \alpha \bar{\alpha }\right] \gamma ^{-1/2} n^{5/2} \mu ^{1/2} \end{aligned} \end{aligned}$$

and the lemma is proved. $\square $

If we restrict the choices of $\alpha $ and $\bar{\alpha }$, then the bounds obtained in Lemma 17 can be significantly simplified as shown in the corollary below.

Corollary 1

If $\alpha \le (\sigma _{max} + \gamma ^{-1} + 8 \beta ) ((1 + C_3) \omega ^2)^{-1}$ and $\bar{\alpha }\le C_3^{-1} \alpha $, then there exists a constant $C_6$, independent of n, such that $ \Vert D^{-1} p \Vert \le C_6 \alpha n^{5/2} \mu ^{1/2} \ \text {and}\ \Vert D q \Vert \le C_6 \alpha n^{5/2} \mu ^{1/2}. $

Proof

Using the bounds on $\bar{\alpha }$ and $\alpha $ assumed in the lemma we obtain $(\alpha + C_3) \omega ^2 \alpha \le (\sigma _{max} + \gamma ^{-1} + 8 \beta )$, hence, by $\bar{\alpha }\le C_3^{-1} \alpha $,

$$\begin{aligned} \begin{aligned} (\sigma _{max} + \gamma ^{-1} + 8 \beta )&(\alpha + \bar{\alpha }) + (\alpha + C_3) \omega ^2 \alpha \bar{\alpha }\le (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (\alpha + 2 \bar{\alpha }) \\&\le (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (1 + 2 C_3^{-1}) \alpha , \end{aligned} \end{aligned}$$

and the conclusion follows by defining $C_6 = (\sigma _{max} + \gamma ^{-1} + 8 \beta ) (1 + 2 C_3^{-1}) \gamma ^{-1/2}$. $\square $

We are now ready to prove the polynomial worst-case iteration complexity of Algorithm 1 in the infeasible case. Lemma 14 dealt with the Newton step of the method. Theorems 5 and 6 provide the results for the quasi-Newton step.

Theorem 5

Suppose that Assumption 4 holds. If

$$\begin{aligned} \begin{aligned} \alpha _\textrm{dec}+ \sigma _{max}&\le 1 - \sigma _{min} \end{aligned} \end{aligned}$$

(66)

then, there exists a constant $C_5$, independent of n, such that, if

$$\begin{aligned} \alpha \in \left( 0, \min \left\{ \frac{1}{(1 + C_3) \omega ^2}, \frac{C_5}{n^5} \right\} \right] \ \text {and} \ \ \bar{\alpha }\in \left[ (C_3 C_5)^{-1} n^5 \alpha ^2, C_3^{-1} \alpha \right] , \end{aligned}$$

(67)

then the iterate after the quasi-Newton step satisfies

$$\begin{aligned} (\bar{x} + \bar{\alpha }\Delta {\bar{x}})^T(\bar{z} + \bar{\alpha }\Delta {\bar{z}})&\ge (1 - \bar{\alpha }) {\bar{x}}^T \bar{z} \end{aligned}$$

(68)

$$\begin{aligned} (\bar{x} + \bar{\alpha }\Delta {\bar{x}})^T(\bar{z} + \bar{\alpha }\Delta {\bar{z}})&\le (1 - \alpha _\textrm{dec}\bar{\alpha }) {\bar{x}}^T \bar{z} \end{aligned}$$

(69)

$$\begin{aligned} _i [\bar{z} + \bar{\alpha }\Delta {\bar{z}})]_i&\ge \gamma \bar{\bar{\mu }} \end{aligned}$$

(70)

$$\begin{aligned} _i [\bar{z} + \bar{\alpha }\Delta {\bar{z}})]_i&\le \gamma ^{-1} \bar{\bar{\mu }} . \end{aligned}$$

(71)

Proof

By Lemma 15 we know that $C_3 \ge 1$. Hence, the given intervals for $\bar{\alpha }$ and $\alpha $ are well defined. We begin the proof with inequality (68). Using (54), we get

$$\begin{aligned} \alpha ^2 {\Delta {x}}^T \Delta {z}\le \alpha ^2 \Vert D^{-1} \Delta {x}\Vert \Vert D \Delta {z}\Vert \le \alpha ^2 \omega ^2 n^2 \mu . \end{aligned}$$

(72)

By Corollary 1, we obtain

$$\begin{aligned} p^T q \le \Vert D^{-1} p \Vert \Vert D q \Vert \le C_6^2 \alpha ^2 n^5 \mu . \end{aligned}$$

(73)

Finally, by using inequality (55) from Lemma 14, we get

$$\begin{aligned} \bar{\alpha }\bar{\sigma }{\bar{x}}^T \bar{z} \ge \bar{\alpha }(1 - \alpha ) \bar{\sigma }{x}^T z \ge \sigma _{min} \bar{\alpha }(1 - \alpha ) n \mu . \end{aligned}$$

(74)

Starting from Eq. (13), using (72)–(74), and then applying the bounds for $|\gamma _1|$ from Lemma 15 together with the bound $\bar{\alpha }\le C_3^{-1} \alpha $ (assumed in (67)) to deliver $1 + \bar{\alpha }|\gamma _1| \le 1 + C_3^{-1} \alpha C_3 \frac{\sqrt{n}}{\alpha } = 1 + \sqrt{n}$, we get

$$\begin{aligned} \begin{aligned} (\bar{x}&+ \bar{\alpha }\Delta {\bar{x}})^T (\bar{z} + \bar{\alpha }\Delta {\bar{z}}) - (1 - \bar{\alpha }) {\bar{x}}^T \bar{z} \ge \\&\ge \sigma _{min} \bar{\alpha }(1 - \alpha ) n \mu - C_6^2 \alpha ^2 n^5 \mu - (1 + \bar{\alpha }|\gamma _1|) \alpha ^2 |{\Delta {x}}^T \Delta {z}|\\&\ge \left( \sigma _{min} \bar{\alpha }- \sigma _{min} \bar{\alpha }\alpha - (1 + n^{1/2}) \alpha ^2 \omega ^2 n - C_6^2 \alpha ^2 n^4 \right) n \mu \\&\ge \left[ \sigma _{min} \bar{\alpha }- \left( C_3^{-1} \sigma _{min} + 2 \omega ^2 + C_6^2 \right) \alpha ^2 n^4 \right] n \mu . \end{aligned} \end{aligned}$$

(75)

By defining $\kappa = 2 \omega ^2 + C_6^2$ and $C_4 = C_3^{-1} \sigma _{min} \left( C_3^{-1} \sigma _{min} + \kappa \right) ^{-1}$, the right-hand side of (75) is non-negative if $\bar{\alpha }\ge (C_4 C_3)^{-1} n^4 \alpha ^2$ and $\alpha \le C_4 / n^4$. The lower bound for $\bar{\alpha }$ is necessary to avoid too small step in the quasi-Newton direction. It is obtained by writing $\bar{\alpha }$ as a function of $\alpha $, in order to guarantee its non-negativity. The upper bound for $\alpha $ is obtained by requesting that $\alpha $ has to be chosen such that $(C_4 C_3)^{-1} n^4 \alpha ^2 \le C_3^{-1} \alpha $ holds.

For inequality (69), we start by subtracting $(1 - \alpha _\textrm{dec}\bar{\alpha }) {\bar{x}}^T \bar{z}$ from both sides of Eq. (13) to obtain an expression $ - (1 - \alpha _\textrm{dec}- \bar{\sigma }) \bar{\alpha }{\bar{x}}^T \bar{z} + p^T q - (1 + \bar{\alpha }\gamma _1) \alpha ^2 {\Delta {x}}^T \Delta {z}$.

We observe that the first term is negative by the assumption (66) of the Theorem, since $\bar{\sigma }\le \sigma _{max}$. Next, from the left inequality in (55), we have $(1 - \alpha ) n \mu \le n \bar{\mu }$ and then by using (66), (72) and (73), we conclude that

$$\begin{aligned} \begin{aligned} (\bar{x}&+ \bar{\alpha }\Delta {\bar{x}})^T (\bar{z} + \bar{\alpha }\Delta {\bar{z}}) - (1 - \alpha _\textrm{dec}\bar{\alpha }) {\bar{x}}^T \bar{z} \le \\&\le - \sigma _{min} \bar{\alpha }(1 - \alpha ) n \mu + (1 + n^{1/2}) \alpha ^2 \omega ^2 n^2 \mu + C_6^2 \alpha ^2 n^5 \mu \\&\le \left[ - \sigma _{min} \bar{\alpha }+ \left( \sigma _{min} C_3^{-1} + \kappa \right) \alpha ^2 n^4 \right] n \mu , \end{aligned} \end{aligned}$$

where in the last inequality we used the bound $\bar{\alpha }\le C_3^{-1} \alpha $ to deliver inequality $\sigma _{min} \bar{\alpha }\alpha n \mu \le \sigma _{min} C_3^{-1} \alpha ^2 n \mu $. The right-hand side of the previous inequality will be non-positive if, as before, $\bar{\alpha }\ge (C_4 C_3)^{-1} n^4 \alpha ^2$ and $\alpha \le C_4 / n^4$.

To prove inequalities (70) and (71), we initially look at (14), which is a component-wise version of (12). We also need to derive component-wise versions of (73) and (72). For (73) we have

$$\begin{aligned} \begin{aligned} p_i q_i \le |[D^{-1}]_{ii} p_i |\, |[D]_{ii} q_i |\le \Vert D^{-1} p \Vert \Vert D q \Vert \le C_6^2 \alpha ^2 n^5 \mu , \end{aligned} \end{aligned}$$

(76)

and, by a similar approach, Eq. (72), Lemmas 13 and 15 and the assumption of the theorem $\bar{\alpha }\le C_3^{-1} \alpha $ we get

$$\begin{aligned} \begin{aligned} (1 + \bar{\alpha }\gamma _1)&\alpha ^2 [\Delta {x}]_i [\Delta {z}]_i \le (1 + \bar{\alpha }|\gamma _1 |) \alpha ^2 |[\Delta {x}]_i [\Delta {z}]_i |\\&\le (1 + n^{1/2}) \alpha ^2 \Vert D^{-1} \Delta {X}\Vert \Vert D \Delta {Z}\Vert \le 2 \alpha ^2 \omega ^2 n^{5/2} \mu . \end{aligned} \end{aligned}$$

(77)

Now, taking (14) and applying (56) and (76)–(77) we get

$$\begin{aligned}{}[ \bar{x} + \bar{\alpha }\Delta {\bar{x}}]_i [\bar{z} + \bar{\alpha }\Delta {\bar{z}}]_i&\ge (1 - \bar{\alpha }) \gamma \bar{\mu }+ \bar{\alpha }\bar{\sigma }\bar{\mu }- |p_i q_i |- (1 + \bar{\alpha }|\gamma _1 |) \alpha ^2 |[\Delta {x}]_i [\Delta {z}]_i |\\&\ge (1 - \bar{\alpha }) \gamma \bar{\mu }+ \bar{\alpha }\bar{\sigma }\bar{\mu }- C_6^2 \alpha ^2 n^5 \mu - 2 \alpha ^2 \omega ^2 n^{5/2} \mu \\&\ge (1 - \bar{\alpha }) \gamma \bar{\mu }+ \bar{\alpha }\bar{\sigma }\bar{\mu }- \alpha ^2 \kappa \, n^5 \mu \end{aligned}$$

and from (13), using (72)–(73) we get

$$\begin{aligned} \gamma \bar{\bar{\mu }}&= \gamma \frac{(\bar{x} + \bar{\alpha }\Delta {\bar{x}})^T (\bar{z} + \bar{\alpha }\Delta {\bar{z}})}{n}\\ {}&\le \gamma \left[ (1 - \bar{\alpha }(1 \! - \! \bar{\sigma }) ) \bar{\mu }+ \frac{1 \! + \! \bar{\alpha }|\gamma _1 |}{n} \alpha ^2 |{\Delta {x}}^T \Delta {z}|+ \frac{|p^T q |}{n} \right] \\&\le (1 - \bar{\alpha }) \gamma \bar{\mu }+ \bar{\alpha }\bar{\sigma }\gamma \bar{\mu }+ (1 + n^{1/2}) \gamma \omega ^2 \alpha ^2 n \mu + C_6^2 \gamma \alpha ^2 n^4 \mu \\&\le (1 - \bar{\alpha }) \gamma \bar{\mu }+ \bar{\alpha }\bar{\sigma }\gamma \bar{\mu }+ \kappa \gamma \alpha ^2 n^4 \mu . \end{aligned}$$

By combining the above two inequalities, then using (74) and (55) and $\bar{\alpha }\le C_3^{-1} \alpha $, we obtain

$$\begin{aligned}{}[\bar{x} \!+\! \bar{\alpha }\Delta {\bar{x}}]_i&[\bar{z} \!+\! \bar{\alpha }\Delta {\bar{z}}]_i \!-\! \gamma \bar{\bar{\mu }} \!\ge \! (1 \!-\! \gamma ) \bar{\alpha }\bar{\sigma }\bar{\mu }\!-\! (1 \!+\! \gamma ) \kappa n^5 \alpha ^2 \mu \\&\ge (1 - \gamma ) \sigma _{min} \bar{\alpha }(1 - \alpha ) \mu - (1 + \gamma ) \kappa n^5 \alpha ^2 \mu \\&\ge \left[ \sigma _{min} \bar{\alpha }- \left( C_3^{-1} \sigma _{min} + \left( \frac{1 + \gamma }{1 - \gamma } \right) \kappa \right) n^5 \alpha ^2 \right] (1 - \gamma ) \mu . \end{aligned}$$

The right-hand side of this inequality is non-negative, and hence (70) holds, if $\bar{\alpha }\ge (C_3 C_5)^{-1} n^5 \alpha ^2$ and $\alpha \le C_5 / n^5$, with $ C_5 = C_3^{-1} \sigma _{min} \left( \frac{\sigma _{min}}{C_3} + \left( \frac{1 + \gamma }{1 - \gamma } \right) \kappa \right) ^{-1} $.

Finally, in order to show (71), we use very similar arguments and obtain the following two inequalities: $ [\bar{x} + \bar{\alpha }\Delta {\bar{x}}]_i [\bar{z} + \bar{\alpha }\Delta {\bar{z}}]_i \le \frac{1 - \bar{\alpha }}{\gamma } \bar{\mu }+ \bar{\alpha }\bar{\sigma }\bar{\mu }+ \kappa n^5 \alpha ^2 \mu $ and $ \frac{1}{\gamma } \bar{\bar{\mu }} \ge \frac{1 - \bar{\alpha }}{\gamma } \bar{\mu }+ \frac{\bar{\alpha }\bar{\sigma }}{\gamma } \bar{\mu }- \frac{\kappa n^4 \alpha ^2}{\gamma } \mu $. By combining them and using (55) and $\bar{\alpha }\le C_3^{-1} \alpha $ once more, we have that

$$\begin{aligned}{}[\bar{x} + \bar{\alpha }\Delta {\bar{x}}]_i&[\bar{z} + \bar{\alpha }\Delta {\bar{z}}]_i - \frac{1}{\gamma } \bar{\bar{\mu }} \le - \left( \frac{1 - \gamma }{\gamma } \right) \sigma _{min} \bar{\alpha }(1 - \alpha ) \mu + \left( \frac{1 + \gamma }{\gamma } \right) \kappa n^5 \alpha ^2 \mu \\&\le \left[ - \sigma _{min} \bar{\alpha }+ \left( C_3^{-1} \sigma _{min} + \left( \frac{1 + \gamma }{1 - \gamma } \right) \kappa \right) n^5 \alpha ^2 \right] \left( \frac{1 - \gamma }{\gamma } \right) \mu . \end{aligned}$$

Again, the right-hand side of this inequality is non-positive if $\bar{\alpha }\ge (C_3 C_5)^{-1} n^5 \alpha ^2$ and $\alpha \le C_5 / n^5$, with $C_5$ defined before. Since $C_5 \le C_4$, we observe that $(C_3 C_5)^{-1} n^5 \ge (C_3 C_4)^{-1} n^5$, and this explains the lower bound on $\bar{\alpha }$. We also observe that in (67), $\alpha $ is bounded from above by $C_5/n^5$ hence $(C_3 C_5)^{-1} n^5 \alpha ^2 \le C_3^{-1} \alpha $, which guarantees that the interval for $\bar{\alpha }$ is not empty. The upper bound of $\alpha $ in (67) is obtained by noting that $0 < C_5 / n^5 \le C_4 / n^4$ and using Corollary 1, since $(\sigma _{max} + \gamma ^{-1} + 8 \beta ) ((1 + C_3) \omega ^2)^{-1} \ge ((1 + C_3) \omega ^2)^{-1} > 0$. This concludes the proof.

$\square $

To further show that $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}})$ belongs to the $\mathcal {N}_s(\gamma , \beta )$ neighborhood, and therefore satisfies (51), we need to ensure that $\gamma $ is not too close to 0.

Theorem 6

Let the hypotheses of Theorem 5 hold. If, in addition, constants $\gamma $, $\beta $ and $\sigma _{min}$ satisfy $ \gamma \ge 2 \left( - 8 \beta + \sqrt{(8 \beta + 2)^2 + \dfrac{4}{3 \sigma _{min}}} \right) ^{-1} $, then $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) \in \mathcal {N}_s(\gamma , \beta )$ and $\bar{\bar{\mu }} \le (1 - \alpha _\textrm{dec}\bar{\alpha }) \bar{\mu }$.

Proof

Using inequality (68) in Theorem 5, we have that the following holds $ \frac{\Vert (\bar{\bar{r}}_b, \bar{\bar{r}}_c) \Vert }{\bar{\bar{\mu }}} \le \frac{(1 - \bar{\alpha }) \Vert ({\bar{r}}_b, {\bar{r}}_c) \Vert }{(1 - \bar{\alpha }) \bar{\mu }} \le \frac{\Vert (r^0_b, r^0_c) \Vert }{\mu _0} \beta $ and, by (70) and (71) we obtain $ 0 < \gamma \bar{\bar{\mu }} \le \bar{\bar{x}}_i \bar{\bar{z}}_i \le \frac{1}{\gamma } \bar{\bar{\mu }} $. To show that $\bar{\bar{x}} > 0$ and $\bar{\bar{z}} > 0$, we suppose by contradiction that $\bar{\bar{x}}_i \le 0$ or $\bar{\bar{z}}_i \le 0$ occurs for some i and follow the same arguments as those used in Theorems 1 and 3 to conclude, using inequality (76), that $\gamma < C_6^2 n^5 \alpha ^2$ should hold.

By using the fact that $C_3 \ge 1$, defined in Lemma 15, and $\kappa \ge 1$ and $C_6 \ge 1$, defined in Theorem 5, and basic manipulation, we conclude that $C_6 \le 3 (1 + \gamma ^{-1} + 8 \beta ) \gamma ^{-1/2}$ and $C_5 \le \sigma _{min} (1 - \gamma )$, where $C_5$ was also defined in Theorem 5. Then, using (67) for $\alpha $ we obtain

$$\begin{aligned} \gamma < C_6^2 n^5 \alpha ^2 \le \frac{(C_5 C_6)^2}{n^5} \le \frac{9 (1 + \gamma ^{-1} + 8 \beta )^2 (1 - \gamma )^2}{\gamma } \sigma _{min}^2 \end{aligned}$$

(78)

which implies $ \gamma < 2 \left( - 8 \beta + \sqrt{(8\beta + 2)^2 + \frac{4}{3 \sigma _{min}}} \right) ^{-1}$ and contradicts the hypothesis on the lower bound of $\gamma $.

The above inequality has been obtained by rearranging (78) namely, dropping the squares and solving the resulting (quadratic) inequality with an unknown $1/\gamma $. Therefore, we conclude that $(\bar{\bar{x}}, \bar{\bar{\lambda }}, \bar{\bar{z}}) \in \mathcal {N}_s(\gamma , \beta )$. Earlier proved inequality (69) guarantees that the iteration after the quasi-Newton step satisfies $\bar{\bar{\mu }} \le (1 - \alpha _\textrm{dec}\bar{\alpha }) \bar{\mu }$. $\square $

Theorem 7

Suppose that $(x^{0}, {\lambda }^{0}, z^{0}) \in \mathcal {N}_s(\gamma , \beta )$ is chosen such that (52) holds, where $\gamma $ satisfies the condition of Theorem 6 and let $\{\mu _k\}_{k \in \mathbb {N}}$ be generated by Algorithm 1. If $\mu _0 = \epsilon ^\kappa $, for a given $\epsilon > 0$ and a constant $\kappa $, then there exists $K > 0$, $K = O(n^5)$, such that $\mu _k \le \epsilon $ for all $k \ge K$.

Proof

By (67), $C_3 \ge 1$ and for sufficiently large n, we have that both $\alpha $ and $\bar{\alpha }$ are greater than or equal to $\frac{C_3^{-1}C_5}{n^5}$. Therefore, $ \mu _k \le (1 - \alpha _\textrm{dec}\alpha ) \mu _{k - 1} \le \left( 1 - \frac{\alpha _\textrm{dec}C_3^{-1}C_5}{n^5} \right) \mu _{k-1} \le \left( 1 - \frac{\alpha _\textrm{dec}C_3^{-1}C_5}{n^5} \right) ^k \mu _0 $, which yields $O(n^5)$ worst-case iteration complexity for the infeasible case of Algorithm 1, using similar arguments to those of Theorem 2. $\square $

For sufficiently large n, we clearly have that $\min \left\{ \left( (1 \! + \! C_3) \omega ^2 \right) ^{-1} \!\!, \frac{C_5}{n^5} \right\} = \frac{C_5}{n^5} < \frac{\delta _2}{n^2}$, where $\delta _2$ comes from Lemma 14 and this guarantees that Assumption 4 holds. Just as in the feasible case, delivering complexity bound for an algorithm working in the $\mathcal {N}_{-\infty }(\gamma ,\beta )$ neighborhood would require an extra effort and is left for possible future developments.

5 Conclusions and final observations

This work provided theoretical tools to analyze the worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms. A simplified algorithm was considered, which consisted of alternating Newton and quasi-Newton steps. The quasi-Newton approach was based on the Broyden “bad” low-rank update of the inverse of the unreduced system. Feasible and infeasible algorithms and well established neighborhoods of the central path have been considered.

The results showed that in all cases, the degree of the polynomial in the worst-case result has increased. This behavior has already been observed by [10], where the number of overall iterations increased, but the number of factorizations of the true unreduced system actually decreased.

An interesting and complicated question is the case where we allow more than one quasi-Newton step after a Newton step. The theoretical results obtained here show that the point after a single quasi-Newton step remains in a neighborhood of the central path. Extending the result to a sequence of quasi-Newton steps and demonstrating that the iterate remains in the neighborhood of the central path poses technical challenges. We expect that similar worst-case complexity bounds could be obtained, with possibly higher degrees of polynomials, but this still remains an open question.

The worst-case complexity results obtained when the $\mathcal {N}_s$ neighborhood was considered in both feasible and infeasible cases seem rather pessimistic. The high polynomial degrees originate from Lemma 12 for the feasible and from Lemma 17 for the infeasible cases. Finding new ways of reducing the degrees of such expressions would definitely result in better complexity results, since all the other terms are at most of order n.

Data availibility statement

The paper contains no data.

References

Gondzio, J., Grothey, A.: Exploiting structure in parallel implementation of interior point methods for optimization. CMS 6(2), 135–160 (2009). https://doi.org/10.1007/s10287-008-0090-3
Article MathSciNet Google Scholar
Lustig, I.J., Marsten, R.E., Shanno, D.F.: Interior point methods for linear programming: computational state of the art. ORSA J. Comput. 6(1), 1–14 (1994). https://doi.org/10.1287/ijoc.6.1.1
Article MathSciNet Google Scholar
D’Apuzzo, M., De Simone, V., Serafino, D.: On mutual impact of numerical linear algebra and large-scale optimization with focus on interior point methods. Comput. Optim. Appl. 45(2), 283–310 (2010). https://doi.org/10.1007/s10589-008-9226-1
Article MathSciNet Google Scholar
Gondzio, J.: Interior point methods 25 years later. Eur. J. Oper. Res. 218(3), 587–601 (2012). https://doi.org/10.1016/j.ejor.2011.09.017
Article MathSciNet Google Scholar
Greif, C., Moulding, E., Orban, D.: Bounds on eigenvalues of matrices arising from interior-point methods. SIAM J. Optim. 24(1), 49–83 (2014). https://doi.org/10.1137/120890600
Article MathSciNet Google Scholar
Morini, B., Simoncini, V., Tani, M.: A comparison of reduced and unreduced KKT systems arising from interior point methods. Comput. Optim. Appl. 68(1), 1–27 (2017). https://doi.org/10.1007/s10589-017-9907-8
Article MathSciNet Google Scholar
Bellavia, S., De Simone, V., Serafino, D., Morini, B.: Updating constraint preconditioners for KKT systems in quadratic programming via low-rank corrections. SIAM J. Optim. 25(3), 1787–1808 (2015). https://doi.org/10.1137/130947155
Article MathSciNet Google Scholar
Bergamaschi, L., De Simone, V., Serafino, D., Martínez, Á.: BFGS-like updates of constraint preconditioners for sequences of KKT linear systems in quadratic programming. Numer. Linear Algebra Appl. 25(5), 2144 (2018). https://doi.org/10.1002/nla.2144
Article MathSciNet Google Scholar
Gratton, S., Sartenaer, A., Tshimanga, J.: On a class of limited memory preconditioners for large scale linear systems with multiple right-hand sides. SIAM J. Optim. 21(3), 912–935 (2011). https://doi.org/10.1137/08074008
Article MathSciNet Google Scholar
Gondzio, J., Sobral, F.N.C.: Quasi-Newton approaches to interior point methods for quadratic problems. Comput. Optim. Appl. 74(1), 93–120 (2019). https://doi.org/10.1007/s10589-019-00102-z
Article MathSciNet Google Scholar
Ek, D., Forsgren, A.: A structured modified Newton approach for solving systems of nonlinear equations arising in interior-point methods for quadratic programming. Comput. Optim. Appl. 86, 1–48 (2023). https://doi.org/10.1007/s10589-023-00486-z
Article MathSciNet Google Scholar
Gonzaga, C.C.: An algorithm for solving linear programming problems in $O(n^3 L)$ operations. In: Megiddo, N. (ed.) Progress Math. Program., pp. 1–28. Springer, New York (1989). https://doi.org/10.1007/978-1-4613-9617-8_1
Chapter Google Scholar
Dennis, J.E., Jr., Morshedi, A.M., Turner, K.: A variable-metric variant of the Karmarkar algorithm for linear programming. Math. Program. 39(1), 1–20 (1987). https://doi.org/10.1007/BF02592068
Article MathSciNet Google Scholar
Nesterov, Y., Nemirovskii, A.: Interior Point Polynomial Methods in Convex Programming. Society for Industrial and Applied Mathematics, Philadelphia (1994)
Google Scholar
Wright, S.J.: Primal-dual Interior Point Methods. Society for Industrial and Applied Mathematics, Philadelphia (1997)
Book Google Scholar
Schmieta, S.H., Alizadeh, F.: Extension of primal-dual interior point algorithms to symmetric cones. Math. Program. 96(3), 409–438 (2003). https://doi.org/10.1007/s10107-003-0380-z
Article MathSciNet Google Scholar
Martínez, J.M.: Practical quasi-Newton methods for solving nonlinear systems. J. Comput. Appl. Math. 124(1–2), 97–121 (2000). https://doi.org/10.1016/S0377-0427(00)00434-9
Article MathSciNet Google Scholar
Monteiro, R.D.C., Adler, I.: Interior path following primal-dual algorithms. Part I: Linear programming. Math. Program. 44(1–3), 27–41 (1989). https://doi.org/10.1007/BF01587075
Article Google Scholar
Colombo, M., Gondzio, J.: Further development of multiple centrality correctors for interior point methods. Comput. Optim. Appl. 41(3), 277–305 (2008). https://doi.org/10.1007/s10589-007-9106-0
Article MathSciNet Google Scholar
Gondzio, J.: HOPDM (version 212): a fast LP solver based on a primal-dual interior point method. Eur. J. Oper. Res. 85(1), 221–225 (1995). https://doi.org/10.1016/0377-2217(95)00163-K
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Mathematics and Maxwell Institute for Mathematical Sciences, University of Edinburgh, JCMB, King’s Buildings, Edinburgh, EH9 3FD, Scotland, UK
Jacek Gondzio
Department of Mathematics, State University of Maringá, Avenida Colombo, 5790, Maringá, Paraná, 87020-900, Brazil
Francisco N. C. Sobral

Authors

Jacek Gondzio
View author publications
You can also search for this author in PubMed Google Scholar
Francisco N. C. Sobral
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors contributed to the results presented in this article.

Corresponding authors

Correspondence to Jacek Gondzio or Francisco N. C. Sobral.

Ethics declarations

Conflict of interest

The authors have no Conflict of interest to declare that are relevant to the content of this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Gondzio, J., Sobral, F.N.C. Polynomial worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms for linear programming. Comput Optim Appl (2024). https://doi.org/10.1007/s10589-024-00584-6

Download citation

Received: 30 October 2023
Accepted: 18 May 2024
Published: 07 June 2024
DOI: https://doi.org/10.1007/s10589-024-00584-6

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Polynomial worst-case iteration complexity of quasi-Newton primal-dual interior point algorithms for linear programming

Abstract

Similar content being viewed by others

Complexity analysis of a full-Newton step interior-point method for linear optimization

Complexity analysis of interior-point methods for second-order stationary points of nonlinear semidefinite optimization problems

Local convergence of primal–dual interior point methods for nonlinear semidefinite optimization using the Monteiro–Tsuchiya family of search directions

1 Introduction

2 Background

Lemma 1

Proof

3 Worst-case complexity in the feasible case

Assumption 1

Lemma 2

Proof

Lemma 3

Proof

Lemma 4

Proof

3.1 The \(\mathcal {N}_2\) neighborhood

Assumption 2

Lemma 5

Proof

Lemma 6

Proof

Lemma 7

Proof

Lemma 8

Proof

Lemma 9

Proof

Theorem 1

Proof

Theorem 2

Proof

3.2 The \(\mathcal {N}_s\) neighborhood

Assumption 3

Lemma 10

Proof

Lemma 11

Proof

Lemma 12

Proof

Theorem 3

Proof

Theorem 4

Proof

4 Worst-case complexity in the infeasible case

Lemma 13

Proof

Lemma 14

Proof

Assumption 4

Lemma 15

Proof

Lemma 16

Proof

Lemma 17

Proof

Corollary 1

Proof

Theorem 5

Proof

Theorem 6

Proof

Theorem 7

Proof

5 Conclusions and final observations

Data availibility statement

References

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article