1 Introduction

Strong feasibility of primal and dual problems is a standard regularity condition in convex optimization, e.g., [24, 36, Chapter 3]. Once this condition is satisfied, powerful algorithms such as interior-point algorithms and the ellipsoid algorithm can be applied to solve them efficiently, at least in theory. On the other hand, if a problem at hand does not satisfy this condition, it can be much harder to solve. For instance, the problem may have a positive duality gap. Due to the advance of techniques of optimization modelling, there are many problems which do not satisfy primal-dual strong feasibility by nature.

A first attempt to apply interior-point algorithms to such problems would be to perturb the problem to recover strong feasibility at both sides, i.e., “regularization.” But it is not clear how this perturbation affects the optimal value. In this paper, we focus on semidefinite programs (SDP) and conduct an asymptotic analysis of the optimal value function when the problem is perturbed slightly to recover primal-dual strong feasibility. The analysis is general enough to be applicable to any ill-behaved problem without assuming constraint qualifications, and has interesting implications to the convergence theory of interior-point algorithms.

It is known that every SDP falls into one of the four statuses: strongly feasible, weakly feasible, weakly infeasible and strongly infeasible, e.g., see [23]. Difficult situations like positive duality gap may occur when the problem is either weakly feasible or weakly infeasible. We may call such problems “singular”.

A standard method to deal with singular problems in semidefinite programming and general conic convex programming is facial reduction [4,5,6,7, 15, 40, 43, 46]. This approach recovers strong feasibility by finding the minimal face containing the feasible region. While many of the earlier papers on facial reduction focused on weakly feasible problems, it is relatively recent that weak infeasibility is analyzed in this context [15, 18, 29]. Along this line of developments, the paper [20] showed that, through double facial reduction, any SDP can be solved “completely” by calling an interior-point oracle polynomially many times, where the interior-point oracle is an idealized interior-point algorithm which return primal-dual optimal solutions given a primal-dual strongly feasible SDP. In the context of SDPs with positive duality gaps, Ramana developed an extended Lagrangian dual SDP for which strong duality always holds [34]. Later it was shown in [35] that Ramana’s dual problem is strongly related to facial reduction, see also [28].

Implementation of a facial reduction algorithm is subtle and not easy, being vulnerable to rounding errors. Nevertheless, it is worth mentioning that there are several recent works focused on practical issues regarding facial reduction or on heuristics based on facial reduction [9, 30, 31, 49].

So far, we have discussed approaches based on (or related to) facial reduction in order to deal with singular SDPs. Unrelated to that, the paper [16] considered an application of the Douglas-Rachford algorithm to the analysis of pathological behavior in SDPs. Interestingly, they show it is sometimes possible to identify the presence of positive duality gaps by observing whether certain sequences converge to 0 or to \(\infty \), see [16, Fig. 1, Sects. 2.8 and 2.9].

As mentioned previously, in this paper we will consider yet another approach for analyzing singular SDPs: regularization. The idea is to perturb the problem slightly to recover strong feasibility on both primal and dual sides. Once strong feasibility is recovered, we may, say, apply interior-point algorithms to the regularized problems. However, the resulting approximate optimal solution is not guaranteed to be close to the optimal solution to the original problem, though intuitively we might expect or hope so. In particular, if we consider a SDP problem with a finite and nonzero duality gap, it is not clear what happens with the optimal value and the optimal solutions of the regularized problem as functions of the perturbation when the perturbation is reduced to zero.

Analyzing this problem is one of the main topics of the current paper. We consider primal and dual pairs of semidefinite programs and assume they are singular i.e., either weakly feasible or weakly infeasible (see Sect. 2.1 for definitions). Under these circumstances, there are arbitrarily small perturbations which make the perturbed pair primal-dual strongly feasible. Then, we fix two positive definite matrices, and shift the associated affine spaces of the primal and dual slightly in the direction of these matrices so that the perturbed problems have interior feasible solutions. Under this setting, we analyze the behavior of the optimal value of the perturbed problem when the perturbation is reduced to zero while keeping the proportion.

First, we demonstrate that, if perturbation is added only to the primal problem to recover strong feasibility, then the optimal value of the perturbed problem converges to the dual optimal value as the perturbation is reduced to zero, even in the presence of nonzero duality gap. An analogous proposition holds for the dual problem. We derive them as a significantly simplified version of the classical asymptotic strong duality theorem (see, for instance, [1, 3, 8, 23, 24, 36] and Chapter 2 of [42]).

Then we analyze the case where perturbation is added to both primal and dual sides of the problem. We will demonstrate that in that case the limiting optimal value of the perturbed problems converges to a value between the primal and dual optimal values of the original problem even in the presence of nonzero duality gap. The limiting optimal value is a function of the relative weight of primal and dual perturbations, and reduces monotonically from the primal optimal value to the dual optimal value as the relative weight shifts from the dual side to the primal side.

The result provides an interesting implication to the behavior of infeasible interior-point algorithms applied to general SDPs [12, 13, 25, 27, 32, 44, 48]. In particular, we pick up two well-known polynomial-time infeasible interior-point algorithms by Zhang [48] and Potra and Sheng [32], and prove the following (see Theorems 5 and 6):

  1. 1.

    If neither the primal nor the dual are strongly infeasible then:

    1. (a)

      the algorithms always generate sequences \((X^k,S^k,y^k)\) that are asymptotically primal-dual feasible and such that the “duality gap” \(X^k\bullet S^k\) converges to zero.

    2. (b)

      the sequence of modified (primal and dual) objective values converges to a number in \([\theta _D, \theta _P]\), where \(\theta _P\) and \(\theta _D\) are the primal optimal value and the dual optimal value, respectively.

  2. 2.

    Otherwise (i.e., if either the primal or the dual is strongly infeasible), the algorithms fail to generate a sequence such that the duality gap \(X^k\bullet S^k\) converges to zero. (Needless to say, there is no way to generate an asymptotically primal-dual feasible sequence in this case.)

One implication of the results above is that, at least in theory, these interior-point algorithms generate sequences converging to the optimal value as long as strong feasibility is satisfied at one side of the problem. Furthermore, even in the presence of a finite duality gap, they still generate sequences converging to values between the primal and dual optimal values. It is also worth mentioning that our analysis shows that, by setting appropriate initial iterates, it is possible to control how close the limit value will be to the primal or the dual optimal values.

Though this result is more of theoretical interest, this might be of some value if one wants to solve mixed-integer SDP (MISDP) through branch-and-bound and linear SDP relaxations. As discussed in [10], it is quite possible that the relaxations eventually fail to satisfy strong feasibility at one of the sides of the problem.

Nevertheless, the solutions obtained by the infeasible interior-point methods described above can still be used as bounds to the optimal values of the relaxed linear SDPs regardless of regularity assumptions or constraint qualifications (at least in theory).

This paper is organized as follows. In Sect. 2, we describe our main results. Section 3 is a preliminary section where we review asymptotic strong duality, infeasible interior-point algorithms, and semialgebraic geometry. In Sect. 4, we prove our main results. In Sect. 5, we show an application of our results to the analysis of infeasible primal-dual interior-point algorithms. In Sect. 6, illustrative instances will be presented.

2 Main results

In this section, we introduce our main results after providing the setup and some preliminaries. We also review existing related results.

2.1 Setup and terminology

First we introduce the notation. The space of \(n\times n\) real symmetric matrices will be denoted by \({\mathcal {S}}^n\). We denote the cone of \(n\times n\) real symmetric positive semidefinite matrices and the cone of \(n\times n\) real symmetric positive definite matrices by \({{\mathcal {S}}^{n}_+}\) and \({{\mathcal {S}}^{n}_{++}}\). For \(U,V \in {\mathcal {S}}^n\), we define the inner product \(U\bullet V\) as \(\sum U_{ij} V_{ij}\), and we use \(U \succeq 0\) and \(U\succ 0\) to denote that \(U \in {{\mathcal {S}}^{n}_+}\) and \(U \in {{\mathcal {S}}^{n}_{++}}\), respectively. The \(n\times n\) identity matrix is denoted by I. We denote the Frobenius norm and the operator norm by \(\Vert {X}\Vert _{F}\) and \(\Vert {X}\Vert \). For \(v \in {\mathbb {R}}^k\), we denote by \(\Vert {v}\Vert \) its Euclidean norm.

In this paper, we deal with the following standard form primal-dual semidefinite programs

$$\begin{aligned} \mathbf{P:}&\ \ \ \min _{X} \ C \bullet X \ \ \hbox {s.t.}\ A_i \bullet X = b_i,\ i=1, \ldots , m, X\succeq 0\\ \mathbf{D:}&\ \ \ \max _{y,S} \ b^T y \ \ \ \ \hbox {s.t.} \ C - \sum _{i=1}^m A_i y_i = S,\ S\succeq 0, \end{aligned}$$

where C, \(A_i, i = 1, \ldots , m\), X, S are real symmetric \(n\times n\) matrices and \(y \in {\mathbb {R}}^m\). For ease of notation, we define the mapping A from \({\mathcal {S}}^n\) to \({\mathbb {R}}^m\):

$$\begin{aligned} A(Y) \equiv (A_1\bullet Y, \ldots , A_m\bullet Y), \end{aligned}$$
(1)

and introduce

$$\begin{aligned}{\mathcal {V}} \equiv \{X \in {\mathcal {S}}^n \mid A_i \bullet X = b_i,\ i=1, \ldots , m \} =\{X\in {\mathcal {S}}^n \mid A(X)=b\}.\end{aligned}$$

We denote by \(v(\mathbf{P})\) and \(v(\mathbf{D})\) the optimal values of P and D, respectively. We use analogous notation throughout the paper to denote the optimal value of an optimization problem. For a maximization problem, the optimal value \(+\,\infty \) means that the optimal value is unbounded above and the optimal value \(-\,\infty \) means that the problem is infeasible. For a minimization problem, the optimal value \(-\,\infty \) means that the optimal value is unbounded below and the optimal value \(+\,\infty \) means that the the problem is infeasible.

It is well-known that \(v(\mathbf{P}) = v(\mathbf{D})\) holds under suitable regularity conditions, although, in general, we might have \(v(\mathbf{P}) \ne v(\mathbf{D})\), i.e., the problem may have a nonzero duality gap. We also note that \(v(\mathbf{P})\) and \(v(\mathbf{D})\) might not be necessarily attainable.

In general, \(\mathbf{P}\) is known to be in one of the following four different mutually exclusive status (see [24]).

  1. 1.

    Strongly feasible: there exists a positive definite matrix satisfying the constraints of \(\mathbf{P}\), i.e., \({\mathcal {V}} \cap {{\mathcal {S}}^{n}_{++}} \ne \emptyset \). This is the same as Slater’s condition.

  2. 2.

    Weakly feasible: \(\mathbf{P}\) is feasible but not strongly feasible, i.e., \({\mathcal {V}} \cap {{\mathcal {S}}^{n}_{++}} =\emptyset \) but \({\mathcal {V}} \cap {{\mathcal {S}}^{n}_+} \ne \emptyset \).

  3. 3.

    Weakly infeasible: \(\mathbf{P}\) is infeasible but the distance between \({{\mathcal {S}}^{n}_+}\) and the affine space \({\mathcal {V}}\) is zero, i.e., \({\mathcal {V}} \cap {{\mathcal {S}}^{n}_+} = \emptyset \) but the zero matrix belongs to the closure of \({{\mathcal {S}}^{n}_+}-{\mathcal {V}}\).

  4. 4.

    Strongly infeasible: \(\mathbf{P}\) is infeasible but not weakly infeasible. Note that this includes the case where \({\mathcal {V}} = \emptyset \).

The status of \(\mathbf{D}\) is defined analogously by replacing \({\mathcal {V}}\) by the affine set

$$\begin{aligned}\left\{ S \in {\mathcal {S}}^n \mid \exists y \in {\mathbb {R}}^m, C - \sum _{i=1}^m A_i y_i = S \right\} .\end{aligned}$$

We say that a problem is asymptotically feasible if it is either feasible or weakly infeasible. As a reminder, we say that a problem is singular if it is either weakly feasible or weakly infeasible.

2.2 Main results

Now we introduce the main results of this paper. We say that a problem is asymptotically primal-dual feasible (or asymptotically pd-feasible, in short) if both P and D are asymptotically feasible. Evidently, the problem is asymptotically pd-feasible if and only if both P and D are feasible or weakly infeasible. The analysis in this paper is conducted mainly under this condition.

Note that asymptotic pd-feasibility is a rather weak condition. Many difficult situations such as finite nonzero duality gaps and weak infeasibility of both P and D are covered under this condition. Furthermore, since strong infeasibility can be detected by solving auxiliary SDPs that are both primal and dual strongly feasible (see [19]), checking whether a given problem is asymptotically pd-feasible or not can also be checked by solving SDPs that are primal and dual strongly feasible.

We consider the following primal-dual pair P(\(\varepsilon ,\eta \)) and \(\mathbf{D}(\varepsilon ,\eta )\) obtained by perturbing P and D with two positive definite matrices \(I_p\) and \(I_d\) and two nonnegative parameters \(\varepsilon \) and \(\eta \):

$$\begin{aligned} \mathbf{P}(\varepsilon ,\eta ):\ \ \min \ (C + \varepsilon I_d)\bullet X \ \ \hbox {s.t.}\ A_i \bullet X = b_i+ \eta A_i\bullet I_p,\ i=1, \ldots , m,\ X\succeq 0,\nonumber \\ \end{aligned}$$
(2)

and

$$\begin{aligned} \mathbf{D}(\varepsilon ,\eta ):\ \ \ \max \sum _{i=1}^m (b_i + \eta A_i\bullet I_p)y_i \ \ \hbox {s.t.} \ C - \sum _{i=1}^m A_i y_i + \varepsilon I_d= S,\ \ \ S\succeq 0. \end{aligned}$$
(3)

Using (1), we have

$$\begin{aligned} \mathbf{P}(\varepsilon ,\eta ):\ \ \ \min \ (C + \varepsilon I_d)\bullet X \ \ \hbox {s.t.}\ A(X) = b+ \eta A(I_p),\ X\succeq 0. \end{aligned}$$
(4)

While \(I_p\) and \(I_d\) represent the direction of perturbation, \(\varepsilon \) and \(\eta \) represent the amount of perturbation. In particular, we could take, for example, \(I_p= I_d= I\), where I is the \(n\times n\) identity matrix. We note that the perturbed pair (2) and (3) was used in the study of infeasible interior-point algorithms [32] and facial reduction [40].

If the problem is asymptotically pd-feasible, \(\mathbf{D}(\varepsilon ,\eta )\) is strongly feasible for any \(\varepsilon > 0\) and \(\mathbf{P}(\varepsilon ,\eta )\) is strongly feasible for any \(\eta >0\). To see the strong feasibility of \(\mathbf{P}(\varepsilon ,\eta )\), we observe that there always exists \({{\widetilde{X}}}\succeq -\eta I_p/2\) satisfying \(A_i\bullet \widetilde{X} = b_i, i=1, \ldots , m\), since P is weakly infeasible or feasible. Then, we see that the matrix \(X={{\widetilde{X}}}+\eta I_p\) is positive definite and a feasible solution to \(\mathbf{P}(\varepsilon ,\eta )\). We emphasize that the primal-dual pair \(\mathbf{P}(\varepsilon ,\eta )\) and \(\mathbf{D}(\varepsilon ,\eta )\) is a natural and possibly one of the simplest regularizations of \(\mathbf{P}\) and \(\mathbf{D}\) which ensures primal-dual strong feasibility under perturbation.

We define \(v(\varepsilon ,\eta )\) to be the common optimal value of \(\mathbf{P}(\varepsilon ,\eta )\) and \(\mathbf{D}(\varepsilon ,\eta )\) if they coincide. If the optimal values differ, \(v(\varepsilon ,\eta )\) is not defined. Suppose that P and D are asymptotically pd-feasible. In this case, from the the duality theory of convex programs, the function \(v(\varepsilon ,\eta )\) has the following properties:

  1. 1.

    \(v(\varepsilon ,\eta )\) is finite if \(\varepsilon >0\) and \(\eta >0\).

  2. 2.

    \(v(\varepsilon ,0)\) is well-defined as long as \(\varepsilon > 0\) and it takes the value \(+\infty \) if P is infeasible.

  3. 3.

    \(v(0,\eta )\) is well-defined as long as \(\eta > 0\) and it takes the value \(-\infty \) if D is infeasible.

  4. 4.

    \(v(\varepsilon ,\eta )\) may not be defined at (0, 0). This is because \(\mathbf{P}=\mathbf{P}(0,0)\) and \(\mathbf{D}=\mathbf{D}(0,0)\) may have different optimal values, i.e., P and D may have a nonzero duality gap.

Therefore, although the regularized pair \(\mathbf{P}(\varepsilon ,\eta )\) and \(\mathbf{D}(\varepsilon ,\eta )\) satisfies primal-dual strong feasibility if \(\varepsilon > 0\) and \(\eta > 0\), it is not clear whether this is actually useful in solving SDP under notorious situations such as the presence of nonzero duality gaps. This is precisely one of the main topics of this paper: an analysis on the behavior of the regularized problems without imposing any restrictive assumption.

In this context, it is worth mentioning that the following asymptotic strong duality results

$$\begin{aligned} \hbox {(i)}\ \lim _{\varepsilon \downarrow 0}v(\varepsilon , 0) = \lim _{\varepsilon \downarrow 0}v(\mathbf{D}(\varepsilon , 0)) = v(\mathbf{P})\ \hbox {under dual asymptotic feasibility} \end{aligned}$$

and

$$\begin{aligned} \hbox {(ii)}\ \lim _{\eta \downarrow 0}v(0,\eta )=\lim _{\eta \downarrow 0}v(\mathbf{P}(0,\eta )) = v(\mathbf{D}) \hbox { under primal asymptotic feasibility} \end{aligned}$$

are obtained as corollaries of the classical asymptotic strong duality theorem established in the 1950’s and 1960’s [1, 8]. This theory received renewed attention with the emergence of conic linear programming; see, for instance, [3, 23, 24, 36] and Chapter 2 of [42]. We will prove (i) and (ii) in the next section, see Theorem 3. In comparison with the classical asymptotic strong duality theorem, Theorem 3 considers a smaller perturbation space.

Now we are ready to describe the main results. They are developed to interpolate between (i) and (ii). The first result is the following theorem.

Theorem 1

Let \(\alpha \ge 0\), \(\beta \ge 0\) and \((\alpha ,\beta )\not =(0,0)\). If the problem is asymptotically pd-feasible, then \(\lim _{t\downarrow 0} v(t\alpha , t\beta )\) exists.

Here we remark that Theorem 1 includes the case where the limit is \(\pm \infty \). Theorem 1 implies that the limit of the optimal value of the perturbed system exists but it is a function of the direction used to approach (0, 0). For \(\theta \in [0,\pi /2]\), let us consider the function

$$\begin{aligned} {v_a}(\theta ) \equiv \lim _{t\downarrow 0}v(t\cos \theta ,t\sin \theta ), \end{aligned}$$

which is the limiting optimal value of \(v(\cdot )\) when it approaches zero along the direction making an angle of \(\theta \) with the \(\varepsilon \) axis. With that, \({v_a}(0)\) and \({v_a}(\pi /2)\) are the special cases corresponding to dual-only perturbation and primal-only perturbation, respectively. So we abuse notation slightly and define

$$\begin{aligned} {v_a}(\mathbf{D}) \equiv {v_a}(0) \quad \text { and }\quad {v_a}(\mathbf{P}) \equiv {v_a}(\pi /2). \end{aligned}$$
(5)

Below is our second main result.

Theorem 2

If the problem is asymptotically pd-feasible, the following statements hold.

  1. 1.

    \({v_a}(0) = {v_a}(\mathbf{D})=v(\mathbf{P})\) and \({v_a}(\pi /2) ={v_a}(\mathbf{P})= v(\mathbf{D})\).

  2. 2.

    \({v_a}(\theta )\) is monotone decreasing in \([0, \pi /2]\), and is continuous in \((0,\pi /2)\).

Theorem 2 is proved by using Theorem 4 which establishes monotonicity and convexity of \(\lim _{t\rightarrow 0} v(t, t\beta )\).

Now we turn our attention to the connection of these main results to the convergence analysis of the primal-dual infeasible interior-point algorithm. Indeed, the pair (2) and (3) appears often in the analysis of infeasible interior-point algorithms. In particular, primal-dual infeasible interior-point algorithms typically generate a sequence of feasible solutions to \(\mathbf{P}(t^k,t^k)\) and \(\mathbf{D}(t^k,t^k)\), where \(I_p\) and \(I_d\) are determined by the initial value of the algorithm and \(t^k\) is a positive sequence converging to 0. By Theorem 2 , the common optimal value \(v(t^k, t^k)\) of \(\mathbf{P}(t^k,t^k)\) and \(\mathbf{D}(t^k,t^k)\) converges to \({v_a}(\pi /4)\) which is between \(v(\mathbf{P})\) and \(v(\mathbf{D})\). Therefore, if we can show that an infeasible interior-point algorithm generates a sequence which approaches \(v(t^k,t^k)\) as \(k\rightarrow \infty \), we can prove that that sequence converges to \(v(\pi /4)\) in the end.

Exploiting this idea, we obtain the following convergence results without any assumption on the feasibility status of the problem. We consider two typical well-known polynomial-time algorithms by Zhang [48] and Potra and Sheng [32]. But the idea can be applied to a broad class of infeasible interior-point algorithms to obtain analogous results. They are stated formally in Theorem 5 and Theorem 6, and summarized as follows:

  1. 1.

    The algorithms [32, 48] generate asymptotically pd-feasible sequences with the duality gap \(X^k\bullet S^k\) and \(t^k\) converging to zero if and only if P and D are asymptotically pd-feasible.

  2. 2.

    If P and D are asymptotically pd-feasible, the sequence of modified primal and dual objective values converges to a common value between the primal optimal value \(v(\mathbf{P})\) and the dual optimal value \(v(\mathbf{D})\) even in the presence of nonzero duality gap.

The modified primal and dual objective values mentioned in the statements can be easily computed using the current iterate and do not require any extra knowledge.

If P and D are not asymptotically pd-feasible, namely, if one of the problems is strongly infeasible, the algorithms get stuck at a certain point and they fail to generate an asymptotically pd-feasible sequence and fails to drive duality gap and \(t^k\) to 0. But the algorithms never fails to generate asymptotically pd-feasible sequences as long as the problems are asymptotically pd-feasible.

We note that Theorems 5 and 6 are to some extent surprising in that infeasible interior-point algorithms work in a meaningful manner without making any restrictive assumptions, at least in theory. This might have interesting implications when solving SDP relaxations arising from hard optimization problems such as MISDP by using infeasible interior-point algorithms. The theorems guarantees that the modified objective function value converges to a value between the primal and dual optimal values. Therefore, the limiting modified objective value can always be used to bound the optimal value of linear SDP relaxations obtained when solving MISDP via, say, branch-and-bound as in [10]. We should mention, however, that if one tries to implement this idea, one would still need to find a way to overcome the severe numerical difficulties that may happen when attempting to solve singular SDPs directly.

Finally, while the results of this paper clarifies some aspects of the limiting behavior of infeasible interior-point algorithms when applied to a problem with nonzero duality gap, we remark that deriving similar results for self-dual embedding approaches is still an open problem.

2.3 Related work

Our work is closely related to perturbation theory and sensitivity analysis which are, of course, classic topics in the optimization literature. In particular, there are a number of results on perturbation of semidefinite programs and closely related topics, see [3, 23, 24, 42]. The book by Bonnans and Shapiro [3], for instance, has many results on the perturbation and sensitivity analysis of general conic programs that are later specialized to nonlinear SDPs in Sect. 5.3 therein. See also [37] for earlier results in the context of convex optimization. However, many of those results require that some sort of constraint qualification holds.

In particular, in Chapter 4 of [3] there is a discussion of a family of optimization problems having the format

$$\begin{aligned} \min _{x\in X} f(x,u) \ \ \text {s.t.} \ \ G(x,u) \in {\mathcal {K}}, \end{aligned}$$
(6)

where f and G are functions depending on the parameter u and \( {\mathcal {K}}\) is a closed convex set in some Banach space. Denote by v(u), the optimal value of (6). For some fixed \(u_0\), many results are proved about the continuity of \(v(\cdot )\) [3, Proposition 4.4], or the directional derivatives of \(v(\cdot )\) in a neighborhood of \(u_0\) [3, Theorem 4.24].

However, these existing results do not cover the situations we will deal in this paper. [3, Proposition 4.4], for example, requires a condition called inf-compactness, which implies, in particular, that the set of optimal solutions of the problem associated to \(v(u_0)\) be compact. [3, Theorem 4.24], on the other hand, requires that the set of optimal solutions associated to \(v(u_0)\) be non-empty. In contrast, neither compactness nor non-emptiness is assumed in this paper.

The perturbation we consider is closely related to the infeasible central path appearing in the primal-dual infeasible interior-point algorithms. In fact, we use some properties of the infeasible central path in our proof. The papers [21, 33] showed the analyticity of the entire trajectory including the end point at the optimal set under the existence of primal-dual optimal solutions satisfying strict complementarity conditions. A very recent paper [41] analyzes the limiting behavior of singular infeasible central paths taking into account the singularity degree. Therein, the authors analyze the speed of convergence under the assumption that the feasible region exists and is bounded. No strong feasibility assumption is made, although we remark that if the feasible region of a primal SDP is non-empty and bounded, then its dual counterpart must satisfy Slater’s condition. While their analysis conducts a detailed limiting analysis on the asymptotic behavior of the central path, our analysis deals with the limiting behavior of the optimal value of the perturbed system under weaker assumptions.

In reality, it may be necessary to estimate the error of an approximate optimal solution to a problem with a finite perturbation. In this regard, an interesting and closely related topic to the limiting perturbation analysis is error bounds. The error bound analysis is relatively easy under primal-dual strong feasibility, but it becomes much harder for singular SDPs. See [22, 43] for SDP and SOCP, and [17] for a more general class of convex programs. The relationship between forward and backward errors of a semidefinite feasibility system is closely related to its singular degree, which, roughly, is defined as the number of facial reduction steps necessary for regularizing the problem. Recently, some analysis of limiting behaviors of the external (or infeasible) central path involving singularity degree is developed in [41]. Finally, we mention [39] which conducted a sensitivity analysis of SDP under perturbation of the coefficient matrices “\(A_i\)”.

3 Preliminaries

In this section, we introduce three ingredients of this paper, namely, asymptotic strong duality, infeasible interior-point algorithms and real-algebraic geometry.

3.1 Asymptotic strong duality

A main difference between the duality theory in linear programming and general convex programming is that the latter requires some regularity conditions for strong duality to hold. If such regularity condition is violated, then the primal and dual may have nonzero duality gap [34]. Nevertheless, the so-called asymptotic strong duality holds even in such singular cases [1, 3, 8, 23, 24, 36, 42]. Here we quickly review the result and work on it a bit to derive a modified and simplified version suitable for our purposes.

Let \(\hbox {a-val}(\mathbf{P})\) and \(\hbox {a-val}(\mathbf{D})\) be

$$\begin{aligned} \hbox {a-val}(\mathbf{P})&\equiv \lim _{\varepsilon \downarrow 0} \inf _{\Vert \Delta b\Vert<\varepsilon } \inf \{C\bullet X |\ A(X) = b+\Delta b,\ X\succeq 0\}, \nonumber \\ \hbox {a-val}(\mathbf{D})&\equiv \lim _{\varepsilon \downarrow 0} \sup _{\Vert \Delta C\Vert <\varepsilon } \sup \left\{ b^T y| \ C+\Delta C -\sum _{i} A_i y_i \succeq 0 \right\} . \end{aligned}$$
(7)

Here, \(\hbox {a-val}(\mathbf{P})\) and \(\hbox {a-val}(\mathbf{D})\) are called the asymptotic optimal values of P and D , respectively [36]. (It is also called subvalue in [1, 3, 8, 23, 24].) The following asymptotic duality theorem holds, see also [8, Theorem 1], [1, Lemmas 1 and 2], [23, Theorem 2], [24, Theorem 6] for similar statements.

Theorem

(Asymptotic Duality Theorem, e.g., [36, Theorem 3.2.4])

  1. 1.

    If P is asymptotically feasible, then, \(\hbox {a-val}(\mathbf{P}) = v(\mathbf{D})\).

  2. 2.

    If D is asymptotically feasible, then, \(\hbox {a-val}(\mathbf{D}) = v(\mathbf{P})\).

Note that the Asymptotic Duality Theorem includes the cases where \(\hbox {a-val}(\cdot )=\pm \infty \).

Now we develop a simplified version of the Asymptotic Duality Theorem. Let \(\varepsilon \ge 0\), and let \(\mathbf{D}\)(\(\varepsilon \)) be \(\mathbf{D}(\varepsilon ,0)\), i.e., the relaxed dual problem

$$\begin{aligned} \max \,\, b^T y \ \ \hbox {s.t.} \ C - \sum _{i=1}^m A_i y_i +\varepsilon I_d= S,\ \ \ S\succeq 0. \end{aligned}$$
(8)

According to the notation introduced in Sect. 2.2, the optimal value of (8) is written as \(v(\varepsilon , 0)\). Recall also that

$$\begin{aligned} \lim _{\varepsilon \downarrow 0} v(\varepsilon , 0) = {v_a}(0) ={v_a}(\mathbf{D}). \end{aligned}$$

Next we consider an analogous relaxation at the primal side. Notice that (8) is obtained by shifting the semidefinite cone by \(-\varepsilon I_d\). The analogous perturbation of the primal problem is given by

$$\begin{aligned} \min \ C\bullet {{\widetilde{X}}} \ \ \hbox {s.t.}\ A({{\widetilde{X}}})=b, \ {{\widetilde{X}}}\succeq -\eta I_p, \end{aligned}$$
(9)

where \(\eta \ge 0\). Letting \(X\equiv {{\widetilde{X}}} + \eta I_p\), we obtain

$$\begin{aligned} \min \ C\bullet X -\eta C\bullet I_p\ \ \hbox {s.t.}\ A(X) = b+ \eta A(I_p),\ \ X\succeq 0. \end{aligned}$$
(10)

The optimal value of (9) is monotone decreasing in \(\eta \), because the feasible region enlarges as \(\eta \) is increased (strictly speaking, it does not shrink). Observe also that this problem is \(\mathbf{P}(0, \eta )\) with the objective function shifted by a constant \(- \eta C\bullet I_p\). Since this constant vanishes as \(\eta \rightarrow 0\), we obtain

$$\begin{aligned} {v_a}\left( \frac{\pi }{2}\right) ={v_a}(\mathbf{P})= \lim _{\eta \downarrow 0}v(0,\eta )= \lim _{\eta \downarrow 0}\{\hbox {The optimal value of (10)}\}. \end{aligned}$$

Now we prove Theorem 3, which is a simplified version of the asymptotic duality theorem discussed above. Compared with the asymptotic duality results discussed in [1, 3, 8, 23, 24, 36], the key difference is that we only consider perturbations along a single direction in each of the primal and dual problems, while in the aforementioned works the perturbation space is larger. Indeed, in the Asymptotic Duality Theorem (as stated above), the perturbation space is \(\Vert {\Delta b}\Vert < \epsilon \) and \(\Vert \Delta C \Vert < \epsilon \) at the primal and dual sides, respectively. In contrast, in Theorem 3 below, we only consider perturbations along a single direction at each of the primal and dual problems (i.e., along \(I_p\) and \(I_d\), respectively). Since it is not a priori obvious that the smaller perturbation space is still enough to close the duality gap, we provide a detailed proof showing how to go from the Asymptotic Duality Theorem to Theorem 3.

Theorem 3

The following statements hold.

  1. 1.

    If D is asymptotically feasible, then

    $$\begin{aligned} {v_a}(0)={v_a}(\mathbf{D}){=\lim _{\varepsilon \downarrow 0}v(\varepsilon , 0)}=v(\mathbf{P}). \end{aligned}$$
    (11)
  2. 2.

    If P is asymptotically feasible, then

    $$\begin{aligned} {v_a}({\pi }/2)={v_a}(\mathbf{P}){=\lim _{\eta \downarrow 0}v(0,\eta )}=v(\mathbf{D}). \end{aligned}$$
    (12)

Proof

Recall that by definition (see (5)), we have \({v_a}(0)={v_a}(\mathbf{D})\) and \({v_a}({\pi }/2)={v_a}(\mathbf{P})\).

First we show that \({v_a}(\mathbf{D})=v(\mathbf{P})\). From the Asymptotic Duality Theorem, \(\hbox {a-val}(\mathbf{D}) = v(\mathbf{P})\) holds including the special cases where \(\hbox {a-val}(\mathbf{D}) = \pm \infty \). We observe that \(\hbox {a-val}(\mathbf{D})\) satisfies

$$\begin{aligned} \hbox {a-val}(\mathbf{D})=\lim _{\varepsilon \downarrow 0} \sup _{y,\Delta C}\ \left\{ b^T y \mid C+\Delta C - \sum _i A_i y_i \succeq 0,\ \Vert \Delta C\Vert \le \varepsilon \right\} , \end{aligned}$$

where \(\Vert \Delta C\Vert <\varepsilon \) in (7) is changed to \(\Vert \Delta C\Vert \le \varepsilon \).

Since \({v_a}(0)\) is obtained by restricting the condition on \(\Delta C\) from “\(\Vert \Delta C\Vert \le \varepsilon \)” to “\(\Delta C = \varepsilon I_d/\Vert I_d\Vert \)”, we obtain \({v_a}(0) \le \hbox { a-val}(\mathbf{D})= v_a(\mathbf{D})\). We also have the converse inequality \({v_a}(0) \ge v_a(\mathbf{D})\) because

$$\begin{aligned} \hbox {a-val}(\mathbf{D})&= \lim _{\varepsilon \downarrow 0} \sup \left\{ b^T y \mid C+\Delta C - \sum _i A_i y_i \succeq 0,\ \Vert \Delta C\Vert \le \varepsilon \right\} \\&= \lim _{\varepsilon \downarrow 0} \sup \left\{ b^T y \mid C+\Delta C - \sum _i A_i y_i \succeq 0,\ -\varepsilon I \preceq \Delta C \preceq \varepsilon I\right\} \\&\le \lim _{\varepsilon \downarrow 0} \sup \left\{ b^T y \mid C+\Delta C - \sum _i A_i y_i \succeq 0,\ \Delta C \preceq \varepsilon I\right\} \\&\le \lim _{\varepsilon \downarrow 0} \sup \left\{ b^T y \mid C+\Delta C - \sum _i A_i y_i \succeq 0,\ \Delta C \preceq \varepsilon \Vert I_d^{-1}\Vert I_d\right\} \\&\le \lim _{\varepsilon \downarrow 0} \sup \left\{ b^T y \mid C+\varepsilon I_d- \sum _i A_i y_i \succeq 0 \right\} ={v_a}(\mathbf{D}). \end{aligned}$$

Here we used \(I \preceq \Vert I_d^{-1}\Vert I_d\) for the second inequality. The proof of item 1 is complete.

We proceed to prove item 2. From the Asymptotic Duality Theorem again, we have \(v(\mathbf{D})=\hbox {a-val}(\mathbf{P})\). Hence, for the sake of proving assertion 2, it suffices to show that \({v_a}(\mathbf{P})=\hbox {a-val}(\mathbf{P})\). The proof of the inequality \({v_a}(\mathbf{P})\ge \hbox {a-val}(\mathbf{P})\) is analogous to the proof for \({v_a}(\mathbf{D})\le \hbox {a-val}(\mathbf{D})\). We will now show the converse inequality. If \(\hbox {a-val}(\mathbf{P})=+\infty \), then \({v_a}(\mathbf{P})\ge \hbox {a-val}(\mathbf{P})\) implies that \({v_a}(\mathbf{P}) = +\infty \). Therefore, in what follows we assume that \(\hbox {a-val}(\mathbf{P})<+\infty \).

By assumption, \(\mathbf{P}\) is not strongly infeasible (see Sect. 2.1). By the definition of \(\hbox {a-val}(\mathbf{P})\), for every \(\varepsilon > 0\) sufficiently small, there exist \({X_{\varepsilon }}\) and \({\Delta b_{\varepsilon }}\) such that \(\Vert \Delta b_{\varepsilon }\Vert \le \varepsilon \), \({X_{\varepsilon }}\) is feasible to “\(A(X)=b + {\Delta b_{\varepsilon }}, \ X\succeq 0\)”, and

$$\begin{aligned} \hbox {a-val}(\mathbf{P}) = \lim _{\varepsilon \downarrow 0 } C \bullet X_{\varepsilon }. \end{aligned}$$
(13)

Note that this is still valid even when \(\hbox {a-val}(\mathbf{P})=-\infty \).

In addition, the fact that \(\mathbf{P}\) is not strongly infeasible implies the existence of a solution to the system “\(A(X') = b\)”. As a consequence, “\(A(Y)=\Delta b_{\varepsilon }\)” too has a solution when \(\Delta b_{\varepsilon }\) is as described above. Otherwise, “\(A(X) = b+\Delta b_{\varepsilon }\)” is infeasible, contradicting the existence of \(X_{\varepsilon }\) above.

Next, we show that there exists \(M > 0\) depending only on A such that “\(A(Y)=\Delta b_{\varepsilon }\)” has a solution with norm bounded by \(M \Vert {\Delta b_{\varepsilon }}\Vert \). Let \({\mathcal {V}}\) denote the set of solutions to “\(A(Y)=\Delta b\)” and let S be a symmetric matrix. Denote by \(\text {dist}\,(S,{\mathcal {V}}) \) the Euclidean distance between S and \({\mathcal {V}}\). Hoffman’s lemma (e.g., [11, Theorem 11.26]) says that there exists a constant M depending on A but not on \(\Delta b\) such that for every S, we have that \(\text {dist}\,(S,{\mathcal {V}}) \) is bounded above by \(M\Vert {\Delta b-A(S)}\Vert \). Taking \(S = 0\), we conclude the existence of Y satisfying \(A(Y)= \Delta b\) and \(\Vert {Y}\Vert \le M\Vert {\Delta b}\Vert \).

Let \(Y_{\varepsilon }\) be one such solution. Then \(\Vert Y_{\varepsilon }\Vert \le M\Vert \Delta b_{\varepsilon }\Vert \le M\varepsilon \) for each sufficiently small \(\varepsilon >0\) and hence

$$\begin{aligned} \lim _{\varepsilon \downarrow 0}\Vert Y_{\varepsilon }\Vert =0. \end{aligned}$$
(14)

Observing that \(\Vert I_p^{-1}\Vert I_p\succeq I\) and \(\Vert Y_{\varepsilon }\Vert I - Y_{\varepsilon }\succeq 0\) yield \(\Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert I_p- Y_{\varepsilon }\succeq 0\), we let

$$\begin{aligned} X'_{\varepsilon } \equiv X_{\varepsilon } + \Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert I_p- Y_{\varepsilon }. \end{aligned}$$

With that, \(X'_{\varepsilon }\) is positive semidefinite and is a feasible solution to \(\mathbf{P}(0, \eta )\) with \(\eta =\Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert \) (see (4)). Furthermore,

$$\begin{aligned} |C\bullet X_{\varepsilon } - C\bullet X'_{\varepsilon }| =|C\bullet (\Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert I_p- Y_{\varepsilon })| \le 2\Vert C\Vert \Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert \Vert I_p\Vert _F,\nonumber \\ \end{aligned}$$
(15)

which approaches 0 by driving \(\varepsilon \rightarrow 0\) because of (14).

We are now ready to show the desired assertion. Notice that we have \(\lim _{\varepsilon \downarrow 0} C \bullet X'_{\varepsilon }\ge {v_a}(\mathbf{P})\), since \(X'_{\varepsilon }\) is feasible to \(\mathbf{P}(0, \Vert Y_{\varepsilon }\Vert \Vert I_p^{-1}\Vert )\) and (14) holds. This fact combined with (13) and (15) implies \({v_a}(\mathbf{P})\le \hbox {a-val}(\mathbf{P})\). The proof is complete. \(\square \)

Theorem 3 motivates our subsequent discussion and leads naturally to an examination of what happens when P and D are simultaneously perturbed, which is the focus of Theorems 1, 2 and 4.

3.2 Infeasible primal-dual interior-point algorithms

We introduce some basic concepts of infeasible primal-dual interior-point algorithms for SDP [32, 45, 47, 48]. This is because our analysis leads to a novel convergence property of the infeasible primal-dual interior-point algorithms when applied to singular problems. We also need some theoretical results about infeasible interior-point algorithms in the proof of Theorem 1. In this subsection, we assume that \(A_i\) \((i=1, \ldots , m)\) are linearly independent. This assumption is not essential but to ensure uniqueness of y and \(\Delta y\) in the system of equations of the form \(S=\sum _i A_i y_i +C'\) and \(\Delta S = \sum _i A_i \Delta y + R'\) with respect to (Sy) and \((\Delta S,\Delta y)\), respectively, where \(C'\) and \(R'\) are constants, which appear throughout the analysis.

3.2.1 Outline of infeasible primal-dual interior-point algorithms

Primal-dual interior-point methods for P and D are based on the following optimality conditions:

$$\begin{aligned} XS = 0,\ \ \ C - \sum _i A_i y_i = S,\ \ \ A(X) =b\ \ \ X \succeq 0,\ \ \ S \succeq 0. \end{aligned}$$
(16)

Rather than solving this system directly, a relaxed problem

$$\begin{aligned} XS = \nu I,\ \ \ C - \sum _i A_i y_i = S,\ \ \ A(X) = b, \ \ X \succeq 0,\ \ \ S \succeq 0, \end{aligned}$$
(17)

is considered, where \(\nu > 0\). The algorithm solves (16) by solving (17) approximately and reducing \(\nu \) gradually to zero repeatedly. This amounts to following the central path

$$\begin{aligned} \{(X_\nu ,S_\nu ,y_\nu ) \mid (X,S,y)=(X_\nu ,S_\nu ,y_\nu )\hbox { is a solution to (17)},\ \nu \in (0, \infty ]\} \end{aligned}$$
(18)

towards “\(\nu = 0\)”. Let us take a closer look at the algorithm proposed by Zhang, more precisely, Algorithm-B of [48].

Let (XSy) be the current iterate such that \(X\succ 0\) and \(S\succ 0\). The method employs the Newton direction to solve the system (17). More precisely, the first equation \(XS=\nu I\) is replaced with an equivalent symmetric reformulation

$$\begin{aligned} \Phi (X,S) = \frac{1}{2}(PXSP^{-1}+P^{-1}SXP)=\nu I, \end{aligned}$$
(19)

where P is a constant nonsingular matrix. In Zhang’s algorithm, the constant matrix P is set to \(S^{1/2}\). Then we consider a modified nonlinear system of equations to (17) where \(XS=\nu I\) is replaced with (19). The Newton direction \((\Delta X, \Delta S, \Delta y)\) for that modified system at the point (XSy) is the unique solution to the following system of linear equations.

$$\begin{aligned}&\Phi (X,S)+ L_\Phi (\Delta X, \Delta S) =\nu I,\nonumber \\&C - \sum _i A_i (y_i +\Delta y_i) = S + \Delta S,\\&A(X+\Delta X) = b,\nonumber \end{aligned}$$
(20)

where \(L_\Phi \) is a linearization of \(\Phi (X,S)\).

Starting from the kth iterate \((X^k,S^k,y^k)=(X,S,y)\), the next iterate \((X^{k+1}, S^{k+1}, y^{k+1})\) is determined as:

$$\begin{aligned} (X^{k+1}, S^{k+1}, y^{k+1}) = (X^k, S^k, y^k) + s^k (\Delta X, \Delta S, \Delta y). \end{aligned}$$
(21)

The stepsize \(0 < s^k \le 1\) is chosen not only so that \(X^{k+1}\) and \(S^{k+1}\) are strictly positive but also carefully so that they stay close to the central path in order to ensure good convergence properties. Then \(\nu \) is updated appropriately and the iteration continues.

Now we briefly describe another representative polynomial-time infeasible primal-dual interior-point algorithm developed by Potra and Sheng [32]. Let \((X^0, S^0, y^0)\) be a point satisfying \(X^0\succ 0\) and \(S^0\succ 0\) and consider the path defined as follows.

$$\begin{aligned}&\{(X,S, y) \mid XS = t I,\ \ \ C - \sum A_i y_i -S = t(C - \sum A_i y_i^0 -S^0),\nonumber \\&\ \ \ A(X) - b =t(A(X^0)-b), \ X \succeq 0,\ S \succeq 0,\ t\in (0,1]\}. \end{aligned}$$
(22)

The algorithm follows this path by driving \(t\rightarrow 0\) and using a predictor-corrector method.

We note that polynomial-time convergence is proved for both algorithms [32, 48] assuming the existence of optimal solutions \((X^*,S^*,y^*)\) to P and D. In the analysis, the initial iterate \((X^0,S^0, y^0)\) is set to \((\rho _0 I, \rho _1 I, 0)\) where \(\rho _0\) and \(\rho _1\) are selected to be large enough in order to satisfy the conditions \(X^0-X^*\succ 0\) and \(S^0-S^*\succ 0\). Although the polynomial convergence analysis was conducted using this initial iterate, the algorithms themselves can be applied to any SDP problem by choosing \((X^0, S^0, y^0)\) such that \(X^0\succ 0\) and \(S^0\succ 0\) as the initial iterate.

In many practical implementations of the algorithm [45, 47], they take different stepsizes in the primal and dual space for the sake of practical efficiency. For simplicity of presentation, we only analyze the case (21) which corresponds to the situation where we take the same stepsize in the primal-dual space.

The following well-known property connects Theorems 1 and 2 to the analysis of infeasible interior-point algorithms.

Proposition 1

Let \(X^0 \succ 0\) and \(S^0 \succ 0\), and let \(\{(X^k, S^k, y^k)\}\) be a sequence generated by the primal-dual infeasible interior-point algorithms in [32, 48] with initial iterate \((X^0, S^0, y^0)\). Let \(I'_d\equiv S^0 - (C-\sum A_i y_i^0)\) and let \(I'_p \equiv X^0 - {{\widetilde{X}}}\) where \(A({{\widetilde{X}}}) = b\). Then, there exists a nonnegative sequence \(\{t^k\}\) such that the following equations hold:

$$\begin{aligned} (C + t^k I'_d) - \sum _i A_i y^k = S^k, \quad A(X^k) = b + t^k A(I'_p). \end{aligned}$$
(23)

(cf. The linear equality constraints of (2) and (3))

Proof

This result is a fundamental tool used in the analysis of the algorithms in [32, 48]. For the sake of completeness, here we prove the result only for Zhang’s algorithm.

We prove the first relation of (23) by induction. For \(k=0\), the proposition holds by taking \(t^0\equiv 1\). Suppose that the relation (23) holds for k, then, the search direction \((\Delta X, \Delta S, \Delta y)\) is the solution to the linear system of equations (20) with \((X, S, y) = (X^k, S^k, y^k)\). Because of the second equation of (20), we have

$$\begin{aligned} C- \sum A_i (y_i^k +\Delta y_i) - (S^k+\Delta S)=0. \end{aligned}$$

Therefore,

$$\begin{aligned} C- \sum A_i (y_i^k + s^k\Delta y_i) -(S^k + s^k\Delta S) = (1-s^k)\left( C- \sum A_i y_i^k - S^k\right) . \end{aligned}$$

Since \(y_i^{k+1} = y_i^k + s^k\Delta y_i ~\mathrm{and}~ S^{k+1} = S^k + s^k\Delta S\), we obtain

$$\begin{aligned} C - \sum A_i y_i^{k+1} - S^{k+1} = (1-s^k)t^k I'_d=t^{k+1} I'_d \end{aligned}$$

as we desired, because \(C- \sum A_i y_i^k - S^k = t^k(C -\sum A_i y_i^0 - S^0)=t^k I'_d\) holds by the induction assumption. The primal relation, i.e., the right side in (23), follows similarly. \(\square \)

Remark

In view of Proposition 1, by convention, we treat \(t^k\) as a part of iterates of the algorithms. By its construction, we have \(t^0=1\) and

$$\begin{aligned} t^{k+1}=\prod _{l=0}^k (1-s^l) \end{aligned}$$
(24)

for \(k=0, 1, \ldots \)

3.2.2 Path formed by points on the central path of perturbed problems

We fix \(\nu \) to be a positive number, and consider the following system of equations and semidefinite conditions parametrized by \(t > 0\):

(25)

We denote by \(w_\nu (t) \equiv (X_\nu (t), S_\nu (t), y_\nu (t))\) the solution of (25) (if it exists). If the problem is asymptotically pd-feasible, for any \(t>0\), \(\mathbf{P}(t\alpha ,t\beta )\) and \(\mathbf{D}(t\alpha ,t\beta )\) are strongly feasible. Then the solution of (25) defines a point on the central path with parameter \(\nu \) of the primal-dual pair of strongly feasible SDP:

$$\begin{aligned} \min \ (C + t\alpha I_d) \bullet X\ \ \ \hbox {s.t.} \ A(X - t \beta I_p) = b,\ \ \ X\succeq 0 \end{aligned}$$
(26)

and

$$\begin{aligned} \max \ \sum _i (b_i + t\beta A_i\bullet I_p) y_i\ \ \ \hbox {s.t.}\ C + t\alpha I_d- \sum _i A_i y_i= S, \ \ \ S \succeq 0, \end{aligned}$$
(27)

where we note that t is fixed in (26) and (27). In this case, \(w_\nu (t)\) is ensured to exist and is uniquely determined for all \(t \in (0,\infty )\) (due to the assumption of linear independence of \(A_i\), \(i=1, \ldots , m\)). Moreover, the set

$$\begin{aligned} {{{\mathcal {C}}}}\equiv \{w_{\nu }(t) \mid t\in (0,\infty )\} \end{aligned}$$
(28)

forms an analytic path running through \({{\mathcal {S}}^{n}_{++}}\times {{\mathcal {S}}^{n}_{++}}\times {\mathbb {R}}^m\). The existence and analyticity of \({{{\mathcal {C}}}}\) is a folklore result (e.g., [21, 33]), but we outline a proof in the Appendix A based on a result in [26]. We note that the existence and analyticity of the path just relies on local conditions, so, the existence of optimal solutions of P and D is not necessary. A special case where \(\nu =1\) and \(C=0\) is analyzed in [40] in the context of facial reduction.

Since \(A(X_\nu (t))=b+t\beta A(I_p)\), \(C + t\alpha I_d- \sum _i A_i y_{\nu i}(t) = S_\nu (t)\), and \(X_\nu (t) S_\nu (t)=\nu I\) hold, we have

$$\begin{aligned} 0\le & {} (C+t\alpha I_d)\bullet X_\nu (t) -\sum _{i=1}^m (b_i+t\beta A_i\bullet I_p) y_{\nu i}(t)\nonumber \\= & {} (C+t\alpha I_d)\bullet X_\nu (t) -\sum _{i=1}^m A_i\bullet X_\nu (t) y_{\nu i}(t) \nonumber \\ \nonumber \\= & {} S_\nu (t)\bullet X_\nu (t) = \mathrm{Tr}(X_\nu (t)S_\nu (t))=\mathrm{Tr}(\nu I) = n \nu . \end{aligned}$$
(29)

Let us denote by \(v_{\mathrm{opt}}(t)\) the common optimal value of (26) and (27). Since \(v_{\mathrm{opt}}(t)\) is between \((C+t\alpha I_d)\bullet X_\nu (t)\) and \(\sum _{i=1}^m (b_i+t\beta A_i\bullet I_p) y_{\nu i}(t)\), i.e.,

$$\begin{aligned} v_{\mathrm{opt}}(t)\in \left[ \sum _{i=1}^m (b_i+t\beta A_i\bullet I_p) y_{\nu i}(t), (C+t\alpha I_d)\bullet X_\nu (t)\right] \end{aligned}$$
(30)

holds by weak duality, we see, together with (29), that

$$\begin{aligned} 0\le (C+t\alpha I_d)\bullet X_\nu (t) - v_{\mathrm{opt}}(t) \le n\nu \end{aligned}$$
(31)

holds for each \(t > 0\).

3.3 Semialgebraic sets and the Tarski-Seidenberg Theorem

A set S in \({\mathbb {R}}^k\) is called basic semialgebraic if it can be written as the set of solutions of finitely many polynomial equalities and strict polynomial inequalities. Then, a set is said to be semialgebraic if it is a union of finitely many basic semialgebraic sets. In particular, a semialgebraic set in \({\mathbb {R}}\) is a union of finitely many points and intervals. For \(x = (x_1, \ldots , x_k) \in {\mathbb {R}}^k\), let T(x) be a coordinate projection to \({\mathbb {R}}^{k-1}\) defined as \(T(x)\equiv (x_2,\ldots ,x_n)\). The Tarski-Seidenberg Theorem states that a coordinate projection of a semialgebraic set is again a semialgebraic set in the lower-dimensional space, and described as follows.

Tarski-Seidenberg Theorem (e.g. Theorem 2.2.1 of [2])

Let \(W \subseteq {\mathbb {R}}^k\) be a semialgebraic set. Then, T(W) is a semialgebraic set in \({\mathbb {R}}^{k-1}\).

4 Proof of the main results

In this section, we prove Theorems 1 and 2. We start with some basic properties of \(v(\varepsilon ,\eta )\).

Proposition 2

If the problem is asymptotically pd-feasible, the following statements hold.

  1. 1.

    \(v(\varepsilon , \eta )\) is well-defined for all \((\varepsilon , \eta ) \ge 0\) not equal to (0, 0). Furthermore,

    1. (i)

      \(\lim _{\varepsilon \downarrow 0} v(\varepsilon , 0) = v(\mathbf{P})\) and

    2. (ii)

      \(\lim _{\eta \downarrow 0} v(0,\eta ) = v(\mathbf{D})\)

    hold including the cases where their values are \(\pm \infty \).

  2. 2.

    \(v(\varepsilon ,\eta )\) is a monotone increasing concave function in \(\varepsilon \). (From item 1., if \(\eta > 0\), \(v(\varepsilon ,\eta )\) is well-defined over \([0,\infty )\). If \(\eta = 0\), \(v(\varepsilon ,\eta )\) is well-defined over \((0,\infty )\).)

  3. 3.

    \(v_P(\varepsilon ,\eta )\equiv v(\varepsilon ,\eta )-\eta C\bullet I_p-\eta \varepsilon I_d\bullet I_p\) is a monotone decreasing and convex function in \(\eta \). (From item 1., if \(\varepsilon > 0\), \(v_P(\varepsilon ,\eta )\) is well-defined over \([0,\infty )\). If \(\varepsilon = 0\), \(v_P(\varepsilon ,\eta )\) is well-defined over \((0,\infty )\).)

Proof

Item 1. follows directly from Theorem 3. Next, we move on to item 2. Let \((\varepsilon _1,\eta )\ge 0\), \((\varepsilon _2, \eta ) \ge 0\) and, without loss of generality, we may assume that \(0 \le \varepsilon _1 < \varepsilon _2\). By definition, \(v(\varepsilon ,\eta )\) coincides with \(v(\mathbf{D}(\varepsilon , \eta ))\) whenever \(v(\varepsilon ,\eta )\) is well-defined, see Sect. 2.2. Then, \(v(\varepsilon ,\eta )\) is monotonically increasing in \(\varepsilon \) because if y is feasible for \(\mathbf{D}(\varepsilon _1, \eta )\) then y is feasible for \(\mathbf{D}(\varepsilon _2, \eta )\) too. Next, we prove concavity and we will start by first considering the case \(\eta > 0\).

There are two sub-cases to consider: when \(\varepsilon _1=0\) and when \(\varepsilon _1>0\). In the latter sub-case, \(\mathbf{D}(\varepsilon _1, \eta )\) and \(\mathbf{D}(\varepsilon _2, \eta )\) are both feasible, since \(\varepsilon _1, \varepsilon _2\) and \(\eta \) are all positive and asymptotic primal-dual feasibility was assumed. For simplicity, we define \({{\hat{b}}}\) as the vector corresponding to the objective function of \(\mathbf{D}(\varepsilon , \eta )\) so that

$$\begin{aligned} {{\hat{b}}}^Ty = \sum _{i=1}^m ({b}_i + \eta A_i\bullet I_p)y_i,\qquad \forall y \in {\mathbb {R}}^m. \end{aligned}$$

We let \(y^k\) and \({\bar{y}}^k\) be sequences of feasible solutions of \(\mathbf{D}(\varepsilon _1, \eta )\) and \(\mathbf{D}(\varepsilon _2, \eta )\) satisfying

$$\begin{aligned} {{\hat{b}}}^Ty^k \rightarrow v(\mathbf{D}(\varepsilon _1, \eta )) \quad \text {and}\quad {{\hat{b}}}^T {\bar{y}}^k \rightarrow v(\mathbf{D}(\varepsilon _2, \eta )). \end{aligned}$$

Then, for \(t \in [0,1]\), we have that \(t y^k+(1-t) {\bar{y}}^k \) is a feasible solution to \(\mathbf{D}(t\varepsilon _1+(1-t)\varepsilon _2, \eta )\) with objective value \({\hat{b}}^T(ty^k+(1-t){\bar{y}}^k)\). Then it follows

$$\begin{aligned} v(\mathbf{D}(t\varepsilon _1+(1-t)\varepsilon _2, \eta ))&= v(t\varepsilon _1+(1-t)\varepsilon _2, \eta )\\&\ge {\hat{b}}^T(ty^k+(1-t){\bar{y}}^k) = t {\hat{b}}^Ty^k+ (1-t) {\hat{b}}^T {\bar{y}}^k. \end{aligned}$$

Taking the limit with respect to k, we obtain

$$\begin{aligned} v(t\varepsilon _1+(1-t)\varepsilon _2,\eta ) \ge t v(\varepsilon _1, \eta )+(1-t) v(\varepsilon _2, \eta ) \end{aligned}$$
(32)

as we desired.

Now we deal with the sub-case where \(\varepsilon _1=0\). By assumption, we have \(\varepsilon _2>\varepsilon _1=0\), implying that \(v(\varepsilon _2, \eta )\) is finite. Then, we can proceed analogously except that \(\mathbf{D}(0, \eta )\) may be infeasible so that \(v(\varepsilon _1,\eta ) =v(0, \eta )=-\infty \). However, in that case, since \({v(t\varepsilon _1+(1-t)\varepsilon _2,\eta ) =v((1-t)\varepsilon _2, \eta )}\) is finite for all \(t\in [0,1)\), we see that (32) indeed holds. This concludes the proof for the case where \(\eta > 0\).

Finally, we deal with the case \(\eta = 0\). In this case, we may assume that \(\varepsilon _1\) is positive, since \(v(\varepsilon _1, 0)\) might not be well-defined otherwise. By assumption, \(\mathbf {D}\) is asymptotically feasible, so \(\mathbf{D}(\varepsilon , 0)\) is always feasible for \(\varepsilon > 0\). Thus the optimal value of \(\mathbf{D}(\varepsilon _1, 0)\) is either finite or is \(+\infty \).

There are two sub-cases to consider. First, suppose that the optimal value of \(\mathbf{D}(\varepsilon , 0)\) is \(+\infty \) for some \(\varepsilon > 0\). Then, \(\mathbf{P}(\varepsilon , 0)\) is infeasible. However, the feasible region of \(\mathbf{P}(\varepsilon , 0)\) is the same for all \(\varepsilon > 0\), which implies infeasibility of \(\mathbf{P}(\varepsilon , 0)\) for all \(\varepsilon > 0\). Consequently, \(v(\varepsilon ,0) = +\infty \) for all \(\varepsilon > 0\) and (32) holds.

The next sub-case is when the optimal value of \(\mathbf{D}(\varepsilon , 0)\) is finite for all \(\varepsilon > 0\). In particular, \(v(\mathbf{D}(\varepsilon _1, \eta ))\) and \(v(\mathbf{D}(\varepsilon _2, \eta ))\) are both finite and we can proceed as in the proof of the case \(\eta > 0\). This concludes the proof of item 2.

Now we prove item 3. First, we recall that the optimal value of (10) (or, equivalently, (9)) is monotone decreasing in \(\eta \). If we replace C with \(C +\varepsilon I_d\) in (10) we obtain

$$\begin{aligned} \min \ (C+\varepsilon I_d)\bullet X -\eta (C+\varepsilon I_d)\bullet I_p\ \ \hbox {s.t.}\ A(X) = b+ \eta A(I_p),\ X\succeq 0. \end{aligned}$$
(33)

Similarly, the optimal value of (33) is monotone decreasing in \(\eta \), when \(\varepsilon \) is fixed. Since (33) differs from \(\mathbf{P}(\varepsilon ,\eta )\) by the term \((\eta (C\bullet I_p)+\eta \varepsilon I_d\bullet I_p)\) in the objective function, the optimal value of (33) can be written as

$$\begin{aligned} v(\varepsilon ,\eta )-(\eta (C\bullet I_p)+\eta \varepsilon I_d\bullet I_p), \end{aligned}$$

which is precisely \(v_P(\varepsilon ,\eta )\). Therefore \(v_P(\varepsilon ,\eta )\) is monotone decreasing with respect to \(\eta \).

Finally, for fixed \(\varepsilon \), \(v_P(\varepsilon ,\eta )\) and \(v(\varepsilon ,\eta )\) differ by a linear function in \(\eta \). So to prove that \(v_P(\varepsilon ,\eta )\) is convex as a function of \(\eta \), it is enough to prove that \(v(\varepsilon ,\eta )\) is convex as a function of \(\eta \). This can be done analogously to the proof of item 2., so we omit the details. \(\square \)

In the following, we prove Theorem 1. The theorem claims that, even though v(0, 0) is not well-defined, the limiting value exists when approaching (0, 0) along a straight line emanating from the origin to any direction of the first orthant.

Proof of Theorem

1 Although the result holds even if the \(A_i\)’s are linearly dependent, for simplicity sake, in this proof we assume linear independence of the \(A_i\) \((i=1, \ldots , m)\). In addition, we write \(v(t\alpha , t\beta )\) as \(v_{\mathrm{opt}}(t)\), since \(v(t\alpha ,t\beta )\) is the common optimal value to the primal-dual pair \(\mathbf{P}(t\alpha ,t\beta )\) and \(\mathbf{D}(t\alpha ,t\beta )\). We also assume that \(\alpha > 0\) and \(\beta >0\), since the proof for the case where either of \(\alpha \) and \(\beta \) is 0 (but \((\alpha ,\beta )\not =0\)) has already been established in Proposition 2.

Recall that we introduced the analytic path \({{\mathcal {C}}}\) in Sect. 3.2.2 (See (25)–(28)). We follow the same notation described therein. The path \({{\mathcal {C}}}\) is parametrized by t. We divide the proof into the following two steps:

(Step 1) For any fixed \(\nu >0\), we prove the monotonicity of \((C+t\alpha I_d)\bullet X_\nu (t)\) when \(t>0\) is sufficiently small.

(Step 2) Prove the existence of \(\lim _{t\downarrow 0} v_{\mathrm{opt}}(t)\).

(Step 1)

We analyze the behavior of \((C+t\alpha I_d)\bullet X_\nu (t)\) along the path \({{{\mathcal {C}}}}\) as \(t\rightarrow 0\). Recall that (25) is the system parametrized by t which defines the path \({{{\mathcal {C}}}}\). By differentiating the three equations in (25) with respect to t, we see that the following system of equations in \((t, X ,S , y, \delta X, \delta S, \delta y)\) (with semidefinite constraints on X and S)

$$\begin{aligned} \begin{array}{l} X \delta S + \delta X S = 0, \\ \alpha I_d-\sum \limits _i A_i \delta {y}_i = \delta S, \\ A_i\bullet (\delta X - \beta I_p) = 0, \qquad (i=1, \ldots , m),\\ XS=\nu I, \\ C + t\alpha I - \sum \limits _i A_i y_i = S, \\ A_i\bullet (X - t \beta I_p) = b_i,\qquad (i=1, \ldots , m),\\ X\succeq 0,\ S\succeq 0, t > 0, \end{array} \end{aligned}$$
(34)

has a unique solution

$$\begin{aligned} (t, X, S, y, \delta X, \delta S, \delta y) = \left( t, X_\nu (t), S_\nu (t), y_\nu (t), \frac{d X_\nu (t)}{dt},\frac{d S_\nu (t)}{dt}, \frac{d y_\nu (t)}{dt}\right) \end{aligned}$$

for each \(t\in (0,\infty )\). That is, (34) is a system of equations with semidefinite constraints which determines the curve \((X_\nu (t), S_\nu (t), y_\nu (t))\) and its tangent \(\left( \frac{d X_\nu (t)}{dt},\frac{d S_\nu (t)}{dt}, \frac{d y_\nu (t)}{dt}\right) \). The reason for that is as follows. By the discussion in Sect. 3.2.2, for fixed \(t > 0\), \(X_\nu (t),S_\nu (t)\) are uniquely defined. Since the \(A_i\) are linearly independent, \(y_{\nu (t)}\) must be unique as well. In order to see that \(\delta X, \delta S, \delta y\) are also uniquely determined, we take a look at the first three equations of (34) for fixed positive definite matrices X and S. They become linear equations in \(\delta X, \delta S, \delta y\) and determine a unique solution if and only if the kernel of \(\phi : (U, V, z )\mapsto (X U + V S,V + \sum _{i}A_i z_i, A(U) )\) is trivial. Suppose \(\phi (U,V,z) = 0\). Then, \(U\bullet V = 0\). Considering the first component of \(\phi \), we have the equation \(XU = -VS\), which implies that \(\nu U = -SVS\). Taking the inner product with V, we obtain \(0 = (SVS)\bullet V = \Vert S^{1/2}VS^{1/2}\Vert _F^2 \). Therefore, \(S^{1/2}VS^{1/2} = 0\) and since S is invertible, \(V = 0\). By \(\nu U = -SVS\), we have \(U = 0\).

Now we are ready to proceed. Let us denote by \({{{\mathcal {D}}}}\) the set of solutions to (34) as follows:

Each element of \({{{\mathcal {D}}}}\) can be seen as a pair consisting of a point on \({{{\mathcal {C}}}}\) and its tangent. Since the semidefinite conditions \(S\succeq 0\) and \(X\succeq 0\) can be written as the solution set of finitely many polynomial inequalities, \({{{\mathcal {D}}}}\) is a semialgebraic set.

Now we claim that \((C + t\alpha I_d) \bullet X_\nu (t)\) is either monotonically increasing or monotonically decreasing for sufficiently small t. To this end, we analyze the set of local minimum points and local maximum points of \((C + t\alpha I_d)\bullet X_\nu (t)\) over \((0,\infty )\). A necessary condition for local minimum and maximum points is:

$$\begin{aligned} \frac{d(C + t\alpha I_d)\bullet X_\nu (t)}{dt} = (C+\alpha I_d) \bullet \frac{d X_\nu (t)}{dt} + \alpha I_d\bullet X_\nu (t) =0. \end{aligned}$$

Recall that for \({{\hat{t}}}>0\), \((\frac{d X_\nu }{dt}({{\hat{t}}}),\frac{d S_\nu }{dt}({{\hat{t}}}), \frac{d y_\nu }{dt}({{\hat{t}}}))\) is the tangent part \((\delta X, \delta S, \delta y)\) of the unique solution to (34) with \(t={{\hat{t}}}\). With that in mind, a necessary condition for \((C+t \alpha I_d)\bullet X_\nu (t)\) to have an extreme value at t is that t is in the set

$$\begin{aligned} {{{\mathcal {T}}}}_1 \equiv \{t \mid (t, X, S, y, \delta X, \delta S, \delta y)\in {{{\mathcal {T}}}}\}, \end{aligned}$$

where

$$\begin{aligned} {{{\mathcal {T}}}} \equiv \{ (t, X, S, y, \delta X, \delta S, \delta y)\in {{{\mathcal {D}}}} \mid \ (C+t\alpha I_d)\bullet \delta X+\alpha I_d\bullet X = 0\}. \end{aligned}$$

Since \({{{\mathcal {D}}}}\) is a semialgebraic set, so is \({{{\mathcal {T}}}}\). Since \({{{\mathcal {T}}}}_1\) is the projection of \({{{\mathcal {T}}}}\) onto the t coordinate, by applying the Tarski-Seidenberg Theorem, we see that \({{{\mathcal {T}}}}_1\) is a semialgebraic set.

Thus, \({{{\mathcal {T}}}}_1\) is a semialgebraic set contained in \({\mathbb {R}}\), therefore \({{{\mathcal {T}}}}_1\) can be expressed as a union of finitely many points and intervals over \({\mathbb {R}}\). Since \((C+t\alpha I_d)\bullet X_\nu (t)\) is an analytic function (see Sect. 3.2.2), the same is true for its derivatives. Therefore, if \({{{\mathcal {T}}}}_1\) contains an interval, then the derivative of \((C+t\alpha I_d)\bullet X_\nu (t)\) with respect to t must, in fact, be zero throughout \((0,\infty )\)Footnote 1. In particular, \((C+ t\alpha I_d)\bullet X_\nu (t)\) is constant for all \(t > 0\). Thus, \((C+t\alpha I_d)\bullet X_\nu (t)\) is a monotonically increasing/decreasing function in this case.

Now we deal with the case where \({{{\mathcal {T}}}}_1\) consists of a finite number of points only. We recall that \((C+t\alpha I_d)\bullet X_\nu (t)\) takes an extreme value at t only if \(t\in {{{\mathcal {T}}}}_1\). This implies that the number of extremal points of \((C+t\alpha I_d)\bullet X_\nu (t)\) is finite and hence \((C+t\alpha I_d)\bullet X_\nu (t)\) is monotonically increasing or monotonically decreasing for sufficiently small t.

(Step 2)

It follows from Step 1 that there are three possibilities.

  1. (i)

    \(\lim _{t\downarrow 0} (C+t\alpha I_d)\bullet X_\nu (t) = \infty \),

  2. (ii)

    \(\lim _{t\downarrow 0}(C+t\alpha I_d) \bullet X_\nu (t)= -\infty \),

  3. (iii)

    \(\lim _{t\downarrow 0} (C+t\alpha I_d)\bullet X_\nu (t)\) is a finite value.

First we consider cases (i) and (ii). Recalling (31), we have \(|(C+t\alpha I_d)\bullet X_\nu (t)-v_{\mathrm{opt}}(t)|\le n\nu \). Therefore, \(v_{\mathrm{opt}}(t)\) diverges to \(+\infty \) and \(-\infty \), respectively. This corresponds to the case of the theorem where the limit is \(\pm \infty \).

Next, we proceed to case (iii). In this case, \(v_{\mathrm{opt}}(t)\) is bounded for sufficiently small \(t >0\) because \(|v_{\mathrm{opt}}(t) - (C+t\alpha I_d)\bullet X_\nu (t)|\le n\nu \) and \((C+\alpha I_d)\bullet X_\nu (t)\) is bounded for sufficiently small \(t>0\). Therefore, there exist three constants \(M_1, M_2\), and \({\bar{t}}>0\) such that \(M_1 < M_2\) and \({{\bar{t}}} >0\) for which

$$\begin{aligned} v_{\mathrm{opt}}(t) \in [M_1, M_2]\quad \hbox {if}\ t\in (0, {{\bar{t}}}]. \end{aligned}$$

For the sake of obtaining a contradiction, we assume that \(v_{\mathrm{opt}}(t)\) does not have a limit as \(t\rightarrow 0\). Then, there exists an infinite sequence \(\{t^k\}\) with \(\lim _{k\rightarrow \infty } t^k \rightarrow 0\) where \(\{v_{\mathrm{opt}}(t^k)\}\) has two distinct accumulation points, \(v_1\) and \(v_2\), say. Without loss of generality, we let \(v_1 > v_2\) and \(z \equiv v_1 - v_2\).

Let \({{\tilde{\nu }}} \equiv z/(6n)\). By Step 1, it follows that \((C+t\alpha I_d) \bullet X_{{{\tilde{\nu }}}}(t)\) is a monotone function for sufficiently small \(t>0\). Furthermore, since \(v_{\mathrm{opt}}(t)\) is bounded for sufficiently small t, (31) implies that \((C+t\alpha I_d) \bullet X_{{{\tilde{\nu }}}}(t)\) does not diverge and has a limit as \(t\downarrow 0\). Let us denote by \(c_{{{\tilde{\nu }}}}^*\) the limit value, and let \({{\tilde{t}}} >0\) be such that

$$\begin{aligned} |(C+t\alpha I_d) \bullet X_{{{\tilde{\nu }}}}(t) -c_{{{\tilde{\nu }}}}^*|\le \frac{z}{6} \end{aligned}$$
(35)

holds for any \(t \in (0,{{\tilde{t}}}]\). On the other hand,

$$\begin{aligned} |(C+t\alpha I_d) \bullet X_{{{\tilde{\nu }}}}(t) -v_{\mathrm{opt}}(t)|= (C+t\alpha I_d) \bullet X_{{{\tilde{\nu }}}}(t) -v_{\mathrm{opt}}(t) \le n{{\tilde{\nu }}} =\frac{z}{6}\nonumber \\ \end{aligned}$$
(36)

holds due to (31). Adding (35), (36) and using the triangular inequality, we see that

$$\begin{aligned} |c^*_{{{\tilde{\nu }}}}-v_{\mathrm{opt}}(t)|\le \frac{z}{3}, \ \ \ \hbox {i.e.}, \ \ \ c^*_{{{\tilde{\nu }}}}-\frac{1}{3}z\le v_{\mathrm{opt}}(t) \le c^*_{{{\tilde{\nu }}}}+\frac{1}{3}z \end{aligned}$$

holds for any \(t \in (0,{{\tilde{t}}}]\). Together with the fact that \(v_1>v_2\) are the two accumulation points of \(\{v_{\mathrm{opt}}(t)\}\), the above relation yields

$$\begin{aligned} c^*_{{{\tilde{\nu }}}}-\frac{1}{3}z\le v_2< v_1 \le c^*_{{{\tilde{\nu }}}}+\frac{1}{3}z. \end{aligned}$$

This implies \(z= v_1-v_2\le 2z/3\) and hence \(z \le 0\), which, however, contradicts \(z >0\). Therefore, the accumulation point of \(v_{\mathrm{opt}}(t)\) is unique and the limit of \(v_{\mathrm{opt}}(t)\) exists as \(t\downarrow 0\). \(\square \)

Now we are ready to prove Theorem 2. Let

$$\begin{aligned} {{\tilde{v}}}(\beta ) \equiv \lim _{t\downarrow 0}v(t, t\beta )\ \hbox {for}\ \beta \in [0,\infty ),\ \ {{\tilde{v}}}(\infty ) \equiv \lim _{t\downarrow 0}v(0, t). \end{aligned}$$

We note that

$$\begin{aligned} {v_a}(\theta ) = \lim _{t\downarrow 0} v(t\cos \theta , t\sin \theta ) = {{\tilde{v}}}(\tan \theta ). \end{aligned}$$

Theorem 2 is a direct consequence of the following theorem.

Theorem 4

If the problem is asymptotically pd-feasible, then \({{\tilde{v}}}(\beta )\) is a monotone decreasing function in \(\beta \) in the interval \([0,+\infty ]\) and the following relation holds.

$$\begin{aligned} v(\mathbf{D})={{\tilde{v}}}(\infty ) \le {{\tilde{v}}}(\beta ) \le {{\tilde{v}}}(0) =v(\mathbf{P}). \end{aligned}$$

Furthermore, \({{\tilde{v}}}(\beta )\) is a convex function in the interval \([0,\infty )\).

Proof

We first show that \({{\tilde{v}}}\) is a monotone decreasing function in \([0, \infty )\). Suppose that, by contradiction, monotonicity is violated, namely, there exists \(\beta _1\) and \(\beta _2\) such that \(\beta _1 < \beta _2\) and \({{\tilde{v}}}(\beta _1) < {{\tilde{v}}}(\beta _2)\). Let \(u = {{\tilde{v}}}(\beta _2) -{{\tilde{v}}}(\beta _1)>0\). Recall that

$$\begin{aligned} {{\tilde{v}}}(\beta ) = \lim _{t\rightarrow 0} v(t, t\beta ). \end{aligned}$$

We show that for sufficiently small t

$$\begin{aligned} v(t,t\beta _2) - v(t,t\beta _1) \le u/2 \end{aligned}$$

holds, which contradicts \({{\tilde{v}}}(\beta _2) -\tilde{v}(\beta _1)=\lim _{t\downarrow 0} (v(t,t\beta _2)-v(t,t\beta _1))=u\). In fact, since \(v_{P}(\varepsilon ,\eta )=v(\varepsilon ,\eta )-\eta (C\bullet I_p+ \varepsilon I_d\bullet I_p)\) is a monotone decreasing function in \(\eta \) (see item 3 of Proposition 2),

$$\begin{aligned} v(t,t\beta )- t\beta (C\bullet I_p+ t I_d\bullet I_p) \end{aligned}$$

is a monotone decreasing function in \(\beta \). Therefore,

$$\begin{aligned} v(t,t\beta _2)-t\beta _2(C\bullet I_p+ tI_d\bullet I_p) \le v(t,t\beta _1)-t\beta _1(C\bullet I_p+ tI_d\bullet I_p) \end{aligned}$$

holds. This implies that, for sufficiently small \(t>0\),

$$\begin{aligned} v(t,t\beta _2)-v(t,t\beta _1) \le t(\beta _2-\beta _1)(C\bullet I_p+ t I_d\bullet I_p) \le \frac{u}{2} \end{aligned}$$

and hence letting \(t\rightarrow 0\), we obtain

$$\begin{aligned} 0 < u = {{\tilde{v}}} (\beta _2)-{{\tilde{v}}}(\beta _1)\le \frac{u}{2}, \end{aligned}$$

contradiction.

Now we confirm monotonicity at \(\beta =\infty \). Since \(\tilde{v}(\infty ) = \lim _{t\downarrow 0}v(0,t)\), what we need to show is \({{\tilde{v}}}(\infty ) \le {{\tilde{v}}}(\beta )\) for any finite \(\beta \). This is confirmed as follows:

$$\begin{aligned} {{\tilde{v}}}(\infty ) =\lim _{t\downarrow 0}v(0,t) \le \lim _{t\downarrow 0}v(\beta ^{-1}t,t) = \lim _{t\downarrow 0}v(t,\beta t) = \tilde{v}(\beta )\ (\beta >0). \end{aligned}$$

The first inequality is due to the item 2. of Proposition 2, and the second equality holds because \({{\tilde{v}}}(\gamma )=\tilde{v}(k\gamma )\) for any \(k>0\), i.e., \({{\tilde{v}}}\) is a homogeneous function.

Now we prove convexity of \({{\tilde{v}}}(\beta )\). We define the function \(v_k\) as

$$\begin{aligned} v_k(\beta ) \equiv v_P\left( \frac{1}{k},\frac{1}{k}\beta \right) . \end{aligned}$$

Then it follows for any \(\beta \in [0,\infty )\) that

$$\begin{aligned} \lim _{k\rightarrow \infty } v_k(\beta ) = \lim _{k\rightarrow \infty }v_P\left( \frac{1}{k},\frac{1}{k}\beta \right) =\tilde{v}(\beta ). \end{aligned}$$

Thus, \(\{v_k\}\) converges pointwise to \({{\tilde{v}}}\). By item 3. of Proposition 2, \(v_k\) is convex on \((0,\infty )\), so it follows from [38, Theorem 10.8] that \({{\tilde{v}}}\) is also a convex function on \((0,\infty )\). Since \({{\tilde{v}}}(\alpha )\) is monotone increasing on \([0,\infty )\), \({{\tilde{v}}}\) is convex on \([0,\infty )\). This completes the proof of the theorem. \(\square \)

Proof of Theorem

2 We recall that a convex function is continuous over the relative interior of its domain, e.g., [38, Theorem 10.1], so the function \({{\tilde{v}}}\) in Theorem 4 is continuous over \((0,\infty )\). We also recall that \({v_a}(\theta )= \lim _{t\downarrow 0} v(t\cos \theta ,t\sin \theta )\). We have, for \(\theta \in [0,\pi /2]\),

$$\begin{aligned} {v_a}(\theta ) = \lim _{t\downarrow 0} v(t\cos \theta ,t\sin \theta ) = \lim _{t\downarrow 0} v(t ,t\tan \theta ) = {{\tilde{v}}}(\tan \theta ). \end{aligned}$$

Since \({v_a}(\theta ) ={{\tilde{v}}}(\tan \theta )\) and \(\tan \) is a strictly monotone increasing function in \(\theta \), Theorem 2 readily follows.

5 Application to infeasible interior-point algorithms

The analysis in the previous section indicates that the limiting common optimal value of \(\mathbf{P}(t\alpha , t\beta )\) and \(\mathbf{D}(t\alpha , t\beta )\) exists as \(t\rightarrow 0\) and the value is between \(v(\mathbf{D})\) and \(v(\mathbf{P})\). In this section, we discuss an application to the convergence analysis of infeasible primal-dual interior-point algorithms.

While the efficiency of infeasible interior-point algorithms is supported by a powerful polynomial-convergence analysis when applied to a primal-dual strongly feasible problems, its behavior for singular problems was not clear. Our analysis leads to a clearer picture about what happens when infeasible interior-point algorithms are applied to arbitrary SDP problems. As indicated in Sect. 3.2, we focus on two polynomial-time algorithms by Zhang [48] and Potra and Sheng [32], but the idea and the analysis can be applied to many other variants.

Suppose that \({{\hat{X}}}\) is a solution to \(A(X) = b\), \(({{\hat{S}}}, {{\hat{y}}})\) is a solution to \(S = C - \sum _i A_i y_i\), and let

$$\begin{aligned}(X^0, S^0, y^0) \equiv ({{\hat{X}}} + \rho \sin \theta I_p, {{\hat{S}}} + \rho \cos \theta I_d,0 ),\end{aligned}$$

where \(\theta \in (0, \pi /2)\) and \(\rho > 0\) is sufficiently large so that \(X^0\succ 0\) and \(S^0\succ 0\) hold. This is an interior feasible point to the primal-dual pair \(\mathbf{P}(\rho \cos \theta , \rho \sin \theta )\) and \(\mathbf{D}(\rho \cos \theta , \rho \sin \theta )\), see (2) and (3). In the following, we analyze infeasible primal-dual interior-point algorithms started from this point.

For simplicity of notation, we let \(\alpha \equiv \cos \theta \) and \(\beta \equiv \sin \theta \). As discussed in Sect. 3.2.1, in particular as stated in Proposition 1, the infeasible primal-dual interior-point algorithms we are considering generate a sequence \((X^k, S^k, y^k)\) of interior feasible points to the perturbed system

$$\begin{aligned} C + t^k \alpha I_d- \sum _i A_i y_i^k= S^k,\ \ \ A (X^k - t^k \beta I_p)=b,\ \ X^k\succeq 0,\ \ S^k \succeq 0. \end{aligned}$$
(37)

for \(t^k \ge 0\). We define

$$\begin{aligned} (C + t^k \alpha I_d)\bullet X\ \ \ \hbox {and}\ \ \ \sum _i (b_i+ t^k\beta A_i\bullet I_p) y_i^k \end{aligned}$$
(38)

as the modified primal objective function and the modified dual objective function, respectively. If \((X^k, S^k, y^k,t^k)\) is a sequence satisfying (37) for every k and \(t^k\downarrow 0\), then it is an asymptotically pd-feasible sequence in the sense that \(X^k\), \(S^k\) satisfy the conic constraints of \(\mathbf{P}\) and \(\mathbf{D}\) and the distance between \((X^k,S^k,y^k)\) and the set of solutions to the linear constraints of \(\mathbf{P}\) and \(\mathbf{D}\) goes to 0 as \(k \rightarrow \infty \).Footnote 2

Now we are ready to describe and prove our first result on infeasible interior-point algorithms.

Theorem 5

Suppose that \({{\hat{X}}}\) is a solution to \(A({{\hat{X}}})=b\), \(({{\hat{S}}}, {{\hat{y}}})\) is a solution to \(C - \sum _i A_i y_i = S\), and let \((X^0, S^0, y^0) \equiv ({{\hat{X}}} + \rho \sin \theta I_p, {{\hat{S}}} + \rho \cos \theta I_d,0 )\), where \(\theta \in (0, \pi /2)\) and \(\rho > 0\) is sufficiently large so that \(X^0\succ 0\) and \(S^0\succ 0\) hold. Also, let \(t^0\equiv 1\). Apply the algorithm Algorithm-B of [48] or Algorithm 2.1 of [32] to solve P and D, and let \(\{(X^k, S^k, y^k, t^k)\}\) be the generated sequence. Then the following statements hold.

  1. 1.

    \(t^k\rightarrow 0\) and \(X^k\bullet S^k \rightarrow 0\) hold if and only if P and D are asymptotically pd-feasible, namely, the algorithms generate an asymptotically pd-feasible sequence with duality gap converging to zero if and only if P and D are asymptotically pd-feasible. See the remark after the proof of the theorem for the behavior of the algorithms when P and D are not asymptotically pd-feasible.

  2. 2.

    If the problem is asymptotically pd-feasible, then the generated sequence of the modified primal and dual objective values (38) converges to the value \({v_a}(\theta ) \in [v(\mathbf{D}), v(\mathbf{P})]\). Here, we include the possibility that \({v_a}(\theta ) =+\infty \) and \({v_a}(\theta ) =-\infty \), interpreting them as divergence to \(+ \infty \) and \(-\infty \), respectively.

  3. 3.

    In item 2., as \(\theta \) gets closer to 0 the limiting modified objective values of the infeasible primal-dual algorithm get closer to the primal optimal value \(v(\mathbf{P})\) of the original problem. As \(\theta \) gets closer to \(\pi /2\) the limiting modified objective value gets closer to the dual optimal value \(v(\mathbf{D})\).

Proof

First, we discuss item 1. If \(\{(X^k, S^k, y^k, t^k)\}\) is an asymptotically pd-feasible sequence, then \(\mathbf{P}\) and \(\mathbf{D}\) must be asymptotically pd-feasible. Next, we take a look at the converse. In the analysis conducted in [32, 48], although both papers assume the existence of a solution to (16), in fact, the existence of a solution is not necessary for showing convergence of \(t^k\) and \(X^k \bullet S^k\) to zero under asymptotic pd-feasibility. Under asymptotic pd-feasibility, for any \(t> 0\) the perturbed problems are strongly feasible. This is enough for showing \(t^k\rightarrow 0\) and \(X^k\bullet S^k\rightarrow 0\) in these algorithms. We give more details of the proof in Appendix B.

Now we prove items 2. and 3. The following relations hold at the k-th iteration:

$$\begin{aligned}&(C+t^k \alpha I_d)\bullet X^k - \sum _i (b_i + t^k\beta A_i \bullet I_p) y_i^k = X^k\bullet S^k. \end{aligned}$$
(39)
$$\begin{aligned}&v(t^k\alpha , t^k\beta ) \in \left[ \sum _i (b_i + t^k\beta A_i \bullet I_p) y_i^k , (C+t^k \alpha I_d)\bullet X^k \right] \end{aligned}$$
(40)

(See also (29) and (30) for the derivation of these relations.)

Then it follows from (39), (40) and \(X^k\bullet S^k \rightarrow 0\) that the sets of accumulation points of \(\{(C+t^k \alpha I_d)\bullet X^k\}\), \(\{v(t^k\alpha , t^k\beta )\}\), and \(\{ \sum (b_i +t^k \beta A_i \bullet I_p) y_i^k\}\) coincide. Since \(t^k\rightarrow 0\), this implies that \( v(t^k\alpha , t^k\beta )=v(t^k \cos \theta , t^k \sin \theta ) \) converges to \({v_a}(\theta )\). Then the sequences of the modified objective functions (38) also converge to \({v_a}(\theta )\). \(\square \)

Remark

When P and D are not pd-asymptotically feasible, \(\lim _{k\rightarrow \infty } t^k\) is positive for both algorithms [32, 48]. But the behavior of the duality gap \(X^k\bullet S^k\) is a bit different. In the case of Zhang’s algorithm, the sequence of \(X^k\bullet S^k\) also converges to a positive value as well, but in the case of Potra and Sheng’s algorithm, what we can say is that \(\mathrm{liminf}\ X^k\bullet S^k\) is positive. This is because the sequence \(X^k\bullet S^k\) is not necessarily monotonically decreasing in Potra and Sheng’s algorithm.

Now we present the last theorem. A typical choice of the initial iterate \((X^0, S^0, y^0)\) for primal-dual infeasible interior-point algorithms is \((X^0, S^0, y^0)=(\rho _0 I, \rho _1 I, 0)\) with \(\rho _0>0\) and \(\rho _1>0\) sufficiently large. This is different from the one adopted in Theorem 5. In concluding this section, we discuss how our results can be adapted to this case.

Let \({{\hat{X}}}\) be a solution to \(A(X)=b\). If we set \(I_p\equiv \rho _0 I - {{\hat{X}}}\) and \(I_d\equiv \rho _1 I - C\) with \(\rho _0\) and \(\rho _1\) sufficiently large so that \(I_p\succ 0\) and \(I_d\succ 0\) hold, \((X^0, S^0, y^0)\) is a feasible solution to \(\mathbf{P}(1, 1)\) and \(\mathbf{D}(1, 1)\). Now, we are ready to apply an argument analogous to the one we developed earlier to derive Theorem 5 with this choice of \(I_p\) and \(I_d\) to obtain the following theorem.

Theorem 6

Let \((X^0, S^0, y^0) \equiv (\rho _0 I, \rho _1 I,0 )\), where \(\rho _0 > 0\) and \(\rho _1 > 0\) are sufficiently large so that \(I_p=\rho _0 I- {{\hat{X}}} \succ 0\) and \(I_d=\rho _1 I- C \succ 0\) hold, where \({{\hat{X}}}\) is a solution to \(A(X) = b\). Apply the algorithm Algorithm-B of [48] or Algorithm 2.1 of [32] with the initial iterate \((X^0, S^0, y^0)\) and \(t^0=1\), and let \(\{(X^k, S^k, y^k, t^k)\}\) be the generated sequence. Then the following statements hold:

  1. 1.

    \(t^k\rightarrow 0\) and \(X^k S^k \rightarrow 0\) hold if and only if P and D are asymptotically pd-feasible, namely, the algorithm generates an asymptotically pd-feasible sequence with duality gap converging to zero if and only if P and D are asymptotically pd-feasible. If P and D are not asymptotically pd-feasible, then the same remark after Theorem 5 holds.

  2. 2.

    If the problem is asymptotically-pd feasible, then the generated sequence of the modified primal and dual objective values (38) converges to a value \({v_a}(\pi /4) \in [v(\mathbf{D}), v(\mathbf{P})]\). Here, we include the possibility that \({v_a}(\pi /4) =+\infty \) and \({v_a}(\pi /4) =-\infty \), interpreting them as divergence to \(+ \infty \) and \(-\infty \), respectively.

6 Examples

In this section, we present three examples with nonzero duality gaps to illustrate Theorems 1 and 2. The optimal values of P and D are both finite in Example 1, the optimal value of P is finite but D is weakly infeasible in Example 2, and both problems are weakly infeasible in Example 3. In the latter two cases the duality gaps are infinity.

Example 1

We start with a simple instance with a finite nonzero duality gap taken from Ramana’s famous paper [34]. The following problem has a duality gap of one.

The problem D is

$$\begin{aligned} \max \ y_1\ \hbox {s.t.}\ \ \left( \begin{array}{ccc} 1 -y_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad -y_2 &{}\quad -y_1 \\ 0 &{}\quad -y_1 &{}\quad 0 \end{array}\right) \succeq 0. \end{aligned}$$

With that, we have

$$\begin{aligned} C= \left( \begin{array}{ccc} 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ A_1= \left( \begin{array}{ccc} 1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 1 &{}\quad 0 \end{array}\right) ,\ \ \ A_2= \left( \begin{array}{ccc} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ b_1=1. \end{aligned}$$

The optimal value \(v(\mathbf{D})= 0\) for this problem, since \(y_1=0\) is the only possible value for the lower-right \(2\times 2\) submatrix to be positive semidefinite.

The associated primal P is

$$\begin{aligned} \min \ x_{11}\ \ \hbox {s.t.}\ x_{11}+ 2 x_{23} = 1, \ x_{22}=0,\ \left( \begin{array}{ccc} x_{11} &{}\quad x_{12} &{}\quad x_{13}\\ x_{12} &{}\quad x_{22} &{}\quad x_{23}\\ x_{13} &{}\quad x_{23} &{}\quad x_{33}\end{array}\right) \succeq 0. \end{aligned}$$

The optimal value \(v(\mathbf{P})=1\) for this problem, since \(x_{23}=0\) must hold for positive semidefiniteness of the lower-right \(2\times 2\) submatrix, which drives \(x_{11}\) to be 1.

Now we consider the problem \(\mathbf{D}(\varepsilon ,\eta )\)

$$\begin{aligned} \max \ (1+\eta ) y_1 + \eta y_2\ \ \hbox {s.t.}\ \left( \begin{array}{ccc} 1+\varepsilon -y_1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad \varepsilon -y_2 &{}\quad -y_1 \\ 0 &{}\quad -y_1 &{}\quad \varepsilon \end{array}\right) \succeq 0. \end{aligned}$$

This is equivalent to

$$\begin{aligned} \max \ (1+\eta ) y_1 + \eta y_2\ \ \hbox {s.t.}\ 1+\varepsilon -y_1 \ge 0,\ \ \ \varepsilon (\varepsilon - y_2) - y_1^2 \ge 0. \end{aligned}$$

Since the objective is linear, there is an optimal solution such that at least one of the inequality constraints is active. Taking into account that the second constraint is quadratic, we analyze the following three subproblems, and take the maximum of them.

$$\begin{aligned}&\mathrm{(Case\ 1)}\ \max \ (1+\eta ) y_1 + \eta y_2\ \ \hbox {s.t.}\ 1+\varepsilon -y_1 = 0,\ \ \ \varepsilon (\varepsilon - y_2) - y_1^2 \ge 0.\\&\mathrm{(Case\ 2)}\ \max \ (1+\eta ) y_1 + \eta y_2\ \ \hbox {s.t.}\ 1+\varepsilon -y_1 \ge 0,\ \ \ y_1=\sqrt{\varepsilon (\varepsilon - y_2)}.\\&\mathrm{(Case\ 3)}\ \max \ (1+\eta ) y_1 + \eta y_2\ \ \hbox {s.t.}\ 1+\varepsilon -y_1 \ge 0,\ \ \ y_1=-\sqrt{\varepsilon (\varepsilon - y_2) }. \end{aligned}$$

(Case 1)

In this case, the second constraint yields

$$\begin{aligned} \varepsilon - \frac{(1+\varepsilon )^2}{\varepsilon } \ge y_2. \end{aligned}$$

Together with \(y_1=1+\varepsilon \), the problem reduces to a linear program, and it follows that the maximum is

$$\begin{aligned} v_1(\varepsilon ,\eta )\equiv (1+\eta )(1+\varepsilon )+\eta \varepsilon - \frac{\eta (1+\varepsilon )^2}{\varepsilon }. \end{aligned}$$

(Case 2)

Under this condition, the objective function is written as

$$\begin{aligned} f(y_2)\equiv (1+\eta )\sqrt{\varepsilon (\varepsilon -y_2)}+\eta y_2. \end{aligned}$$

By computing the derivative, we see that the function takes the unique maximum at

$$\begin{aligned} y_2 = \varepsilon -\frac{\varepsilon (1+\eta )^2}{4\eta ^2} \end{aligned}$$
(41)

and

$$\begin{aligned} \sqrt{\varepsilon (\varepsilon -y_2)} = \frac{\varepsilon (1+\eta )}{2\eta }. \end{aligned}$$
(42)

Then, we see that

$$\begin{aligned} f(y_2) = \varepsilon \eta + \frac{\varepsilon }{4\eta }(1+\eta )^2. \end{aligned}$$
(43)

But we should recall that this maximum is obtained by ignoring the constraint

$$\begin{aligned} 1+ \varepsilon - y_1 = 1+\varepsilon - \sqrt{\varepsilon (\varepsilon - y_2)}\ge 0. \end{aligned}$$

By substituting (41) and (42) into this constraint, (43) is the maximum only if

$$\begin{aligned} 1+\varepsilon - \frac{\varepsilon (1+\eta )}{2\eta } \ge 0, \hbox {or, equivalently}, \ \frac{2\eta }{1-\eta } \ge \varepsilon \end{aligned}$$
(44)

is satisfied.

If (44) does not hold, then, the maximum of \(f(y_2)\) is taken at the boundary of the constraint \(1+\varepsilon -y_1 \ge 0\), i.e., \(y_2\) satisfying the condition

$$\begin{aligned} 1+\varepsilon = \sqrt{\varepsilon (\varepsilon -y_2)}. \end{aligned}$$

Solving this equation with respect to \(y_2\), we obtain

$$\begin{aligned} y_2=-2-\frac{1}{\varepsilon },\ y_1 = 1+\varepsilon ,\ \ f(y_2) = (1+\eta )(1+\varepsilon )-\eta \left( 2+\frac{1}{\varepsilon }\right) . \end{aligned}$$

In summary, the maximum value in (Case 2) is as follows:

$$\begin{aligned}&v_{2}(\varepsilon ,\eta )\equiv \varepsilon \eta + \frac{\varepsilon }{4\eta }(1+\eta )^2\ \ \hbox {if}\ \frac{2\eta }{1-\eta } \ge \varepsilon , \\&v_2(\varepsilon ,\eta )\equiv (1+\eta )(1+\varepsilon )-\eta \left( 2+\frac{1}{\varepsilon }\right) \ \ \hbox {if}\ \frac{2\eta }{1-\eta } \le \varepsilon . \end{aligned}$$

(Case 3)

In this case, \(1+\varepsilon - y_1 \ge 0\) holds trivially. Therefore, the maximization problem in this case is

$$\begin{aligned} \max -(1+\eta )\sqrt{\varepsilon (\varepsilon -y_2)}+\eta y_2 \end{aligned}$$

under the condition that \(y_2 \le \varepsilon \). The function is monotone increasing, so that the maximum is attained when \(y_2 = \varepsilon \) and the maximum value is

$$\begin{aligned} v_3(\varepsilon ,\eta )\equiv \eta \varepsilon . \end{aligned}$$

Now we are ready to combine the three results to complete the evaluation of \({{\tilde{v}}}\) and \({v_a}\). By letting \(\varepsilon = t\alpha \), \(\eta =t\beta \) with \(t> 0\) and letting \(t\downarrow 0\), we see that

(Case 1) \(\lim _{t\downarrow 0}v_1(t\alpha ,t\beta )=0\).

(Case 2) \(\lim _{t\downarrow 0}v_2(t\alpha ,t\beta )=\frac{\alpha }{4\beta }\) if \(\frac{\beta }{\alpha }\ge \frac{1}{2}\), \(\lim _{t\downarrow 0}v_2(t\alpha ,t\beta )=1-\frac{\beta }{\alpha }\) if \(\frac{\beta }{\alpha }\le \frac{1}{2}\)

(Case 3) \(\lim _{t\downarrow 0}v_3(t\alpha ,t\beta )= 0\).

The maximum among the three corresponds to \({{\tilde{v}}}\). Comparing the three, we see that (Case 2) always is the maximum. This means

$$\begin{aligned} {{\tilde{v}}}(\beta )=1 - \beta \ (\beta \in [0,\frac{1}{2}]),\ \ \ \tilde{v}(\beta )=\frac{1}{4\beta }\ (\beta \in [\frac{1}{2}, \infty )),\ \ \tilde{v}(\infty )=0. \end{aligned}$$

Example 2

The next example is such that D is weakly infeasible but P is weakly feasible and has a finite optimal value.

The problem D is

$$\begin{aligned}&\max \ -y_1\ \ \hbox {s.t.}\ \left( \begin{array}{ccc} y_2 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad y_1 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 \end{array}\right) \succeq 0. \\&C= \left( \begin{array}{ccc} 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 0 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ A_1= \left( \begin{array}{ccc} 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad -1 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ A_2= \left( \begin{array}{ccc} -1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{array}\right) , \ \ \ b_1=-1. \end{aligned}$$

This system is weakly infeasible, so \(v(\mathbf{D})=-\infty \).

The associated primal P is

$$\begin{aligned} \min \ 2x_{13}\ \ \hbox {s.t.}\ x_{11} = 0, \ x_{22}=1,\ \left( \begin{array}{ccc} x_{11} &{}\quad x_{12} &{}\quad x_{13}\\ x_{12} &{}\quad x_{22} &{}\quad x_{23}\\ x_{13} &{}\quad x_{23} &{}\quad x_{33}\end{array}\right) \succeq 0. \end{aligned}$$

The optimal value \(v(\mathbf{P})=0\) for this problem, since \(x_{13}=0\) must hold for feasibility.

Now we consider the problem \(\mathbf{D}(\varepsilon ,\eta )\)

$$\begin{aligned} \max \ -(1+\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \left( \begin{array}{ccc} y_2+\varepsilon &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad y_1+\varepsilon &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad \varepsilon \end{array}\right) \succeq 0. \end{aligned}$$

It follows that

$$\begin{aligned} y_1 \ge -\varepsilon ,\ \ \ y_2 \ge \frac{1-\varepsilon ^2}{\varepsilon }. \end{aligned}$$

Therefore, we see that the maximum value is

$$\begin{aligned} v(\varepsilon ,\eta ) = (1+\eta )\varepsilon - \frac{1-\varepsilon ^2}{\varepsilon }\eta . \end{aligned}$$

Now we are ready to evaluate \({{\tilde{v}}}\) and \({v_a}\). By letting \(\varepsilon = t\alpha \), \(\eta =t\beta \) with \(t> 0\) and letting \(t\downarrow 0\), we see that

$$\begin{aligned} \lim _{t\downarrow 0} v(t\alpha ,t\beta )=-\frac{\beta }{\alpha }. \end{aligned}$$

and

$$\begin{aligned} {{\tilde{v}}}(\beta )=-\beta \ (\beta \in [0,\infty ]). \end{aligned}$$

Finally, we deal with a pathological case where both primal and dual are weakly infeasible.

Example 3

The problem D is

$$\begin{aligned}&\max \ y_1\ \ \hbox {s.t.}\ \left( \begin{array}{ccc} y_2 &{}\quad 0 &{}\quad 1+\frac{1}{2}y_1 \\ 0 &{}\quad 1+y_1 &{}\quad 0 \\ 1+\frac{1}{2}y_1 &{}\quad 0 &{}\quad 0 \end{array}\right) \succeq 0. \\&C= \left( \begin{array}{ccc} 0 &{}\quad 0 &{}\quad 1 \\ 0 &{}\quad 1 &{}\quad 0 \\ 1 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ A_1= \left( \begin{array}{ccc} 0 &{}\quad 0 &{}\quad -\frac{1}{2} \\ 0 &{}\quad -1 &{}\quad 0 \\ -\frac{1}{2} &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ A_2= \left( \begin{array}{ccc} -1 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \\ 0 &{}\quad 0 &{}\quad 0 \end{array}\right) ,\ \ \ b_1=1. \end{aligned}$$

The optimal value \(v(\mathbf{D})= -\infty \) for this problem, since \(y_1=-2\) should hold for feasibility, but then the (2,2) element becomes \(-1\) and, therefore, the matrix cannot be feasible. By letting \(y_2\) large and \(y_1=0\), we confirm the problem is weakly infeasible.

The associated primal P is

$$\begin{aligned} \min \ 2x_{13}+x_{22}\ \ \hbox {s.t.}\ x_{13}+ x_{22} = -1, \ x_{11}=0,\ \left( \begin{array}{ccc} x_{11} &{}\quad x_{12} &{}\quad x_{13}\\ x_{12} &{}\quad x_{22} &{}\quad x_{23}\\ x_{13} &{}\quad x_{23} &{}\quad x_{33}\end{array}\right) \succeq 0. \end{aligned}$$

This problem is weakly infeasible.

Now we consider the problem \(\mathbf{D}(\varepsilon ,\eta )\)

$$\begin{aligned} \max \ (1-\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \left( \begin{array}{ccc} \varepsilon +y_1 &{}\quad 0 &{}\quad 1+\frac{1}{2} y_1 \\ 0 &{}\quad 1+\varepsilon +y_1 &{}\quad 0 \\ 1+\frac{1}{2} y_1 &{}\quad 0 &{}\quad \varepsilon \end{array}\right) \succeq 0. \end{aligned}$$

This is equivalent to

$$\begin{aligned}&\max \ (1-\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \varepsilon +y_2 \ge 0, \ \ \varepsilon (\varepsilon + y_2) - \left( 1+\frac{1}{2}y_1\right) ^2 \ge 0, \\&\quad 1+\varepsilon +y_1 \ge 0. \end{aligned}$$

Since the objective is linear, there is an optimal solution such that at least one of the inequality constraints is active. Taking into account that the second constraint is quadratic, we analyze the following three subproblems and take the maximum of them.

$$\begin{aligned} \mathrm{(Case\ 1)}&\max \ (1-\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \varepsilon +y_2 = 0, \ \varepsilon (\varepsilon + y_2) - \left( 1+\frac{1}{2}y_1\right) ^2 \ge 0, \\&1+\varepsilon +y_1 \ge 0. \\ \mathrm{(Case\ 2)}&\max \ (1-\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \varepsilon +y_2 \ge 0, \ \varepsilon (\varepsilon + y_2) - \left( 1+\frac{1}{2}y_1\right) ^2 = 0, \\&1+\varepsilon +y_1 \ge 0. \\ \mathrm{(Case\ 3)}&\max \ (1-\eta ) y_1 - \eta y_2\ \ \hbox {s.t.}\ \varepsilon +y_2 \ge 0, \ \varepsilon (\varepsilon + y_2) - \left( 1+\frac{1}{2}y_1\right) ^2 \ge 0, \\&1+\varepsilon +y_1 = 0. \end{aligned}$$

(Case 1)

In this case, we have \(y_2=-\varepsilon \), \(y_1=-2\). Then the third constraint becomes \(\varepsilon -1 \ge 0\). Since we are interested in the situation where \(\varepsilon \) is approaching zero, we may exclude this case.

(Case 2)

In this case, we have

$$\begin{aligned} \varepsilon (\varepsilon +y_2)=\left( 1+\frac{1}{2}y_1\right) ^2. \end{aligned}$$

This implies that

$$\begin{aligned} y_1 = 2\left( -1\pm \sqrt{\varepsilon (\varepsilon +y_2)} \right) . \end{aligned}$$

Since the condition \(1+\varepsilon +y_1 \ge 0\) yields

$$\begin{aligned} \pm \sqrt{\varepsilon (\varepsilon +y_2)} \ge 1 - \varepsilon , \end{aligned}$$

choosing ‘-’ sign is not compatible with our analysis since we are interested in the case where \(\varepsilon \) is close to zero. Therefore, we pick ‘+’ sign, and seek for the maximum of the objective function

$$\begin{aligned} 2(1-\eta )\left( -1+\sqrt{\varepsilon (\varepsilon +y_2)}\right) -\eta y_2. \end{aligned}$$

By differentiation, we see that the function attains its maximum at

$$\begin{aligned} y_1=2\left( -1+\frac{\varepsilon (1-\eta )}{\eta }\right) ,\ \ \ y_2=\frac{\varepsilon }{\eta ^2}(1-2\eta ). \end{aligned}$$

We see that the first constraint is always satisfied at the maximum. The third constraint \(1+y_1 + \varepsilon \ge 0\) is satisfied if

$$\begin{aligned} \frac{\varepsilon }{\eta } \ge \frac{1+\varepsilon }{2}. \end{aligned}$$

If this condition is not satisfied, then \(1+y_1+\varepsilon =0\) holds at the maximum, so, we can leave the analysis to the third case. Substituting \(y_1, y_2\) to the objective, we conclude that, if \(\varepsilon /\eta \ge 1\), then, the maximum is

$$\begin{aligned} v_2(\varepsilon ,\eta ) \equiv 2(1-\eta )\left( -1-\varepsilon +\frac{\varepsilon }{\eta }\right) -\frac{\varepsilon }{\eta }+2\varepsilon , \end{aligned}$$

and if the aforementioned condition is not satisfied, then, we can leave the analysis to the third case below.

(Case 3)

We have \(y_1=-1-\varepsilon \). After simple manipulation, we see that other two inequalities are satisfied iff

$$\begin{aligned} y_2\ge \frac{1}{\varepsilon }\left( \frac{1-\varepsilon }{2}\right) ^2 - \varepsilon . \end{aligned}$$

Therefore, the maximum is

$$\begin{aligned} v_3(\varepsilon ,\eta )\equiv -(1-\eta )(1+\varepsilon )-\frac{\eta }{\varepsilon }\left( \frac{1-\varepsilon }{2}\right) ^2 + \varepsilon \eta . \end{aligned}$$

Now we are ready to combine the three results to complete evaluation of \({{\tilde{v}}}\) and \({v_a}\). By letting \(\varepsilon = t\alpha \), \(\eta =t\beta \) with \(t> 0\) and letting \(t\downarrow 0\), we see that

(Case 1) Cannot occur.

(Case 2) \(\lim _{t\downarrow 0}v_2(t\alpha ,t\beta )=-2+\frac{\alpha }{\beta }\) if \(\frac{\alpha }{\beta }\ge \frac{1}{2}\).

(Case 3) \(\lim _{t\downarrow 0}v_3(t\alpha ,t\beta )= -1-\frac{1}{4}\frac{\beta }{\alpha }\).

The maximum between the latter two corresponds to \({{\tilde{v}}}\). Thus, we obtain that

$$\begin{aligned} {{\tilde{v}}}(\beta )=-2+\frac{1}{\beta }\ (\beta \in [0,2]),\ \ \ \tilde{v}(\beta )=-1-\frac{\beta }{4}\ (\beta \in [2, \infty ]), \end{aligned}$$

where we used the convention \(1/0 =\infty \).

7 Concluding discussion

In this paper, we developed a perturbation analysis for singular primal-dual semidefinite programs. We assumed that primal and dual problems are asymptotically feasible and added positive definite perturbations to recover strong feasibility. A major innovation was that we considered perturbations of primal and dual problems simultaneously. It was shown that the primal-dual common optimal value of the perturbed problem has a directional limit when the perturbation is reduced to zero along a line. Representing the direction of approach with an angle \(\theta \) between 0 and \(\pi /2\), where the former and latter corresponds to the dual-only perturbation and the primal-only perturbation, respectively, we demonstrated that the limiting objective value is a monotone decreasing function in \(\theta \) which takes the primal optimal value \(v(\mathbf{P})\) at \(\theta =0\) and the dual optimal value \(v(\mathbf{D})\) at \(\theta = \pi /2\). Based on this result, we could show that the modified objective values of the two infeasible primal-dual interior-point algorithms by Zhang and by Potra and Sheng converge to a value between the optimal values of P and D. The modified primal and dual objective functions are easily computed from the current iterate. The development of analogous results for homogeneous self-dual interior-point algorithms and the design of robust infeasible primal-dual interior-point algorithms reflecting the theory developed in this paper are interesting further research topics to explore.