1 Introduction

The problem of approximating a convex set C by a polyhedron in the Hausdorff distance has been studied systematically for at least a century [26] and has a variety of applications in mathematical programming. These include algorithms for convex optimization problems that approximate the feasible region by a sequence of polyhedra [4, 19] or solution concepts for convex vector optimization problems [10, 23, 31]. Moreover, there are multiple algorithms for mixed-integer convex optimzation problems that are based on polyhedral outer approximations, see [7, 21, 38]. Interest in this problem is fueled by the fact that polyhedra have a simple structure in the sense that they can be described by finitely many points and directions. This finite structure makes computations with polyhedra more viable than with general convex sets. Hence, it is desirable to work with polyhedra that approximate some complicated set well. If the set C to be approximated is assumed to be compact, then numerous theoretical results are available. This includes asymptotic [11, 32] and explicit [3] bounds on the number of vertices that a polyhedron needs to have in order to approximate C to within a prescribed accuracy. Moreover, iterative algorithms, so-called augmenting and cutting schemes, for the polyhedral approximation of convex bodies are known, see [16]. Some convergence properties are discussed in [17] and [18]. An overview about combinatorial aspects of the approximation of convex bodies is collected in the survey article by Bronshteĭn [2].

If boundedness of the set C is not assumed, the amount of literature on the problem is scarce, although there are various applications where unbounded convex sets arise naturally. In convex vector optimization, for example, it is known that the so-called extended feasible image or upper image contains the set of nondominated points on its boundary, see, e.g., [10, 22]. Due to its geometric properties, it is advantageous to work with this unbounded set instead of with the feasible image itself. Another application is in large deviations theory, which, generally speaking, is the study of the asymptotic behaviour of tails of sequences of probability distributions, see [36]. Under certain conditions, bounds for this behaviour can be obtained in terms of rate functions of probability distributions. In [28], such bounds are obtained under the condition that the level sets of a specific convex rate function can be approximated by polyhedra. Moreover, the authors of [25] generalize the algorithm in [7] and consider mixed-integer convex optimization problems, whose feasible region is not necessarily bounded. The problems are solved by computing polyhedral outer approximations of the feasible region in such a fashion that reaching a globally optimal solution is guaranteed.

The most notable result about the polyhedral approximation of C is due to Ney and Robinson [28] who give a characterization of the class of sets that can be approximated arbitrarily well by polyhedra in the Hausdorff distance. However, this class is relatively small as restrictive assumptions on the recession cone of C have to be made, such as polyhedrality. The reason boils down to the fact that the Hausdorff distance is seldom suitable to measure the similarity between unbounded sets. In fact, the Hausdorff distance between closed and convex sets is finite only if the recession cones of the sets are equal, see Proposition 3.2 in Section 3. Due to this difficulty, additional assumptions about the structure of the problem have to be made in each of the aforementioned applications. These include polyhedrality of the ordering cone or boundedness of the problem in convex vector optimization, see, e.g., [8, 23], polyhedrality of a cone generated by the rate function in large deviations theory, or strong duality when dealing with convex optimization problems. In 2018, Ulus [35] characterized the tractability of convex vector optimization problems in terms of polyhedral approximations. One important necessary condition is the so-called self-boundedness of the problem.

Considering the facts mentioned, polyhedral approximation of unbounded convex sets requires a notion that does not solely rely on the Hausdorff distance. To this end, our main contribution is the introduction of the notion of \(\left( \varepsilon , \delta \right) \)-approximation for closed convex sets C that do not contain lines. One feature of \(\left( \varepsilon , \delta \right) \)-approximations is that the recession cones of the involved sets play an important role. We show that \(\left( \varepsilon , \delta \right) \)-approximations define a meaningful notion of polyhedral approximation in the sense that a sequence of approximations converges to the set C as \(\varepsilon \) and \(\delta \) diminish. This convergence is achieved in the sense of Painlevé–Kuratowski, who define convergence for sequences of sets, see [30]. Moreover, we present an algorithm for the computation of \(\left( \varepsilon , \delta \right) \)-approximations when the set C is a spectrahedron, i.e., defined by a linear matrix inequality. We also prove correctness and finiteness of the algorithm. Its main purpose, however, is to show that \(\left( \varepsilon , \delta \right) \)-approximations can be constructed in finitely many steps theoretically.

This article is organized as follows. In the next section, we introduce the necessary notation and provide definitions. In Section 3, we compare the results by Ney and Robinson [28] with the results by Ulus [35] and put them in relation. In particular, we show that self-boundedness is a special case of the property that the excess of a set over its own recession cone is finite. The concept of \(\left( \varepsilon , \delta \right) \)-approximations is introduced in Section 4. We prove a bound on the Hausdorff distance between truncations of an \(\left( \varepsilon , \delta \right) \)-approximation and truncations of C. The main result is Theorem 4.2. It states that a sequence of \(\left( \varepsilon , \delta \right) \)-approximations of C converges to C in the sense of Painlevé–Kuratowski as \(\varepsilon \) and \(\delta \) tend to zero. In the last section, we present the aforementioned algorithm and prove correctness and finiteness as well as illustrate it with two examples.

2 Preliminaries

Throughout this article, we denote by \({{\,\mathrm{cl}\,}}C\), \({{\,\mathrm{int}\,}}C\), \(0^+ {C}\), \({{\,\mathrm{conv}\,}}C\), and \({{\,\mathrm{cone}\,}}C\) the closure, interior, recession cone, convex hull and conical hull of a set C, respectively. A compact convex set with nonempty interior is called a convex body. The Euclidean ball with radius r centred at a point \({c \in {\mathbb {R}}^n}\) is denoted by \(B_r(c)\). A point c of a convex set C is called an extreme point of C, if \(C \setminus \{c\}\) is convex. Extreme points are exactly the points of C that cannot be written as a proper convex combination of elements of C [29, p. 162]. For finite sets \(V, D \subseteq {\mathbb {R}}^n\), the set

$$\begin{aligned} P = {{\,\mathrm{conv}\,}}V + {{\,\mathrm{cone}\,}}D \end{aligned}$$
(1)

is called a polyhedron. The plus sign denotes Minkowski addition. The sets VD in (1) are called a V-representation of P as it is expressed in terms of its vertices and directions. A polyhedron can equivalently be expressed as a finite intersection of closed halfspaces [29, Theorem 19.1], i.e.,

$$\begin{aligned} P = \{x \in {\mathbb {R}}^n \mid Ax \le b\} \end{aligned}$$
(2)

for a matrix \(A \in {\mathbb {R}}^{m \times n}\) and a vector \(b \in {\mathbb {R}}^m\). The data (Ab) are called a H-representation of P. The extreme points of a polyhedron P are called vertices of P and are denoted by \({{\,\mathrm{vert}\,}}P\). For symmetric matrices \(A_0,A_1,\dots ,A_n\) of arbitrary fixed size, we define

$$\begin{aligned} {\bar{{\mathcal {A}}}}(x) := \sum _{i=1}^n x_iA_i \quad \text {and} \quad {\mathcal {A}}(x) := {\bar{{\mathcal {A}}}}(x) + A_0, \end{aligned}$$
(3)

i.e., a linear combination of the \(A_i\) and a translation of \({\bar{{\mathcal {A}}}}(x)\) by \(A_0\), respectively. We denote by \(A \succ 0\) (\(A \succcurlyeq 0\)) positive (semi-)definiteness of the symmetric matrix A. A set of the form \(\{x \in {\mathbb {R}}^n \mid {\mathcal {A}}(x) \succcurlyeq 0\}\) is called a spectrahedron. Spectrahedra are a generalization of polyhedra for which many geometric properties of polyhedra generalize nicely, e.g., the recession cone of a spectrahedron C is obtained as \(\{x \in {\mathbb {R}}^n \mid {\bar{{\mathcal {A}}}}(x) \succcurlyeq 0 \}\), see [12], whereas the recession cone of a polyhedron in H-representation is \(\{x \in {\mathbb {R}}^n \mid Ax \le 0\}\). Given a cone K the set \({K}^{\circ } = \{y \in {\mathbb {R}}^n \mid \forall x \in K: x^{{\mathsf {T}}}y \le 0\}\) is called the polar cone of K. The polar \({(0^+ {C})}^{\circ }\) of the recession cone of a spectrahedron C is computed as

$$\begin{aligned} {(0^+ {C})}^{\circ } = {{\,\mathrm{cl}\,}}\left\{ \left( -A_1 \cdot X, \dots , -A_n \cdot X \right) ^{{\mathsf {T}}}\mid X \succcurlyeq 0 \right\} , \end{aligned}$$
(4)

where \(A_i \cdot X\) means the trace of the matrix product \(A_iX\), see [12, Section 3]. A cone K is called polyhedral if \({K = {{\,\mathrm{cone}\,}}D}\) for some finite set D and pointed if \({K \cap (-K) = \{0\}}\). A set whose recession cone is pointed is called line-free. It is well known that a closed convex set contains an extreme point if and only if it is line-free, see [29, Corollary 18.5.3]. Given nonempty sets \({C_1, C_2 \subseteq {\mathbb {R}}^n}\), the excess of \(C_1\) over \(C_2\), \({e[{C_1},{C_2}]}\), is defined as

$$\begin{aligned} e[{C_1},{C_2}] = \mathop {\sup }\limits _{c_1 \in C_1} \mathop {\inf }\limits _{c_2 \in C_2} \left\Vert {c_1-c_2} \right\Vert , \end{aligned}$$
(5)

where \(\left\Vert {\cdot } \right\Vert \) denotes the Euclidean norm. The Hausdorff distance between \(C_1\) and \(C_2\), \({d_{{\mathsf {H}}}\left( {C_1},{C_2}\right) }\), is then expressed as

$$\begin{aligned} d_{{\mathsf {H}}}\left( {C_1},{C_2}\right) = \max \{e[{C_1},{C_2}], e[{C_2},{C_1}]\}. \end{aligned}$$
(6)

It is well known that the Hausdorff distance defines a metric on the space of nonempty compact subsets of \({\mathbb {R}}^n\). Between unbounded sets the Hausdorff distance may be infinite. A polyhedron P is called an \(\varepsilon \)-approximation of a convex set C if \(d_{{\mathsf {H}}}\left( {P},{C}\right) \le \varepsilon \).

3 Polyhedral Approximation in the Hausdorff Distance

Every convex body can be approximated arbitrarily well by a polytope in the Hausdorff distance, see, e.g., [32]. Moreover, algorithms for the computation of \(\varepsilon \)-approximations exist for which the convergence rate is known [16]. For the approximation of unbounded convex sets, only the following theorem is known, which provides a characterization of the sets that can be approximated by polyhedra in the Hausdorff distance.

Theorem 3.1

(see [28, Theorem 2.1]) Let \(C \subseteq {\mathbb {R}}^n\) be nonempty and closed. Then the following are equivalent:

  1. (i)

    C is convex, \(0^+ {C}\) is polyhedral, and \(e[{C},{0^+ {C}}] < +\infty \).

  2. (ii)

    There exists a polyhedral cone K such that for every \(\varepsilon > 0\) there exists a finite set \(V \subseteq {\mathbb {R}}^n\) such that \({d_{{\mathsf {H}}}\left( {{{\,\mathrm{conv}\,}}V + K},{C}\right) \le \varepsilon }\).

Further, if (ii) holds then \(K = 0^+ {C}\).

A related result is found in [35], where the approximate solvability of convex vector optimization problems in terms of polyhedral approximations is investigated. In order to state the result and establish the relationship to Theorem 3.1, we need the following definition from [35].

Definition 3.1

A set \(C \subsetneq {\mathbb {R}}^n\), with a nontrivial recession cone is called self-bounded, if there exists \(y \in {\mathbb {R}}^n\) such that \({\{y\}+0^+ {C} \supseteq C}\).

Adjusted to our notation, the mentioned result can be stated as:

Proposition 3.1

(see [35, Proposition 3.7]) Let \(C \subseteq {\mathbb {R}}^n\) be closed and convex. If C is self-bounded, then there exists a finite set \(V \subseteq {\mathbb {R}}^n\) such that \({d_{{\mathsf {H}}}\left( {{{\,\mathrm{conv}\,}}V + 0^+ {C}},{C}\right) \le \varepsilon }\).

If \(0^+ {C}\) is polyhedral, then, clearly, C can be approximated by a polyhedron. The difference to Theorem 3.1 is the self-boundedness of C instead of the finiteness of \({e[{C},{0^+ {C}}]}\). The following theorem points out the connection between these conditions and shows that under an additional assumption, both coincide. The relationships are illustrated in Fig. 1.

Theorem 3.2

Given a nonempty, closed and convex set \(C \subseteq {\mathbb {R}}^n\), consider the statements

  1. (i)

    \(e[{C},{0^+ {C}}] < +\infty \),

  2. (ii)

    C is self-bounded,

  3. (iii)

    There is a compact set \(K \subseteq {\mathbb {R}}^n\) such that \({K + 0^+ {C} \supseteq C}\).

Then the following implications are true: (i) \(\Leftrightarrow \) (iii) and (ii) \(\Rightarrow \) (iii). If, additionally, \(0^+ {C}\) is solid, then (i) – (iii) are equivalent.

Proof

We start with the assertion (i) \(\Rightarrow \) (iii). Let \({M = e[{C},{0^+ {C}}]}\)\({c \in C}\) be arbitrary and \({r_c \in 0^+ {C}}\) such that \({\left\Vert {c-r_c} \right\Vert = \inf _{r \in 0^+ {C}} \left\Vert {c-r} \right\Vert }\). This infimum is uniquely attained, because \(0^+ {C}\) is closed and convex. Then \({\left\Vert {c-r_c} \right\Vert \le M}\) and we conclude \({c = (c-r_c)+r_c \in B_M(0) + 0^+ {C}}\). Therefore, \({B_M(0) + 0^+ {C} \supseteq C}\).

To show (iii) \(\Rightarrow \) (i), let \({K + 0^+ {C} \supseteq C}\) for some compact set K. Then we have

$$\begin{aligned} \begin{aligned} e[{C},{0^+ {C}}]&\le e[{K+0^+ {C}},{0^+ {C}}] \\&= \mathop {\sup }\limits _{\begin{array}{c} k \in K\\ r \in 0^+ {C} \end{array}} \inf _{{\bar{r}} \in 0^+ {C}} \left\Vert {(k+r)-{\bar{r}}} \right\Vert \\&\le \sup _{\begin{array}{c} k \in K\\ r \in 0^+ {C} \end{array}} \left\Vert {(k+r)-r} \right\Vert \\&= \sup _{k \in K} \left\Vert {k} \right\Vert < +\infty . \end{aligned} \end{aligned}$$

The implication (ii) \(\Rightarrow \) (iii) is trivial with \({K = \{y\}}\). For the last part, we show (iii) \(\Rightarrow \) (ii), assuming \(0^+ {C}\) is solid. If \({e[{C},{0^+ {C}}] = 0}\), then \({C \subseteq 0^+ {C}}\) and we can set \({y = 0}\). Now, let \({e[{C},{0^+ {C}}] = M > 0}\) and fix \({c \in {{\,\mathrm{int}\,}}C}\). Then there exists \({\varepsilon > 0}\) such that \({B_{\varepsilon }(c) \subseteq 0^+ {C}}\). We have

$$\begin{aligned} \begin{aligned} B_M\left( \frac{M}{\varepsilon } c\right)&= \frac{M}{\varepsilon } B_{\varepsilon }(c) \\&\subseteq \frac{M}{\varepsilon } 0^+ {C} \\&= 0^+ {C}. \end{aligned} \end{aligned}$$

Therefore, \({\left\{ -\frac{M}{\varepsilon } c \right\} + 0^+ {C} \supseteq B_M(0)}\) and, since \(0^+ {C}\) is convex, \(\left\{ -\frac{M}{\varepsilon } c \right\} + 0^+ {C} \supseteq B_M(0) + 0^+ {C}\). From the first part of the proof, we know that \({B_M(0) + 0^+ {C} \supseteq C}\), which completes the proof. \(\square \)

Fig. 1
figure 1

Illustration of Theorem 3.2. Left: The set C is contained in its own recession cone. Therefore, it is self-bounded and \(e[{C},{0^+ {C}}]=0\). Centre: The excess of C over its recession cone is finite and attained in any of the two vertices. However, C is not self-bounded, because it cannot be contained in a translation of \(0^+ {C}\). Right: A set that is neither self-bounded nor does it hold \(e[{C},{0^+ {C}}]<\infty \). Traversing the parabolic arc, the distance to \(0^+ {C}\) grows without bound

Example 3.1

To see that \({e[{C},{0^+ {C}}] < +\infty }\) does not imply self-boundedness of C unless \(0^+ {C}\) is solid, consider the following counterexample. In \({\mathbb {R}}^2\) let \(C = {{\,\mathrm{conv}\,}}\{\pm e_1\}+ {{\,\mathrm{cone}\,}}\{e_2\}\), where \(e_i\) denotes the ith unit vector. Then one has the equality \(e[{C},{0^+ {C}}]=1\), but C is not self-bounded. The set is illustrated in the centre of Figure 1.

In view of the above result, we suggest calling a set self-bounded, if it satisfies Property (iii). On the one hand, this extends the notion to sets whose recession cone is not solid. And on the other hand, this makes every compact set self-bounded, rather than just singletons. Since in [35] cones are assumed to be solid, Theorem 3.2 proves that a convex vector optimization problem is tractable in terms of polyhedral approximations, if and only if the upper image [35, Equation 6] of the problem satisfies (i) in Theorem 3.1.

The reason that many unbounded convex sets are beyond the scope of polyhedral approximation in the Hausdorff distance is that it is by nature designed to behave nicely only for compact sets. The following proposition specifies this.

Proposition 3.2

For closed and convex sets \(C_1, C_2 \subseteq {\mathbb {R}}^n\), it is true that \(d_{{\mathsf {H}}}\left( {C_1},{C_2}\right) < +\infty \) only if \({0^+ {C_1} = 0^+ {C_2}}\).

Proof

Assume \({0^+ {C_1} \ne 0^+ {C_2}}\) and let w.l.o.g. \({r \in 0^+ {C_1} \setminus 0^+ {C_2}}\). Consider the equivalent definition of the Hausdorff distance:

$$\begin{aligned} d_{{\mathsf {H}}}\left( {C_1},{C_2}\right) = \inf \left\{ \varepsilon > 0 \mid C_1 \subseteq C_2 + B_{\varepsilon }(0), C_2 \subseteq C_1 + B_{\varepsilon }(0) \right\} . \end{aligned}$$

Let \(\varepsilon \) be large enough such that \({C_1 \cap \left( C_2 + B_{\varepsilon }(0)\right) \ne \emptyset }\) and let z be an element of this set. Then \({z +\mu r \in C_1}\) for all \({\mu \ge 0}\). The recession cone of \({C_2+B_e(0)}\) is \(0^+ {C_2}\) according to [29, Proposition 9.1.2]. Therefore, there exists some \(\mu _{\varepsilon }\) such that \({z+\mu r \notin C_2+B_e(0)}\) for all \({\mu \ge \mu _{\varepsilon }}\). This yields \({d_{{\mathsf {H}}}\left( {C_1},{C_2}\right) \ge \varepsilon }\) and the claim follows with \({\varepsilon \rightarrow \infty }\). \(\square \)

4 A Polyhedral Approximation Scheme for Closed Convex Line-Free Sets

We have seen that, in order to approximate a set C by a polyhedron P in the Hausdorff distance, their recession cones need to be identical. Theorems 3.1 and 3.2 tell us that this is achievable only for specific sets C. To treat a larger class of sets, a concept is needed that quantifies similarity between closed convex cones, similar to how the Hausdorff distance quantifies similarity between compact sets.

Definition 4.1

Given nonempty closed convex cones \({K_1, K_2 \subseteq {\mathbb {R}}^n}\), the truncated Hausdorff distance between \(K_1\) and \(K_2\)\({{\bar{d}}_{{\mathsf {H}}}(K_1,K_2)}\), is defined as

$$\begin{aligned} {\bar{d}}_{{\mathsf {H}}}(K_1,K_2) := d_{{\mathsf {H}}}\left( {K_1 \cap B_1(0)},{K_2 \cap B_1(0)}\right) . \end{aligned}$$
(7)

Since every cone contains the origin, it is immediate that \({{\bar{d}}_{{\mathsf {H}}}(K_1,K_2) \le 1}\). The truncated Hausdorff distance defines a metric on the set of closed convex cones in \({\mathbb {R}}^n\), see [13]. However, it is only one way among many to measure the distance between convex cones. We suggest the survey in [15] for a more thorough discussion of the topic. With the truncated Hausdorff distance, we define the following notion of polyhedral approximation of convex sets that are not necessarily bounded.

Definition 4.2

Given a nonempty closed convex and line-free set \({C \subseteq {\mathbb {R}}^n}\), a line-free polyhedron P is called an \({\left( \varepsilon , \delta \right) }\)-approximation of C if

  1. (i)

    \(e[{{{\,\mathrm{vert}\,}}P},{C}] \le \varepsilon \),

  2. (ii)

    \({\bar{d}}_{{\mathsf {H}}}(0^+ {P},0^+ {C}) \le \delta \),

  3. (iii)

    \(P \supseteq C\).

Fig. 2
figure 2

Left: Polyhedron P is an \(\left( \varepsilon , \delta \right) \)-approximation of the grey set C. Right: Recession cones of the sets on the left. The truncated Hausdorff distance between them is at most \(\delta \)

Remark 4.1

The assumption that P is line-free is equivalent to \({{\,\mathrm{vert}\,}}P \ne \emptyset \) and hence required for condition (i) in the definition. Condition (iii) means that P is an outer approximation of C. This is required, because otherwise the roles of P and C would have to be interchanged in (i). However, it is not clear how to proceed with this in a meaningful fashion. The analogue of considering vertices of P would be to consider extreme points of C instead. The set of extreme points of C may be unbounded and it is in general not possible to enforce the upper bound of \(\varepsilon \). Lastly, we decided to make a distinction between \(\varepsilon \) and \(\delta \), because scales of these error measures may be very different depending on the sets, i.e., \(\delta \) is always bounded from above by 1, but for \(\varepsilon \) it may be useful to allow values larger than 1.

Figure 2 illustrates the definition. We will show that an \(\left( \varepsilon , \delta \right) \)-approximation of a set C approximates C in a meaningful way. To this end, we consider the Painlevé–Kuratowski convergence, a notion of set convergence that is suitable for a broader class of sets than convergence with respect to the Hausdorff distance.

Definition 4.3

A sequence \(\{C^{\nu }\}_{\nu \in {\mathbb {N}}}\) of subsets of \({\mathbb {R}}^n\) is said to converge to \({C \subseteq {\mathbb {R}}^n}\) in the sense of Painlevé–Kuratowski, denoted by \({C^{\nu } \rightarrow C}\), if the following equalities hold:

$$\begin{aligned} \begin{aligned} C&= \left\{ x \in {\mathbb {R}}^n \Big | \begin{array}{l} \text {for all open neighbourhoods } U \text { of } x,\\ U \cap C^{\nu } \ne \emptyset \text { for large enough } \nu \end{array} \right\} \\&= \left\{ x \in {\mathbb {R}}^n \Big | \begin{array}{l} \text {for all open neighbourhoods } U \text { of } x,\\ U \cap C^{\nu } \ne \emptyset \text { for infinitely many }\nu \end{array}\right\} . \end{aligned} \end{aligned}$$

To conserve space and enhance readability, we will denote by \(C^{\nu }\) the sequence \(\{C^{\nu }\}_{\nu \in {\mathbb {N}}}\) as well as the specific element \(C^{\nu }\) of this sequence whenever there is no ambiguity. The two sets in the definition are called the inner and outer limit of \(C^{\nu }\), respectively. Convergence in the sense of Painlevé–Kuratowski is weaker than convergence in the Hausdorff distance, but both concepts coincide when restricted to compact subsets, see Example 4.1 and [30, pp. 131–138]. However, for convex sets Painlevé–Kuratowski convergence can be characterized using the Hausdorff distance.

Example 4.1

(see [30, p. 118]) Consider the sequence of sets for which \(C^{\nu } = \left\{ {x,y^{\nu }} \right\} \) for \(x,y^{\nu } \in {\mathbb {R}}^n\) and \(\left\Vert {y^{\nu }} \right\Vert \rightarrow \infty \). Then \(C^{\nu }\) converges in the sense of Painlevé–Kuratowski to the singleton \(C=\left\{ {x} \right\} \), but does not converge in the Hausdorff distance, because \(d_{{\mathsf {H}}}\left( {C^{\nu }},{C}\right) =\left\Vert {x-y^{\nu }} \right\Vert \rightarrow \infty \).

Theorem 4.1

(see [30, p. 120]) A sequence \(C^{\nu }\) of nonempty closed and convex sets converges to C in the sense of Painlevé–Kuratowski if and only if there exist \(x \in {\mathbb {R}}^n\) and \(r_0 \in {\mathbb {R}}\), such that for all \(r \ge r_0\) it holds that

$$\begin{aligned} d_{{\mathsf {H}}}\left( {C^{\nu } \cap B_r(x)},{C \cap B_r(x)}\right) \rightarrow 0. \end{aligned}$$
(8)

In geometric terms, this means that a sequence of nonempty closed and convex sets converges in the sense of Painlevé–Kuratowski if and only if it converges in the Hausdorff distance on every nonempty compact subset. In the remainder of this section we show that \(\left( \varepsilon , \delta \right) \)-approximations provide a meaningful notion of polyhedral approximation for unbounded sets in the sense that a sequence of approximations converges as defined in Definition 4.2 if \(\varepsilon \) and \(\delta \) tend to zero. To this end, we need some preparatory results. The first one yields a bound on the Hausdorff distance between truncations of a set and truncations of an \(\left( \varepsilon , \delta \right) \)-approximation.

Proposition 4.1

Let \(C \subseteq {\mathbb {R}}^n\) be nonempty closed convex and line-free and let P be an \(\left( \varepsilon , \delta \right) \)-approximation of C. Then for every \(x \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P\) and \(r \ge \varepsilon \) it holds true that

$$\begin{aligned} d_{{\mathsf {H}}}\left( {P \cap B_r(x)},{C \cap B_r(x)}\right) \le 2\left( \varepsilon +\delta \left( r+\left\Vert {x-v} \right\Vert \right) \right) \end{aligned}$$
(9)

for some \(v \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P\). In particular, if \(d_{{\mathsf {H}}}\left( {P \cap B_r(x)},{C \cap B_r(x)}\right) \) is attained as \(\left\Vert {p-c} \right\Vert \) with \(p \in P\), then \(p = v + td\) for some \(d \in 0^+ {P}\) and \(t \ge 0\).

Proof

Denote by \({\bar{P}}\), \({\bar{C}}\), and V, \(P \cap B_r(x)\), \(C \cap B_r(x)\), and \({{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P\), respectively. Since \({\bar{P}}, {\bar{C}}\) are nonempty convex and compact, \(d_{{\mathsf {H}}}\left( {{\bar{P}}},{{\bar{C}}}\right) = \left\Vert {p^*-c^*} \right\Vert \) for some \(p^* \in {\bar{P}}\) and \(c^* \in {\bar{C}}\). For \(\uplambda \in [0,1]\) let \(z(\uplambda ) = \uplambda p^* + (1-\uplambda ) x\). We distinguish two cases. First, assume \(p^* \in V\). Then \(z(\uplambda ) \in V\) for every \({\uplambda \in [0,1]}\), and due to (i) in Definition 4.2, there is \({c_{\uplambda } \in {\bar{C}}}\) with \({\left\Vert {z(\uplambda )-c_{\uplambda }} \right\Vert \le \varepsilon }\). If \(\left\Vert {p^*-x} \right\Vert \le \varepsilon \), then

$$\begin{aligned} \left\Vert {p^*-c^*} \right\Vert \le \left\Vert {p^*-c_0} \right\Vert \le \left\Vert {p^*-x} \right\Vert + \left\Vert {x-c_0} \right\Vert \le 2\varepsilon . \end{aligned}$$

If \(\left\Vert {p^*-x} \right\Vert > \varepsilon \), set \({\uplambda }^* = \frac{\varepsilon }{\left\Vert {p^*-x} \right\Vert }\). Similarly, we have

$$\begin{aligned} \left\Vert {p^*-c^*} \right\Vert \le \left\Vert {p^*-c_{{\uplambda }^*}} \right\Vert \le \left\Vert {p^*-z({\uplambda }^*)} \right\Vert + \left\Vert {z({\uplambda }^*)-c_{{\uplambda }^*}} \right\Vert \le 2\varepsilon . \end{aligned}$$

Now, assume \({p^* \notin V}\). Then there exists \({\bar{\uplambda }} \in (0,1)\), such that \(z({\bar{\uplambda }}) \in V\) and \(z(\uplambda ) \notin V\) for all \(\uplambda \in ({\bar{\uplambda }},1]\). For \({\uplambda \in ({\bar{\uplambda }},1]}\), \(z(\uplambda )\) can be written as \({z(\uplambda ) = v_{\uplambda } + t_{\uplambda }d_{\uplambda }}\) with \({v_{\uplambda } = {{\,\mathrm{argmin}\,}}\left\{ \left\Vert {z({\bar{\uplambda }}) - v} \right\Vert \;\Big |\; v \in \left( \{z(\uplambda )\} - 0^+ {P}\right) \cap V \right\} }\), \(t_{\uplambda } \ge 0\) and \(d_{\uplambda } \in 0^+ {P}\), \(\left\Vert {d_{\uplambda }} \right\Vert = 1\). By the definition of \(\left( \varepsilon , \delta \right) \)-approximation, there exist \(c_{\uplambda } \in C\) and \(\bar{d_{\uplambda }} \in 0^+ {C}\) such that \(\left\Vert {v_{\uplambda }-c_{\uplambda }} \right\Vert \le \varepsilon \) and \(\left\Vert {d_{\uplambda }-\bar{d_{\uplambda }}} \right\Vert \le \delta \). Now we have

$$\begin{aligned} \left\Vert {z(\uplambda ) - (c_{\uplambda } + t_{\uplambda }\bar{d_{\uplambda }})} \right\Vert \le \left\Vert {v_{\uplambda }-c_{\uplambda }} \right\Vert + t_{\uplambda } \left\Vert {d_{\uplambda }-\bar{d_{\uplambda }}} \right\Vert \le \varepsilon + t_{\uplambda }\delta . \end{aligned}$$
(10)

Furthermore, \(t_{\uplambda } = \left\Vert {z(\uplambda )-v_{\uplambda }} \right\Vert \), because \(\left\Vert {d_{\uplambda }} \right\Vert = 1\). Hence,

$$\begin{aligned} \begin{aligned} t_{\uplambda }&\le \left\Vert {z(\uplambda )-z({\bar{\uplambda }})} \right\Vert + \left\Vert {z({\bar{\uplambda }}) - v_{\uplambda }} \right\Vert \le \left\Vert {z(\uplambda )-z({\bar{\uplambda }})} \right\Vert + \left\Vert {z({\bar{\uplambda }})-v_1} \right\Vert \\&\le \left\Vert {z(\uplambda )-z({\bar{\uplambda }})} \right\Vert + \left\Vert {z({\bar{\uplambda }})-x} \right\Vert + \left\Vert {x -v_1} \right\Vert \\&= \left\Vert {z(\uplambda )-x} \right\Vert + \left\Vert {x-v_1} \right\Vert \\&\le r + \left\Vert {x-v_1} \right\Vert . \end{aligned} \end{aligned}$$
(11)

If \(\frac{r-\varepsilon }{r+\left\Vert {x-v_1} \right\Vert } < \delta \le 1\), then

$$\begin{aligned} \left\Vert {p^*-c^*} \right\Vert \le \left\Vert {p^*-c_0} \right\Vert \le \left\Vert {p^*-x} \right\Vert + \left\Vert {x-c_0} \right\Vert \le r + \varepsilon \le 2(\varepsilon + \delta (r+\left\Vert {x-v_1} \right\Vert )). \end{aligned}$$

Otherwise, the last inequality is violated. In this case, let \({\uplambda }^* = 1-\frac{\varepsilon + \delta (r\left\Vert {x-v_1} \right\Vert )}{\left\Vert {p^*-x} \right\Vert }\). If \({\uplambda }^* \le {\bar{\uplambda }}\), then \(z({\uplambda }^*) \in V\), and there exists \(c_{{\uplambda }^*} \in C\) with \(\left\Vert {z({\uplambda }^*) - c_{{\uplambda }^*}} \right\Vert \le \varepsilon \). Therefore,

$$\begin{aligned} \left\Vert {p^*-c^*} \right\Vert \le \left\Vert {p^*-z({\uplambda }^*)} \right\Vert + \left\Vert {z({\uplambda }^*)-c_{{\uplambda }^*}} \right\Vert \le 2 \varepsilon + \delta (r+\left\Vert {x-v_1} \right\Vert ). \end{aligned}$$

If \({\uplambda }^* > {\bar{\uplambda }}\) then, according to (10) and (11), there exists \(c \in C\) such that \(\left\Vert {z({\uplambda }^*)-c} \right\Vert \le \varepsilon + \delta (r+\left\Vert {x-v_1} \right\Vert )\). Altogether this yields

$$\begin{aligned} \left\Vert {p^*-c^*} \right\Vert \le \left\Vert {p^*-c} \right\Vert \le \left\Vert {p^*-z({\uplambda }^*)} \right\Vert + \left\Vert {z({\uplambda }^*)-c} \right\Vert \le 2 \left( \varepsilon + \delta (r+\left\Vert {x-v_1} \right\Vert )\right) , \end{aligned}$$

which completes the proof. \(\square \)

We need two more results before we can prove Theorem 4.2.

Lemma 4.1

Let \(C \subseteq {\mathbb {R}}^n\) be nonempty closed and convex and let there be sequences \(v^{\nu }\)\(r^{\nu }\) such that \(\inf _{c \in C} \left\Vert {v^{\nu }-c} \right\Vert \rightarrow 0\), \(\inf _{r \in 0^+ {C}} \left\Vert {r^{\nu }-r} \right\Vert \rightarrow 0\), and \(v^{\nu }+r^{\nu } \in B_M(x)\) for some \(M \ge 0\) and \(x \in {\mathbb {R}}^n\). If C is line-free, then \(\left\Vert {v^{\nu }} \right\Vert \) is bounded.

Proof

Assume that \(\left\Vert {v^{\nu }} \right\Vert \) is unbounded. This implies that \(\left\Vert {r^{\nu }} \right\Vert \) is also unbounded and \(0^+ {C} \ne \{0\}\). Without loss of generality, let \(r^{\nu } \ne 0\) for all \(\nu \). Then \(d^{\nu } := r^{\nu }/\left\Vert {r^{\nu }} \right\Vert \) is bounded and has a convergent subsequence. Without loss of generality, we can assume that \(d^{\nu } \rightarrow d \in 0^+ {C}\). We will show that \(-d \in 0^+ {C}\). Therefore let \(c \in C\), \(t \ge 0\) and define \(y^{\nu } = {{\,\mathrm{argmin}\,}}_{y \in C} \left\Vert {v^{\nu }-y} \right\Vert \). By the triangle inequality, it holds true that

(12)

Note that is bounded from above by some \({\overline{M}}\), because \(\left\Vert {v^{\nu }-y^{\nu }} \right\Vert \rightarrow 0\). For every \(T \ge 0\), there exists some \(\nu _T\), such that \(\left\Vert {r^{\nu _T}} \right\Vert \ge T\). Let \(T \ge t\) and define

$$\begin{aligned} {\bar{y}} := \frac{t}{\left\Vert {r^{\nu _T}} \right\Vert } y^{\nu _T} + \left( 1-\frac{t}{\left\Vert {r^{\nu _T}} \right\Vert } \right) c \in C. \end{aligned}$$

Putting it all together, one gets

where the last inequality holds due to (12) and boundedness of . Since C is closed and \(d^{\nu } \rightarrow d \in 0^+ {C}\), taking the limit \(T \rightarrow +\infty \) yields that \(c-td \in C\). This is a contradiction to the pointedness of \(0^+ {C}\). \(\square \)

Every closed and line-free convex set C can be written as the convex hull of its extreme points plus its recession cone [14, p. 35]. In particular, Lemma 4.1 states that the set of convex combinations of extreme points for which a given point in C can be decomposed in such a fashion is compact. The next result establishes a relation between extreme points of C and the vertices of an \(\left( \varepsilon , \delta \right) \)-approximation.

Proposition 4.2

Let \(C \subseteq {\mathbb {R}}^n\) be nonempty closed convex and line-free. For \(\nu \in {\mathbb {N}}\), let  \(P^{\nu }\) be an \((\varepsilon ^{\nu },\delta ^{\nu })\)-approximation of C. If  \({(\varepsilon ^{\nu },\delta ^{\nu }) \rightarrow (0,0)}\), then for every extreme point c of C there exists a sequence  \({x^{\nu } \rightarrow c}\) such that \(x^{\nu } \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P^{\nu }\).

Proof

Since C is line-free, it has at least one extreme point. Let c be one such extreme point. Assume that for every sequence \(x^{\nu }\) with  \({x^{\nu } \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P^{\nu }}\) there exists a \(\gamma > 0\), such that  \({\left\Vert {x^{\nu }-c} \right\Vert > \gamma }\) for infinitely many \(\nu \). Then, without loss of generality, there exists one such sequence such that  \(\left\Vert {x^{\nu }-c} \right\Vert > \gamma \) for every \(\nu \) and, since \(C \subseteq P^{\nu }\), \(c=x^{\nu }+r^{\nu }\) for some \(r^{\nu } \in 0^+ {P^{\nu }}\). By Lemma 4.1, it holds that \(\left\Vert {x^{\nu }} \right\Vert \) and \(\left\Vert {r^{\nu }} \right\Vert \) are bounded. Then there exist subsequences \(x^{\nu _k}\), \(r^{\nu _k}\) such that

$$\begin{aligned} x^{\nu _k} \rightarrow x \in C, \quad r^{\nu _k} \rightarrow r \in 0^+ {C}. \end{aligned}$$

Note that \(r \ne 0\), because \(\left\Vert {r^{\nu }} \right\Vert > \gamma \) for all \(\nu \). Finally,

$$\begin{aligned} c=x+r=\frac{1}{2}\left( x+2r \right) + \frac{1}{2} x. \end{aligned}$$

This is a contradiction to c being an extreme point of C. \(\square \)

We are now ready to prove the main result.

Theorem 4.2

Let \(C \subseteq {\mathbb {R}}^n\) be nonempty closed convex and line-free. For \(\nu \in {\mathbb {N}}\) let \(P^{\nu }\) be an \((\varepsilon ^{\nu },\delta ^{\nu })\)-approximation of C. If \({(\varepsilon ^{\nu },\delta ^{\nu }) \rightarrow (0,0)}\), then \({P^{\nu } \rightarrow C}\) in the sense of Painlevé–Kuratowski.

Proof

By Theorem 4.1, we must show that there exist \(c \in {\mathbb {R}}^n\) and \(r_0 \ge 0\) such that \(d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(c)},{C \cap B_r(c)}\right) \rightarrow 0\) for all \(r \ge r_0\). Let \({r \ge \max _{\nu \in {\mathbb {N}}} \varepsilon ^{\nu }}\) and let c be an extreme point of C, which exists, because C contains no lines. By Proposition 4.2, there exists a sequence \({x^{\nu } \rightarrow c}\) with \({x^{\nu } \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P^{\nu }}\). Applying the triangle inequality and Proposition 4.1 yields

$$\begin{aligned} \begin{aligned}&d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(c)},{C \cap B_r(c)}\right) \\&\le d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(c)},{P^{\nu } \cap B_r(x^{\nu })}\right) \\&+ d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(x^{\nu })},{C \cap B_r(x^{\nu })}\right) \\&+ d_{{\mathsf {H}}}\left( {C \cap B_r(x^{\nu })},{C \cap B_r(c)}\right) \\&\le d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(c)},{P^{\nu } \cap B_r(x^{\nu })}\right) \\&+ 2(\varepsilon ^{\nu }+\delta ^{\nu }(r+\left\Vert {x^{\nu }-v^{\nu }} \right\Vert )) \\&+ d_{{\mathsf {H}}}\left( {C \cap B_r(x^{\nu })},{C \cap B_r(c)}\right) , \end{aligned} \end{aligned}$$

for some \(v^{\nu } \in {{\,\mathrm{conv}\,}}{{\,\mathrm{vert}\,}}P^{\nu }\). The first and third term in this sum converge to zero as \({x^{\nu } \rightarrow c}\). It remains to show that \(\left\Vert {x^{\nu }-v^{\nu }} \right\Vert \) is bounded. Since \(C \subseteq P^{\nu }\), the distance \({d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(x^{\nu })},{C \cap B_r(x^{\nu })}\right) }\) is attained as \(e[{P^{\nu } \cap B_r(x^{\nu })},{C \cap B_r(x^{\nu })}]\). Let the supremum be attained by \(p^{\nu } \in P^{\nu }\). Then \({p^{\nu } = v^{\nu } + d^{\nu }}\) for some \(d^{\nu } \in 0^+ {P^{\nu }}\). It holds

$$\begin{aligned} \left\Vert {p^{\nu }-c} \right\Vert \le \left\Vert {p^{\nu }-x^{\nu }} \right\Vert + \left\Vert {x^{\nu }-c} \right\Vert \le r + \max _{\nu \in {\mathbb {N}}} \left\Vert {x^{\nu }-c} \right\Vert < +\infty , \end{aligned}$$

i.e., \(v^{\nu }+d^{\nu } \in B_M(c)\) for some \(M \ge 0\). Therefore, the sequence \(\left\Vert {v^{\nu }} \right\Vert \) is bounded according to Lemma 4.1. Hence, \(\left\Vert {x^{\nu }-v^{\nu }} \right\Vert \) is also bounded and \(d_{{\mathsf {H}}}\left( {P^{\nu } \cap B_r(c)},{C \cap B_r(c)}\right) \rightarrow 0\), which was to be proved. \(\square \)

Theorem 4.2 justifies the definition of \(\left( \varepsilon , \delta \right) \)-approximations, i.e., it states that they define a meaningful notion of approximation. We close this section with the observation that \(\left( \varepsilon , \delta \right) \)-approximations reduce to \(\varepsilon \)-approximations in the compact case.

Corollary 4.1

Let \(C \subseteq {\mathbb {R}}^n\) be a convex body and \(P \subseteq {\mathbb {R}}^n\) be a polyhedron. For \(\varepsilon \ge 0\) and \(\delta \in [0,1)\) the following are equivalent.

  1. (i)

    P is an \(\left( \varepsilon , \delta \right) \)-approximation of C.

  2. (ii)

    \(P \supseteq C\) and \(d_{{\mathsf {H}}}\left( {P},{C}\right) \le \varepsilon \).

Proof

Since C is compact, \(0^+ {C} = \{0\}\). Then \({\bar{d}}_{{\mathsf {H}}}(0^+ {P},\{0\}) < 1\) implies that \(0^+ {P} = \{0\}\), i.e., P is compact as well, because otherwise one would have \({\bar{d}}_{{\mathsf {H}}}(0^+ {P},\{0\}) = 1\). Therefore, P is the convex hull of its vertices. Because \(P \supseteq C\), \(d_{{\mathsf {H}}}\left( {P},{C}\right) \) is attained as \(e[{P},{C}]\). But \(e[{P},{C}]\) is attained in a vertex of P, i.e., \(e[{P},{C}] = e[{{{\,\mathrm{vert}\,}}P},{C}]\). Hence, \(d_{{\mathsf {H}}}\left( {P},{C}\right) \le \varepsilon \). On the other hand, if \(d_{{\mathsf {H}}}\left( {P},{C}\right) \le \varepsilon \), then P must be compact by Proposition 3.2. Then \(0^+ {P} = 0^+ {C}\) and P is an \((\varepsilon ,0)\)-approximation of C and in particular an \(\left( \varepsilon , \delta \right) \)-approximation. \(\square \)

5 An Algorithm for the Polyhedral Approximation of Unbounded Spectrahedra

In this section, we present an algorithm for computing \(\left( \varepsilon , \delta \right) \)-approximations of closed convex and line-free sets C whose interior is nonempty. We also prove correctness and finiteness of the algorithm. The algorithm employs a cutting scheme, a procedure for approximating convex bodies by polyhedra that is introduced in [16]. A cutting scheme is an iterative algorithm that computes a sequence of polyhedral outer approximations by successively intersecting the approximation with new halfspaces. In doing so, vertices of the current approximation are cut off, hence the name. The calculation of these halfspaces is explained in Proposition 5.2.

Since we are dealing with unbounded sets, we pursue the idea to reduce computations to certain compact sets and then apply a cutting scheme. Furthermore, we have to be able to assess the set \(0^+ {C}\). Since this is difficult in the general case, we only consider sets C that are spectrahedra, because a representation of the recession cone is readily available.

Throughout this section, we consider the following semidefinite programs related to a closed spectrahedron \(C = \{x \in {\mathbb {R}}^n \mid {\mathcal {A}}(x) \succcurlyeq 0 \}\) with nonempty interior. For a direction \(w \in {\mathbb {R}}^n \setminus \{0\}\), consider

figure a

Solving (P\(_1\)(w)) is equivalent to determining the maximal shifting of a hyperplane with normal w within C. The following result is well known in the literature, see, e.g., [29, Corollary 14.2.1].

Proposition 5.1

For every \(w \in {{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ }\), an optimal solution to (P\(_1\)(w)) exists.

The second problem we consider is

figure b

where \(v \in {\mathbb {R}}^n\) and \(d \in {\mathbb {R}}^n\setminus \{0\}\). Solving (P\(_2\)(vd)) can be described as the task of determining the maximum distance on can move in direction d starting at point v until the set C is reached. If this distance is finite and \(v \notin C\), then a solution to (P\(_2\)(vd)) yields a point on the boundary of C, namely one of the points that are obtained by intersecting the boundary of C with the affine set \(\{v+td \mid t \in {\mathbb {R}}\}\). The Lagrangian dual problem of (P\(_2\)(vd)) is

figure c

Solutions to (P\(_2\)(vd)) and (D\(_2\)(vd)) give rise to a supporting hyperplane of C as described in the next proposition.

Proposition 5.2

Let \(v \notin C\) and set \(d=c-v\) for some \(c \in {{\,\mathrm{int}\,}}C\). Then solutions \((x^*,t^*)\) to (P\(_2\)(vd)) and \((U^*,w^*)\) to (D\(_2\)(vd)) exist. Moreover, \(w^{*{\mathsf {T}}} x \ge w^{*{\mathsf {T}}}v+t^*\) for all \(x \in C\) and equality holds for \(x=x^*\).

Proof

Without loss of generality, we can assume that \({{\,\mathrm{int}\,}}C = \{x \in {\mathbb {R}}^n \mid {\mathcal {A}}(x) \succ 0\}\), see [12, Corollary 5]. Then (c, 1) is strictly feasible for (P\(_2\)(vd)), which is the well-known Slater’s constraint qualification in convex optimization. Since, \(v \notin C\) and by convexity the first constraint is violated whenever \(t \le 0\). Since C is closed, an optimal solution \((x^*,t^*)\) to (P\(_2\)(vd)) with \(t^* \in [0,1]\) exists. Slater’s constraint qualification now implies strong duality, i.e., an optimal solution \((U^*,w^*)\) to (D\(_2\)(vd)) exists and the optimal values conincide. Next, let \(x \in C\) and observe that

$$\begin{aligned} \begin{aligned} w^{*{\mathsf {T}}}x-w^{*{\mathsf {T}}}v-t^*&= \sum _{i=1}^n x_i (A_i \cdot U^*) -w^{*{\mathsf {T}}}v-t^* \\&= {\bar{{\mathcal {A}}}}(x) \cdot U^* -\sum _{i=1}^n v_i (A_i \cdot U^*) -t^* \\&= {\bar{{\mathcal {A}}}}(x) \cdot U^* - {\bar{{\mathcal {A}}}}(v) \cdot U^* + {\mathcal {A}}(v) \cdot U^* \\&= {\bar{{\mathcal {A}}}}(x) \cdot U^* + A_0 \cdot U^* \\&= {\mathcal {A}}(x) \cdot U^* \ge 0. \end{aligned} \end{aligned}$$

The third equality holds due to strong duality. Lastly, for \(x=x^*\) we have equality, because \(x^* = v+t^*d\) and \(w^{*{\mathsf {T}}}d = 1\). \(\square \)

We want to describe the functioning of the algorithm geometrically before we present the details in pseudo code. The method consists of two phases. In the first phase, an initial polyhedron \(P_0\), such that \(P_0 \supseteq C\) and \({\bar{d}}_{{\mathsf {H}}}(0^+ {P_0},0^+ {C}) \le \delta \), is constructed as follows: For \(w \in {{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ }\), the set

$$\begin{aligned} M := \{x \in {\mathbb {R}}^n \mid w^{{\mathsf {T}}}x = -(1+\delta )\} \end{aligned}$$
(13)

is a compact basis of \(0^+ {C}\), i.e., \(0^+ {C} = {{\,\mathrm{cone}\,}}M\). We use a cutting scheme to compute a polyhedral \(\delta \)-approximation \(\overline{{M}}\) of M with \(M \subseteq {{\,\mathrm{int}\,}}\overline{M}\). If in (13) we set \(\left\Vert {w} \right\Vert =1\), then

$$\begin{aligned} K := {{\,\mathrm{cone}\,}}\overline{M} \end{aligned}$$
(14)

is a polyhedral cone with \({\bar{d}}_{{\mathsf {H}}}(K,0^+ {C}) \le \delta \). Next, we need to construct a polyhedron \(P_0\) with recession cone K that contains C. To this end, we compute a H-representation (R, 0) of K and solve (P\(_1\)(r)) for every row r of R that is for every normal of supporting hyperplanes that define K. Note that a solution always exists, because \(r \in {{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ }\) by construction. For a solution \(x_r^*\) to (P\(_1\)(r)), the set

$$\begin{aligned} \{x \in {\mathbb {R}}^n \mid r^{{\mathsf {T}}}x = r^{{\mathsf {T}}}x_r^*\} \end{aligned}$$
(15)

is a hyperplane that supports C in \(x_r^*\). For the initial approximation, we then set

$$\begin{aligned} P_0 = \bigcap _r \{x \in {\mathbb {R}}^n \mid r^{{\mathsf {T}}}x \le r^{{\mathsf {T}}}x_r^*\}. \end{aligned}$$
(16)

Clearly, it holds that \(0^+ {P_0} = K\) and that \(P_0\) has at least one vertex, because K is pointed.

In the second phase of the algorithm, \(P_0\) is refined by successively cutting of vertices until all vertices are within distance of at most \(\varepsilon \) from C. This is achieved by iteratively intersecting \(P_0\) with halfspaces that support C in some point of its boundary. To guarantee finiteness of the algorithm, we retreat with the computations to compact subsets of \(P_0\) and \(\overline{{C}}\) of C, namely

(17)

and

$$\begin{aligned} \overline{C} = C \cap \{x \in {\mathbb {R}}^n \mid w^{{\mathsf {T}}}x \ge \min _r w^{{\mathsf {T}}}x_r^* - \varepsilon \}, \end{aligned}$$
(18)

where w is the same as in (13) and the \(x_r^*\) are the optimal solutions from (15). A cutting scheme is then applied to compute an outer \(\varepsilon \)-approximation \(\overline{{P}}\) of \(\overline{{C}}\). Finally, an \(\left( \varepsilon , \delta \right) \)-approximation of C is obtained as

$$\begin{aligned} \overline{P} + K. \end{aligned}$$
(19)

We describe the aforementioned cutting scheme due to [16] that is used in the computation of an \(\left( \varepsilon , \delta \right) \)-approximation for the special case of spectrahedral sets in Algorithm 1.

figure d

The vectors e and \(e_i\), \(i=1,\dots ,n\), in line 1 denote the vector in \({\mathbb {R}}^n\) with components all equal to one and the ith unit vector, respectively. Since C is compact, it holds that \({{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ } = {\mathbb {R}}^n\). Therefore, Proposition 5.1 implies that optimal solutions \(x_w^*\) in line 1 always exist. Note that \(\kappa \) in line 12 is an upper bound on the Hausdorff distance between P and C due to the following observation. The Hausdorff distance between P and C is attained in a vertex of P, because \(C \subseteq P\). Since the part \(x_v^*\) of an optimal solution of (P\(_2\)(vd)) is an element of the boundary of C, we conclude \(\inf _{x \in C} \left\Vert {x-v} \right\Vert \le \left\Vert {x_v^*-v} \right\Vert = t_v^*\left\Vert {c-v} \right\Vert \) for every \(v \in {{\,\mathrm{vert}\,}}P\). Hence, the algorithm terminates with \(d_{{\mathsf {H}}}\left( {P},{C}\right) \le t_{{\bar{v}}}^*\left\Vert {c-{\bar{v}}} \right\Vert \le \varepsilon \). For the special class of spectrahedral sets, the cutting scheme algorithm terminates after finitely many steps. This is proved in [5,  Theorem 4.38].

Remark 5.1

As mentioned at the beginning of this section, Algorithm 1 falls into the class of cutting scheme algorithms. In [16], convergence properties for a similar class of algorithms, called Hausdorff schemes, are established. The authors define a Hausdorff scheme as a polyhedral approximation algorithm fulfilling the condition

$$\begin{aligned} d_{{\mathsf {H}}}\left( {P^k},{P^{k+1}}\right) \ge \gamma d_{{\mathsf {H}}}\left( {P^k},{C}\right) \end{aligned}$$

for a positive constant \(\gamma \) in every iteration. Here, \(P^k\) denotes the polyhedral approximation obtained in iteration k. They show that for every \(\varepsilon > 0\) there exists an index \(k_0\), such that for all \(k \ge k_0\)

$$\begin{aligned} d_{{\mathsf {H}}}\left( {P^k},{C}\right) \le (1+\varepsilon )\varGamma (C,n)\frac{1}{k^{1/(n-1)}} \end{aligned}$$

holds for a positive constant \(\varGamma (C,n)\). Note that if in step 8 of Algorithm 1 we were able to choose d such that \(\kappa \) in line 12 was equal to \(d_{{\mathsf {H}}}\left( {P},{C}\right) \), then our algorithm would be a Hausdorff scheme with constant \(\gamma =1\) and the bound would hold.

Remark 5.2

Algorithm 1 uses similar techniques as the supporting hyperplane method introduced in [37] for the maximization of a linear function subject to quasiconvex constraints. The supporting hyperplane method also constructs a sequence of polyhedral outer approximations of a convex body by successively introducing supporting hyperplanes. In order to find the corresponding boundary points, the same geometric idea is employed, i.e., moving from vertices of the current approximation towards an interior point until the boundary is met. However, the algorithms differ in multiple aspects. Firstly, in each iteration we choose the vertex with the largest distance to the set C with respect to the direction d, while in [37] the vertex that realizes the smallest objective function value is chosen. Secondly, we do not assume C to have a continuously differentiable boundary. In particular, if \({\mathcal {A}}(x)\) is a diagonal matrix, then C is a polyhedron. Therefore, our algorithm can handle a larger class of sets. Finally, the supporting hyperplane method approximates the set only in a neighbourhood of the optimal solution to the underlying optimization problem, while we are interested in an approximation of the whole set C.

We are now prepared to present Algorithm 2, an algorithm for the computation of \(\left( \varepsilon , \delta \right) \)-approximations of closed and line-free spectrahedra with nonempty interior.

figure e

Steps 6 and 13 in Algorithm 1 and 5, 6 and 11 in Algorithm 2 require the computation of a V-representation from an H-representation or vice versa. These problems are known as vertex enumeration and facet enumeration, respectively, and are difficult problems on their own. It is beyond the scope of this paper to discuss these problems in more detail. Therefore, we only point out that there exist toolboxes that are able to perform these tasks numerically, such as bensolve tools [6, 24]. In practice, however, the computations often become infeasible in dimensions three and higher when the number of halfspaces defining the polyhedron is large. It is also known that vertex enumeration for unbounded polyhedra is NP-hard, see [20]. Thus, since vertex enumeration has to be performed in every iteration of Algorithm 1 and for the unbounded polyhedron P in step 11 of Algorithm 2, one cannot expect the algorithms to be computationally efficient.

Theorem 5.1

As inputs for Algorithm 2 let \(\varepsilon , \delta > 0\) and the spectrahedron defined by the matrix \({\mathcal {A}}(x)\) be closed convex line-free and with nonempty interior. Then Algorithm 2 works correctly, i.e., if it terminates it returns an \(\left( \varepsilon , \delta \right) \)-approximation of C according to Definition 4.2.

Proof

Since C is closed and does not contain any lines, its recession cone is also closed and pointed. This implies that \({(0^+ {C})}^{\circ }\) has nonempty interior, see, e.g., [1, p. 53]. The direction w defined in line 1 is an element of \({(0^+ {C})}^{\circ }\) according to 4. Note that \(w \ne 0\), because \(0^+ {C} \ne \left\{ {0} \right\} \) and the pointedness of \(0^+ {C}\) implies that the matrices \(A_1,\dots ,A_n\) are linearly independent [27, Lemma 3.2.9]. To see that w is indeed from the interior of \({(0^+ {C})}^{\circ }\) observe that for every \(x \in 0^+ {C}\setminus \{0\}\) it holds that

$$\begin{aligned} w^{{\mathsf {T}}}x = -\sum _{i=1}^n x_i (A_i \cdot I) = - \left( \sum _{i=1}^n x_iA_i\right) \cdot I = - {\bar{{\mathcal {A}}}}(x) \cdot I < 0. \end{aligned}$$

The last inequality holds, because at least one eigenvalue of \({\bar{{\mathcal {A}}}}(x)\) is positive. The set M defined in line 2 is compact, because \(w \in {{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ }\). Note that M is not full-dimensional, however, treating its affine hull as the ambient space, M is a valid input for Algorithm 1 in line 3. By enlarging \(\overline{{M}}\) in line 4, it remains polyhedral as the Minkowski sum of polyhedra. The cone K is then polyhedral and it satisfies \(0^+ {C}\setminus \{0\} \subseteq {{\,\mathrm{int}\,}}K\) and \({\bar{d}}_{{\mathsf {H}}}(K,0^+ {C}) \le \delta \). The first assertion is immediate from the observation that \(0^+ {C} = {{\,\mathrm{cone}\,}}M\) and . Secondly, it is true that \(\left\Vert {x} \right\Vert \ge 1+\delta \) for every x satisfying \(w^{{\mathsf {T}}}x = -(1+\delta )\). Therefore, \(\left\Vert {x} \right\Vert \ge 1\) for every \(x \in \overline{M}\) due to the construction of the set. Assume \({\bar{d}}_{{\mathsf {H}}}(K,0^+ {C})\) is attained as \(\left\Vert {k-c} \right\Vert \) and let \(\alpha \) be chosen such that \(\alpha k \in \overline{M}\), in particular \(\alpha \ge 1\). Then we obtain the second claim by the following observation:

Due to polarity for convex cones and the properties of K, the relation \({K}^{\circ } \subseteq {{\,\mathrm{int}\,}}{(0^+ {C})}^{\circ }\) holds. Also, K is pointed, because \(0 \notin \overline{M}\). By Proposition 5.1, optimal solutions \(x_r^*\) in line 7 exist for every r. The set \(\overline{C}\) in line 8 is compact by the same argument as for M. Moreover, it has nonemtpy interior, because C itself has nonempty interior and it contains the convex hull of the points \(x_r^*\), of which every relative interior point is an interior point of \(\overline{C}\). Therefore, a polyhedral \(\varepsilon \)-approximation \(\overline{P}\) is computed correctly. It remains to show that \(P = \overline{P} + K\) is indeed an \(\left( \varepsilon , \delta \right) \)-approximation of C. The recession cone of P is K since \(\overline{P}\) is compact. Thus, we have \({\bar{d}}_{{\mathsf {H}}}(0^+ {P},0^+ {C}) \le \delta \) and P is line-free. As and \(\overline{{P}}\) is an \(\varepsilon \)-approximation of \(\overline{{C}}\) it holds \(e[{{{\,\mathrm{vert}\,}}P},{C}] \le \varepsilon \). In order to complete the proof, we must show that \(C \subseteq P\). To this end, denote by \(C^+\) the set \(C \cap \left\{ {x \in {\mathbb {R}}^n \mid w^{{\mathsf {T}}}x \le \min _r w^{{\mathsf {T}}}x_r^* - \varepsilon } \right\} \), i.e., . Then for every row r of R in the H-representation of K, \(\sup \left\{ {r^{{\mathsf {T}}}x \mid x \in C^+} \right\} \) is attained by some \({\bar{x}}_r\) with

$$\begin{aligned} w^{{\mathsf {T}}}{\bar{x}}_r = \min _r w^{{\mathsf {T}}}x_r^* - \varepsilon \end{aligned}$$

and it holds that

$$\begin{aligned} C^+ \subseteq \left( \bigcap _r \left\{ {x \in {\mathbb {R}}^n \mid r^{{\mathsf {T}}}x \le r^{{\mathsf {T}}}{\bar{x}}_r} \right\} \right) \cap \left\{ {x \in {\mathbb {R}}^n \mid w^{{\mathsf {T}}}x \le \min _r w^{{\mathsf {T}}}x_r^* - \varepsilon } \right\} . \end{aligned}$$

But since the recession cone of \(\bigcap _r \left\{ {x \in {\mathbb {R}}^n \mid r^{{\mathsf {T}}}x \le r^{{\mathsf {T}}}{\bar{x}}_r} \right\} \) is K and all \({\bar{x}}_r\) lie in the same hyperplane, it is also true that

Altogether, we conclude . \(\square \)

Corollary 5.1

Algorithm 2 terminates after finitely many steps.

Proof

This is a consequence of the finiteness of Algorithm 1, see [5, Theorem 4.38]. Therefore, the executions of Algorithm 1 in lines 3 and 8 of Algorithm 2 terminate after finitely many steps, which implies that Algorithm 2 itself is finite. \(\square \)

We close this section by illustrating Algorithm 2 with the following two examples.

Example 5.1

Consider the spectrahedron \(C \subseteq {\mathbb {R}}^2\) defined by the matrix inequality

$$\begin{aligned} \begin{pmatrix} x_1 &{} 1 &{} 0 &{} 0 \\ 1 &{} x_2 &{} 0 &{} 0 \\ 0 &{} 0 &{} 1 &{} x_1 \\ 0 &{} 0 &{} x_1 &{} x_2 \end{pmatrix} \succcurlyeq 0. \end{aligned}$$

It is the intersection of the epigraphs of the functions \(x \mapsto 1/x\), restricted to the positive real line, and \(x \mapsto x^2\). We use the solver SDPT3 [33, 34] and the software bensolve tools [6, 24] to solve the semidefinite subproblems and perform vertex and facet enumeration, respectively. The algorithm is implemented in GNU Octave [9]. Figure 3 shows the polyhedral approximations of C at different stages of Algorithm 2 for the tolerances \(\left( \varepsilon , \delta \right) = (0.1, 0.1)\).

Fig. 3
figure 3

Computed polyhedra at different steps of Algorithm 2 for the error tolerances \(\left( \varepsilon , \delta \right) = (0.1,0.1)\). Figures 3a and 3c-3f are scaled vertically by a factor of 0.077, Figure 3b by a factor of 0.054 for visibility

Computational results for different values of \(\varepsilon \) and \(\delta \) are presented in Table 1. It can be seen that the number of subproblems that have to be solved is larger than the number of vertices the polyhedral approximation has. The reason is that one instance of (P\(_2\)(vd)) is solved for every vertex of the current approximation in every iteration of Algorithm 1, but only one of these vertices is cut off. Moreover, the number of solved subproblems grows quickly as \(\varepsilon \) decreases, because more iterations of Algorithm 1 are needed to reach the desired accuracy and the number of solved subproblems grows with every iteration. Since the recession cone of C is just a ray and easy to approximate, most of the computational effort is put into approximating \(\overline{C}\) in line 9. However, for fixed \(\varepsilon \) and decreasing \(\delta \) the number of solved subproblems grows. This is due to the fact that \(\overline{C}\) depends on the approximate recession cone K. As \(\delta \) decreases the rays generating K will be closer to each other with respect to the truncated Hausdorff distance. Therefore, the set \(\overline{C}\) will have a larger area and it takes more iterations to compute an \(\varepsilon \)-approximation of it. Note that for \(\left( \varepsilon , \delta \right) \) equal to (0.3, 0.2), (0.5, 0.15) or (0.5, 0.2) the same number of subproblems are solved and the approximations have the same number of vertices. For the tolerances (0.3, 0.2) and (0.5, 0.2), the values are identical, because during the approximation of \(\overline{C}\) the approximation error in Algorithm 1 changes from a value larger than 0.5 to a value smaller than 0.3 in one iteration. Therefore, the resulting \(\left( \varepsilon , \delta \right) \)-approximations are identical. For \(\left( \varepsilon , \delta \right) = (0.5,0.15)\), the approximation is different and it is a coincidence that the values coincide.

Example 5.2

Algorithm 2 can also be used to compute polyhedral approximations of closed and pointed convex cones. Consider for example the positive semidefinite cone of \(2 \times 2\) matrices

$$\begin{aligned} S = \left\{ x \in {\mathbb {R}}^3 \Big \vert \begin{pmatrix} x_1 &{} x_3 \\ x_3 &{} x_2 \end{pmatrix} \succcurlyeq 0 \right\} . \end{aligned}$$

It is a closed and pointed convex cone with nonempty interior. Thus, we can apply Algorithm 2 to it. Since S is a cone, its only vertex is the origin and we can terminate the algorithm after K has been computed in line 5. Then K is a polyhedral cone and it holds \({\bar{d}}_{{\mathsf {H}}}(K,S) \le \delta \). Figure 4 shows a polyhedral approximation of S with 20 extreme rays and \({\bar{d}}_{{\mathsf {H}}}(K,S) \le 0.1\).

Table 1 Computational results for Example 5.1 and different values of \(\varepsilon \) and \(\delta \). Every cell shows the number of solved semidefinite subproblems, the number of vertices of the polyhedral approximation, as well as the elapsed CPU time
Fig. 4
figure 4

A polyhedral approximation of the cone of \(2 \times 2\) positive semidefinite matrices obtained by Algorithm 2 for \(\delta = 0.1\), see Example 5.2

6 Conclusion

We have introduced the notion of \(\left( \varepsilon , \delta \right) \)-approximations for the polyhedral approximation of unbounded convex sets. Since polyhedral approximation in the Hausdorff distance can only be achieved for unbounded sets under restrictive assumptions, \(\left( \varepsilon , \delta \right) \)-approximations are of particular interest, because they allow treatment of a larger class of sets. An important observation is that the recession cones of the involved sets must play a crucial role in a meaningful concept of approximation for unbounded sets. We have shown that \(\left( \varepsilon , \delta \right) \)-approximations define a suitable notion of approximation in the sense that a sequence of such approximations convergences and that \(\left( \varepsilon , \delta \right) \)-approximations generalize the polyhedral approximation of compact sets with respect to the Hausdorff distance. Finally, we have presented an algorithm that allows for the computation of \(\left( \varepsilon , \delta \right) \)-approximations of spectrahedra and have shown that the algorithm is finite.