Abstract
We construct a general framework for deriving error bounds for conic feasibility problems. In particular, our approach allows one to work with cones that fail to be amenable or even to have computable projections, two previously challenging barriers. For the purpose, we first show how error bounds may be constructed using objects called one-step facial residual functions. Then, we develop several tools to compute these facial residual functions even in the absence of closed form expressions for the projections onto the cones. We demonstrate the use and power of our results by computing tight error bounds for the exponential cone feasibility problem. Interestingly, we discover a natural example for which the tightest error bound is related to the Boltzmann–Shannon entropy. We were also able to produce an example of sets for which a Hölderian error bound holds but the supremum of the set of admissible exponents is not itself an admissible exponent.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Our main object of interest is the following convex conic feasibility problem:
where \(\mathcal {L}\) is a subspace contained in some finite-dimensional real Euclidean space \(\mathcal {E}\), \({{\textbf {a}}}\in \mathcal {E}\) and \( \mathcal {K}\subseteq \mathcal {E}\) is a closed convex cone. For a discussion of some applications and algorithms for (Feas) see [22]. See also [5] for a broader analysis of convex feasibility problems. We also recall that a conic linear program (CLP) is the problem of minimizing a linear function subject to a constraint of the form described in (Feas). In addition, when the optimal set of a CLP is non-empty it can be written as the intersection of a cone with an affine set. This provides yet another motivation for analyzing (Feas): to better understand feasible regions and optimal sets of conic linear programs. Here, our main interest is in obtaining error bounds for (Feas). That is, assuming \((\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K}\ne \emptyset \), we want an inequality that, given some arbitrary \({{\textbf {x}}}\in \mathcal {E}\), relates the individual distances \(\text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}}), \text {d}({{\textbf {x}}}, \mathcal {K})\) to the distance to the intersection \(\text {d}({{\textbf {x}}}, (\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K})\). Considering that \(\mathcal {E}\) is equipped with some norm \(\Vert \cdot \Vert \) induced by some inner product \(\langle \cdot , \cdot \rangle \), we recall that the distance function to a convex set C is defined as follows:
When \( \mathcal {K}\) is a polyhedral cone, the classical Hoffman’s error bound [23] gives a relatively complete picture of the way that the individual distances relate to the distance to the intersection. If \( \mathcal {K}\) is not polyhedral, but \(\mathcal {L}+{{\textbf {a}}}\) intersects \( \mathcal {K}\) in a sufficiently well-behaved fashion (say, for example, when \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\), the relative interior of \( \mathcal {K}\); see Proposition 2.2), we may still expect “good” error bounds to hold, e.g., [6, Corollary 3]. However, checking whether \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\) is not necessarily a trivial task; and, in general, \((\mathcal {L}+{{\textbf {a}}})\cap \text {ri}\, \mathcal {K}\) can be void.
Here, we focus on error bound results that do not require any assumption on the way that the affine space \(\mathcal {L}+{{\textbf {a}}}\) intersects \( \mathcal {K}\). So, for example, we want results that are valid even if, say, \(\mathcal {L}+{{\textbf {a}}}\) fails to intersect the relative interior of \( \mathcal {K}\). Inspired by Sturm’s pioneering work on error bounds for positive semidefinite systems [50], the class of amenable cones was proposed in [34] and it was shown that the following three ingredients can be used to obtain general error bounds for (Feas): (i) amenable cones, (ii) facial reduction [13, 45, 52] and (iii) the so-called facial residual functions (FRFs) [34, Definition 16].
In this paper, we will show that, in fact, it is possible to obtain error bounds for (Feas) by using the so-called one-step facial residual functions directly in combination with facial reduction. It is fair to say that computing the facial residual functions is the most critical step in obtaining error bounds for (Feas). We will demonstrate techniques that are readily adaptable for the purpose.
All the techniques discussed here will be showcased with error bounds for the so-called exponential cone which is defined as followsFootnote 1:
Put succinctly, the exponential cone is the closure of the epigraph of the perspective function of \(z=e^x\). It is quite useful in entropy optimization, see [15]. Furthermore, it is also implemented in the MOSEK package, see [17, 37, Chapter 5], and the many modelling examples in Sect. 5.4 therein. There are several other solvers that either support the exponential cone or convex sets closely related to it [16, 25, 39, 41]. See also [20] for an algorithm for projecting onto the exponential cone. So convex optimization with exponential cones is widely available even if, as of this writing, it is not as widespread as, say, semidefinite programming.
The exponential cone \(K_{\exp }\) appears, at a glance, to be simple. However, it possesses a very intricate geometric structure that illustrates a wide range of challenges practitioners may face in computing error bounds. First of all, being non-facially-exposed, it is not amenable, so the theory developed in [34] does not directly apply to it. Another difficulty is that not many analytical tools have been developed to deal with the projection operator onto \(K_{\exp }\) (compared with, for example, the projection operator onto PSD cones) which is only implicitly specified. Until now, these issues have made challenging the establishment of error bounds for objects like \(K_{\exp }\), many of which are of growing interest in the mathematical programming community.
Our research is at the intersection of two topics: error bounds and the facial structure of cones. General information on the former can be found, for example, in [26, 40]. Classically, there seems to be a focus on the so-called Hölderian error bounds (see also [27,28,29]) but we will see in this paper that non Hölderian behavior can still appear even in relatively natural settings such as conic feasibility problems associated to the exponential cone.
Facts on the facial structure of convex cones can be found, for example, in [3, 4, 42]. We recall that a cone is said to be facially exposed if each face arises as the intersection of the whole cone with some supporting hyperplane. Stronger forms of facial exposedness have also been studied to some extent, here are some examples: projectional exposedness [13, 51], niceness [44, 46], tangential exposedness [47], amenability [34]. See also [36] for a comparison between a few different types of facial exposedness. These notions are useful in many topics, e.g.: regularization of convex programs and extended duals [13, 32, 45], studying the closure of certain linear images [32, 43], lifts of convex sets [21] and error bounds [34]. However, as can be seen in Fig. 1, the exponential cone is not even a facially exposed cone, so none of the aforementioned notions apply (in particular, the face \( \mathcal {F}_{ne}:=\{(0,0,z)\;|\; z \ge 0\}\) is not exposed). This was one of the motivations for looking beyond facial exposedness and developing a framework for deriving error bounds for feasibility problems associated to general closed convex cones.
1.1 Outline and results
The goal of this paper is to build a robust framework that may be used to obtain error bounds for previously inaccessible cones, and to demonstrate the use of this framework by applying it to fully describe error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\).
In Sect. 2, we recall preliminaries. New contributions begin in Sect. 3. We first recall some rules for chains of faces and the diamond composition. Then we show how error bounds may be constructed using objects known as one-step facial residual functions. In Sect. 3.1, we build our general framework for constructing one-step facial residual functions. Our key result, Theorem 3.10, obviates the need of computing explicitly the projection onto the cone. Instead, we make use of the parametrization of the boundary of the cone and projections onto the proper faces of a cone: thus, our approach is advantageous when these projections are easier to analyze than the projection onto the whole cone itself. We emphasize that all of the results of Sect. 3 are applicable to a general closed convex cone.
In Sect. 4, we use our new framework to fully describe error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\). This was previously a problem lacking a clear strategy, because all projections onto \(K_{\exp }\) are implicitly specified. However, having obviated the need to project onto \(K_{\exp }\), we successfully obtain all the necessary FRFs, partly because it is easier to project onto the proper faces of \(K_{\exp }\) than to project onto \(K_{\exp }\) itself. Surprisingly, we discover that different collections of faces and exposing hyperplanes admit very different FRFs. In Sect. 4.2.1, we show that for the unique 2-dimensional face, any exponent in \(\left( 0,1\right) \) may be used to build a valid FRF, while the supremum over all the admissible exponents cannot be. Furthermore, a better FRF for the 2D face can be obtained if we go beyond Hölderian error bounds and consider a so-called entropic error bound which uses a modified Boltzmann-Shannon entropy function, see Theorem 4.2. The curious discoveries continue; for infinitely many 1-dimensional faces, the FRF, and the final error bound, feature exponent 1/2. For the final outstanding 1-dimensional exposed face, the FRF, and the final error bound, are Lipschitzian for all exposing hyperplanes except exactly one, for which no exponent will suffice. However, for this exceptional case, our framework still successfully finds an FRF, which is logarithmic in character (Corollary 4.11). Consequentially, the system consisting of \(\{(0,0,1)\}^\perp \) and \(K_{\exp }\) possesses a kind of “logarithmic error bound" (see Example 4.20) instead of a Hölderian error bound. In Theorems 4.13 and 4.17, we give explicit error bounds by using our FRFs and the suite of tools we developed in Sect. 3. We also show that the error bound given in Theorem 4.13 is tight, see Remark 4.14.
These findings about the exponential cone are surprising, since we are not aware of other objects having this litany of odd behaviour hidden in their structure all at once.Footnote 2 One possible reason for the absence of previous reports on these phenomena might have been the sheer absence of tools for obtaining error bounds for general cones. In this sense, we believe that the machinery developed in Sect. 3 might be a reasonable first step towards filling this gap. In Sect. 4.4, we document additional odd consequences and connections to other concepts, with particular relevance to the Kurdyka-Łojasiewicz (KL) property [1, 2, 8,9,10, 30]. In particular, we have two sets satisfying a Hölderian error bound for every \(\gamma \in \left( 0,1\right) \) but the supremum of allowable exponents is not allowable. Consequently, one obtains a function with the KL property with exponent \(\alpha \) for any \(\alpha \in \left( 1/2,1\right) \) at the origin, but not for \(\alpha = 1/2\). We conclude in Sect. 5.
2 Preliminaries
We recall that \(\mathcal {E}\) denotes an arbitrary finite-dimensional real Euclidean space. We will adopt the following convention, vectors will be boldfaced while scalars will use normal typeface. For example, if \({{\textbf {p}}}\in \mathbb {R}^3\), we write \({{\textbf {p}}}= (p_x,p_y,p_z)\), where \(p_x,p_y,p_z \in \mathbb {R}\).
We denote by \(B(\eta )\) the closed ball of radius \(\eta \) centered at the origin, i.e., \(B(\eta ) = \{{{\textbf {x}}}\in \mathcal {E}\mid \Vert {{\textbf {x}}}\Vert \le \eta \}\). Let \(C\subseteq \mathcal {E}\) be a convex set. We denote the relative interior and the linear span of C by \(\text {ri}\,C\) and \(\text {span}\,C\), respectively. We also denote the boundary of C by \(\partial C\), and \(\text {cl}\, C\) is the closure of C. We denote the projection operator onto C by \(P_C\), so that \(P_C({{\textbf {x}}}) = \text {argmin}_{{{\textbf {y}}}\in C} \Vert {{\textbf {x}}}-{{\textbf {y}}}\Vert \). Given closed convex sets \(C_1,C_2 \subseteq \mathcal {E}\), we note the following properties of the projection operator
2.1 Cones and their faces
Let \( \mathcal {K}\) be a closed convex cone. We say that \( \mathcal {K}\) is pointed if . The dimension of \( \mathcal {K}\) is denoted by \(\dim ( \mathcal {K})\) and is the dimension of the linear subspace spanned by \( \mathcal {K}\). A face of \( \mathcal {K}\) is a closed convex cone \( \mathcal {F}\) satisfying \( \mathcal {F}\subseteq \mathcal {K}\) and the following property
In this case, we write \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\). We say that \( \mathcal {F}\) is proper if \( \mathcal {F}\ne \mathcal {K}\). A face is said to be nontrivial if \( \mathcal {F}\ne \mathcal {K}\) and \( \mathcal {F}\ne \mathcal {K}\cap - \mathcal {K}\). In particular, if \( \mathcal {K}\) is pointed (as is the case of the exponential cone), a nontrivial face is neither \( \mathcal {K}\) nor \(\{{\textbf {0}}\}\). Next, let \( \mathcal {K}^*\) denote the dual cone of \( \mathcal {K}\), i.e., \( \mathcal {K}^* = \{{{\textbf {z}}}\in \mathcal {E}\mid \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \ge 0, \forall {{\textbf {x}}}\in \mathcal {K}\}\). We say that \( \mathcal {F}\) is an exposed face if there exists \({{\textbf {z}}}\in \mathcal {K}^*\) such that \( \mathcal {F}= \mathcal {K}\cap \{{{\textbf {z}}}\}^\perp \).
A chain of faces of \( \mathcal {K}\) is a sequence of faces satisfying \( \mathcal {F}_\ell \subsetneq \cdots \subsetneq \mathcal {F}_{1}\) such that each \( \mathcal {F}_{i}\) is a face of \( \mathcal {K}\) and the inclusions \( \mathcal {F}_{i+1} \subsetneq \mathcal {F}_{i}\) are all proper. The length of the chain is defined to be \(\ell \). With that, we define the distance to polyhedrality of \( \mathcal {K}\) as the length minus one of the longest chain of faces of \( \mathcal {K}\) such that \( \mathcal {F}_{\ell }\) is polyhedral and \( \mathcal {F}_{i}\) is not polyhedral for \(i < \ell \), see [35, Sect. 5.1]. We denote the distance to polyhedrality by \(\ell _{\text {poly}}( \mathcal {K})\).
2.2 Lipschitzian and Hölderian error bounds
In this subsection, suppose that \(C_1,\ldots , C_{\ell } \subseteq \mathcal {E}\) are convex sets with nonempty intersection. We recall the following definitions.
Definition 2.1
(Hölderian and Lipschitzian error bounds) The sets \(C_1,\ldots , C_\ell \) are said to satisfy a Hölderian error bound if for every bounded set \(B \subseteq \mathcal {E}\) there exist some \(\kappa _B > 0\) and an exponent \(\gamma _B\in (0, 1]\) such that
If we can take the same \(\gamma _B = \gamma \in (0,1]\) for all B, then we say that the bound is uniform. If the bound is uniform with \(\gamma = 1\), we call it a Lipschitzian error bound.
We note that the concepts in Definition 2.1 also have different names throughout the literature. When \(C_1,\ldots , C_\ell \) satisfy a Hölderian error bound it is said that they satisfy bounded Hölder regularity, e.g., see [11, Definition 2.2]. When a Lipschitzian error bound holds, \(C_1,\ldots , C_\ell \) are said to satisfy bounded linear regularity, see [5, Sect. 5] or [6]. Bounded linear regularity is also closely related to the notion of subtransversality [24, Definition 7.5].
Hölderian and Lipschitzian error bounds will appear frequently in our results, but we also encounter non-Hölderian bounds as in Theorems 4.2 and 4.10. Next, we recall the following result which ensures a Lipschitzian error bound holds between families of convex sets when a constraint qualification is satisfied.
Proposition 2.2
[6, Corollary 3] Let \(C_1,\ldots , C_{\ell } \subseteq \mathcal {E}\) be convex sets such that \(C_{1},\ldots , C_{k}\) are polyhedral. If
then for every bounded set B there exists \(\kappa _B>0\) such that
In view of (Feas), we say that Slater’s condition is satisfied if \((\mathcal {L}+ {{\textbf {a}}}) \cap \text {ri}\, \mathcal {K}\ne \emptyset \). If \( \mathcal {K}\) can be written as \( \mathcal {K}^1 \times \mathcal {K}^2\subseteq \mathcal {E}^1 \times \mathcal {E}^2\), where \(\mathcal {E}^1\) and \(\mathcal {E}^2\) are real Euclidean spaces and \( \mathcal {K}^1\subseteq \mathcal {E}^1\) is polyhedral, we say that the partial polyhedral Slater’s (PPS) condition is satisfied if
Adding a dummy coordinate, if necessary, we can see Slater’s condition as a particular case of the PPS condition. By convention, we consider that the PPS condition is satisfied for (Feas) if one of the following is satisfied: 1) \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\); 2) \((\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K}\ne \emptyset \) and \( \mathcal {K}\) is polyhedral; or 3) \( \mathcal {K}\) can be written as a direct product \( \mathcal {K}^1 \times \mathcal {K}^2\) where \( \mathcal {K}^1\) is polyhedral and (2.3) is satisfied.
Noting that \((\mathcal {L}+ {{\textbf {a}}}) \cap ( \mathcal {K}^1 \times (\text {ri}\, \mathcal {K}^2) ) = (\mathcal {L}+ {{\textbf {a}}})\cap ( \mathcal {K}^1 \times \mathcal {E}^2) \cap (\mathcal {E}^1 \times (\text {ri}\, \mathcal {K}^2) )\), we deduce the following result from Proposition 2.2.
Proposition 2.3
(Error bound under PPS condition) Suppose that (Feas) satisfies the partial polyhedral Slater’s condition. Then, for every bounded set B there exists \(\kappa _B>0\) such that
We recall that for \(a,b \in \mathbb {R}_+\) we have \(a+b \le 2\max \{a,b\} \le 2(a+b)\), so Propositions 2.2 and 2.3 can also be equivalently stated in terms of sums of distances.
3 Facial residual functions and error bounds
In this section, we discuss a strategy for obtaining error bounds for the conic linear system (Feas) based on the so-called facial residual functions that were introduced in [34]. In contrast to [34], we will not require that \( \mathcal {K}\) be amenable.
The motivation for our approach is as follows. If it were the case that (Feas) satisfies some constraint qualification, we would have a Lipschitizian error bound per Proposition 2.3, see also [6] for other sufficient conditions. Unfortunately, this does not happen in general. However, as long as (Feas) is feasible, there is always a face of \( \mathcal {K}\) that contains the feasible region of (Feas) and for which a constraint qualification holds. The error bound computation essentially boils down to understanding how to compute the distance to this special face. The first result towards our goal is the following.
Proposition 3.1
(An error bound when a face satisfying a CQ is known) Suppose that (Feas) is feasible and let \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) be a face such that
-
(a)
\( \mathcal {F}\) contains \( \mathcal {K}\cap (\mathcal {L}+{{\textbf {a}}})\).
-
(b)
\(\{ \mathcal {F}, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition.Footnote 3
Then, for every bounded set B, there exists \(\kappa _B > 0\) such that
Proof
Since \( \mathcal {F}\) is a face of \( \mathcal {K}\), assumption (a) implies \( \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}}) = \mathcal {F}\cap (\mathcal {L}+{{\textbf {a}}})\). Then, the result follows from assumption (b) and Proposition 2.3. \(\square \)
From Proposition 3.1 we see that the key to obtaining an error bound for the system (Feas) is to find a face \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) satisfying (a), (b) and we must know how to estimate the quantity \(\text {d}({{\textbf {x}}}, \mathcal {F})\) from the available information \(\text {d}({{\textbf {x}}}, \mathcal {K})\) and \(\text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}})\).
This is where we will make use of facial reduction and facial residual functions. The former will help us find \( \mathcal {F}\) and the latter will be instrumental in upper bounding \(\text {d}({{\textbf {x}}}, \mathcal {F})\). First, we recall below a result that follows from the analysis of the FRA-poly facial reduction algorithm developed in [35].
Proposition 3.2
[34, Proposition 5]Footnote 4 Let \( \mathcal {K}= \mathcal {K}^1\times \cdots \times \mathcal {K}^s\), where each \( \mathcal {K}^i\) is a closed convex cone. Suppose (Feas) is feasible. Then there is a chain of faces
of length \(\ell \) and vectors \(\{{{\textbf {z}}}_1,\ldots , {{\textbf {z}}}_{\ell -1}\}\) satisfying the following properties.
-
(i)
\(\ell -1\le \sum _{i=1}^{s} \ell _{\text {poly}}( \mathcal {K}^i) \le \dim { \mathcal {K}}\).
-
(ii)
For all \(i \in \{1,\ldots , \ell -1\}\), we have
$$\begin{aligned} {{\textbf {z}}}_i \in \mathcal {F}_i^* \cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \ \ \ {and}\ \ \ \mathcal {F}_{i+1} = \mathcal {F}_{i} \cap \{{{\textbf {z}}}_i\}^\perp . \end{aligned}$$ -
(iii)
\( \mathcal {F}_{\ell } \cap (\mathcal {L}+{{\textbf {a}}}) = \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}})\) and \(\{ \mathcal {F}_{\ell },\mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition.
In view of Proposition 3.2, we define the distance to the PPS condition \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})\) as the length minus one of the shortest chain of faces (as in (3.1)) satisfying items (ii) and (iii) in Proposition 3.2. For example, if (Feas) satisfies the PPS condition, we have \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}}) = 0\).
Next, we recall the definition of facial residual functions from [34, Definition 16].
Definition 3.3
(Facial residual functionFootnote 5) Let \( \mathcal {K}\) be a closed convex cone, \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) be a face, and let \({{\textbf {z}}}\in \mathcal {F}^*\). Suppose that \(\psi _{ \mathcal {F},{{\textbf {z}}}} : \mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) satisfies the following properties:
-
(i)
\(\psi _{ \mathcal {F},{{\textbf {z}}}}\) is nonnegative, monotone nondecreasing in each argument and \(\psi _{ \mathcal {F},{{\textbf {z}}}}(0,t) = 0\) for every \(t \in \mathbb {R}_+\).
-
(ii)
The following implication holds for any \({{\textbf {x}}}\in \text {span}\, \mathcal {K}\) and any \(\epsilon \ge 0\):
$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \le \epsilon , \quad \text {d}({{\textbf {x}}}, \text {span}\, \mathcal {F}) \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}}, \mathcal {F}\cap \{{{\textbf {z}}}\}^{\perp }) \le \psi _{ \mathcal {F},{{\textbf {z}}}} (\epsilon , \Vert {{\textbf {x}}}\Vert ). \end{aligned}$$
Then, \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) is said to be a facial residual function for \( \mathcal {F}\) and \({{\textbf {z}}}\) with respect to \( \mathcal {K}\).
Definition 3.3, in its most general form, represents “two-steps” along the facial structure of a cone: we have a cone \( \mathcal {K}\), a face \( \mathcal {F}\) (which could be different from \( \mathcal {K}\)) and a third face defined by \( \mathcal {F}\cap \{{{\textbf {z}}}\}^\perp \). In this work, however, we will be focused on the following special case of Definition 3.3.
Definition 3.4
(One-step facial residual function (\(\mathbb {1}\)-FRF)) Let \( \mathcal {K}\) be a closed convex cone and \({{\textbf {z}}}\in \mathcal {K}^*\). A function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) is called a one-step facial residual function (\(\mathbb {1}\)-FRF) for \( \mathcal {K}\) and \({{\textbf {z}}}\) if it is a facial residual function of \( \mathcal {K}\) and \({{\textbf {z}}}\) with respect to \( \mathcal {K}\). That is, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) satisfies item (i) of Definition 3.3 and for every \({{\textbf {x}}}\in \text {span}\, \mathcal {K}\) and any \(\epsilon \ge 0\):
Remark 3.5
(Concerning the implication in Definition 3.4) In view of the monotonicity of \(\psi _{ \mathcal {K},{{\textbf {z}}}}\), the implication in Definition 3.4 can be equivalently and more succinctly written as
The unfolded form presented in Definition 3.4 is more handy in our discussions and analysis below.
Facial residual functions always exist (see [34, Sect. 3.2] for the case of pointed cones, although the argument holds in general), but their computation is often nontrivial. Next, we review a few examples.
Example 3.6
(Examples of facial residual functions) If \( \mathcal {K}\) is a symmetric cone (i.e., a self-dual homogeneous cone, see [18, 19]), then given \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) and \({{\textbf {z}}}\in \mathcal {F}^*\), there exists a \(\kappa > 0\) such that \(\psi _{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \kappa \sqrt{\epsilon t}\) is a one-step facial residual function for \( \mathcal {F}\) and \({{\textbf {z}}}\), see [34, Theorem 35].
If \( \mathcal {K}\) is a polyhedral cone, the function \(\psi _{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon \) can be taken instead, with no dependency on t, see [34, Proposition 18].
Moving on, we say that a function \(\tilde{\psi }_{ \mathcal {F},{{\textbf {z}}}}\) is a positively rescaled shift of \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) if there are positive constants \(M_1,M_2,M_3\) and nonnegative constant \(M_4\) such that
This is a generalization of the notion of positive rescaling in [34], which sets \(M_4 = 0\). We also need to compose facial residual functions in a special manner. Let \(f:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) and \(g:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) be functions. We define the diamond composition \(f\diamondsuit g\) to be the function satisfying
Note that the above composition is not associative in general. When we have functions \(f_i:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\), \(i = 1,\ldots ,m\) with \(m\ge 3\), we define \(f_m\diamondsuit \cdots \diamondsuit f_1\) inductively as the function \(\varphi _m\) such that
With that, we have \( f_m\diamondsuit f_{m-1} \diamondsuit \cdots \diamondsuit f_2 \diamondsuit f_1 {:}{=}f_m\diamondsuit (f_{m-1}\diamondsuit (\cdots \diamondsuit (f_2\diamondsuit f_1)))\).
The following lemma, which holds for a general closed convex cone \( \mathcal {K}\), shows how (positively rescaled shifts of) one-step facial residual functions for the faces of \( \mathcal {K}\) can be combined via the diamond composition to derive useful bounds on the distance to faces. A version of it was proved in [34, Lemma 22], which required the cones to be pointed and made use of general (i.e., not necessarily one-step) facial residual functions with respect to \( \mathcal {K}\). This is a subtle, but very crucial difference which will allows us to relax the assumptions in [34].
Lemma 3.7
(Diamond composing facial residual functions) Suppose (Feas) is feasible and let
be a chain of faces of \( \mathcal {K}\) together with \({{\textbf {z}}}_i \in \mathcal {F}_i^*\cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that \( \mathcal {F}_{i+1} = \mathcal {F}_i\cap \{{{\textbf {z}}}_i\}^\perp \), for \(i = 1,\ldots , \ell - 1\). For each i, let \(\psi _{i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}_i\) and \({{\textbf {z}}}_i\). Then, there is a positively rescaled shift of \(\psi _i\) (still denoted as \(\psi _i\) by an abuse of notation) so that for every \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\):
where \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), if \(\ell \ge 2\). If \(\ell = 1\), we let \(\varphi \) be the function satisfying \(\varphi (\epsilon , t) = \epsilon \).
Proof
For \(\ell = 1\), we have \( \mathcal {F}_{\ell } = \mathcal {K}\), so the lemma follows immediately. Now, we consider the case \(\ell \ge 2\). First we note that \(\mathcal {L}+ {{\textbf {a}}}\) is contained in all the \(\{{{\textbf {z}}}_{i}\}^\perp \) for \(i = 1, \ldots , \ell -1\). Since the distance of \({{\textbf {x}}}\in \mathcal {E}\) to \(\{{{\textbf {z}}}_{i}\}^\perp \) is given by \(\frac{|\langle {{\textbf {x}}} , {{\textbf {z}}}_i \rangle |}{\Vert {{\textbf {z}}}_i\Vert }\), we have the following chain of implications
Next, we proceed by induction. If \(\ell = 2\), we have that \(\psi _1\) is a one-step facial residual function for \( \mathcal {K}\) and \({{\textbf {z}}}_1\). By Definition 3.4, we have
In view of (3.4) and the monotonicity of \(\psi _1\), we see further that
Now, suppose that \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\) are such that \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \). Let \({\hat{{{\textbf {x}}}}} := P_{\text {span}\, \mathcal {K}}({{\textbf {x}}})\). Since \( \mathcal {K}\subseteq \text {span}\, \mathcal {K}\), we have \(\text {d}({{\textbf {x}}}, \text {span}\, \mathcal {K})\le \text {d}({{\textbf {x}}}, \mathcal {K})\) and, in view of (2.2), we have that
From (2.1), (3.5) and (3.6) we obtain
where the last inequality follows from the monotonicity of \(\psi _{{1}}\) and the fact that \(\Vert {\hat{{{\textbf {x}}}}}\Vert \le \Vert {{\textbf {x}}}\Vert \). This proves the lemma for chains of length \(\ell = 2\) because the function mapping \((\epsilon ,t)\) to \(\epsilon + \psi _{{1}}(2\epsilon (1+\Vert {{\textbf {z}}}_{1}\Vert ),t)\) is a positively rescaled shift of \(\psi _{{1}}\).
Now, suppose that the lemma holds for chains of length \({\hat{\ell }}\) and consider a chain of length \({\hat{\ell }} + 1\). By the induction hypothesis, we have
where \(\varphi = \psi _{{{\hat{\ell }}-1}}\diamondsuit \cdots \diamondsuit \psi _1\) and the \(\psi _i\) are (positively rescaled shifts of) one-step facial residual functions. By the definition of \(\psi _{{{\hat{\ell }}}}\) as a one-step facial residual function and using (3.4), we may positively rescale \(\psi _{{{\hat{\ell }}}}\) (still denoted as \(\psi _{{{\hat{\ell }}}}\) by an abuse of notation) so that for \({{\textbf {y}}}\in \text {span}\, \mathcal {F}_{{\hat{\ell }}}\) and \({\hat{\epsilon }} \ge 0\), the following implication holds:
Now, suppose that \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\) satisfy \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \). Let \({\hat{{{\textbf {x}}}}} := P_{\text {span}\, \mathcal {F}_{{\hat{\ell }}}}({{\textbf {x}}})\). As before, since \( \mathcal {F}_{{\hat{\ell }}} \subseteq \text {span}\, \mathcal {F}_{{\hat{\ell }}}\), we have \(\text {d}({{\textbf {x}}}, \text {span}\, \mathcal {F}_{{\hat{\ell }}})\le \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}})\) and, in view of (2.2), we have
Let \({\hat{\psi }}_{{\hat{\ell }}}\) be such that \({\hat{\psi }}_{{\hat{\ell }}}(s,t){:}{=}s + \psi _{{\hat{\ell }}}(2s,t)\), so that \({\hat{\psi }}_{{\hat{\ell }}}\) is a positively rescaled shift of \(\psi _{{\hat{\ell }}}\). Then, (3.9) together with (3.8) and (2.1) gives
where (a) follows from the monotonicity of \(\psi _{{\hat{\ell }}}\) and the fact that \(\Vert {\hat{{{\textbf {x}}}}}\Vert \le \Vert {{\textbf {x}}}\Vert \), and (b) follows from (3.7) and the monotonicity of \({\hat{\psi }}_{{\hat{\ell }}}\). This completes the proof. \(\square \)
We now have all the pieces to state an error bound result for (Feas) that does not require any constraint qualifications.
Theorem 3.8
(Error bound based on \(\mathbb {1}\)-FRFs) Suppose (Feas) is feasible and let
be a chain of faces of \( \mathcal {K}\) together with \({{\textbf {z}}}_i \in \mathcal {F}_i^*\cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that \(\{ \mathcal {F}_{\ell }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition and \( \mathcal {F}_{i+1} = \mathcal {F}_i\cap \{{{\textbf {z}}}_i\}^\perp \) for every i. For \(i = 1,\ldots , \ell - 1\), let \(\psi _{i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}_{i}\) and \({{\textbf {z}}}_i\).
Then, there is a suitable positively rescaled shift of the \(\psi _{i}\) (still denoted as \(\psi _i\) by an abuse of notation) such that for any bounded set B there is a positive constant \(\kappa _B\) (depending on \(B, \mathcal {L}, {{\textbf {a}}}, \mathcal {F}_{\ell }\)) such that
where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \), \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), if \(\ell \ge 2\). If \(\ell = 1\), we let \(\varphi \) be the function satisfying \(\varphi (\epsilon , M) = \epsilon \).
Proof
The case \(\ell = 1\) follows from Proposition 3.1, by taking \( \mathcal {F}= \mathcal {F}_1\). Now, suppose \(\ell \ge 2\). We apply Lemma 3.7, which tells us that, after positively rescaling and shifting the \(\psi _i\), we have:
where \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}} \). In particular, since \(\Vert {{\textbf {x}}}\Vert \le M\) for \({{\textbf {x}}}\in B\) we have
By assumption, \(\{ \mathcal {F}_{\ell }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition. We invoke Proposition 3.1 to find \( \kappa _B > 0\) such that
Combining (3.10), (3.11), we conclude that if \({{\textbf {x}}}\in B\) and \(\epsilon \ge 0\) satisfy \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \), then we have \(\text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}\right) \le \kappa _B(\epsilon +\varphi (\epsilon ,M))\). This completes the proof. \(\square \)
Theorem 3.8 is an improvement over [34, Theorem 23] because it removes the amenability assumption. Furthermore, it shows that it is enough to determine the one-step facial residual functions for \( \mathcal {K}\) and its faces, whereas [34, Theorem 23] may require all possible facial residual functions related to \( \mathcal {K}\) and its faces. Nevertheless, Theorem 3.8 is still an abstract error bound result; whether some concrete inequality can be written down depends on obtaining a formula for the \(\varphi \) function. To do so, it would require finding expressions for the one-step facial residual functions. In the next subsections, we will address this challenge.
3.1 How to compute one-step facial residual functions?
In this section, we present some general tools for computing one-step facial residual functions.
Lemma 3.9
(\(\mathbb {1}\)-FRF from error bound) Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a proper face of \( \mathcal {K}\). Let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0)=0\), and let \(\kappa _{{{\textbf {z}}},\mathfrak {s}}\) be a finite monotone nondecreasing nonnegative function in \(\mathfrak {s}\in \mathbb {R}_+\) such that
Define the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) by
Then we have
Moreover, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is a \(\mathbb {1}\)-FRF for \( \mathcal {K}\) and \({{\textbf {z}}}\).
Proof
Suppose that \(\text {d}({{\textbf {p}}}, \mathcal {K}) \le \epsilon \) and \(\langle {{\textbf {p}}} , {{\textbf {z}}} \rangle \le \epsilon \). We first claim that
This can be shown as follows. Since \({{\textbf {z}}}\in \mathcal {K}^*\), we have \(\langle {{\textbf {p}}}+P_{ \mathcal {K}}({{\textbf {p}}})-{{\textbf {p}}} , {{\textbf {z}}} \rangle \ge 0\) and
We conclude that \(|\langle {{\textbf {p}}},{{\textbf {z}}}\rangle | \le \max \{\epsilon \Vert {{\textbf {z}}}\Vert ,\epsilon \}\). This, in combination with \(\text {d}({{\textbf {p}}},\{{{\textbf {z}}}\}^\perp ) = |\langle {{\textbf {p}}},{{\textbf {z}}}\rangle |/\Vert {{\textbf {z}}}\Vert \), leads to (3.14).
Next, let \({{\textbf {q}}}:=P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {p}}}\). Then we have that
where (a) follows from (3.14), (b) is a consequence of (3.12), (c) holds because \(\Vert {{\textbf {q}}}\Vert = \Vert P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {p}}}\Vert \le \Vert {{\textbf {p}}}\Vert \) so that \(\kappa _{{{\textbf {z}}},\Vert {{\textbf {q}}}\Vert }\le \kappa _{{{\textbf {z}}},\Vert {{\textbf {p}}}\Vert }\), and (d) holds because \(\mathfrak {g}\) is monotone nondecreasing and
here, the second inequality follows from (3.14) and the assumption that \(\text {d}({{\textbf {p}}}, \mathcal {K})\le \epsilon \). This proves (3.13). Finally, notice that \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is nonnegative, monotone nondecreasing in each argument, and that \(\psi _{ \mathcal {K},{{\textbf {z}}}}(0,t)=0\) for every \(t \in \mathbb {R}_+\). Hence, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is a one-step facial residual function for \( \mathcal {K}\) and \({{\textbf {z}}}\). \(\square \)
In view of Lemma 3.9, one may construct one-step facial residual functions after establishing the error bound (3.12). In the next theorem, we present a characterization for the existence of such an error bound. Our result is based on the quantity (3.15) defined below being nonzero. Note that this quantity does not explicitly involve projections onto \( \mathcal {K}\); this enables us to work with the exponential cone later, whose projections do not seem to have simple expressions. Figure 2 provides a geometric interpretation of (3.15).
Theorem 3.10
(Characterization of the existence of error bounds) Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a nontrivial exposed face of \( \mathcal {K}\). Let \(\eta \ge 0\), \(\alpha \in (0,1]\) and let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0) = 0\) and \(\mathfrak {g}\ge |\cdot |^\alpha \). Define
Then the following statements hold.
-
(i)
If \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\), then it holds that
$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) \le \kappa _{{{\textbf {z}}},\eta } \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))\ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ), \end{aligned}$$(3.16)where \(\kappa _{{{\textbf {z}}},\eta } := \max \left\{ 2\eta ^{1-\alpha }, 2\gamma _{{{\textbf {z}}},\eta }^{-1} \right\} < \infty \).
-
(ii)
If there exists \(\kappa _{_B} \in (0,\infty )\) so that
$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) \le \kappa _{_B} \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))\ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ), \end{aligned}$$(3.17)then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\).
Proof
We first consider item (i). If \(\eta = 0\) or \({{\textbf {q}}}\in \mathcal {F}\), the result is vacuously true, so let \(\eta > 0\) and \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ) \backslash \mathcal {F}\). Then \({{\textbf {q}}}\notin \mathcal {K}\) because \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\). Define
Then \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\) because \({{\textbf {q}}}\notin \mathcal {K}\) and \(\Vert {{\textbf {q}}}\Vert \le \eta \). If \({{\textbf {v}}}\in \mathcal {F}\), then we have \(\text {d}({{\textbf {q}}}, \mathcal {F}) = \text {d}({{\textbf {q}}}, \mathcal {K})\) and hence
where the first inequality holds because \(\Vert {{\textbf {q}}}\Vert \le \eta \), and the last inequality follows from the definitions of \(\mathfrak {g}\) and \(\kappa _{{{\textbf {z}}},\eta }\). Thus, from now on, we assume that \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\).
Next, since \({{\textbf {w}}}=P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\), it holds that \({{\textbf {v}}}-{{\textbf {w}}}\in \{{{\textbf {z}}}\}^{\perp \perp }\) and hence \(\Vert {{\textbf {q}}}-{{\textbf {v}}}\Vert ^2 = \Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert ^2+\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert ^2\). In particular, we have
where the equality follows from the definition of \({{\textbf {v}}}\). Now, to establish (3.16), we consider two cases.
-
(I)
\(\text {d}({{\textbf {q}}}, \mathcal {F}) \le 2\text {d}({{\textbf {w}}}, \mathcal {F})\);
-
(II)
\(\text {d}({{\textbf {q}}}, \mathcal {F}) > 2\text {d}({{\textbf {w}}}, \mathcal {F})\).
(I): In this case, we have from \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\) and \({{\textbf {q}}}\notin \mathcal {F}\) that
where the first inequality follows from the assumption in this case (I). Hence,
where (a) is true by the definition of \(\kappa _{{{\textbf {z}}},\eta }\), (b) uses the condition that \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\), (3.19) and the definition of \(\gamma _{{{\textbf {z}}},\eta }\), (c) is true by (3.18) and the monotonicity of \(\mathfrak {g}\), and (d) follows from (3.19). This concludes case (I).Footnote 6
(II): Using the triangle inequality, we have
where the strict inequality follows from the condition for this case (II). Consequently, we have \(\text {d}({{\textbf {q}}}, \mathcal {F}) \le 2\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert \). Combining this with (3.18), we deduce further that
where the fourth inequality holds because \(\Vert {{\textbf {q}}}\Vert \le \eta \), and the last inequality follows from the definitions of \(\mathfrak {g}\) and \(\kappa _{{{\textbf {z}}},\eta }\). This proves item (i).
We next consider item (ii). Again, the result is vacuously true if \(\eta = 0\), so let \(\eta > 0\). Let \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\), \({{\textbf {w}}}= P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\) and \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\) with \({{\textbf {w}}}\ne {{\textbf {u}}}\). Then \({{\textbf {w}}}\in B(\eta )\), and we have in view of (3.17) that
where (a) holds because \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\), (b) holds because of (3.17), \({{\textbf {w}}}\in \{{{\textbf {z}}}\}^\perp \) and \(\Vert {{\textbf {w}}}\Vert \le \eta \), and (c) is true because \(\mathfrak {g}\) is monotone nondecreasing and \({{\textbf {v}}}\in \mathcal {K}\). Thus, we have \(\gamma _{{{\textbf {z}}},\eta } \ge 1/\kappa _{_B} > 0\). This completes the proof. \(\square \)
Remark 3.11
(About \(\kappa _{{{\textbf {z}}},\eta }\) and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\)) As \(\eta \) increases, the infimum in (3.15) is taken over a larger region, so \(\gamma _{{{\textbf {z}}},\eta }\) does not increase. Accordingly, \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) does not decrease when \(\eta \) increases. Therefore, the \(\kappa _{{{\textbf {z}}},\eta }\) and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) considered in Theorem 3.10 are monotone nondecreasing as functions of \(\eta \) when \({{\textbf {z}}}\) is fixed. We are also using the convention that \(1/\infty = 0\) so that \(\kappa _{{{\textbf {z}}},\eta } = 2\eta ^{1-\alpha }\) when \(\gamma _{{{\textbf {z}}},\eta } = \infty \).
Thus, to establish an error bound as in (3.16), it suffices to show that \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) for the choice of \(\mathfrak {g}\) and \(\eta \ge 0\). Clearly, \(\gamma _{{{\textbf {z}}},0} = \infty \). The next lemma allows us to check whether \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) for an \(\eta > 0\) by considering convergent sequences.
Lemma 3.12
Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a nontrivial exposed face of \( \mathcal {K}\). Let \(\eta > 0\), \(\alpha \in (0,1]\) and let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0) = 0\) and \(\mathfrak {g}\ge |\cdot |^\alpha \). Let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15). If \(\gamma _{{{\textbf {z}}},\eta } = 0\), then there exist \(\bar{{{\textbf {v}}}}\in \mathcal {F}\) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial \mathcal {K}\cap B(\eta ) \backslash \mathcal {F}\) such that
where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{ \mathcal {F}}{{\textbf {w}}}^k\) and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\).
Proof
Suppose that \(\gamma _{{{\textbf {z}}},\eta } = 0\). Then, by the definition of infimum, there exists a sequence \(\{{{\textbf {v}}}^k\}\subset \partial \mathcal {K}\cap B(\eta ) \backslash \mathcal {F}\) such that
where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{ \mathcal {F}}{{\textbf {w}}}^k\) and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). Since \(\{{{\textbf {v}}}^k\}\subset B(\eta )\), by passing to a convergent subsequence if necessary, we may assume without loss of generality that
for some \(\bar{{{\textbf {v}}}}\in \mathcal {K}\cap B(\eta )\). In addition, since \(0\in \mathcal {F}\subseteq \{{{\textbf {z}}}\}^\perp \), and projections onto closed convex sets are nonexpansive, we see that \(\{{{\textbf {w}}}^k\}\subset B(\eta )\) and \(\{{{\textbf {u}}}^k\}\subset B(\eta )\), and hence the sequence \(\{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert \}\) is bounded. Then we can conclude from (3.22) and the assumption \(\mathfrak {g}\ge |\cdot |^\alpha \) that
Now (3.24), (3.23), and the triangle inequality give \({{\textbf {w}}}^k\rightarrow \bar{{{\textbf {v}}}}\). Since \(\{{{\textbf {w}}}^k\}\subset \{{{\textbf {z}}}\}^\perp \), it then follows that \(\bar{{{\textbf {v}}}}\in \{{{\textbf {z}}}\}^\perp \). Thus, \(\bar{{{\textbf {v}}}}\in \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}= \mathcal {F}\). This completes the proof. \(\square \)
Let \( \mathcal {K}\) be a closed convex cone. Lemma 3.9, Theorem 3.10 and Lemma 3.12 are tools to obtain one-step facial residual functions for \( \mathcal {K}\). These are exactly the kind of facial residual functions needed in the abstract error bound result, Theorem 3.8. We conclude this subsection with a result that connects the one-step facial residual functions of a product cone and those of its constituent cones, which is useful for deriving error bounds for product cones.
Proposition 3.13
(\(\mathbb {1}\)-FRF for products) Let \( \mathcal {K}^i \subseteq \mathcal {E}^i\) be closed convex cones for every \(i \in \{1,\ldots ,m\}\) and let \( \mathcal {K}= \mathcal {K}^1 \times \cdots \times \mathcal {K}^m\). Let \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\), \({{\textbf {z}}}\in \mathcal {F}^*\) and suppose that \( \mathcal {F}= \mathcal {F}^1\times \cdots \times \mathcal {F}^m\) with \( \mathcal {F}^i \mathrel {\unlhd } \mathcal {K}^i\) for every \(i \in \{1,\ldots ,m\}\). Write \({{\textbf {z}}}= ({{\textbf {z}}}_1,\ldots ,{{\textbf {z}}}_m)\) with \({{\textbf {z}}}_i \in ( \mathcal {F}^i)^*\).
For every i, let \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}^i\) and \({{\textbf {z}}}_i\). Then, there exists a \(\kappa > 0\) such that the function \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) satisfying
is a \(\mathbb {1}\)-FRF for \( \mathcal {F}\) and \({{\textbf {z}}}\).
Proof
Suppose that \({{\textbf {x}}}\in \text {span}\, \mathcal {F}\) and \(\epsilon \ge 0\) satisfy the inequalities
We note that
and that for every \(i \in \{1,\ldots ,m\}\),
Since \({{\textbf {z}}}_i \in ( \mathcal {F}^i)^*\), we have from (3.25) that
Using (3.26) for all i and recalling that \(\langle {{\textbf {z}}} , {{\textbf {x}}} \rangle \le \epsilon \), we have
where \({\hat{\kappa }} = 1+\sum _{i=1}^m\Vert {{\textbf {z}}}_i\Vert \). Since \(\langle {{\textbf {z}}}_i , P_{ \mathcal {F}^{i}}({{\textbf {x}}}_i) \rangle \ge 0\) for \(i \in \{1,\ldots ,m\}\), from (3.27) we obtain
This implies that for \(i \in \{1,\ldots ,m\}\) we have
where the inequality follows from (3.25) and (3.28). Now, recapitulating, the facial residual function \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) has the property that if \(\gamma _1,\gamma _2 \in \mathbb {R}_+\) then the relations
imply \(\text {d}({{\textbf {y}}}_i, \mathcal {F}^i\cap \{{{\textbf {z}}}_i\}^\perp ) \le \psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}(\max _{1\le j \le 2}\{\gamma _j \},\Vert {{\textbf {y}}}_i\Vert )\). Therefore, from (3.25), (3.29) and the monotonicity of \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\), we have upon recalling \({{\textbf {x}}}\in \text {span}\, \mathcal {F}\) that
Finally, from (3.30), we conclude that
where we also used the monotonicity of \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) for the last inequality. This completes the proof. \(\square \)
4 The exponential cone
In this section, we will use all the techniques developed so far to obtain error bounds for the 3D exponential cone \(K_{\exp }\). We will start with a study of its facial structure in Sect. 4.1, then we will compute its one-step facial residual functions in Sect. 4.2. Finally, error bounds will be presented in Sect. 4.3. In Sect. 4.4, we summarize odd behaviour found in the facial structure of the exponential cone.
4.1 Facial structure
Recall that the exponential cone is defined as follows:
Its dual cone is given by
It may therefore be readily seen that \(K_{\exp }^*\) is a scaled and rotated version of \(K_{\exp }\). In this subsection, we will describe the nontrivial faces of \(K_{\exp }\); see Fig. 3. We will show that we have the following types of nontrivial faces:
-
(a)
infinitely many exposed extreme rays (1D faces) parametrized by \(\beta \in \mathbb {R}\) as follows:
$$\begin{aligned} {{\mathcal {F}}}_\beta {:}{=}\left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\bigg |\;y \in [0,\infty )\right\} . \end{aligned}$$(4.2) -
(b)
a single “exceptional” exposed extreme ray denoted by \({{\mathcal {F}}}_{\infty }\):
$$\begin{aligned} {{\mathcal {F}}}_{\infty }{:}{=}\{(x,0,0)\;|\;x\le 0 \}. \end{aligned}$$(4.3) -
(c)
a single non-exposed extreme ray denoted by \({{\mathcal {F}}}_{ne} \):
$$\begin{aligned} {{\mathcal {F}}}_{ne}{:}{=}\{(0,0,z)\;|\;z\ge 0 \}. \end{aligned}$$(4.4) -
(d)
a single 2D exposed face denoted by \({{\mathcal {F}}}_{-\infty }\):
$$\begin{aligned} {{\mathcal {F}}}_{-\infty }{:}{=}\{(x,y,z)\;|\;x\le 0, z\ge 0, y=0\}, \end{aligned}$$(4.5)where we note that \({{\mathcal {F}}}_{\infty }\) and \({{\mathcal {F}}}_{ne}\) are the extreme rays of \({{\mathcal {F}}}_{-\infty }\).
Notice that except for the case (c), all faces are exposed and thus arise as an intersection \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) for some \({{\textbf {z}}}\in K_{\exp }^*\). To establish the above characterization, we start by examining how the components of \({{\textbf {z}}}\) determine the corresponding exposed face.
4.1.1 Exposed faces
Let \({{\textbf {z}}}\in K_{\exp }^*\) be such that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a nontrivial face of \(K_{\exp }\). Then \({{\textbf {z}}}\ne 0\) and \({{\textbf {z}}}\in \partial K_{\exp }^*\). We consider the following cases.
\(\underline{z_x < 0}\): Since \({{\textbf {z}}}\in \partial K_{\exp }^*\), we must have \(z_z e = -z_x e^{\frac{z_y}{z_x}}\) and hence
Since \(z_x \ne 0\), we see that \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \) if and only if
Solving (4.7) for \(q_z\) and letting \(\beta :=\frac{z_y}{z_x}\) to simplify the exposition, we have
Thus, we obtain that \(\{{{\textbf {z}}}\}^\perp = \left\{ \left( x,y, e^{1-\beta }\left( x+y\beta \right) \right) \;\big |\; x,y \in \mathbb {R}\right\} \). Combining this with the definition of \(K_{\exp }\) and the fact that \(\{{{\textbf {z}}}\}^\perp \) is a supporting hyperplane (so that \(K_{\exp } \cap \{{{\textbf {z}}}\}^\perp = \partial K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \)) yields
We now refine the above characterization in the next proposition.
Proposition 4.1
(Characterization of \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) satisfy \({{\textbf {z}}}=(z_x,z_y,z_z)\), where \(z_z e = -z_x e^{\frac{z_y}{z_x}}\) and \(z_x <0\). Define \(\beta =\frac{z_y}{z_x}\) as in (4.8) and let \({{\mathcal {F}}}_\beta := K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \). Then
Proof
Let \(\varOmega := \left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\big |\;y \in [0,\infty )\right\} \). In view of (4.9), we can check that \(\varOmega \subseteq {{\mathcal {F}}}_\beta \). To prove the converse inclusion, pick any \({{\textbf {q}}}=\left( x,y,e^{1-\beta }(x+y\beta )\right) \in {{\mathcal {F}}}_\beta \). We need to show that \({{\textbf {q}}}\in \varOmega \).
To this end, we note from (4.9) that if \(y = 0\), then necessarily \({{\textbf {q}}}= {\textbf {0}}\) and consequently \({{\textbf {q}}}\in \varOmega \). On the other hand, if \(y > 0\), then (4.9) gives \(ye^{x/y}= (x+\beta y)e^{1-\beta }\). Then we have the following chain of equivalences:
where (a) follows from the fact that the function \(t\mapsto te^t\) is strictly increasing on \([-1,\infty )\). Plugging the last expression back into \({{\textbf {q}}}\), we may compute
Altogether, (4.10), (4.11) together with \(y>0\) yield
This completes the proof. \(\square \)
Next, we move on to the two remaining cases.
\(\underline{z_x = 0, z_z > 0}\): Notice that \({{\textbf {q}}}\in K_{\exp }\) means that \(q_y \ge 0\) and \(q_z\ge 0\). Since \(z_z > 0\) and \(z_y \ge 0\), in order to have \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \), we must have \(q_z = 0\). The the definition of \(K_{\exp }\) also forces \(q_y = 0\) and hence
This one-dimensional face is exposed by any hyperplane with normal vectors coming from the set \(\{(0,z_y,z_z):\; z_y\ge 0, z_z > 0)\}\).
\({{\underline{z_x = 0, z_z = 0}}}\): In this case, we have \(z_y > 0\). In order to have \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \), we must have \(q_y = 0\). Thus
which is the unique two-dimensional face of \(K_{\exp }\).
4.1.2 The single non-exposed face and completeness of the classification
The face \({{\mathcal {F}}}_{ne}\) is non-exposed because, as shown in Proposition 4.1, (4.12) and (4.13), it never arises as an intersection of the form \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\), for \({{\textbf {z}}}\in K_{\exp }^*\).
We now show that all nontrivial faces of \(K_{\exp }\) were accounted for in (4.2), (4.3), (4.4), (4.5). First of all, by the discussion in Sect. 4.1.1, all nontrivial exposed faces must be among the ones in (4.2), (4.3) and (4.5). So, let \( \mathcal {F}\) be a non-exposed face of \(K_{\exp }\). Then, it must be contained in a nontrivial exposed face of \(K_{\exp }\).Footnote 7 Therefore, \( \mathcal {F}\) must be a proper face of the unique 2D face (4.5). This implies that \( \mathcal {F}\) is one of the extreme rays of (4.5): \({{\mathcal {F}}}_{\infty }\) or \({{\mathcal {F}}}_{ne}\). By assumption, \( \mathcal {F}\) is non-exposed, so it must be \({{\mathcal {F}}}_{ne}\).
4.2 One-step facial residual functions
In this subsection, we will use the machinery developed in Sect. 3 to obtain the one-step facial residual functions for \(K_{\exp }\).
Let us first discuss how the discoveries were originally made, and how that process motivated the development of the framework we built in Sect. 3. The FRFs proven here were initially found by using the characterizations of Theorem 3.10 and Lemma 3.12 together with numerical experiments. Specifically, we used MapleFootnote 8 to numerically evaluate limits of relevant sequences (3.21), as well as plotting lower dimensional slices of the function \({{\textbf {v}}}\mapsto \mathfrak {g}(\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert )/\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert \), where \({{\textbf {w}}}\) and \({{\textbf {u}}}\) are defined similarly as in (3.15).
A natural question is whether it might be simpler to change coordinates and work with the nearly equivalent \({{\textbf {w}}}\mapsto \mathfrak {g}(\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert )/\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert \), since \({{\textbf {w}}}\in \{{{\textbf {z}}}\}^\perp \). However, \(P_{\{{{\textbf {z}}}\}^\perp }^{-1}\{{{\textbf {w}}}\}\cap \partial \mathcal {K}\) may contain multiple points, which creates many challenges. We encountered an example of this when working with the exponential cone, where the change of coordinates from \({{\textbf {v}}}\) to \({{\textbf {w}}}\) necessitates the introduction of the two real branches of the Lambert \({\mathcal {W}}\) function (see, for example, [7, 12, 14] or [48] for the closely related Wright Omega function). With terrible effort, one can use such a parametrization to prove the FRFs for \({{\mathcal {F}}}_{\beta }, \beta \in \left[ -\infty ,\infty \right] \setminus \{\hat{\beta }:=-{\mathcal {W}}_{\mathrm{principal}}(2e^{-2})/2 \}\). However, the change of branches inhibits proving the result for the exceptional number \(\hat{\beta }\). The change of variables to \({{\textbf {v}}}\) cures this problem by obviating the need for a branch function in the analysis; see [31] for additional details. This is why we present Theorem 3.10 in terms of \({{\textbf {v}}}\). Computational investigation also pointed to the path of proof, though the proof we present may be understood without the aid of a computer.
4.2.1 \({{\mathcal {F}}}_{-\infty }\): the unique 2D face
Recall the unique 2D face of \(K_{\exp }\):
Define the piecewise modified Boltzmann–Shannon entropy \(\mathfrak {g}_{-\infty }:\mathbb {R}_+\rightarrow \mathbb {R}_+\) as follows:
For more on its usefulness in optimization, see, for example, [7, 14]. We note that \(\mathfrak {g}_{-\infty }\) is monotone increasing and there exists \(L\ge 1\) such that the following inequalities hold for every \(t \in \mathbb {R}_+\) and \(M > 0\):Footnote 9
With that, we prove in the next theorem that \(\gamma _{{{\textbf {z}}},\eta }\) is positive for \({{\mathcal {F}}}_{-\infty }\), which implies that an entropic error bound holds.
Theorem 4.2
(Entropic error bound concerning \({{\mathcal {F}}}_{-\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_z=0\) and \(z_y > 0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in (4.14). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and
Proof
In view of Lemma 3.12, take any \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_{-\infty }\) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_{-\infty }\) such that
where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_{-\infty }}{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). We will show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\).
Since \({{\textbf {v}}}^k \notin {{\mathcal {F}}}_{-\infty }\), in view of (4.1) and (4.13), we have \(v^k_y>0\) and
where the second representation is obtained by solving for \(v^k_x\) from \(v^k_z = v^k_ye^{v^k_x/v^k_y} > 0\). Using the second representation in (4.18), we then have
here, we made use of the fact that \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\), which implies that \(v^k_y\ln (v^k_z/v^k_y) > 0\) and thus the resulting formula for \({{\textbf {u}}}^k\). In addition, we also note from \(v^k_y\ln (v^k_z/v^k_y) > 0\) (and \(v^k_y>0\)) that
Furthermore, since \(\bar{{{\textbf {v}}}} \in {{\mathcal {F}}}_{-\infty }\), we see from (4.13) and (4.17) that
Now, using (4.18), (4.19), (4.21) and the definition of \(\mathfrak {g}_{-\infty }\), we see that for k sufficiently large,
We will show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in each of the following cases.
-
(I)
\(\bar{v}_z >0\).
-
(II)
\(\bar{v}_z =0\).
(I): In this case, we deduce from (4.21) and (4.22) that
Thus (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\).
(II): By passing to a subsequence if necessary, we may assume that \(v^k_z < 1\) for all k. This together with (4.20) gives \(\frac{\ln (v_z^k)}{\ln (v_y^k)} \in (0,1)\) for all k. Thus, we conclude from (4.22) that for all k,
Consequently, (3.21b) also fails for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in this case.
Having shown that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in any case, we conclude by Lemma 3.12 that \(\gamma _{{{\textbf {z}}},\eta } \in \left( 0,\infty \right] \). With that, (4.16) follows from Theorem 3.10 and (4.15). \(\square \)
Using Theorem 4.2, we can also show weaker Hölderian error bounds.
Corollary 4.3
Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_z=0\) and \(z_y > 0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\eta >0\), \(\alpha \in (0,1)\), and \(\gamma _{{{\textbf {z}}},\eta }\) be as in (3.15) with \(\mathfrak {g}= |\cdot |^\alpha \). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and
Proof
Suppose that \(\gamma _{{{\textbf {z}}},\eta } = 0\) and let sequences \(\{{{\textbf {v}}}^k\},\{{{\textbf {w}}}^k\},\{{{\textbf {u}}}^k\}\) be as in Lemma 3.12. Then \({{\textbf {v}}}^k\ne {{\textbf {w}}}^k\) for all k because \(\{{{\textbf {v}}}^k\}\subset K_{\exp }\backslash {{\mathcal {F}}}_{-\infty }\), \(\{{{\textbf {w}}}^k\}\subset \{{{\textbf {z}}}\}^\perp \), and \({{\mathcal {F}}}_{-\infty } = K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \). Since \(\mathfrak {g}_{-\infty }(t)/|t|^{\alpha } \downarrow 0 \) as \(t \downarrow 0\) we have
which contradicts Theorem 4.2 because the quantity in (3.15) should be positive for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\). \(\square \)
Recalling (4.15), we obtain one-step facial residual functions using Theorem 4.2 and Corollary 4.3 in combination with Theorem 3.10, Remark 3.11 and Lemma 3.9.
Corollary 4.4
(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_{-\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) be such that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in (4.14) or \(\mathfrak {g}= |\cdot |^\alpha \) for \(\alpha \in (0,1)\).
Let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by
is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\mathfrak {g}(\epsilon )\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).
4.2.2 \({{\mathcal {F}}}_\beta \): the family of one-dimensional faces \(\beta \in \mathbb {R}\)
Recall from Proposition 4.1 that for each \(\beta \in \mathbb {R}\),
is a one-dimensional face of \(K_{\exp }\). We will now show that for \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\), the \(\gamma _{{{\textbf {z}}},\eta }\) defined in Theorem 3.10 is positive when \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Our discussion will be centered around the following quantities, which were also defined and used in the proof of Theorem 3.10. Specifically, for \({{\textbf {z}}}\in K_{\exp }^*\) such that \({{\mathcal {F}}}_{\beta } = K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \), we let \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\beta \) and define
We first note the following three important vectors:
Note that \(\widehat{{{\textbf {z}}}}\) is parallel to \({{\textbf {z}}}\) in (4.6) (recall that \(z_x < 0\) for \({{\mathcal {F}}}_\beta \), where \(\beta := \frac{z_y}{z_x}\in \mathbb {R}\)), \({{\mathcal {F}}}_\beta \) is the conic hull of \(\{\widehat{{{\textbf {f}}}}\}\) according to Proposition 4.1, \(\langle \widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}}\rangle =0\) and \(\widehat{{{\textbf {p}}}}= \widehat{{{\textbf {z}}}}\times \widehat{{{\textbf {f}}}}\ne {{\varvec{0}}}\). These three nonzero vectors form a mutually orthogonal set. The next lemma represents \(\Vert {{\textbf {u}}}- {{\textbf {w}}}\Vert \) and \(\Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert \) in terms of inner products of \({{\textbf {v}}}\) with these vectors, whenever possible.
Lemma 4.5
Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\eta >0\), \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\beta \) and define \({{\textbf {w}}}\) and \({{\textbf {u}}}\) as in (4.23). Let \(\{\widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}},\widehat{{{\textbf {p}}}}\}\) be as in (4.24). Then
Moreover, when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0\), we have \({{\textbf {u}}}= P_{\mathrm{span}{{\mathcal {F}}}_\beta }{{\textbf {w}}}\).
Proof
Since \(\{\widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}},\widehat{{{\textbf {p}}}}\}\) is orthogonal, one can decompose \({{\textbf {v}}}\) as
with
Also, since \(\widehat{{{\textbf {z}}}}\) is parallel to \({{\textbf {z}}}\), we must have \({{\textbf {w}}}= \lambda _2 \widehat{{{\textbf {f}}}}+ \lambda _3 \widehat{{{\textbf {p}}}}\). Thus, it holds that \(\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert = |\lambda _1|\Vert \widehat{{{\textbf {z}}}}\Vert \) and the first conclusion follows from this and (4.26).
Next, we have \({{\textbf {u}}}= {\hat{t}} \,\widehat{{{\textbf {f}}}}\), where
Moreover, observe from (4.25) that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {w}}}\rangle = \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}- \lambda _1\widehat{{{\textbf {z}}}}\rangle = \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \). These mean that when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle < 0\), we have \({{\textbf {u}}}= 0\), while when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0\), we have \({{\textbf {u}}}= \frac{\langle {{\textbf {w}}},\widehat{{{\textbf {f}}}}\rangle }{\Vert \widehat{{{\textbf {f}}}}\Vert ^2}\widehat{{{\textbf {f}}}}=P_{\mathrm{span}{{\mathcal {F}}}_\beta }{{\textbf {w}}}\) and
where the second and the third equalities follow from (4.25), (4.26), and the fact that \({{\textbf {w}}}= \lambda _2 \widehat{{{\textbf {f}}}}+ \lambda _3 \widehat{{{\textbf {p}}}}\). This completes the proof. \(\square \)
We now prove our main theorem in this section.
Theorem 4.6
(Hölderian error bound concerning \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and
Proof
In view of Lemma 3.12, take any \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_\beta \) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_\beta \) such that
where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\beta }{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). We will show that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\).
We first suppose that \({{\textbf {v}}}^k\in {{\mathcal {F}}}_{-\infty }\) infinitely often. By extracting a subsequence if necessary, we may assume that \({{\textbf {v}}}^k\in {{\mathcal {F}}}_{-\infty }\) for all k. From the definition of \({{\mathcal {F}}}_{-\infty }\) in (4.13), we have \(v_x^k\le 0\), \(v_y^k=0\) and \(v_z^k \ge 0\). Thus, recalling the definition of \(\widehat{{{\textbf {z}}}}\) in (4.24), it holds that
where the last equality holds because \(v_y^k=0\). Next, using properties of projections onto subspaces, we have \(\Vert {{\textbf {w}}}^k\Vert \le \Vert {{\textbf {v}}}^k\Vert \). This together with Lemma 4.5 and (4.28) shows that
Since \({{\textbf {w}}}^k \rightarrow \bar{{{\textbf {v}}}}\) and \({{\textbf {v}}}^k \rightarrow \bar{{{\textbf {v}}}}\), the above display shows that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\) in this case.
Next, suppose that \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\) for all large k instead. By passing to a subsequence, we assume that \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\) for all k. In view of (4.1) and (4.13), this means in particular that
We consider two cases and show that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\) in either of them:
-
(I)
\(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) infinitely often;
-
(II)
\(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all large k.
(I): Since \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) infinitely often, by extracting a subsequence if necessary, we assume that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) for all k. Now, consider the following functions:
Using these functions, Lemma 4.5, (4.24) and (4.29), one can see immediately that
Note that \(h_1\) is zero if and only if \(\zeta = 1 - \beta \). Furthermore, we have \(h_1'(1-\beta ) = 0\) and \(h_1''(1-\beta ) = -1\). Then, considering the Taylor expansion of \(h_1\) around \(1-\beta \) we have
Also, one can check that \(h_2(1-\beta )=0\) and that
Therefore, we have the following Taylor expansion of \(h_2\) around \(1-\beta \):
Thus, using the Taylor expansions of \(h_1\) and \(h_2\) at \(1-\beta \) we have
Hence, there exist \(C_h > 0\) and \(\epsilon >0\) so that
Next, consider the following functionFootnote 10
Then it is easy to check that H is proper closed and is never zero. Moreover, by direct computation, we have \(\lim \limits _{\zeta \rightarrow \infty }H(\zeta ) = \frac{e^{\beta -1}}{\beta ^2-\beta +1} > 0\) and
Thus, we deduce that \(\inf H > 0\).
Now, if it happens that \(|v^k_x/v^k_y - (1-\beta )|> \epsilon \) for all large k, upon letting \(\zeta _k:= v^k_x/v^k_y\), we have from (4.30) that for all large k,
where the second equality holds because of the definition of H and the facts that \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) (so that \(h_2(\zeta _k)\ne 0\) by (4.30)) and \(|v^k_x/v^k_y - (1-\beta )|> \epsilon \) for all large k.
On the other hand, if it holds that \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) infinitely often, then by extracting a further subsequence, we may assume that \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) for all k. Upon letting \(\zeta _k:= v^k_x/v^k_y\), we have from (4.30) that
where the first inequality holds thanks to \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) for all k, (4.33) and the fact that \({{\textbf {w}}}^k\ne {{{\textbf {u}}}}^k\) (so that \(h_2(\zeta _k)\ne 0\) and hence \(h_1(\zeta _k)\ne 0\)), and the second inequality holds because \({{\textbf {v}}}^k\in B(\eta )\).
Using (4.34) and (4.35) together with (4.27), we see that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). This concludes case (I).
(II): By passing to a subsequence, we may assume that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all k. Then we see from (4.24) and (4.29) that
Using this together with the fact that \((1-\beta )^2+1+e^{2(1-\beta )} > 0\), we deduce that there exists \(\epsilon > 0\) so that
Now, consider the following function
Then G is continuous and is zero if and only if \(\zeta = 1-\beta \). Moreover, by direct computation, we have \(\lim \nolimits _{\zeta \rightarrow \infty } G(\zeta ) = e^{\beta -1} > 0\) and \(\lim \nolimits _{\zeta \rightarrow -\infty } G(\zeta ) = 1 > 0\). Thus, it follows that
Finally, since \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all k, we see that
where (a) follows from \(\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert = \Vert {{\textbf {w}}}^k\Vert \) (see Lemma 4.5) and \(\Vert {{\textbf {w}}}^k\Vert \le \Vert {{\textbf {v}}}^k\Vert \) (because \({{\textbf {w}}}^k\) is the projection of \({{{\textbf {v}}}}^k\) onto a subspace), (b) follows from Lemma 4.5 and (4.29), (c) holds because \(v^k_y > 0\) (see (4.29)) and we defined \(\zeta _k:= v_x^k/v_y^k\), (d) follows from (4.24) and the definition of G, and (e) follows from (4.36) and (4.37). The above together with (4.27) shows that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\), which is what we wanted to show in case (II).
Summarizing the above cases, we conclude that there does not exist the sequence \(\{{{\textbf {v}}}^k\}\) and its associates so that (3.21b) holds for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). By Lemma 3.12, it must then hold that \(\gamma _{{{\textbf {z}}},\eta }\in (0,\infty ]\) and we have the desired error bound in view of Theorem 3.10. This completes the proof. \(\square \)
Combining Theorems 4.6, 3.10 and Lemma 3.9, and using the observation that \(\gamma _{{{\textbf {z}}},0}=\infty \) (see (3.15)), we obtain a one-step facial residual function in the following corollary.
Corollary 4.7
(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by
is a \(\mathbb {1}\)-FRF for \(K_{\exp }\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\sqrt{\epsilon }\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).
4.2.3 \({{\mathcal {F}}}_{\infty }\): the exceptional one-dimensional face
Recall the special one-dimensional face of \(K_{\exp }\) defined by
We first show that we have a Lipschitz error bound for any exposing normal vectors \({{\textbf {z}}}= (0, z_y,z_z)\) with \(z_y > 0\) and \(z_z > 0\).
Theorem 4.8
(Lipschitz error bound concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=0\), \(z_y > 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= |\cdot |\). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and
Proof
Without loss of generality, upon scaling, we may assume that \({{\textbf {z}}}= (0,a,1)\) for some \(a > 0\). Similarly as in the proof of Theorem 4.6, we will consider the following vectors:
Here, \({{\mathcal {F}}}_\infty \) is the conical hull of \(\widetilde{{{\textbf {f}}}}\) (see (4.12)), and \(\widetilde{{{\textbf {p}}}}\) is constructed so that \(\{\widetilde{{{\textbf {z}}}},\widetilde{{{\textbf {f}}}},\widetilde{{{\textbf {p}}}}\}\) is orthogonal.
Now, let \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\infty \), \({{\textbf {w}}}= P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\) and \({{\textbf {u}}}=P_{{{\mathcal {F}}}_\infty }{{\textbf {w}}}\) with \({{\textbf {u}}}\ne {{\textbf {w}}}\). Then, as in Lemma 4.5, by decomposing \({{\textbf {v}}}\) as a linear combination of \(\{\widetilde{{{\textbf {z}}}},\widetilde{{{\textbf {f}}}},\widetilde{{{\textbf {p}}}}\}\), we have
We consider the following cases for estimating \(\gamma _{{{\textbf {z}}},\eta }\).
-
(I)
\({{\textbf {v}}}\in {{\mathcal {F}}}_{-\infty }\backslash {{\mathcal {F}}}_\infty \);
-
(II)
\({{\textbf {v}}}\notin {{\mathcal {F}}}_{-\infty }\) with \(v_x \le 0\);
-
(III)
\({{\textbf {v}}}\notin {{\mathcal {F}}}_{-\infty }\) with \(v_x > 0\).
(I): In this case, \({{\textbf {v}}}= (v_x,0,v_z)\) with \(v_x\le 0\le v_z\); see (4.13). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x\ge 0\) and \(|\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle | = |v_z| = \frac{1}{a}|\langle \widetilde{{{\textbf {p}}}},{{\textbf {v}}}\rangle |\). Consequently, we have from (4.38) that
(II): In this case, in view of (4.1) and (4.13), we have \({{\textbf {v}}}= (v_x,v_y,v_ye^{v_x/v_y})\) with \(v_x\le 0\) and \(v_y > 0\). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x\ge 0\). Moreover, since \(v_y>0\), we have
Using (4.38), we then obtain that \(\Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert \ge \frac{\min \{1,a\}\Vert \widetilde{{{\textbf {p}}}}\Vert }{\max \{1,a\}\Vert \widetilde{{{\textbf {z}}}}\Vert }\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert \).
(III): In this case, in view of (4.1) and (4.13), \({{\textbf {v}}}= (v_x,v_y,v_ye^{v_x/v_y})\) with \(v_x> 0\) and \(v_y > 0\). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x< 0\) and hence \(\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert = \Vert {{\textbf {w}}}\Vert \le \Vert {{\textbf {v}}}\Vert \), where the equality follows from (4.38) and the inequality holds because \({{\textbf {w}}}\) is the projection of \({{\textbf {v}}}\) onto a subspace. Since \(v_y > 0\), we have
where we used \(v_y > 0\) and \(e^t\ge 1+t\) for all t in (a) and \(\Vert {{\textbf {v}}}\Vert _1\) denotes the 1-norm of \({{\textbf {v}}}\). Combining this with (4.38) and the fact that \(\Vert {{\textbf {w}}}\Vert \le \Vert {{\textbf {v}}}\Vert \), we see that
Summarizing the three cases, we conclude that
In view of Theorem 3.10, we have the desired error bound. This completes the proof. \(\square \)
We next turn to the supporting hyperplane defined by \({{\textbf {z}}}= (0,0,z_z)\) for some \(z_z>0\) and so \(\{{{\textbf {z}}}\}^\perp \) is the xy-plane. The following lemma demonstrates that the Hölderian-type error bound in the form of (3.16) with \(\mathfrak {g}= |\cdot |^\alpha \) for some \(\alpha \in (0,1]\) no longer works in this case.
Lemma 4.9
(Nonexistence of Hölderian error bounds) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x= z_y = 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\alpha \in (0,1]\) and \(\eta > 0\). Then
Proof
For each \(k\in \mathbb {N}\), let \({{\textbf {q}}}^k := (-\frac{\eta }{2},\frac{\eta }{2k},0)\). Then \({{\textbf {q}}}^k\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\backslash {{\mathcal {F}}}_\infty \) and we have \(\text {d}({{\textbf {q}}}^k,{{\mathcal {F}}}_\infty ) = \frac{\eta }{2k}\). Moreover, since \((q^k_x,q^k_y,q^k_ye^{q^k_x/q^k_y})\in K_{\exp }\), we have \(\text {d}({{\textbf {q}}}^k,K_{\exp })\le q^k_ye^{q^k_x/q^k_y} = \frac{\eta }{2k}e^{-k}\). Then it holds that
since \(\alpha \in (0,1]\). This completes the proof. \(\square \)
Since a zero-at-zero monotone nondecreasing function of the form \((\cdot )^\alpha \) no longer works, we opt for the following function \(\mathfrak {g}_\infty :\mathbb {R}_+\rightarrow \mathbb {R}_+\) that grows faster around \(t=0\):
Similar to \(\mathfrak {g}_{-\infty }\) in (4.14), \(\mathfrak {g}_{\infty }\) is monotone increasing and there exists a constant \({\widehat{L}} \ge 1\) such that the following inequalities hold for every \(t \in \mathbb {R}_+\) and \(M > 0\):
We next show that error bounds in the form of (3.16) holds for \({{\textbf {z}}}= (0,0,z_z)\), \(z_z>0\), if we use \(\mathfrak {g}_\infty \).
Theorem 4.10
(Log-type error bound concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_y = 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= \mathfrak {g}_\infty \) in (4.39). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and
Proof
Take \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_\infty \) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_\infty \) such that
where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\infty }{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). Since \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\), in view of (4.12) and (4.13), we must have \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\). Then, from (4.1) and (4.12), we have
Since \({{\textbf {w}}}^k\rightarrow \bar{{{\textbf {v}}}}\) and \({{\textbf {v}}}^k\rightarrow \bar{{{\textbf {v}}}}\), without loss of generality, by passing to a subsequence if necessary, we assume in addition that \(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert \le e^{-2}\) for all k. From (4.41) we conclude that \({{\textbf {v}}}^k \ne {{\textbf {w}}}^k\), hence \(\mathfrak {g}_\infty (\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert ) = -(\ln \Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )^{-1}\).
We consider the following two cases in order to show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \):
-
(I)
\(\bar{{{\textbf {v}}}}\ne {\textbf {0}}\);
-
(II)
\(\bar{{{\textbf {v}}}}= {\textbf {0}}\).
(I): In this case, we have \(\bar{{{\textbf {v}}}}= ({\bar{v}}_x,0,0)\) for some \({\bar{v}}_x < 0\). This implies that \(v^k_x<0\) for all large k. Hence, we have from (4.41) that for all large k,
since \(v^k_y\rightarrow 0\) and \(v^k_x\rightarrow {\bar{v}}_x < 0\). This shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).
(II): If \(v_x^k\le 0\) infinitely often, by extracting a subsequence, we assume that \(v_x^k\le 0\) for all k. Since \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) (and \({{\textbf {w}}}^k\ne {{\textbf {v}}}^k\)), we note from (4.41) that
Since \(\{-(v^k_y\ln v^k_y + v^k_x)\}\) is a positive sequence and it converges to zero as \((v^k_x,v^k_y)\rightarrow 0\), it follows that \(\lim \nolimits _{k\rightarrow \infty }\frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }=\infty \). This shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).
Now, it remains to consider the case that \(v_x^k>0\) for all large k. By passing to a subsequence if necessary, we assume that \(v_x^k>0\) for all k. By solving for \(v_x^k\) from \(v^k_z=v^k_y e^{v^k_x/v^k_y} > 0\) and noting (4.41), we obtain that
Also, we note from \(v_x^k=v^k_y\ln (v_z^k/v_y^k)>0\), \(v_y^k>0\) and the monotonicity of \(\ln (\cdot )\) that for all k,
Next consider the function \(h(t) := \frac{1}{t}\sqrt{1+(\ln t)^2}\) on \([1,\infty )\). Then h is continuous and positive. Since \(h(1)=1\) and \(\lim _{t\rightarrow \infty }h(t) = 0\), there exists \(M_h\) such that \(h(t)\le M_h\) for all \(t\ge 1\). Now, using (4.42), we have, upon defining \(t_k:= v_z^k/v_y^k\) that
where the division by \(v_z^k\) in (a) is legitimate because \(v_z^k>0\), (b) follows from the definition of h and the fact that \(t_k > 1\) (see (4.43)), and (c) holds because of the definition of \(M_h\) and the fact that \(-\ln v_z^k > 0\) (thanks to \(v_z^k = \Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert \le e^{-2}\)). Since \(v^k_z\rightarrow 0\), it then follows that \(\left\{ \frac{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}\right\} \) is a positive sequence that converges to zero. Thus, \(\lim \nolimits _{k\rightarrow \infty }\frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }=\infty \), which again shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).
Having shown that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \), in view of Lemma 3.12, we must have \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\). Then the result follows from Theorem 3.10 and (4.40). \(\square \)
Combining Theorem 4.8, Theorem 4.10 and Lemma 3.9, and noting (4.40) and \(\gamma _{{{\textbf {z}}},0}=\infty \) (see (3.15)), we can now summarize the one-step facial residual functions derived in this section in the following corollary.
Corollary 4.11
(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=0\) and \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \).
-
(i)
In the case when \(z_y > 0\), let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= |\cdot |\). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):=\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}(\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ) \end{aligned}$$is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\epsilon \) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).
-
(ii)
In the case when \(z_y = 0\), let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= \mathfrak {g}_\infty \) in (4.39). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):=\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_\infty (\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ) \end{aligned}$$is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\mathfrak {g}_{\infty }(\epsilon )\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).
4.2.4 The non-exposed face \({{\mathcal {F}}}_{ne}\)
Recall the unique non-exposed face of \(K_{\exp }\):
In this subsection, we take a look at \({{\mathcal {F}}}_{ne}\). Note that \({{\mathcal {F}}}_{ne}\) is an exposed face of \({{\mathcal {F}}}_{-\infty }\), which is polyhedral. This observation leads immediately to the following corollary, which also follows from [34, Proposition 18] by letting \( \mathcal {F}{:}{=} \mathcal {K}{:}{=}{{\mathcal {F}}}_{-\infty }\) therein. We omit the proof for brevity.
Corollary 4.12
(\(\mathbb {1}\)-FRF for \({{\mathcal {F}}}_{ne}\)) Let \({{\textbf {z}}}\in \ {{\mathcal {F}}}_{-\infty }^*\) be such that \({{\mathcal {F}}}_{ne} = {{\mathcal {F}}}_{-\infty } \cap \{{{\textbf {z}}}\}^\perp \). Then there exists \(\kappa > 0\) such that
is a \(\mathbb {1}\)-FRF for \({{\mathcal {F}}}_{-\infty }\) and \({{\textbf {z}}}\).
4.3 Error bounds
In this subsection, we return to the feasibility problem (Feas) and consider the case where \( \mathcal {K}= K_{\exp }\). We now have all the tools for obtaining error bounds. Recalling Definition 2.1, we can state the following result.
Theorem 4.13
(Error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\)) Let \(\mathcal {L}\subseteq \mathbb {R}^3\) be a subspace and \({{\textbf {a}}}\in \mathbb {R}^3\) such that \((\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\ne \emptyset \). Then the following items hold.
-
(i)
The distance to the PPS condition of \(\{K_{\exp }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) \le 1\).
-
(ii)
If \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=0 \), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound.
-
(iii)
Suppose \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=1\) and let \( \mathcal {F}\subsetneq K_{\exp }\) be a chain of faces of length 2 satisfying items (ii) and (iii) of Proposition 3.2. We have the following possibilities.
-
(a)
If \( \mathcal {F}= {{\mathcal {F}}}_{-\infty }\) then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy an entropic error bound as in (4.44). In addition, for all \(\alpha \in (0,1)\), a uniform Hölderian error bound with exponent \(\alpha \) holds.
-
(b)
If \( \mathcal {F}= {{\mathcal {F}}}_{\beta }\), with \(\beta \in \mathbb {R}\), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a uniform Hölderian error bound with exponent 1/2.
-
(c)
Suppose that \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\). If there exists \({{\textbf {z}}}\in K_{\exp }^* \cap \mathcal {L}^\perp \cap \{a\}^\perp \) with \(z_x=0\), \(z_y > 0\) and \(z_z>0\) then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound. Otherwise, \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a log-type error bound as in (4.45).
-
(d)
If \( \mathcal {F}= \{{\textbf {0}} \}\), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound.
-
(a)
Proof
-
(i):
All proper faces of \(K_{\exp }\) are polyhedral, therefore \(\ell _{\text {poly}}(K_{\exp }) = 1\). By item of Proposition 3.2, there exists a chain of length 2 satisfying item of Proposition 3.2. Therefore, \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) \le 1\).
-
(ii):
If \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) = 0\), it is because \(\{K_{\exp }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition, which implies a Lipschitzian error bound by Proposition 2.2.
-
(iii):
Next, suppose \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=1\) and let \( \mathcal {F}\subsetneq K_{\exp }\) be a chain of faces of length 2 satisfying items and of Proposition 3.2, together with \({{\textbf {z}}}\in K_{\exp }^* \cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that
$$\begin{aligned} \mathcal {F}= K_{\exp }\cap \{{{\textbf {z}}}\}^\perp . \end{aligned}$$
Since positively scaling \({{\textbf {z}}}\) does not affect the chain of faces, we may assume that \(\Vert {{\textbf {z}}}\Vert = 1\). Also, in what follows, for simplicity, we define
Then, we prove each item by applying Theorem 3.8 with the corresponding facial residual function.
-
(a)
If \( \mathcal {F}= {{\mathcal {F}}}_{-\infty }\), the one-step facial residual functions are given by Corollary 4.4. First we consider the case where \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) and we have
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_{-\infty }(2\epsilon ), \end{aligned}$$where \(\mathfrak {g}_{-\infty }\) is as in (4.14). Then, if \(\psi \) is a positively rescaled shift of \(\psi _{ \mathcal {K},{{\textbf {z}}}}\), using the monotonicity of \(\mathfrak {g}_{-\infty }\) and of \(\kappa _{{{\textbf {z}}},t}\) as a function of t, we conclude that there exists \({\widehat{M}} > 0\) such that
$$\begin{aligned} \psi (\epsilon ,t) \le {\widehat{M}} \epsilon + {\widehat{M}} \kappa _{{{\textbf {z}}}, {\widehat{M}} t}\mathfrak {g}_{-\infty }({\widehat{M}}\epsilon ). \end{aligned}$$Invoking Theorem 3.8, using the monotonicity of all functions involved in the definition of \(\psi \) and recalling (4.15), we conclude that for every bounded set B, there exists \(\kappa _B > 0\)
$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_{-\infty }({\widehat{\text {d}}}({{\textbf {x}}})), \qquad \forall {{\textbf {x}}}\in B, \end{aligned}$$(4.44)which shows that an entropic error bound holds. Next, we consider the case \(\mathfrak {g}= |\cdot |^{\alpha }\). Given \(\alpha \in (0,1)\), we have the following one-step facial residual function:
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}2^\alpha \epsilon ^\alpha , \end{aligned}$$where \(\kappa _{{{\textbf {z}}},t}\) is defined as in (3.16). Invoking Theorem 3.8, we conclude that for every bounded set B, there exists \(\kappa _B > 0\) such that
$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _B {\widehat{\text {d}}}({{\textbf {x}}}) + \kappa _{B} {\widehat{\text {d}}}({{\textbf {x}}})^\alpha , \qquad \forall {{\textbf {x}}}\in B, \end{aligned}$$In addition, for \({{\textbf {x}}}\in B\), we have \({\widehat{\text {d}}}({{\textbf {x}}}) \le {\widehat{\text {d}}}({{\textbf {x}}})^\alpha {M}\), where \(M = \sup _{{{\textbf {x}}}\in B} {\widehat{\text {d}}}({{\textbf {x}}})^{1-\alpha }\). In conclusion, for \(\kappa = 2\kappa _{B}\max \{M,1\}\), we have
$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa {\widehat{\text {d}}}({{\textbf {x}}})^\alpha , \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$That is, a uniform Hölderian error bound holds with exponent \(\alpha \).
-
(b)
If \( \mathcal {F}= {{\mathcal {F}}}_{\beta }\), with \(\beta \in \mathbb {R}\), then the one-step facial residual function is given by Corollary 4.7, that is, we have
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) := \epsilon + \kappa _{{{\textbf {z}}},t}\sqrt{2} \epsilon ^{1/2}, \end{aligned}$$Then, following the same argument as in the second half of item (a), we conclude that a uniform Hölderian error bound holds with exponent 1/2.
-
(c)
If \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\), the one-step facial residual functions are given by Corollary 4.11 and they depend on \({{\textbf {z}}}\). Since \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\), we must have \(z_x = 0\) and \(z_z > 0\), see Sect. 4.1.1. The deciding factor is whether \(z_y\) is positive or zero. If \(z_y > 0\), then we have the following one-step facial residual function:
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= (1+2 \kappa _{{{\textbf {z}}},t})\epsilon , \end{aligned}$$where \(\kappa _{{{\textbf {z}}},t}\) is defined as in (3.16). In this case, analogously to items (a) and (b) we have a Lipschitzian error bound. If \(z_y = 0\), we have
$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_\infty (2\epsilon ), \end{aligned}$$where \(\mathfrak {g}_\infty \) is as in (4.39). Analogous to the proof of item (a) but making use of (4.40) in place of (4.15), we conclude that for every bounded set B, there exists \(\kappa _B > 0\) such that
$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_\infty ({\widehat{\text {d}}}({{\textbf {x}}}))), \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$(4.45) -
(d)
See [34, Proposition 27].
\(\square \)
Remark 4.14
(Tightness of Theorem 4.13) We will argue that Theorem 4.13 is tight by showing that for every situation described in item (iii), there is a specific choice of \(\mathcal {L}\) and a sequence \(\{{{\textbf {w}}}^k\}\) in \(\mathcal {L}\backslash K_{\exp }\) with \(\text {d}({{\textbf {w}}}^k,K_{\exp }) \rightarrow 0\) along which the corresponding error bound for \(K_{\exp }\) and \(\mathcal {L}\) is off by at most a multiplicative constant.
-
(a)
Let \(\mathcal {L}= \text {span}\,{{\mathcal {F}}}_{-\infty } = \{(x,y,z) \mid y = 0 \}\) (see (4.5)) and consider the sequence \(\{{{\textbf {w}}}^k\}\) where \({{\textbf {w}}}^k = ((1/(k+1))\ln (k+1),0,1)\), for every \(k \in \mathbb {N}\). Then, \(\mathcal {L}\cap K_{\exp }= {{\mathcal {F}}}_{-\infty }\) and we are under the conditions of item (iii)(a) of Theorem 4.13. Since \(\{{{\textbf {w}}}^k\} =: B \subseteq \mathcal {L}\), there exists \(\kappa _B > 0\) such that
$$\begin{aligned} \text {d}\left( {{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })), \quad \forall k \in \mathbb {N}. \end{aligned}$$Then, the projection of \({{\textbf {w}}}^k\) onto \({{\mathcal {F}}}_{-\infty }\) is given by (0, 0, 1). Therefore,
$$\begin{aligned} \frac{\ln (k+1)}{k+1} = \text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp }) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })). \end{aligned}$$Let \({{\textbf {v}}}^k = ((1/(k+1))\ln (k+1),1/(k+1),1)\) for every k. Then, we have \({{\textbf {v}}}^k \in K_{\exp }\). Therefore, \(\text {d}({{\textbf {w}}}^k, K_{\exp }) \le 1/(k+1)\). In view of the definition of \(\mathfrak {g}_{-\infty }\) (see (4.14)), we conclude that for large enough k we have
$$\begin{aligned} \frac{\ln (k+1)}{k+1} = \text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp }) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })) \le \kappa _B\frac{\ln (k+1)}{k+1}. \end{aligned}$$Thus, it holds that for all sufficiently large k,
$$\begin{aligned} 1\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \le \kappa _B. \end{aligned}$$Consequently, for any given nonnegative function \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) such that \(\lim _{t\downarrow 0}\frac{\mathfrak {g}(t)}{\mathfrak {g}_{-\infty }(t)}=0\), we have upon noting \(\text {d}({{\textbf {w}}}^k,K_{\exp })\rightarrow 0\) that
$$\begin{aligned} \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}(\text {d}({{\textbf {w}}}^k,K_{\exp }))} = \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))}\frac{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))}{\mathfrak {g}(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \rightarrow \infty , \end{aligned}$$which shows that the choice of \(\mathfrak {g}_{-\infty }\) in the error bound is tight.
-
(b)
Let \(\beta \in \mathbb {R}\) and let \(\widehat{{{\textbf {z}}}}\), \(\widehat{{{\textbf {p}}}}\) and \(\widehat{{{\textbf {f}}}}\) be as in (4.24). Let \(\mathcal {L}= \{{{\textbf {z}}}\}^\perp \) with \(z_x < 0\) such that \(K_{\exp }\cap \mathcal {L}= {{\mathcal {F}}}_{\beta }\). We are then under the conditions of item (iii)(b) of Theorem 4.13. We consider the following sequences
$$\begin{aligned} {{\textbf {v}}}^k = \begin{bmatrix} 1-\beta +1/k\\ 1\\ e^{1-\beta + 1/k} \end{bmatrix},\quad {{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k,\quad {{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\beta }{{\textbf {w}}}^k. \end{aligned}$$For every k we have \({{\textbf {v}}}^k \in \partial K_{\exp }\setminus {{\mathcal {F}}}_{\beta }\), and \({{\textbf {v}}}^k \ne {{\textbf {w}}}^k\) (because otherwise, we would have \({{\textbf {v}}}^k \in K_{\exp }\cap \{{{\textbf {z}}}\}^\perp = {{\mathcal {F}}}_{\beta }\)). In addition, we have \({{\textbf {v}}}^k \rightarrow \widehat{{{\textbf {f}}}}\) and, since \(\widehat{{{\textbf {f}}}}\in {{\mathcal {F}}}_{\beta }\), we have \({{\textbf {w}}}^k \rightarrow \widehat{{{\textbf {f}}}}\) as well. Next, notice that we have \(\langle \widehat{{{\textbf {f}}}}, {{\textbf {v}}}^k \rangle \ge 0\) for k sufficiently large and \(|v_x^k/v_y^k - (1-\beta )| \rightarrow 0\). Then, following the computations outlined in case (I) of the proof of Theorem 4.6 and letting \(\zeta _k{:}{=}v_x^k/v_y^k\), we have from (4.30) and (4.31) that \(h_2(\zeta _k)\ne 0\) for all large k (hence, \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) for all large k), and that
$$\begin{aligned} L_{\beta } {:}{=}\lim _{k \rightarrow \infty }\frac{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}}{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert }= & {} \lim _{k \rightarrow \infty }\frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert ^{\frac{1}{2}}}\frac{|h_1(\zeta _k)|^{\frac{1}{2}}}{|h_2(\zeta _k)|} \nonumber \\= & {} \frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert ^{\frac{1}{2}}}\frac{1}{\sqrt{2}(e^{\beta -1} + (\beta ^2+1)e^{1-\beta })} \in (0,\infty ),\nonumber \\ \end{aligned}$$(4.46)where the latter equality is from (4.32). On the other hand, from item (iii)(b) of Theorem 4.13, for \(B {:}{=}\{{{\textbf {w}}}^k\}\), there exists \(\kappa _B > 0\) such that for all \(k\in \mathbb {N}\),
$$\begin{aligned} \Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \text {d}({{\textbf {w}}}^k,K_{\exp })^{\frac{1}{2}} \le \kappa _B\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}. \end{aligned}$$However, from (4.46), for large enough k, we have \(\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert \ge 1/(2L_{\beta })\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}\). Therefore, for large enough k we have
$$\begin{aligned} \frac{1}{2L_{\beta }}\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}} \le \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp })\le \kappa _B \text {d}({{\textbf {w}}}^k,K_{\exp })^{\frac{1}{2}} \le \kappa _B\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}. \end{aligned}$$Consequently, it holds that for all large enough k,
$$\begin{aligned} \frac{1}{2L_\beta }\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\text {d}({{\textbf {w}}}^k,K_{\exp })^\frac{1}{2}} \le \kappa _B. \end{aligned}$$Arguing similarly as in case (a), we can also conclude that the choice of \(|\cdot |^\frac{1}{2}\) in the error bound is tight.
-
(c)
Let \({{\textbf {z}}}= (0,0,1)\) and \(\mathcal {L}= \{(x,y,0) \mid x,y \in \mathbb {R}\} = \{{{\textbf {z}}}\}^\perp \). Then, from (4.12), we have \(\mathcal {L}\cap K_{\exp }= {{\mathcal {F}}}_{\infty }\). We are then under case (iii)(c) of Theorem 4.13. Because there is no \({\hat{{{\textbf {z}}}}} \in \mathcal {L}^\perp \) with \({\hat{z}} _y > 0\), we have a log-type error bound as in (4.45). We proceed as in item (a) using sequences such that \({{\textbf {w}}}^k=(-1,1/k,0)\), \({{\textbf {v}}}^k=(-1,1/k,(1/k)e^{-k})\), \({{\textbf {u}}}^k=(-1,0,0)\), for every k. Note that \({{\textbf {w}}}^k \in \mathcal {L}, {{\textbf {v}}}^k \in K_{\exp }\) and \({\text {P} }_{\negthinspace \negthinspace {{\mathcal {F}}}_\infty }({{\textbf {w}}}^k) = {{\textbf {u}}}^k\), for every k. Therefore, there exists \(\kappa _B > 0\) such that
$$\begin{aligned} \frac{1}{k} = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))\le \kappa _{B}\mathfrak {g}_{\infty }\left( \frac{1}{ke^k}\right) , \quad \forall k \in \mathbb {N}. \end{aligned}$$In view of the definition of \(\mathfrak {g}_{\infty }\) (see (4.39)), there exists \(L > 0\) such that for large enough k we have
$$\begin{aligned} \frac{1}{k} = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp })) \le \frac{L}{k}. \end{aligned}$$Consequently, it holds that for all large enough k,
$$\begin{aligned} \frac{\kappa _B}{L}\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \le \kappa _B. \end{aligned}$$Arguing similarly as in case (a), we conclude that the choice of \(\mathfrak {g}_{\infty }\) is tight.
Note that a Lipschitz error bound is always tight up to a constant, because \(\text {d}({{\textbf {x}}}, \mathcal {K}\cap (\mathcal {L}+{{\textbf {a}}})) \ge \max \{\text {d}({{\textbf {x}}}, \mathcal {K}),\text {d}({{\textbf {x}}},\mathcal {L}+{{\textbf {a}}})\}\). Therefore, the error bounds in items (ii), (iii)(d) and in the first half of (iii)(c) are tight.
Sometimes we may need to consider direct products of multiple copies of \(K_{\exp }\) in order to model certain problems, i.e., our problem of interest could have the following shape:
where \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) is a direct product of m exponential cones.
Fortunately, we already have all the tools required to extend Theorem 4.13 and compute error bounds for this case too. We recall that the faces of a direct product of cones are direct products of the faces of the individual cones.Footnote 11 Therefore, using Proposition 3.13, we are able to compute all the necessary one-step facial residual functions for \( \mathcal {K}\). Once they are obtained we can invoke Theorem 3.8. Unfortunately, there is quite a number of different cases one must consider, so we cannot give a concise statement of an all-encompassing tight error bound result.
We will, however, given an error bound result under the following simplifying assumption of non-exceptionality or SANE.
Assumption 4.15
(SANE: simplifying assumption of non-exceptionality) Suppose (Feas) is feasible with \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) being a direct product of m exponential cones. We say that \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy the simplifying assumption of non-exceptionality (SANE) if there exists a chain of faces \( \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\) as in Proposition 3.2 with \(\ell - 1 = {d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})}\) such that for all i, the exceptional face \({{\mathcal {F}}}_{\infty }\) of \(K_{\exp }\) never appears as one of the blocks of \( \mathcal {F}_{i}\).
Remark 4.16
(SANE is not unreasonable) In many modelling applications of the exponential cone presented in [37, Chapter 5], translating to our notation, the \({{\textbf {y}}}\) variable is fixed to be 1 in (4.1). For example, the hypograph of the logarithm function “\(x \le \ln (z)\)” can be represented as the constraint “\((x,y,z) \in K_{\exp }\cap (\mathcal {L}+{{\textbf {a}}})\)”, where \(\mathcal {L}+{{\textbf {a}}}= \{(x,y,z) \mid y = 1\}\). Because the y variable is fixed to be 1, the feasible region does not intersect the 2D face \({{\mathcal {F}}}_{-\infty }\) nor its subfaces \({{\mathcal {F}}}_{\infty }\) and \({{\mathcal {F}}}_{ne}\). In particular, SANE is satisfied. More generally, if \( \mathcal {K}\) is a direct product of exponential cones and the affine space \(\mathcal {L}+{{\textbf {a}}}\) is such that the \({{\textbf {y}}}\) components of each block are fixed positive constants, then \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy SANE.
On the other hand, problems involving the relative entropy \(D(x,y) {:}{=}x \ln (x/y)\) are often modelled as “minimize t” subject to “\((-t,x,y) \in K_{\exp }\)” and additional constraints. We could also have sums so that the problem is of the form “minimize \(\sum t_i\)” subject to “\((-t_i,x_i,y_i) \in K_{\exp }\)” and additional constraints. In those cases, it seems that it could happen that SANE is not satisfied.
Under SANE, we can state the following result.
Theorem 4.17
(Error bounds for direct products of exponential cones) Suppose (Feas) is feasible with \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) being a direct product of m exponential cones. Then the following hold.
-
(i)
The distance to the PPS condition of \(\{ \mathcal {K}, \mathcal {L}+{{\textbf {a}}}\}\) satisfies \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}}) \le m\).
-
(ii)
If SANE is satisfied, then \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a uniform Hölderian error bound with exponent \(2^{-d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})}\).
Proof
-
(i):
All proper faces of \(K_{\exp }\) are polyhedral, therefore \(\ell _{\text {poly}}(K_{\exp }) = 1\). By item of Proposition 3.2, there exists a chain of length \(\ell \) satisfying item of Proposition 3.2 such that \(\ell -1 \le m\). Therefore, \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})\le \ell -1 \le m\).
-
(ii):
If SANE is satisfied, then there exists a chain \( \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\) of length \(\ell \le m +1\) as in Proposition 3.2, together with the corresponding \({{\textbf {z}}}_{1},\ldots ,{{\textbf {z}}}_{\ell -1}\). Also, the exceptional face \({{\mathcal {F}}}_{\infty }\) never appears as one of the blocks of the \( \mathcal {F}_i\).
In what follows, for simplicity, we define
Then, we invoke Theorem 3.8, which implies that given a bounded set B, there exists a constant \(\kappa _B > 0\) such that
where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \) and there are two cases for \(\varphi \). If \(\ell = 1\), \(\varphi \) is the function such that \(\varphi (\epsilon ,M) = \epsilon \). If \(\ell \ge 2\), we have \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), where \(\psi _{i}\) is a (suitable positively rescaled shift of a) one-step facial residual function for \( \mathcal {F}_{i}\) and \({{\textbf {z}}}_i\). In the former case, the PPS condition is satisfied, we have a Lipschitzian error bound and we are done. We therefore assume that the latter case occurs with \(\ell - 1 = {d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})}\).
First, we compute the one-step facial residual functions for each \( \mathcal {F}_i\). In order to do that, we recall that each \( \mathcal {F}_{i}\) is a direct product \( \mathcal {F}_{i}^1\times \cdots \times \mathcal {F}_{i}^m\) where each \( \mathcal {F}_{i}^j\) is a face of \(K_{\exp }\), excluding \({{\mathcal {F}}}_{\infty }\) by SANE. Therefore, a one-step facial residual function for \( \mathcal {F}_{i}^j\) can be obtained from Corollary 4.4, 4.7 or 4.12. In particular, taking the worstFootnote 12 case in consideration, and taking the maximum of the facial residual functions, there exists a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \(\psi \) given by
is a one-step facial residual function for each \( \mathcal {F}_{i}^j\). In what follows, in order to simplify the notation, we define \({\hat{\mathfrak {g}}}(t) {:}{=}\sqrt{t}\). Also, for every j, we use \({\hat{\mathfrak {g}}}_j\) to denote the composition of \({\hat{\mathfrak {g}}}\) with itself j-times, i.e.,
and we set \({\hat{\mathfrak {g}}}_0\) to be the identity map.
Using the above notation and Proposition 3.13, we conclude the existence of a nonnegative monotone nondecreasing function \(\sigma : \mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \(\psi _{i}\) given by
is a one-step facial residual function for \( \mathcal {F}_i\) and \({{\textbf {z}}}_i\). Therefore, for \({{\textbf {x}}}\in B\), we have
where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \).
Next we are going to make a series of arguments related to the following informal principle: over a bounded set only the terms \({\hat{\mathfrak {g}}}_j\) with largest j matter. We start by noting that for any \({{\textbf {x}}}\in B\) and any \(0\le k\le j\le \ell \),
where \({\hat{\kappa }}_{j,k}:= \sup _{x\in B}{\widehat{\text {d}}}({{\textbf {x}}})^{(2^{-k} - 2^{-j})} < \infty \) because \({{\textbf {x}}}\mapsto {\widehat{\text {d}}}({{\textbf {x}}})^{(2^{-k} - 2^{-j})}\) is continuous, and \({\hat{\kappa }} := \max _{0\le k\le j\le \ell }{\hat{\kappa }}_{j,k}\).
Now, let \(\varphi _j {:}{=}\psi _{{j}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), where \(\diamondsuit \) is the diamond composition defined in (3.3). We will show by induction that for every \(j \le \ell -1\) there exists \(\kappa _j\) such that
For \(j = 1\), it follows directly from (4.49) and (4.50). Now, suppose that the claim is valid for some j such that \(j+1 \le \ell -1\). By the inductive hypothesis, we have
where \(\tilde{\kappa }_j {:}{=}2\max \{{\hat{\kappa }},\kappa _j\}\) and the last inequality follows from (4.50). Then, we plug \(\epsilon = \tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}}))\) in (4.49) to obtain
where the last inequality follows from (4.50). Combining (4.52) and (4.53) concludes the induction proof. In particular, (4.51) is valid for \(j = \ell -1\). Then, taking into account some positive rescaling and shifting (see (3.2)) and adjusting constants, from (4.47), (4.51) and (4.50) we deduce that there exists \(\kappa > 0\) such that
with \({\hat{\mathfrak {g}}}_{\ell -1}\) as in (4.48). To complete the proof, we recall that \({d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})} = \ell -1\). \(\square \)
Remark 4.18
(Variants of Theorem 4.17) Theorem 4.17 is not tight and admits variants that are somewhat cumbersome to describe precisely. For example, the \(\mathfrak {g}_{-\infty }\) function was not taken into account explicitly but simply “relaxed" to \(t\mapsto \sqrt{t}\).
Going for greater generality, we can also drop the SANE assumption altogether and try to be as tight as our analysis permits when dealing with possibly inSANE instances. Although there are several possibilities one must consider, the overall strategy is the same as outlined in the proof of Theorem 4.17: invoke Theorem 3.8, fix a bounded set B, pick a chain of faces as in Proposition 3.2 and upper bound the diamond composition of facial residual function as in (4.51). Intuitively, whenever sums of function compositions appear, only the “higher” compositions matter. However, the analysis must consider the possibility of \(\mathfrak {g}_{-\infty }\) or \(\mathfrak {g}_{\infty }\) appearing. After this is done, it is just a matter to plug this upper bound into (4.47).
We conclude this subsection with an application. In [11], among other results, the authors showed that when a Hölderian error bound holds, it is possible to derive the convergence rate of several algorithms from the exponent of the error bound. As a consequence, Theorems 4.13 and 4.17 allow us to apply some of their results (e.g., [11, Corollary 3.8]) to the conic feasibility problem with exponential cones, whenever a Hölderian error bound holds. For non-Hölderian error bounds appearing in Theorem 4.13, different techniques are necessary, such as the ones discussed in [33] for deriving convergence rates under more general error bounds.
4.4 Miscellaneous odd behavior and connections to other notions
In this final subsection, we collect several instances of pathological behaviour that can be found inside the facial structure of the exponential cone.
Example 4.19
(Hölderian bounds and the non-attainment of admissible exponents) We recall Definition 2.1 and we consider the special case of two closed convex sets \(C_1,C_2\) with non-empty intersection. We say that \(\gamma \in (0,1]\) is an admissible exponent for \(C_1, C_2\) if \(C_1\) and \(C_2\) satisfy a uniform Hölderian error bound with exponent \(\gamma \). It turns out that the supremum of the set of admissible exponents is not itself admissible. In particular, if \(C_1 = K_{\exp }\) and \(C_2 = \text {span}\,{{\mathcal {F}}}_{-\infty }\), then we see from Corollary 4.3 that \(C_1 \cap C_2 = {{\mathcal {F}}}_{-\infty }\) and that \(C_1\) and \(C_2\) satisfy a uniform Hölderian error bound for all \(\gamma \in (0,1)\); however, in view of the sequence constructed in Remark 4.14(a), the exponent cannot be chosen to be \(\gamma = 1\).
In fact, from Theorem 4.13 and Remark 4.14(a), \(C_1\) and \(C_2\) satisfy an entropic error bound which is tight and is, in a sense, better than any Hölderian error bound with \(\gamma \in (0,1)\) but worse than a Lipschitzian error bound.
Example 4.20
(Non-Hölderian error bound) The facial structure of \(K_{\exp }\) can be used to derive an example of two sets that provably do not have a Hölderian error bound. Let \(C_1 = K_{\exp }\) and \(C_2 = \{{{\textbf {z}}}\}^\perp \), where \(z_x=z_y = 0\) and \(z_z=1\) so that \(C_1\cap C_2={{\mathcal {F}}}_\infty \). Then, for every \(\eta > 0\) and every \(\alpha \in (0,1]\), there is no constant \(\kappa > 0\) such that
This is because if there were such a positive \(\kappa \), the infimum in Lemma 4.9 would be positive, which it is not. This shows that \(C_1\) and \(C_2\) do not have a Hölderian error bound. However, as seen in Theorem 4.10, \(C_1\) and \(C_2\) have a log-type error bound. In particular if \({{\textbf {q}}}\in B(\eta )\), using (2.1), (2.2) and Theorem 4.10, we have
where \({\widehat{\text {d}}}({{\textbf {q}}}) {:}{=}\max \{\text {d}({{\textbf {q}}},K_{\exp }),\text {d}({{\textbf {q}}},\{{{\textbf {z}}}\}^\perp ) \}\) and in the last inequality we used the monotonicity of \(\mathfrak {g}_\infty \).
Let \(C_1, \cdots , C_m\) be closed convex sets having nonempty intersection and let \(C {:}{=}\cap _{i=1}^m C_i\). Following [33], we say that \(\varphi : \mathbb {R}_+\times \mathbb {R}_+ \rightarrow \mathbb {R}_+ \) is a consistent error bound function (CEBF) for \(C_1, \ldots , C_m\) if the following inequality holds
and the following technical conditions are satisfied for every \(a,b\in \mathbb {R}_+\): \(\varphi (\cdot ,b)\) is monotone nondecreasing, right-continuous at 0 and \(\varphi (0,b) = 0\); \(\varphi (a,\cdot )\) is mononotone nondecreasing. CEBFs are a framework for expressing error bounds and can be used in the convergence analysis of algorithms for convex feasibility problems, see [33, Sects. 3 and 4]. For example, \(C_1,\ldots , C_m\) satisfy a Hölderian error bound (Definition 2.1) if and only if these sets admit a CEBF of the format \(\varphi (a,b) {:}{=}\rho (b)\max \{a,a^{\gamma (b)}\}\), where \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) and \(\gamma :\mathbb {R}_+ \rightarrow (0,1]\) are monotone nondecreasing functions [33, Theorem 3.4].
We remark that in Example 4.20, although the sets \(C_1, C_2\) do not satisfy a Hölderian error bound, the log-type error bound displayed therein is covered under the framework of consistent error bound functions. This is because \(\mathfrak {g}_\infty \) is a continuous monotone nondecreasing function and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) is monotone nondecreasing as a function of \(\eta \) (Remark 3.11). Therefore, in view of (4.54), the function given by \(\varphi (a,b) {:}{=}a + \max \{2,2\gamma _{{{\textbf {z}}},b}^{-1}\}\mathfrak {g}_\infty (2a)\) is a CEBF for \(C_1\) and \(C_2\).
By the way, it seems conceivable that many of our results in Sect. 3.1 can be adapted to derive CEBFs for arbitrary convex sets. Specifically, Lemma 3.9, Theorem 3.10, and Lemma 3.12 only rely on convexity rather than on the more specific structure of cones.
Next, we will see that we can also adapt Examples 4.19 and 4.20 to find instances of odd behavior of the so-called Kurdyka-Łojasiewicz (KL) property [1, 2, 8,9,10, 30]. First, we recall some notations and definitions. Let \(f: \mathbb {R}^n\rightarrow \mathbb {R}\cup \{+\infty \}\) be a proper closed convex extended-real-valued function. We denote by \(\text {dom}\partial f\) the set of points for which the subdifferential \(\partial f({{\textbf {x}}})\) is non-empty and by \([a< f < b]\) the set of \({{\textbf {x}}}\) such that \(a< f({{\textbf {x}}}) < b\). As in [10, Sect. 2.3], we define for \(r_0\in (0,\infty )\) the set
Let \(B({{\textbf {x}}},\epsilon )\) denote the closed ball of radius \(\epsilon > 0\) centered at \({{\textbf {x}}}\). With that, we say that f satisfies the KL property at \({{\textbf {x}}}\in \text {dom}\partial f\) if there exist \(r_0 \in (0,\infty )\), \(\epsilon > 0\) and \(\phi \in \mathcal {K}(0,r_0)\) such that for all \({{\textbf {y}}}\in B({{\textbf {x}}},\epsilon ) \cap [f({{\textbf {x}}})< f < f({{\textbf {x}}}) + r_0 ]\) we have
In particular, as in [30], we say that f satisfies the KL property with exponent \(\alpha \in [0,1)\) at \({{\textbf {x}}}\in \text {dom}\partial f\), if \(\phi \) can be taken to be \(\phi (t) = ct^{1-\alpha }\) for some positive constant c. Next, we need a result which is a corollary of [10, Theorem 5].
Proposition 4.21
Let \(C_1, C_2 \subseteq \mathbb {R}^n\) be closed convex sets with \(C_1 \cap C_2 \ne \emptyset \). Define \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) as
Let \({{\textbf {x}}}\in C_1\cap C_2\), \(\gamma \in (0,1]\). Then, there exist \(\kappa > 0\) and \(\epsilon > 0 \) such that
if and only if f satisfies the KL property with exponent \(1-\gamma /2\) at \({{\textbf {x}}}\).
Proof
Note that \(\inf f = 0\) and \(\text {argmin}f = C_1\cap C_2\). Furthermore, (4.55) is equivalent to the existence of \(\kappa ' > 0\) and \(\epsilon > 0\) such that
where \(\varphi \) is the function given by \(\varphi (r) = \kappa ' r^{\gamma /2}\). With that, the result follows from [10, Theorem 5]. \(\square \)
Example 4.22
(Examples in the KL world) In Example 4.19, we have two sets \(C_1, C_2\) satisfying a uniform Hölderian error bound for \(\gamma \in (0,1)\) but not for \(\gamma = 1\). Because \(C_1\) and \(C_2\) are cones and the corresponding distance functions are positively homogeneous, this implies that for \({\textbf {0}} \in C_1 \cap C_2\), a Lipschitzian error bound never holds at any neighbourhood of \({\textbf {0}}\). That is, given \(\eta > 0\), there is no \(\kappa > 0\) such that
holds. Consequently, the function f in Proposition 4.21 satisfies the KL property with exponent \(\alpha \) for any \(\alpha \in (1/2,1)\) at the origin, but not for \(\alpha = 1/2\). To the best of our knowledge, this is the first explicitly constructed function in the literature such that the infimum of KL exponents at a point is not itself a KL exponent.
Similarly, from Example 4.20 we obtain \(C_1,C_2\) for which (4.55) does not hold for \({\textbf {0}} \in C_1\cap C_2\) with any chosen \(\kappa ,\varepsilon >0,\;\gamma \in \left( 0,1 \right] \). Thus from Proposition 4.21 we obtain a function f that does not satisfy the KL property with exponent \(\beta \in [1/2,1)\) at the origin. Since a function satisfying the KL property with exponent \(\alpha \in [0,1)\) at an \({{\textbf {x}}}\in \text {dom}\partial f\) necessarily satisfies it with exponent \(\beta \) for any \(\beta \in [\alpha ,1)\) at \({{\textbf {x}}}\), we see that this f does not satisfy the KL property with any exponent at the origin. On passing, we would like to point out that there are functions known in the literature that fail to satisfy the KL property; e.g., [9, Example 1].
5 Concluding remarks
In this work, we presented an extension of the results of [34] and showed how to obtain error bounds for conic linear systems using one-step facial residual functions and facial reduction (Theorem 3.8) even when the underlying cone is not amenable. Related to facial residual functions, we also developed techniques that aid in their computation; see Sect. 3.1. Finally, all techniques and results developed in Sect. 3 were used in some shape or form in order to obtain error bounds for the exponential cone in Sect. 4. Our new framework unlocks analysis for cones not reachable with the techniques developed in [34]; these include cones that are not facially exposed, as well as cones for which the projection operator has no simple closed form or is only implicitly specified. These were, until now, significant barriers against error bound analysis for many cones of interest.
As future work, we are planning to use the techniques developed in this paper to analyze and obtain error bounds for some of these other cones that have been previously unapproachable. Potential examples include the cone of \(n\times n\) completely positive matrices and its dual, the cone of \(n\times n\) copositive matrices. The former is not facially exposed when \(n\ge 5\) (see [53]) and the latter is not facially exposed when \(n \ge 2\). It would be interesting to clarify how far error bound problems for these cones can be tackled by our framework. Or, more ambitiously, we could try to obtain some of the facial residual functions and some error bound results. Of course, a significant challenge is that their facial structure is not completely understood, but we believe that even partial results for general n or complete results for specific values of n would be relevant and, possibly, quite non-trivial. Finally, as suggested by one of the reviewers, our framework may be enriched by investigating further geometric interpretations of the key quantity \(\gamma _{{{\textbf {z}}},\eta }\) in (3.15), beyond Fig. 2. For instance, it will be interesting to see whether the positivity of \(\gamma _{{{\textbf {z}}},\eta }\) is related to some generalization of the angle condition in [38], which was originally proposed for the study of Lipschitz error bounds.
Notes
To be fair, the exponential function is the classical example of a non-semialgebraic analytic function. Given that semialgebraicity is connected to the KL property (which is related to error bounds), one may argue that it is not that surprising that the exponential cone has its share of quirks. Nevertheless, given how natural the exponential cone is, the amount of quirks is still somewhat surprising.
As a reminder, the PPS condition is, by convention, a shorthand for three closely related conditions, see remarks after (2.3).
In particular, in view of (3.20), we see that this case only happens when \(\gamma _{{{\textbf {z}}},\eta } < \infty \).
This is a general fact. A proper face of a closed convex cone is contained in a proper exposed face, e.g., [13, Proposition 3.6].
Though one could use any suitable computer algebra package.
The third relation in (4.15) is derived from the second relation and the monotonicity of \(\mathfrak {g}_{-\infty }\) as follows: \(\mathfrak {g}_{-\infty }(Mt) = \mathfrak {g}_{-\infty }(2^{\log _2 M}t )\le \mathfrak {g}_{-\infty }(2^{\lceil |\log _2 M|\rceil }t )\le L^{\lceil |\log _2 M|\rceil }\mathfrak {g}_{-\infty }(t)\le L^{1 +|\log _2 M|}\mathfrak {g}_{-\infty }(t)\).
Notice that this function is well defined because \(h_1\) is zero only at \(1-\beta \) and thus we will not end up with \(\frac{0}{0}\).
Here is a sketch of the proof. If \( \mathcal {F}^1 \mathrel {\unlhd } \mathcal {K}^1, \mathcal {F}^{2} \mathrel {\unlhd } \mathcal {K}^2\), then the definition of face implies that \( \mathcal {F}^1 \times \mathcal {F}^{2} \mathrel {\unlhd } \mathcal {K}^1 \times \mathcal {K}^2\). For the converse, let \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}^1 \times \mathcal {K}^2\) and let \( \mathcal {F}^1, \mathcal {F}^2\) be the projection of \( \mathcal {F}\) onto the first and second variables, respectively. Suppose that \({{\textbf {x}}},{{\textbf {y}}}\in \mathcal {K}^1\) are such that \({{\textbf {x}}}+{{\textbf {y}}}\in \mathcal {F}^1\). Then, \(({{\textbf {x}}}+{{\textbf {y}}},{{\textbf {z}}}) \in \mathcal {F}\) for some \({{\textbf {z}}}\in \mathcal {K}^2\). Since \(({{\textbf {x}}}+{{\textbf {y}}},{{\textbf {z}}}) = ({{\textbf {x}}},{{\textbf {z}}}/2) + ({{\textbf {y}}},{{\textbf {z}}}/2)\) and \( \mathcal {F}\) is a face, we conclude that \(({{\textbf {x}}},{{\textbf {z}}}/2), ({{\textbf {y}}},{{\textbf {z}}}/2) \in \mathcal {F}\) and \({{\textbf {x}}}, {{\textbf {y}}}\in \mathcal {F}^1\). Therefore \( \mathcal {F}^1 \mathrel {\unlhd } \mathcal {K}^1\) and, similarly, \( \mathcal {F}^2 \mathrel {\unlhd } \mathcal {K}^2\). Then, the equality \( \mathcal {F}= \mathcal {F}^1 \times \mathcal {F}^2\) is proven using the definition of face and the fact that \(({{\textbf {x}}},{{\textbf {z}}}) = ({{\textbf {x}}},0) + (0,{{\textbf {z}}})\).
\(\sqrt{\cdot }\) is “worse” than \(\mathfrak {g}_{-\infty }\) in that, near zero, \(\sqrt{t} \ge \mathfrak {g}_{-\infty }(t)\). The function \(\mathfrak {g}_{\infty }\) need not be considered because, by SANE, \( \mathcal {F}_{\infty }\) never appears.
References
Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alternating minimization and projection methods for nonconvex problems: an approach based on the Kurdyka-Łojasiewicz inequality. Math. Oper. Res. 35, 438–457 (2010)
Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward-backward splitting, and regularized Gauss–Seidel methods. Math. Program. 137, 92–129 (2013)
Barker, G.P.: The lattice of faces of a finite dimensional cone. Linear Algebra Appl. 7(1), 71–82 (1973)
Barker, G.P.: Theory of cones. Linear Algebra Appl. 39, 263–291 (1981)
Bauschke, H.H., Borwein, J.M.: On projection algorithms for solving convex feasibility problems. SIAM Rev. 38(3), 367–426 (1996)
Bauschke, H.H., Borwein, J.M., Li, W.: Strong conical hull intersection property, bounded linear regularity, Jameson’s property (G), and error bounds in convex optimization. Math. Program. 86(1), 135–160 (1999)
Bauschke, H.H., Lindstrom, S.B.: Proximal averages for minimization of entropy functionals. (2020) arXiv preprint arXiv:1807.08878
Bolte, J., Daniilidis, A., Lewis, A.: The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J. Optim. 17, 1205–1223 (2007)
Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients of stratifiable functions. SIAM J. Optim. 18, 556–572 (2007)
Bolte, J., Nguyen, T.P., Peypouquet, J., Suter, B.W.: From error bounds to the complexity of first-order descent methods for convex functions. Math. Program. 165(2), 471–507 (2017)
Borwein, J.M., Li, G., Tam, M.K.: Convergence rate analysis for averaged fixed point iterations in common fixed point problems. SIAM J. Optim. 27(1), 1–33 (2017)
Borwein, J.M., Lindstrom, S.B.: Meetings with Lambert W and other special functions in optimization and analysis. Pure and App. Func. Anal. 1(3), 361–396 (2016)
Borwein, J.M., Wolkowicz, H.: Regularizing the abstract convex program. J. Math. Anal. Appl. 83(2), 495–530 (1981)
Burachik, R.S., Dao, M.N., Lindstrom, S.B.: The generalized Bregman distance. (2020) arXiv preprint arXiv:1909.08206
Chandrasekaran, V., Shah, P.: Relative entropy optimization and its applications. Math. Program. 161(1), 1–32 (2017)
Coey, C., Kapelevich, L., Vielma, J.P.: Solving natural conic formulations with hypatia.jl. ArXiv e-prints (2021)
Dahl, J., Andersen, E.D.: A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization. Mathematical Programming (2021)
Faraut, J., Korányi, A.: Analysis on Symmetric Cones. Oxford Mathematical Monographs. Clarendon Press, Oxford (1994)
Faybusovich, L.: Several Jordan-algebraic aspects of optimization. Optimization 57(3), 379–393 (2008)
Friberg, H.A.: Projection onto the exponential cone: a univariate root-finding problem. Optimization Online (2021)
Gouveia, J., Parrilo, P.A., Thomas, R.R.: Lifts of convex sets and cone factorizations. Math. Oper. Res. 38(2), 248–264 (2013)
Henrion, D., Malick, J.: Projection methods for conic feasibility problems: applications to polynomial sum-of-squares decompositions. Optim. Methods Softw. 26(1), 23–46 (2011)
Hoffman, A.J.: On approximate solutions of systems of linear inequalities. J. Res. Natl. Bur. Stand. 49(4), 263–265 (1952)
Ioffe, A.D.: Variational Analysis of Regular Mappings: Theory and Applications. Springer Monographs in Mathematics. Springer, Berlin (2017)
Karimi, M., Tunçel, L.: Domain-Driven Solver (DDS) Version 2.0: a MATLAB-based software package for convex optimization problems in domain-driven form. ArXiv e-prints (2019)
Lewis, A.S., Pang, J.S.: Error bounds for convex inequality systems. In: Crouzeix, J.P., Martínez-Legaz, J.E., Volle, M. (eds.) Generalized Convexity, Generalized Monotonicity: Recent Results, pp. 75–110. Springer, US (1998)
Li, G.: On the asymptotically well behaved functions and global error bound for convex polynomials. SIAM J. Optim. 20(4), 1923–1943 (2010)
Li, G.: Global error bounds for piecewise convex polynomials. Math. Program. 137(1), 37–64 (2013)
Li, G., Mordukhovich, B.S., Phạm, T.S.: New fractional error bounds for polynomial systems with applications to Hölderian stability in optimization and spectral theory of tensors. Math. Program. 153(2), 333–362 (2015)
Li, G., Pong, T.K.: Calculus of the exponent of Kurdyka–Łojasiewicz inequality and its applications to linear convergence of first-order methods. Found. Comput. Math. 18, 1199–1232 (2018)
Lindstrom, S.B.: The art of modern homo habilis mathematicus, or: What would Jon Borwein do? In: B. Sriraman (ed.) Handbook of the Mathematics of the Arts and Sciences. Springer (2020)
Liu, M., Pataki, G.: Exact duals and short certificates of infeasibility and weak infeasibility in conic linear programming. Math. Program. 167(2), 435–480 (2018)
Liu, T., Lourenço, B.F.: Convergence analysis under consistent error bounds. Found. Comput. Math. (2020)
Lourenço, B.F.: Amenable cones: error bounds without constraint qualifications. Math. Program. 186, 1–48 (2021)
Lourenço, B.F., Muramatsu, M., Tsuchiya, T.: Facial reduction and partial polyhedrality. SIAM J. Optim. 28(3), 2304–2326 (2018)
Lourenço, B.F., Roshchina, V., Saunderson, J.: Amenable cones are particularly nice. SIAM J. Optim. 32(3) (2022). https://doi.org/10.1137/20M138466X
MOSEK ApS: MOSEK Modeling Cookbook Release 3.2.2 (2020). https://docs.mosek.com/modeling-cookbook/index.html
Ng, K.F., Yang, W.H.: Error bounds for abstract linear inequality systems. SIAM J. Optim. 13(1), 24–43 (2002)
O’Donoghue, B., Chu, E., Parikh, N., Boyd, S.: Conic optimization via operator splitting and homogeneous self-dual embedding. J. Optim. Theory Appl. 169(3), 1042–1068 (2016)
Pang, J.S.: Error bounds in mathematical programming. Math. Program. 79(1), 299–332 (1997)
Papp, D., Yıldız, S.: Alfonso: Matlab package for nonsymmetric conic optimization. To be appear in INFORMS Journal on Computing
Pataki, G.: The geometry of semidefinite programming. In: H. Wolkowicz, R. Saigal, L. Vandenberghe (eds.) Handbook of semidefinite programming: theory, algorithms, and applications. Kluwer, online version at http://www.unc.edu/~pataki/papers/chapter.pdf (2000)
Pataki, G.: On the closedness of the linear image of a closed convex cone. Math. Oper. Res. 32(2), 395–412 (2007)
Pataki, G.: On the connection of facially exposed and nice cones. J. Math. Anal. Appl. 400(1), 211–221 (2013)
Pataki, G.: Strong duality in conic linear programming: facial reduction and extended duals. In: Computational and Analytical Mathematics, vol. 50, pp. 613–634. Springer, New York (2013)
Roshchina, V.: Facially exposed cones are not always nice. SIAM J. Optim. 24(1), 257–268 (2014)
Roshchina, V., Tunçel, L.: Facially dual complete (nice) cones and lexicographic tangents. SIAM J. Optim. 29(3), 2363–2387 (2019)
Serrano, S.A.: Algorithms for unsymmetric cone optimization and an implementation for problems with the exponential cone. Ph.D. thesis, Stanford University (2015)
Skajaa, A., Ye, Y.: A homogeneous interior-point algorithm for nonsymmetric convex conic optimization. Math. Program. 150(2), 391–422 (2015)
Sturm, J.F.: Error bounds for linear matrix inequalities. SIAM J. Optim. 10(4), 1228–1248 (2000)
Sung, C.H., Tam, B.S.: A study of projectionally exposed cones. Linear Algebra Appl. 139, 225–252 (1990)
Waki, H., Muramatsu, M.: Facial reduction algorithms for conic optimization problems. J. Optim. Theory Appl. 158(1), 188–215 (2013)
Zhang, Q.: Completely positive cones: are they facially exposed? Linear Algebra Appl. 558, 195–204 (2018)
Acknowledgements
We thank the referees for their comments, which helped to improve the paper.
Funding
Open Access funding enabled and organized by CAUL and its Member Institutions
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Bruno F. Lourenço was supported partly by JSPS Grantin-Aid for Early-Career Scientists 19K20217 and the Grant-in-Aid for Scientific Research (B)18H03206 and 21H03398.
Ting Kei Pong was supported partly by Hong Kong Research Grants Council PolyU153003/19p.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Lindstrom, S.B., Lourenço, B.F. & Pong, T.K. Error bounds, facial residual functions and applications to the exponential cone. Math. Program. 200, 229–278 (2023). https://doi.org/10.1007/s10107-022-01883-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-022-01883-8