1 Introduction

Our main object of interest is the following convex conic feasibility problem:

$$\begin{aligned} \text {find}&\quad {{\textbf {x}}}\in (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}, \end{aligned}$$
(Feas)

where \(\mathcal {L}\) is a subspace contained in some finite-dimensional real Euclidean space \(\mathcal {E}\), \({{\textbf {a}}}\in \mathcal {E}\) and \( \mathcal {K}\subseteq \mathcal {E}\) is a closed convex cone. For a discussion of some applications and algorithms for (Feas) see [22]. See also [5] for a broader analysis of convex feasibility problems. We also recall that a conic linear program (CLP) is the problem of minimizing a linear function subject to a constraint of the form described in (Feas). In addition, when the optimal set of a CLP is non-empty it can be written as the intersection of a cone with an affine set. This provides yet another motivation for analyzing (Feas): to better understand feasible regions and optimal sets of conic linear programs. Here, our main interest is in obtaining error bounds for (Feas). That is, assuming \((\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K}\ne \emptyset \), we want an inequality that, given some arbitrary \({{\textbf {x}}}\in \mathcal {E}\), relates the individual distances \(\text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}}), \text {d}({{\textbf {x}}}, \mathcal {K})\) to the distance to the intersection \(\text {d}({{\textbf {x}}}, (\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K})\). Considering that \(\mathcal {E}\) is equipped with some norm \(\Vert \cdot \Vert \) induced by some inner product \(\langle \cdot , \cdot \rangle \), we recall that the distance function to a convex set C is defined as follows:

$$\begin{aligned} \text {d}({{\textbf {x}}}, C) {:}{=}\inf _{{{\textbf {y}}}\in C} \Vert {{\textbf {x}}}-{{\textbf {y}}}\Vert . \end{aligned}$$

When \( \mathcal {K}\) is a polyhedral cone, the classical Hoffman’s error bound [23] gives a relatively complete picture of the way that the individual distances relate to the distance to the intersection. If \( \mathcal {K}\) is not polyhedral, but \(\mathcal {L}+{{\textbf {a}}}\) intersects \( \mathcal {K}\) in a sufficiently well-behaved fashion (say, for example, when \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\), the relative interior of \( \mathcal {K}\); see Proposition 2.2), we may still expect “good” error bounds to hold, e.g., [6, Corollary 3]. However, checking whether \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\) is not necessarily a trivial task; and, in general, \((\mathcal {L}+{{\textbf {a}}})\cap \text {ri}\, \mathcal {K}\) can be void.

Here, we focus on error bound results that do not require any assumption on the way that the affine space \(\mathcal {L}+{{\textbf {a}}}\) intersects \( \mathcal {K}\). So, for example, we want results that are valid even if, say, \(\mathcal {L}+{{\textbf {a}}}\) fails to intersect the relative interior of \( \mathcal {K}\). Inspired by Sturm’s pioneering work on error bounds for positive semidefinite systems [50], the class of amenable cones was proposed in [34] and it was shown that the following three ingredients can be used to obtain general error bounds for (Feas): (i) amenable cones, (ii) facial reduction [13, 45, 52] and (iii) the so-called facial residual functions (FRFs) [34, Definition 16].

In this paper, we will show that, in fact, it is possible to obtain error bounds for (Feas) by using the so-called one-step facial residual functions directly in combination with facial reduction. It is fair to say that computing the facial residual functions is the most critical step in obtaining error bounds for (Feas). We will demonstrate techniques that are readily adaptable for the purpose.

All the techniques discussed here will be showcased with error bounds for the so-called exponential cone which is defined as followsFootnote 1:

$$\begin{aligned} K_{\exp }:=&\left\{ (x,y,z)\in \mathbb {R}^3\;|\;y>0,z\ge ye^{x/y}\right\} \cup \left\{ (x,y,z)\;|\; x \le 0, z\ge 0, y=0 \right\} . \end{aligned}$$

Put succinctly, the exponential cone is the closure of the epigraph of the perspective function of \(z=e^x\). It is quite useful in entropy optimization, see [15]. Furthermore, it is also implemented in the MOSEK package, see [17, 37, Chapter 5], and the many modelling examples in Sect. 5.4 therein. There are several other solvers that either support the exponential cone or convex sets closely related to it [16, 25, 39, 41]. See also [20] for an algorithm for projecting onto the exponential cone. So convex optimization with exponential cones is widely available even if, as of this writing, it is not as widespread as, say, semidefinite programming.

The exponential cone \(K_{\exp }\) appears, at a glance, to be simple. However, it possesses a very intricate geometric structure that illustrates a wide range of challenges practitioners may face in computing error bounds. First of all, being non-facially-exposed, it is not amenable, so the theory developed in [34] does not directly apply to it. Another difficulty is that not many analytical tools have been developed to deal with the projection operator onto \(K_{\exp }\) (compared with, for example, the projection operator onto PSD cones) which is only implicitly specified. Until now, these issues have made challenging the establishment of error bounds for objects like \(K_{\exp }\), many of which are of growing interest in the mathematical programming community.

Our research is at the intersection of two topics: error bounds and the facial structure of cones. General information on the former can be found, for example, in [26, 40]. Classically, there seems to be a focus on the so-called Hölderian error bounds (see also [27,28,29]) but we will see in this paper that non Hölderian behavior can still appear even in relatively natural settings such as conic feasibility problems associated to the exponential cone.

Facts on the facial structure of convex cones can be found, for example, in [3, 4, 42]. We recall that a cone is said to be facially exposed if each face arises as the intersection of the whole cone with some supporting hyperplane. Stronger forms of facial exposedness have also been studied to some extent, here are some examples: projectional exposedness [13, 51], niceness [44, 46], tangential exposedness [47], amenability [34]. See also [36] for a comparison between a few different types of facial exposedness. These notions are useful in many topics, e.g.: regularization of convex programs and extended duals [13, 32, 45], studying the closure of certain linear images [32, 43], lifts of convex sets [21] and error bounds [34]. However, as can be seen in Fig. 1, the exponential cone is not even a facially exposed cone, so none of the aforementioned notions apply (in particular, the face \( \mathcal {F}_{ne}:=\{(0,0,z)\;|\; z \ge 0\}\) is not exposed). This was one of the motivations for looking beyond facial exposedness and developing a framework for deriving error bounds for feasibility problems associated to general closed convex cones.

Fig. 1
figure 1

The exponential cone is the union of the two labelled sets

1.1 Outline and results

The goal of this paper is to build a robust framework that may be used to obtain error bounds for previously inaccessible cones, and to demonstrate the use of this framework by applying it to fully describe error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\).

In Sect. 2, we recall preliminaries. New contributions begin in Sect. 3. We first recall some rules for chains of faces and the diamond composition. Then we show how error bounds may be constructed using objects known as one-step facial residual functions. In Sect. 3.1, we build our general framework for constructing one-step facial residual functions. Our key result, Theorem 3.10, obviates the need of computing explicitly the projection onto the cone. Instead, we make use of the parametrization of the boundary of the cone and projections onto the proper faces of a cone: thus, our approach is advantageous when these projections are easier to analyze than the projection onto the whole cone itself. We emphasize that all of the results of Sect. 3 are applicable to a general closed convex cone.

In Sect. 4, we use our new framework to fully describe error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\). This was previously a problem lacking a clear strategy, because all projections onto \(K_{\exp }\) are implicitly specified. However, having obviated the need to project onto \(K_{\exp }\), we successfully obtain all the necessary FRFs, partly because it is easier to project onto the proper faces of \(K_{\exp }\) than to project onto \(K_{\exp }\) itself. Surprisingly, we discover that different collections of faces and exposing hyperplanes admit very different FRFs. In Sect. 4.2.1, we show that for the unique 2-dimensional face, any exponent in \(\left( 0,1\right) \) may be used to build a valid FRF, while the supremum over all the admissible exponents cannot be. Furthermore, a better FRF for the 2D face can be obtained if we go beyond Hölderian error bounds and consider a so-called entropic error bound which uses a modified Boltzmann-Shannon entropy function, see Theorem 4.2. The curious discoveries continue; for infinitely many 1-dimensional faces, the FRF, and the final error bound, feature exponent 1/2. For the final outstanding 1-dimensional exposed face, the FRF, and the final error bound, are Lipschitzian for all exposing hyperplanes except exactly one, for which no exponent will suffice. However, for this exceptional case, our framework still successfully finds an FRF, which is logarithmic in character (Corollary 4.11). Consequentially, the system consisting of \(\{(0,0,1)\}^\perp \) and \(K_{\exp }\) possesses a kind of “logarithmic error bound" (see Example 4.20) instead of a Hölderian error bound. In Theorems 4.13 and 4.17, we give explicit error bounds by using our FRFs and the suite of tools we developed in Sect. 3. We also show that the error bound given in Theorem 4.13 is tight, see Remark 4.14.

These findings about the exponential cone are surprising, since we are not aware of other objects having this litany of odd behaviour hidden in their structure all at once.Footnote 2 One possible reason for the absence of previous reports on these phenomena might have been the sheer absence of tools for obtaining error bounds for general cones. In this sense, we believe that the machinery developed in Sect. 3 might be a reasonable first step towards filling this gap. In Sect. 4.4, we document additional odd consequences and connections to other concepts, with particular relevance to the Kurdyka-Łojasiewicz (KL) property [1, 2, 8,9,10, 30]. In particular, we have two sets satisfying a Hölderian error bound for every \(\gamma \in \left( 0,1\right) \) but the supremum of allowable exponents is not allowable. Consequently, one obtains a function with the KL property with exponent \(\alpha \) for any \(\alpha \in \left( 1/2,1\right) \) at the origin, but not for \(\alpha = 1/2\). We conclude in Sect. 5.

2 Preliminaries

We recall that \(\mathcal {E}\) denotes an arbitrary finite-dimensional real Euclidean space. We will adopt the following convention, vectors will be boldfaced while scalars will use normal typeface. For example, if \({{\textbf {p}}}\in \mathbb {R}^3\), we write \({{\textbf {p}}}= (p_x,p_y,p_z)\), where \(p_x,p_y,p_z \in \mathbb {R}\).

We denote by \(B(\eta )\) the closed ball of radius \(\eta \) centered at the origin, i.e., \(B(\eta ) = \{{{\textbf {x}}}\in \mathcal {E}\mid \Vert {{\textbf {x}}}\Vert \le \eta \}\). Let \(C\subseteq \mathcal {E}\) be a convex set. We denote the relative interior and the linear span of C by \(\text {ri}\,C\) and \(\text {span}\,C\), respectively. We also denote the boundary of C by \(\partial C\), and \(\text {cl}\, C\) is the closure of C. We denote the projection operator onto C by \(P_C\), so that \(P_C({{\textbf {x}}}) = \text {argmin}_{{{\textbf {y}}}\in C} \Vert {{\textbf {x}}}-{{\textbf {y}}}\Vert \). Given closed convex sets \(C_1,C_2 \subseteq \mathcal {E}\), we note the following properties of the projection operator

$$\begin{aligned} \text {d}({{\textbf {x}}},C_1)&\le \text {d}({{\textbf {x}}},C_2) + \text {d}(P_{C_2}({{\textbf {x}}}),C_1) \end{aligned}$$
(2.1)
$$\begin{aligned} \text {d}(P_{C_2}({{\textbf {x}}}),C_1)&\le \text {d}({{\textbf {x}}},C_2) + \text {d}({{\textbf {x}}},C_1). \end{aligned}$$
(2.2)

2.1 Cones and their faces

Let \( \mathcal {K}\) be a closed convex cone. We say that \( \mathcal {K}\) is pointed if . The dimension of \( \mathcal {K}\) is denoted by \(\dim ( \mathcal {K})\) and is the dimension of the linear subspace spanned by \( \mathcal {K}\). A face of \( \mathcal {K}\) is a closed convex cone \( \mathcal {F}\) satisfying \( \mathcal {F}\subseteq \mathcal {K}\) and the following property

$$\begin{aligned} {{\textbf {x}}},{{\textbf {y}}}\in \mathcal {K}, {{\textbf {x}}}+{{\textbf {y}}}\in \mathcal {F}\Rightarrow {{\textbf {x}}},{{\textbf {y}}}\in \mathcal {F}. \end{aligned}$$

In this case, we write \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\). We say that \( \mathcal {F}\) is proper if \( \mathcal {F}\ne \mathcal {K}\). A face is said to be nontrivial if \( \mathcal {F}\ne \mathcal {K}\) and \( \mathcal {F}\ne \mathcal {K}\cap - \mathcal {K}\). In particular, if \( \mathcal {K}\) is pointed (as is the case of the exponential cone), a nontrivial face is neither \( \mathcal {K}\) nor \(\{{\textbf {0}}\}\). Next, let \( \mathcal {K}^*\) denote the dual cone of \( \mathcal {K}\), i.e., \( \mathcal {K}^* = \{{{\textbf {z}}}\in \mathcal {E}\mid \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \ge 0, \forall {{\textbf {x}}}\in \mathcal {K}\}\). We say that \( \mathcal {F}\) is an exposed face if there exists \({{\textbf {z}}}\in \mathcal {K}^*\) such that \( \mathcal {F}= \mathcal {K}\cap \{{{\textbf {z}}}\}^\perp \).

A chain of faces of \( \mathcal {K}\) is a sequence of faces satisfying \( \mathcal {F}_\ell \subsetneq \cdots \subsetneq \mathcal {F}_{1}\) such that each \( \mathcal {F}_{i}\) is a face of \( \mathcal {K}\) and the inclusions \( \mathcal {F}_{i+1} \subsetneq \mathcal {F}_{i}\) are all proper. The length of the chain is defined to be \(\ell \). With that, we define the distance to polyhedrality of \( \mathcal {K}\) as the length minus one of the longest chain of faces of \( \mathcal {K}\) such that \( \mathcal {F}_{\ell }\) is polyhedral and \( \mathcal {F}_{i}\) is not polyhedral for \(i < \ell \), see [35, Sect. 5.1]. We denote the distance to polyhedrality by \(\ell _{\text {poly}}( \mathcal {K})\).

2.2 Lipschitzian and Hölderian error bounds

In this subsection, suppose that \(C_1,\ldots , C_{\ell } \subseteq \mathcal {E}\) are convex sets with nonempty intersection. We recall the following definitions.

Definition 2.1

(Hölderian and Lipschitzian error bounds) The sets \(C_1,\ldots , C_\ell \) are said to satisfy a Hölderian error bound if for every bounded set \(B \subseteq \mathcal {E}\) there exist some \(\kappa _B > 0\) and an exponent \(\gamma _B\in (0, 1]\) such that

$$\begin{aligned} \text {d}({{\textbf {x}}}, \cap _{i=1}^\ell C_i) \le \kappa _B\max _{1\le i\le \ell }\text {d}({{\textbf {x}}}, \, C_i)^{\gamma _B}, \qquad \forall \ {{\textbf {x}}}\in B. \end{aligned}$$

If we can take the same \(\gamma _B = \gamma \in (0,1]\) for all B, then we say that the bound is uniform. If the bound is uniform with \(\gamma = 1\), we call it a Lipschitzian error bound.

We note that the concepts in Definition 2.1 also have different names throughout the literature. When \(C_1,\ldots , C_\ell \) satisfy a Hölderian error bound it is said that they satisfy bounded Hölder regularity, e.g., see [11, Definition 2.2]. When a Lipschitzian error bound holds, \(C_1,\ldots , C_\ell \) are said to satisfy bounded linear regularity, see [5, Sect. 5] or [6]. Bounded linear regularity is also closely related to the notion of subtransversality [24, Definition 7.5].

Hölderian and Lipschitzian error bounds will appear frequently in our results, but we also encounter non-Hölderian bounds as in Theorems 4.2 and 4.10. Next, we recall the following result which ensures a Lipschitzian error bound holds between families of convex sets when a constraint qualification is satisfied.

Proposition 2.2

[6, Corollary 3] Let \(C_1,\ldots , C_{\ell } \subseteq \mathcal {E}\) be convex sets such that \(C_{1},\ldots , C_{k}\) are polyhedral. If

$$\begin{aligned}\textstyle \left( \bigcap \limits _{i=1}^k C_i\right) \bigcap \left( \bigcap \limits _{j=k+1}^\ell \text {ri}\,C_j\right) \ne \emptyset , \end{aligned}$$

then for every bounded set B there exists \(\kappa _B>0\) such that

$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, \cap _{i=1}^\ell C_i\right) \le \kappa _B\left( \max _{1 \le i \le \ell } \text {d}({{\textbf {x}}}, C_i)\right) , \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$

In view of (Feas), we say that Slater’s condition is satisfied if \((\mathcal {L}+ {{\textbf {a}}}) \cap \text {ri}\, \mathcal {K}\ne \emptyset \). If \( \mathcal {K}\) can be written as \( \mathcal {K}^1 \times \mathcal {K}^2\subseteq \mathcal {E}^1 \times \mathcal {E}^2\), where \(\mathcal {E}^1\) and \(\mathcal {E}^2\) are real Euclidean spaces and \( \mathcal {K}^1\subseteq \mathcal {E}^1\) is polyhedral, we say that the partial polyhedral Slater’s (PPS) condition is satisfied if

$$\begin{aligned} (\mathcal {L}+ {{\textbf {a}}}) \cap ( \mathcal {K}^1 \times (\text {ri}\, \mathcal {K}^2) ) \ne \emptyset . \end{aligned}$$
(2.3)

Adding a dummy coordinate, if necessary, we can see Slater’s condition as a particular case of the PPS condition. By convention, we consider that the PPS condition is satisfied for (Feas) if one of the following is satisfied: 1) \(\mathcal {L}+{{\textbf {a}}}\) intersects \(\text {ri}\, \mathcal {K}\); 2) \((\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K}\ne \emptyset \) and \( \mathcal {K}\) is polyhedral; or 3) \( \mathcal {K}\) can be written as a direct product \( \mathcal {K}^1 \times \mathcal {K}^2\) where \( \mathcal {K}^1\) is polyhedral and (2.3) is satisfied.

Noting that \((\mathcal {L}+ {{\textbf {a}}}) \cap ( \mathcal {K}^1 \times (\text {ri}\, \mathcal {K}^2) ) = (\mathcal {L}+ {{\textbf {a}}})\cap ( \mathcal {K}^1 \times \mathcal {E}^2) \cap (\mathcal {E}^1 \times (\text {ri}\, \mathcal {K}^2) )\), we deduce the following result from Proposition 2.2.

Proposition 2.3

(Error bound under PPS condition) Suppose that (Feas) satisfies the partial polyhedral Slater’s condition. Then, for every bounded set B there exists \(\kappa _B>0\) such that

$$\begin{aligned} \text {d}({{\textbf {x}}}, (\mathcal {L}+{{\textbf {a}}})\cap \mathcal {K}) \le \kappa _B \max \{\text {d}({{\textbf {x}}}, \mathcal {K}),\text {d}({{\textbf {x}}},\mathcal {L}+{{\textbf {a}}})\}, \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$

We recall that for \(a,b \in \mathbb {R}_+\) we have \(a+b \le 2\max \{a,b\} \le 2(a+b)\), so Propositions 2.2 and 2.3 can also be equivalently stated in terms of sums of distances.

3 Facial residual functions and error bounds

In this section, we discuss a strategy for obtaining error bounds for the conic linear system (Feas) based on the so-called facial residual functions that were introduced in [34]. In contrast to [34], we will not require that \( \mathcal {K}\) be amenable.

The motivation for our approach is as follows. If it were the case that (Feas) satisfies some constraint qualification, we would have a Lipschitizian error bound per Proposition 2.3, see also [6] for other sufficient conditions. Unfortunately, this does not happen in general. However, as long as (Feas) is feasible, there is always a face of \( \mathcal {K}\) that contains the feasible region of (Feas) and for which a constraint qualification holds. The error bound computation essentially boils down to understanding how to compute the distance to this special face. The first result towards our goal is the following.

Proposition 3.1

(An error bound when a face satisfying a CQ is known) Suppose that (Feas) is feasible and let \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) be a face such that

  1. (a)

    \( \mathcal {F}\) contains \( \mathcal {K}\cap (\mathcal {L}+{{\textbf {a}}})\).

  2. (b)

    \(\{ \mathcal {F}, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition.Footnote 3

Then, for every bounded set B, there exists \(\kappa _B > 0\) such that

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}})) \le \kappa _B(\text {d}({{\textbf {x}}}, \mathcal {F}) + \text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}})), \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$

Proof

Since \( \mathcal {F}\) is a face of \( \mathcal {K}\), assumption (a) implies \( \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}}) = \mathcal {F}\cap (\mathcal {L}+{{\textbf {a}}})\). Then, the result follows from assumption (b) and Proposition 2.3. \(\square \)

From Proposition 3.1 we see that the key to obtaining an error bound for the system (Feas) is to find a face \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) satisfying (a), (b) and we must know how to estimate the quantity \(\text {d}({{\textbf {x}}}, \mathcal {F})\) from the available information \(\text {d}({{\textbf {x}}}, \mathcal {K})\) and \(\text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}})\).

This is where we will make use of facial reduction and facial residual functions. The former will help us find \( \mathcal {F}\) and the latter will be instrumental in upper bounding \(\text {d}({{\textbf {x}}}, \mathcal {F})\). First, we recall below a result that follows from the analysis of the FRA-poly facial reduction algorithm developed in [35].

Proposition 3.2

[34, Proposition 5]Footnote 4 Let \( \mathcal {K}= \mathcal {K}^1\times \cdots \times \mathcal {K}^s\), where each \( \mathcal {K}^i\) is a closed convex cone. Suppose (Feas) is feasible. Then there is a chain of faces

$$\begin{aligned} \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\end{aligned}$$
(3.1)

of length \(\ell \) and vectors \(\{{{\textbf {z}}}_1,\ldots , {{\textbf {z}}}_{\ell -1}\}\) satisfying the following properties.

  1. (i)

    \(\ell -1\le \sum _{i=1}^{s} \ell _{\text {poly}}( \mathcal {K}^i) \le \dim { \mathcal {K}}\).

  2. (ii)

    For all \(i \in \{1,\ldots , \ell -1\}\), we have

    $$\begin{aligned} {{\textbf {z}}}_i \in \mathcal {F}_i^* \cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \ \ \ {and}\ \ \ \mathcal {F}_{i+1} = \mathcal {F}_{i} \cap \{{{\textbf {z}}}_i\}^\perp . \end{aligned}$$
  3. (iii)

    \( \mathcal {F}_{\ell } \cap (\mathcal {L}+{{\textbf {a}}}) = \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}})\) and \(\{ \mathcal {F}_{\ell },\mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition.

In view of Proposition 3.2, we define the distance to the PPS condition \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})\) as the length minus one of the shortest chain of faces (as in (3.1)) satisfying items (ii) and (iii) in Proposition 3.2. For example, if (Feas) satisfies the PPS condition, we have \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}}) = 0\).

Next, we recall the definition of facial residual functions from [34, Definition 16].

Definition 3.3

(Facial residual functionFootnote 5) Let \( \mathcal {K}\) be a closed convex cone, \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) be a face, and let \({{\textbf {z}}}\in \mathcal {F}^*\). Suppose that \(\psi _{ \mathcal {F},{{\textbf {z}}}} : \mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) satisfies the following properties:

  1. (i)

    \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) is nonnegative, monotone nondecreasing in each argument and \(\psi _{ \mathcal {F},{{\textbf {z}}}}(0,t) = 0\) for every \(t \in \mathbb {R}_+\).

  2. (ii)

    The following implication holds for any \({{\textbf {x}}}\in \text {span}\, \mathcal {K}\) and any \(\epsilon \ge 0\):

    $$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \le \epsilon , \quad \text {d}({{\textbf {x}}}, \text {span}\, \mathcal {F}) \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}}, \mathcal {F}\cap \{{{\textbf {z}}}\}^{\perp }) \le \psi _{ \mathcal {F},{{\textbf {z}}}} (\epsilon , \Vert {{\textbf {x}}}\Vert ). \end{aligned}$$

Then, \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) is said to be a facial residual function for \( \mathcal {F}\) and \({{\textbf {z}}}\) with respect to \( \mathcal {K}\).

Definition 3.3, in its most general form, represents “two-steps” along the facial structure of a cone: we have a cone \( \mathcal {K}\), a face \( \mathcal {F}\) (which could be different from \( \mathcal {K}\)) and a third face defined by \( \mathcal {F}\cap \{{{\textbf {z}}}\}^\perp \). In this work, however, we will be focused on the following special case of Definition 3.3.

Definition 3.4

(One-step facial residual function (\(\mathbb {1}\)-FRF)) Let \( \mathcal {K}\) be a closed convex cone and \({{\textbf {z}}}\in \mathcal {K}^*\). A function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) is called a one-step facial residual function (\(\mathbb {1}\)-FRF) for \( \mathcal {K}\) and \({{\textbf {z}}}\) if it is a facial residual function of \( \mathcal {K}\) and \({{\textbf {z}}}\) with respect to \( \mathcal {K}\). That is, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) satisfies item (i) of Definition 3.3 and for every \({{\textbf {x}}}\in \text {span}\, \mathcal {K}\) and any \(\epsilon \ge 0\):

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}}, \mathcal {K}\cap \{{{\textbf {z}}}\}^{\perp }) \le \psi _{ \mathcal {K},{{\textbf {z}}}} (\epsilon , \Vert {{\textbf {x}}}\Vert ). \end{aligned}$$

Remark 3.5

(Concerning the implication in Definition 3.4) In view of the monotonicity of \(\psi _{ \mathcal {K},{{\textbf {z}}}}\), the implication in Definition 3.4 can be equivalently and more succinctly written as

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}\cap \{{{\textbf {z}}}\}^\perp ) \le \psi _{ \mathcal {K},{{\textbf {z}}}}(\max \{\text {d}({{\textbf {x}}}, \mathcal {K}), \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \}, \Vert {{\textbf {x}}}\Vert ),\ \ \forall {{\textbf {x}}}\in \text {span}\, \mathcal {K}. \end{aligned}$$

The unfolded form presented in Definition 3.4 is more handy in our discussions and analysis below.

Facial residual functions always exist (see [34, Sect. 3.2] for the case of pointed cones, although the argument holds in general), but their computation is often nontrivial. Next, we review a few examples.

Example 3.6

(Examples of facial residual functions) If \( \mathcal {K}\) is a symmetric cone (i.e., a self-dual homogeneous cone, see [18, 19]), then given \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\) and \({{\textbf {z}}}\in \mathcal {F}^*\), there exists a \(\kappa > 0\) such that \(\psi _{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \kappa \sqrt{\epsilon t}\) is a one-step facial residual function for \( \mathcal {F}\) and \({{\textbf {z}}}\), see [34, Theorem 35].

If \( \mathcal {K}\) is a polyhedral cone, the function \(\psi _{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon \) can be taken instead, with no dependency on t, see [34, Proposition 18].

Moving on, we say that a function \(\tilde{\psi }_{ \mathcal {F},{{\textbf {z}}}}\) is a positively rescaled shift of \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) if there are positive constants \(M_1,M_2,M_3\) and nonnegative constant \(M_4\) such that

$$\begin{aligned} \tilde{\psi }_{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) = M_3\psi _{ \mathcal {F},{{\textbf {z}}}} (M_1\epsilon ,M_2t) + M_4\epsilon . \end{aligned}$$
(3.2)

This is a generalization of the notion of positive rescaling in [34], which sets \(M_4 = 0\). We also need to compose facial residual functions in a special manner. Let \(f:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) and \(g:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) be functions. We define the diamond composition \(f\diamondsuit g\) to be the function satisfying

$$\begin{aligned} (f\diamondsuit g)(a,b) = f(a+g(a,b),b), \qquad \forall a,b \in \mathbb {R}_+. \end{aligned}$$
(3.3)

Note that the above composition is not associative in general. When we have functions \(f_i:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\), \(i = 1,\ldots ,m\) with \(m\ge 3\), we define \(f_m\diamondsuit \cdots \diamondsuit f_1\) inductively as the function \(\varphi _m\) such that

$$\begin{aligned} \varphi _i&{:}{=}f_i \diamondsuit \varphi _{i-1},\qquad i \in \{2,\ldots ,m\}\\ \varphi _1&{:}{=}f_1. \end{aligned}$$

With that, we have \( f_m\diamondsuit f_{m-1} \diamondsuit \cdots \diamondsuit f_2 \diamondsuit f_1 {:}{=}f_m\diamondsuit (f_{m-1}\diamondsuit (\cdots \diamondsuit (f_2\diamondsuit f_1)))\).

The following lemma, which holds for a general closed convex cone \( \mathcal {K}\), shows how (positively rescaled shifts of) one-step facial residual functions for the faces of \( \mathcal {K}\) can be combined via the diamond composition to derive useful bounds on the distance to faces. A version of it was proved in [34, Lemma 22], which required the cones to be pointed and made use of general (i.e., not necessarily one-step) facial residual functions with respect to \( \mathcal {K}\). This is a subtle, but very crucial difference which will allows us to relax the assumptions in [34].

Lemma 3.7

(Diamond composing facial residual functions) Suppose (Feas) is feasible and let

$$\begin{aligned} \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\end{aligned}$$

be a chain of faces of \( \mathcal {K}\) together with \({{\textbf {z}}}_i \in \mathcal {F}_i^*\cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that \( \mathcal {F}_{i+1} = \mathcal {F}_i\cap \{{{\textbf {z}}}_i\}^\perp \), for \(i = 1,\ldots , \ell - 1\). For each i, let \(\psi _{i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}_i\) and \({{\textbf {z}}}_i\). Then, there is a positively rescaled shift of \(\psi _i\) (still denoted as \(\psi _i\) by an abuse of notation) so that for every \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\):

$$\begin{aligned} \quad \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}}, \mathcal {F}_{\ell }) \le \varphi (\epsilon ,\Vert {{\textbf {x}}}\Vert ), \end{aligned}$$

where \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), if \(\ell \ge 2\). If \(\ell = 1\), we let \(\varphi \) be the function satisfying \(\varphi (\epsilon , t) = \epsilon \).

Proof

For \(\ell = 1\), we have \( \mathcal {F}_{\ell } = \mathcal {K}\), so the lemma follows immediately. Now, we consider the case \(\ell \ge 2\). First we note that \(\mathcal {L}+ {{\textbf {a}}}\) is contained in all the \(\{{{\textbf {z}}}_{i}\}^\perp \) for \(i = 1, \ldots , \ell -1\). Since the distance of \({{\textbf {x}}}\in \mathcal {E}\) to \(\{{{\textbf {z}}}_{i}\}^\perp \) is given by \(\frac{|\langle {{\textbf {x}}} , {{\textbf {z}}}_i \rangle |}{\Vert {{\textbf {z}}}_i\Vert }\), we have the following chain of implications

$$\begin{aligned} \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}},\{{{\textbf {z}}}_i\}^\perp ) \le \epsilon \quad \Rightarrow \quad \langle {{\textbf {x}}} , {{\textbf {z}}}_i \rangle \le \epsilon \Vert {{\textbf {z}}}_i\Vert . \end{aligned}$$
(3.4)

Next, we proceed by induction. If \(\ell = 2\), we have that \(\psi _1\) is a one-step facial residual function for \( \mathcal {K}\) and \({{\textbf {z}}}_1\). By Definition 3.4, we have

$$\begin{aligned} {{\textbf {y}}}\in \text {span}\, \mathcal {K}, \quad \text {d}({{\textbf {y}}}, \mathcal {K}) \le \epsilon , \quad \langle {{\textbf {y}}} , {{\textbf {z}}}_1 \rangle \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {y}}}, \mathcal {F}_{2}) \le \psi _{1} (\epsilon , \Vert {{\textbf {y}}}\Vert ). \end{aligned}$$

In view of (3.4) and the monotonicity of \(\psi _1\), we see further that

$$\begin{aligned} {{\textbf {y}}}\in \text {span}\, \mathcal {K}, \, \text {d}({{\textbf {y}}}, \mathcal {K}) \le \epsilon , \, \text {d}({{\textbf {y}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \, \Rightarrow \, \text {d}({{\textbf {y}}}, \mathcal {F}_{2}) \le \psi _{1} (\epsilon (1+\Vert {{\textbf {z}}}_{1}\Vert ), \Vert {{\textbf {y}}}\Vert ).\nonumber \\ \end{aligned}$$
(3.5)

Now, suppose that \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\) are such that \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \). Let \({\hat{{{\textbf {x}}}}} := P_{\text {span}\, \mathcal {K}}({{\textbf {x}}})\). Since \( \mathcal {K}\subseteq \text {span}\, \mathcal {K}\), we have \(\text {d}({{\textbf {x}}}, \text {span}\, \mathcal {K})\le \text {d}({{\textbf {x}}}, \mathcal {K})\) and, in view of (2.2), we have that

$$\begin{aligned} \begin{aligned} \text {d}({\hat{{{\textbf {x}}}}}, \mathcal {K})&\le \text {d}({{\textbf {x}}},\text {span}\, \mathcal {K}) + \text {d}({{\textbf {x}}}, \mathcal {K})\le 2\epsilon ,\\ \text {d}({\hat{{{\textbf {x}}}}},\mathcal {L}+ {{\textbf {a}}})&\le \text {d}({{\textbf {x}}},\text {span}\, \mathcal {K}) + \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}})\le 2\epsilon . \end{aligned} \end{aligned}$$
(3.6)

From (2.1), (3.5) and (3.6) we obtain

$$\begin{aligned} \begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {F}_2) \le \text {d}({{\textbf {x}}}, \text {span}\, \mathcal {K}) + \text {d}({\hat{{{\textbf {x}}}}}, \mathcal {F}_{2})&\le \epsilon + \psi _{{1}}(2\epsilon (1+\Vert {{\textbf {z}}}_{1}\Vert ),\Vert {\hat{{{\textbf {x}}}}}\Vert )\\&\le \epsilon + \psi _{{1}}(2\epsilon (1+\Vert {{\textbf {z}}}_{1}\Vert ),\Vert {{\textbf {x}}}\Vert ),\\ \end{aligned} \end{aligned}$$

where the last inequality follows from the monotonicity of \(\psi _{{1}}\) and the fact that \(\Vert {\hat{{{\textbf {x}}}}}\Vert \le \Vert {{\textbf {x}}}\Vert \). This proves the lemma for chains of length \(\ell = 2\) because the function mapping \((\epsilon ,t)\) to \(\epsilon + \psi _{{1}}(2\epsilon (1+\Vert {{\textbf {z}}}_{1}\Vert ),t)\) is a positively rescaled shift of \(\psi _{{1}}\).

Now, suppose that the lemma holds for chains of length \({\hat{\ell }}\) and consider a chain of length \({\hat{\ell }} + 1\). By the induction hypothesis, we have

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \quad \Rightarrow \quad \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) \le \varphi (\epsilon , \Vert {{\textbf {x}}}\Vert ), \end{aligned}$$
(3.7)

where \(\varphi = \psi _{{{\hat{\ell }}-1}}\diamondsuit \cdots \diamondsuit \psi _1\) and the \(\psi _i\) are (positively rescaled shifts of) one-step facial residual functions. By the definition of \(\psi _{{{\hat{\ell }}}}\) as a one-step facial residual function and using (3.4), we may positively rescale \(\psi _{{{\hat{\ell }}}}\) (still denoted as \(\psi _{{{\hat{\ell }}}}\) by an abuse of notation) so that for \({{\textbf {y}}}\in \text {span}\, \mathcal {F}_{{\hat{\ell }}}\) and \({\hat{\epsilon }} \ge 0\), the following implication holds:

$$\begin{aligned} \text {d}({{\textbf {y}}}, \mathcal {F}_{{\hat{\ell }}}) \le {\hat{\epsilon }}, \quad \text {d}({{\textbf {y}}},\mathcal {L}+ {{\textbf {a}}}) \le {\hat{\epsilon }} \quad \Rightarrow \quad \text {d}({{\textbf {y}}}, \mathcal {F}_{{\hat{\ell }}+1}) \le \psi _{{\hat{\ell }}}({\hat{\epsilon }}, \Vert {{\textbf {y}}}\Vert ). \end{aligned}$$
(3.8)

Now, suppose that \({{\textbf {x}}}\in \mathcal {E}\) and \(\epsilon \ge 0\) satisfy \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \). Let \({\hat{{{\textbf {x}}}}} := P_{\text {span}\, \mathcal {F}_{{\hat{\ell }}}}({{\textbf {x}}})\). As before, since \( \mathcal {F}_{{\hat{\ell }}} \subseteq \text {span}\, \mathcal {F}_{{\hat{\ell }}}\), we have \(\text {d}({{\textbf {x}}}, \text {span}\, \mathcal {F}_{{\hat{\ell }}})\le \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}})\) and, in view of (2.2), we have

$$\begin{aligned} \text {d}({\hat{{{\textbf {x}}}}}, \mathcal {F}_{{\hat{\ell }}})\le & {} \text {d}({{\textbf {x}}},\text {span}\, \mathcal {F}_{{\hat{\ell }}}) + \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}})\le 2\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) \le 2\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) + \epsilon ,\nonumber \\ \text {d}({\hat{{{\textbf {x}}}}},\mathcal {L}+ {{\textbf {a}}})\le & {} \text {d}({{\textbf {x}}},\text {span}\, \mathcal {F}_{{\hat{\ell }}}) + \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}})\le \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}})+\epsilon \le 2\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) + \epsilon .\nonumber \\ \end{aligned}$$
(3.9)

Let \({\hat{\psi }}_{{\hat{\ell }}}\) be such that \({\hat{\psi }}_{{\hat{\ell }}}(s,t){:}{=}s + \psi _{{\hat{\ell }}}(2s,t)\), so that \({\hat{\psi }}_{{\hat{\ell }}}\) is a positively rescaled shift of \(\psi _{{\hat{\ell }}}\). Then, (3.9) together with (3.8) and (2.1) gives

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}+1})&\le \text {d}({{\textbf {x}}},\text {span}\, \mathcal {F}_{{\hat{\ell }}}) + \text {d}({\hat{{{\textbf {x}}}}}, \mathcal {F}_{{\hat{\ell }}+1})\le \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) + \text {d}({\hat{{{\textbf {x}}}}}, \mathcal {F}_{{\hat{\ell }}+1})\\&\le \text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) + \psi _{{\hat{\ell }}}(\epsilon +2\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) , \Vert {\hat{{{\textbf {x}}}}}\Vert ) \overset{\mathrm{(a)}}{\le }\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) + \psi _{{\hat{\ell }}}(2\epsilon +2\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}) , \Vert {{\textbf {x}}}\Vert )\\&\le {\hat{\psi }}_{{\hat{\ell }}}(\epsilon +\text {d}({{\textbf {x}}}, \mathcal {F}_{{\hat{\ell }}}), \Vert {{\textbf {x}}}\Vert )\overset{\mathrm{(b)}}{\le }{\hat{\psi }}_{{\hat{\ell }}}(\epsilon +\varphi (\epsilon , \Vert {{\textbf {x}}}\Vert ), \Vert {{\textbf {x}}}\Vert )=({\hat{\psi }}_{{\hat{\ell }}} \diamondsuit \varphi )(\epsilon ,\Vert {{\textbf {x}}}\Vert ), \end{aligned}$$

where (a) follows from the monotonicity of \(\psi _{{\hat{\ell }}}\) and the fact that \(\Vert {\hat{{{\textbf {x}}}}}\Vert \le \Vert {{\textbf {x}}}\Vert \), and (b) follows from (3.7) and the monotonicity of \({\hat{\psi }}_{{\hat{\ell }}}\). This completes the proof. \(\square \)

We now have all the pieces to state an error bound result for (Feas) that does not require any constraint qualifications.

Theorem 3.8

(Error bound based on \(\mathbb {1}\)-FRFs) Suppose (Feas) is feasible and let

$$\begin{aligned} \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\end{aligned}$$

be a chain of faces of \( \mathcal {K}\) together with \({{\textbf {z}}}_i \in \mathcal {F}_i^*\cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that \(\{ \mathcal {F}_{\ell }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition and \( \mathcal {F}_{i+1} = \mathcal {F}_i\cap \{{{\textbf {z}}}_i\}^\perp \) for every i. For \(i = 1,\ldots , \ell - 1\), let \(\psi _{i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}_{i}\) and \({{\textbf {z}}}_i\).

Then, there is a suitable positively rescaled shift of the \(\psi _{i}\) (still denoted as \(\psi _i\) by an abuse of notation) such that for any bounded set B there is a positive constant \(\kappa _B\) (depending on \(B, \mathcal {L}, {{\textbf {a}}}, \mathcal {F}_{\ell }\)) such that

$$\begin{aligned} {{\textbf {x}}}\in B, \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \Rightarrow \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}\right) \le \kappa _B (\epsilon +\varphi (\epsilon ,M)), \end{aligned}$$

where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \), \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), if \(\ell \ge 2\). If \(\ell = 1\), we let \(\varphi \) be the function satisfying \(\varphi (\epsilon , M) = \epsilon \).

Proof

The case \(\ell = 1\) follows from Proposition 3.1, by taking \( \mathcal {F}= \mathcal {F}_1\). Now, suppose \(\ell \ge 2\). We apply Lemma 3.7, which tells us that, after positively rescaling and shifting the \(\psi _i\), we have:

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \implies \text {d}({{\textbf {x}}}, \mathcal {F}_{\ell }) \le \varphi (\epsilon ,\Vert {{\textbf {x}}}\Vert ), \end{aligned}$$

where \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}} \). In particular, since \(\Vert {{\textbf {x}}}\Vert \le M\) for \({{\textbf {x}}}\in B\) we have

$$\begin{aligned} \quad \text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon , \quad \text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \implies \text {d}({{\textbf {x}}}, \mathcal {F}_{\ell }) \le \varphi (\epsilon ,M), \qquad \forall {{\textbf {x}}}\in B \end{aligned}$$
(3.10)

By assumption, \(\{ \mathcal {F}_{\ell }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition. We invoke Proposition 3.1 to find \( \kappa _B > 0\) such that

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {K}\cap (\mathcal {L}+ {{\textbf {a}}})) \le \kappa _B(\text {d}({{\textbf {x}}}, \mathcal {F}_{\ell }) + \text {d}({{\textbf {x}}}, \mathcal {L}+{{\textbf {a}}})), \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$
(3.11)

Combining (3.10), (3.11), we conclude that if \({{\textbf {x}}}\in B\) and \(\epsilon \ge 0\) satisfy \(\text {d}({{\textbf {x}}}, \mathcal {K}) \le \epsilon \) and \(\text {d}({{\textbf {x}}},\mathcal {L}+ {{\textbf {a}}}) \le \epsilon \), then we have \(\text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}\right) \le \kappa _B(\epsilon +\varphi (\epsilon ,M))\). This completes the proof. \(\square \)

Theorem 3.8 is an improvement over [34, Theorem 23] because it removes the amenability assumption. Furthermore, it shows that it is enough to determine the one-step facial residual functions for \( \mathcal {K}\) and its faces, whereas [34, Theorem 23] may require all possible facial residual functions related to \( \mathcal {K}\) and its faces. Nevertheless, Theorem 3.8 is still an abstract error bound result; whether some concrete inequality can be written down depends on obtaining a formula for the \(\varphi \) function. To do so, it would require finding expressions for the one-step facial residual functions. In the next subsections, we will address this challenge.

3.1 How to compute one-step facial residual functions?

In this section, we present some general tools for computing one-step facial residual functions.

Lemma 3.9

(\(\mathbb {1}\)-FRF from error bound) Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a proper face of \( \mathcal {K}\). Let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0)=0\), and let \(\kappa _{{{\textbf {z}}},\mathfrak {s}}\) be a finite monotone nondecreasing nonnegative function in \(\mathfrak {s}\in \mathbb {R}_+\) such that

$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) \le \kappa _{{{\textbf {z}}},\Vert {{\textbf {q}}}\Vert } \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))\ \ \text{ whenever }\ \ {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp . \end{aligned}$$
(3.12)

Define the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) by

$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(s,t) := \max \left\{ s,s/\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}\mathfrak {g}\left( s +\max \left\{ s,s/\Vert {{\textbf {z}}}\Vert \right\} \right) . \end{aligned}$$

Then we have

$$\begin{aligned} \text {d}({{\textbf {p}}}, \mathcal {F}) \le \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,\Vert {{\textbf {p}}}\Vert ) \text{ whenever } \text {d}({{\textbf {p}}}, \mathcal {K}) \le \epsilon \text{ and } \langle {{\textbf {p}}} , {{\textbf {z}}} \rangle \le \epsilon \text{. } \end{aligned}$$
(3.13)

Moreover, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is a \(\mathbb {1}\)-FRF for \( \mathcal {K}\) and \({{\textbf {z}}}\).

Proof

Suppose that \(\text {d}({{\textbf {p}}}, \mathcal {K}) \le \epsilon \) and \(\langle {{\textbf {p}}} , {{\textbf {z}}} \rangle \le \epsilon \). We first claim that

$$\begin{aligned} \text {d}({{\textbf {p}}},\{{{\textbf {z}}}\}^\perp ) \le \max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} . \end{aligned}$$
(3.14)

This can be shown as follows. Since \({{\textbf {z}}}\in \mathcal {K}^*\), we have \(\langle {{\textbf {p}}}+P_{ \mathcal {K}}({{\textbf {p}}})-{{\textbf {p}}} , {{\textbf {z}}} \rangle \ge 0\) and

$$\begin{aligned} \langle {{\textbf {p}}} , {{\textbf {z}}} \rangle \ge - \langle P_{ \mathcal {K}}({{\textbf {p}}})-{{\textbf {p}}} , {{\textbf {z}}} \rangle \ge -\epsilon \Vert {{\textbf {z}}}\Vert . \end{aligned}$$

We conclude that \(|\langle {{\textbf {p}}},{{\textbf {z}}}\rangle | \le \max \{\epsilon \Vert {{\textbf {z}}}\Vert ,\epsilon \}\). This, in combination with \(\text {d}({{\textbf {p}}},\{{{\textbf {z}}}\}^\perp ) = |\langle {{\textbf {p}}},{{\textbf {z}}}\rangle |/\Vert {{\textbf {z}}}\Vert \), leads to (3.14).

Next, let \({{\textbf {q}}}:=P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {p}}}\). Then we have that

$$\begin{aligned} \begin{aligned} \text {d}({{\textbf {p}}}, \mathcal {F})&\le \Vert {{\textbf {p}}}- {{\textbf {q}}}\Vert +\text {d}({{\textbf {q}}}, \mathcal {F})\overset{\mathrm{(a)}}{\le }\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \text {d}({{\textbf {q}}}, \mathcal {F})\\&\overset{\mathrm{(b)}}{\le }\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},\Vert {{\textbf {q}}}\Vert } \mathfrak {g}\left( \text {d}({{\textbf {q}}}, \mathcal {K})\right) \\&\overset{\mathrm{(c)}}{\le }\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},\Vert {{\textbf {p}}}\Vert } \mathfrak {g}\left( \text {d}({{\textbf {q}}}, \mathcal {K})\right) \\&\overset{\mathrm{(d)}}{\le }\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},\Vert {{\textbf {p}}}\Vert } \mathfrak {g}\left( \epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} \right) , \end{aligned} \end{aligned}$$

where (a) follows from (3.14), (b) is a consequence of (3.12), (c) holds because \(\Vert {{\textbf {q}}}\Vert = \Vert P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {p}}}\Vert \le \Vert {{\textbf {p}}}\Vert \) so that \(\kappa _{{{\textbf {z}}},\Vert {{\textbf {q}}}\Vert }\le \kappa _{{{\textbf {z}}},\Vert {{\textbf {p}}}\Vert }\), and (d) holds because \(\mathfrak {g}\) is monotone nondecreasing and

$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {K}) \le \text {d}({{\textbf {p}}}, \mathcal {K}) + \Vert {{\textbf {q}}}- {{\textbf {p}}}\Vert \le \epsilon + \max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ; \end{aligned}$$

here, the second inequality follows from (3.14) and the assumption that \(\text {d}({{\textbf {p}}}, \mathcal {K})\le \epsilon \). This proves (3.13). Finally, notice that \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is nonnegative, monotone nondecreasing in each argument, and that \(\psi _{ \mathcal {K},{{\textbf {z}}}}(0,t)=0\) for every \(t \in \mathbb {R}_+\). Hence, \(\psi _{ \mathcal {K},{{\textbf {z}}}}\) is a one-step facial residual function for \( \mathcal {K}\) and \({{\textbf {z}}}\). \(\square \)

Fig. 2
figure 2

Theorem 3.10 shows that we may replace the problem of showing that \(\mathfrak {g}(\Vert {{\textbf {q}}}-P_{ \mathcal {K}}{{\textbf {q}}}\Vert )/\Vert {{\textbf {q}}}-P_{ \mathcal {F}}{{\textbf {q}}}\Vert \) is uniformly bounded away from zero for all \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\setminus \mathcal {F}\), with the equivalent problem of showing that \(\mathfrak {g}(\Vert {{\textbf {v}}}-P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\Vert )/\Vert P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}- P_{ \mathcal {F}}\circ P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\Vert \) is uniformly bounded away from zero for all \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\setminus \mathcal {F}\) with \(P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\ne P_{ \mathcal {F}}\circ P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\). This second problem can sometimes be easier to deal with, because it obviates the need to project onto \( \mathcal {K}\) and projects onto the nontrivial exposed face \( \mathcal {F}\) instead. For describing a possibly higher dimensional problem in 2D, we represent \(\{{{\textbf {z}}}\}^\perp \) with a line, \(\mathcal {K}\) with a 2D slice, and \({{\mathcal {F}}}\) with a dot; of course, this is an oversimplification, since \({{\textbf {q}}},{{\textbf {w}}},{{\textbf {u}}}\) are not generically colinear, nor would any of the points necessarily lie in the same horizontal slice. The scenario shown is meant to suggest intuition, but it is not a plausible configuration of points

In view of Lemma 3.9, one may construct one-step facial residual functions after establishing the error bound (3.12). In the next theorem, we present a characterization for the existence of such an error bound. Our result is based on the quantity (3.15) defined below being nonzero. Note that this quantity does not explicitly involve projections onto \( \mathcal {K}\); this enables us to work with the exponential cone later, whose projections do not seem to have simple expressions. Figure 2 provides a geometric interpretation of (3.15).

Theorem 3.10

(Characterization of the existence of error bounds) Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a nontrivial exposed face of \( \mathcal {K}\). Let \(\eta \ge 0\), \(\alpha \in (0,1]\) and let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0) = 0\) and \(\mathfrak {g}\ge |\cdot |^\alpha \). Define

$$\begin{aligned} \!\!\!\!\!\!\!\gamma _{{{\textbf {z}}},\eta }\! := \inf _{{{\textbf {v}}}} \!\left\{ \frac{\mathfrak {g}(\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert )}{\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert }\;\bigg |\; {{\textbf {v}}}\!\in \! \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F},\, {{\textbf {w}}}\!=\! P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}},\, {{\textbf {u}}}\!=\! P_{ \mathcal {F}}{{\textbf {w}}},\, {{\textbf {w}}}\!\ne \! {{\textbf {u}}}\right\} \!.\!\!\!\!\! \nonumber \\ \end{aligned}$$
(3.15)

Then the following statements hold.

  1. (i)

    If \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\), then it holds that

    $$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) \le \kappa _{{{\textbf {z}}},\eta } \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))\ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ), \end{aligned}$$
    (3.16)

    where \(\kappa _{{{\textbf {z}}},\eta } := \max \left\{ 2\eta ^{1-\alpha }, 2\gamma _{{{\textbf {z}}},\eta }^{-1} \right\} < \infty \).

  2. (ii)

    If there exists \(\kappa _{_B} \in (0,\infty )\) so that

    $$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) \le \kappa _{_B} \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))\ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ), \end{aligned}$$
    (3.17)

    then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\).

Proof

We first consider item (i). If \(\eta = 0\) or \({{\textbf {q}}}\in \mathcal {F}\), the result is vacuously true, so let \(\eta > 0\) and \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ) \backslash \mathcal {F}\). Then \({{\textbf {q}}}\notin \mathcal {K}\) because \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\). Define

$$\begin{aligned} {{\textbf {v}}}=P_{ \mathcal {K}}{{\textbf {q}}},\quad {{\textbf {w}}}= P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}},\quad \text {and}\quad {{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}. \end{aligned}$$

Then \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\) because \({{\textbf {q}}}\notin \mathcal {K}\) and \(\Vert {{\textbf {q}}}\Vert \le \eta \). If \({{\textbf {v}}}\in \mathcal {F}\), then we have \(\text {d}({{\textbf {q}}}, \mathcal {F}) = \text {d}({{\textbf {q}}}, \mathcal {K})\) and hence

$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F}) = \text {d}({{\textbf {q}}}, \mathcal {K})^{1-\alpha }\cdot \text {d}({{\textbf {q}}}, \mathcal {K})^\alpha \le \eta ^{1-\alpha }\text {d}({{\textbf {q}}}, \mathcal {K})^\alpha \le \kappa _{{{\textbf {z}}},\eta } \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K})), \end{aligned}$$

where the first inequality holds because \(\Vert {{\textbf {q}}}\Vert \le \eta \), and the last inequality follows from the definitions of \(\mathfrak {g}\) and \(\kappa _{{{\textbf {z}}},\eta }\). Thus, from now on, we assume that \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\).

Next, since \({{\textbf {w}}}=P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\), it holds that \({{\textbf {v}}}-{{\textbf {w}}}\in \{{{\textbf {z}}}\}^{\perp \perp }\) and hence \(\Vert {{\textbf {q}}}-{{\textbf {v}}}\Vert ^2 = \Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert ^2+\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert ^2\). In particular, we have

$$\begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {K}) = \Vert {{\textbf {q}}}-{{\textbf {v}}}\Vert \ge \max \{\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert ,\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert \}, \end{aligned}$$
(3.18)

where the equality follows from the definition of \({{\textbf {v}}}\). Now, to establish (3.16), we consider two cases.

  1. (I)

    \(\text {d}({{\textbf {q}}}, \mathcal {F}) \le 2\text {d}({{\textbf {w}}}, \mathcal {F})\);

  2. (II)

    \(\text {d}({{\textbf {q}}}, \mathcal {F}) > 2\text {d}({{\textbf {w}}}, \mathcal {F})\).

 (I): In this case, we have from \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\) and \({{\textbf {q}}}\notin \mathcal {F}\) that

$$\begin{aligned} 2\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert = 2\text {d}({{\textbf {w}}}, \mathcal {F}) \ge \text {d}({{\textbf {q}}}, \mathcal {F}) > 0, \end{aligned}$$
(3.19)

where the first inequality follows from the assumption in this case (I). Hence,

$$\begin{aligned} \frac{1}{\kappa _{{{\textbf {z}}},\eta }} {\mathop {\le }\limits ^\mathrm{(a)}} \frac{1}{2}\gamma _{{{\textbf {z}}},\eta } {\mathop {\le }\limits ^\mathrm{(b)}} \frac{\mathfrak {g}(\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert )}{2\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert } {\mathop {\le }\limits ^\mathrm{(c)}} \frac{\mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))}{2\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert }{\mathop {\le }\limits ^\mathrm{(d)}} \frac{\mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K}))}{\text {d}({{\textbf {q}}}, \mathcal {F})}, \end{aligned}$$
(3.20)

where (a) is true by the definition of \(\kappa _{{{\textbf {z}}},\eta }\), (b) uses the condition that \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\), (3.19) and the definition of \(\gamma _{{{\textbf {z}}},\eta }\), (c) is true by (3.18) and the monotonicity of \(\mathfrak {g}\), and (d) follows from (3.19). This concludes case (I).Footnote 6

(II): Using the triangle inequality, we have

$$\begin{aligned} 2\text {d}({{\textbf {q}}}, \mathcal {F}) \le 2\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert +2\text {d}({{\textbf {w}}}, \mathcal {F})< 2\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert +\text {d}({{\textbf {q}}}, \mathcal {F}), \end{aligned}$$

where the strict inequality follows from the condition for this case (II). Consequently, we have \(\text {d}({{\textbf {q}}}, \mathcal {F}) \le 2\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert \). Combining this with (3.18), we deduce further that

$$\begin{aligned} \begin{aligned} \text {d}({{\textbf {q}}}, \mathcal {F})&\le 2\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert \le 2 \max \{\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert ,\Vert {{\textbf {q}}}-{{\textbf {w}}}\Vert \} \\&\le 2\text {d}({{\textbf {q}}}, \mathcal {K}) = 2\text {d}({{\textbf {q}}}, \mathcal {K})^{1-\alpha }\cdot \text {d}({{\textbf {q}}}, \mathcal {K})^\alpha \\&\le 2\eta ^{1-\alpha }\text {d}({{\textbf {q}}}, \mathcal {K})^\alpha \le \kappa _{{{\textbf {z}}},\eta } \mathfrak {g}(\text {d}({{\textbf {q}}}, \mathcal {K})), \end{aligned} \end{aligned}$$

where the fourth inequality holds because \(\Vert {{\textbf {q}}}\Vert \le \eta \), and the last inequality follows from the definitions of \(\mathfrak {g}\) and \(\kappa _{{{\textbf {z}}},\eta }\). This proves item (i).

We next consider item (ii). Again, the result is vacuously true if \(\eta = 0\), so let \(\eta > 0\). Let \({{\textbf {v}}}\in \partial \mathcal {K}\cap B(\eta )\backslash \mathcal {F}\), \({{\textbf {w}}}= P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\) and \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\) with \({{\textbf {w}}}\ne {{\textbf {u}}}\). Then \({{\textbf {w}}}\in B(\eta )\), and we have in view of (3.17) that

$$\begin{aligned} \Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert \overset{\mathrm{(a)}}{=} \text {d}({{\textbf {w}}}, \mathcal {F}) \overset{\mathrm{(b)}}{\le }\kappa _{_B}\mathfrak {g}(\text {d}({{\textbf {w}}}, \mathcal {K})) \overset{\mathrm{(c)}}{\le }\kappa _{_B}\mathfrak {g}(\Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert ), \end{aligned}$$

where (a) holds because \({{\textbf {u}}}= P_{ \mathcal {F}}{{\textbf {w}}}\), (b) holds because of (3.17), \({{\textbf {w}}}\in \{{{\textbf {z}}}\}^\perp \) and \(\Vert {{\textbf {w}}}\Vert \le \eta \), and (c) is true because \(\mathfrak {g}\) is monotone nondecreasing and \({{\textbf {v}}}\in \mathcal {K}\). Thus, we have \(\gamma _{{{\textbf {z}}},\eta } \ge 1/\kappa _{_B} > 0\). This completes the proof. \(\square \)

Remark 3.11

(About \(\kappa _{{{\textbf {z}}},\eta }\) and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\)) As \(\eta \) increases, the infimum in (3.15) is taken over a larger region, so \(\gamma _{{{\textbf {z}}},\eta }\) does not increase. Accordingly, \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) does not decrease when \(\eta \) increases. Therefore, the \(\kappa _{{{\textbf {z}}},\eta }\) and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) considered in Theorem 3.10 are monotone nondecreasing as functions of \(\eta \) when \({{\textbf {z}}}\) is fixed. We are also using the convention that \(1/\infty = 0\) so that \(\kappa _{{{\textbf {z}}},\eta } = 2\eta ^{1-\alpha }\) when \(\gamma _{{{\textbf {z}}},\eta } = \infty \).

Thus, to establish an error bound as in (3.16), it suffices to show that \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) for the choice of \(\mathfrak {g}\) and \(\eta \ge 0\). Clearly, \(\gamma _{{{\textbf {z}}},0} = \infty \). The next lemma allows us to check whether \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) for an \(\eta > 0\) by considering convergent sequences.

Lemma 3.12

Suppose that \( \mathcal {K}\) is a closed convex cone and let \({{\textbf {z}}}\in \mathcal {K}^*\) be such that \( \mathcal {F}= \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}\) is a nontrivial exposed face of \( \mathcal {K}\). Let \(\eta > 0\), \(\alpha \in (0,1]\) and let \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) be monotone nondecreasing with \(\mathfrak {g}(0) = 0\) and \(\mathfrak {g}\ge |\cdot |^\alpha \). Let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15). If \(\gamma _{{{\textbf {z}}},\eta } = 0\), then there exist \(\bar{{{\textbf {v}}}}\in \mathcal {F}\) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial \mathcal {K}\cap B(\eta ) \backslash \mathcal {F}\) such that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }{{\textbf {v}}}^k = \underset{k \rightarrow \infty }{\lim }{{\textbf {w}}}^k&= \bar{{{\textbf {v}}}} \end{aligned}$$
(3.21a)
$$\begin{aligned} \mathrm{and}\ \ \ \lim _{k\rightarrow \infty } \frac{\mathfrak {g}(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )}{\Vert {{\textbf {w}}}^k- {{\textbf {u}}}^k\Vert }&= 0, \end{aligned}$$
(3.21b)

where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{ \mathcal {F}}{{\textbf {w}}}^k\) and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\).

Proof

Suppose that \(\gamma _{{{\textbf {z}}},\eta } = 0\). Then, by the definition of infimum, there exists a sequence \(\{{{\textbf {v}}}^k\}\subset \partial \mathcal {K}\cap B(\eta ) \backslash \mathcal {F}\) such that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim } \frac{\mathfrak {g}(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert } = 0, \end{aligned}$$
(3.22)

where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{ \mathcal {F}}{{\textbf {w}}}^k\) and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). Since \(\{{{\textbf {v}}}^k\}\subset B(\eta )\), by passing to a convergent subsequence if necessary, we may assume without loss of generality that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }{{\textbf {v}}}^k = \bar{{{\textbf {v}}}}\end{aligned}$$
(3.23)

for some \(\bar{{{\textbf {v}}}}\in \mathcal {K}\cap B(\eta )\). In addition, since \(0\in \mathcal {F}\subseteq \{{{\textbf {z}}}\}^\perp \), and projections onto closed convex sets are nonexpansive, we see that \(\{{{\textbf {w}}}^k\}\subset B(\eta )\) and \(\{{{\textbf {u}}}^k\}\subset B(\eta )\), and hence the sequence \(\{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert \}\) is bounded. Then we can conclude from (3.22) and the assumption \(\mathfrak {g}\ge |\cdot |^\alpha \) that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert =0. \end{aligned}$$
(3.24)

Now (3.24), (3.23), and the triangle inequality give \({{\textbf {w}}}^k\rightarrow \bar{{{\textbf {v}}}}\). Since \(\{{{\textbf {w}}}^k\}\subset \{{{\textbf {z}}}\}^\perp \), it then follows that \(\bar{{{\textbf {v}}}}\in \{{{\textbf {z}}}\}^\perp \). Thus, \(\bar{{{\textbf {v}}}}\in \{{{\textbf {z}}}\}^\perp \cap \mathcal {K}= \mathcal {F}\). This completes the proof. \(\square \)

Let \( \mathcal {K}\) be a closed convex cone. Lemma 3.9, Theorem 3.10 and Lemma 3.12 are tools to obtain one-step facial residual functions for \( \mathcal {K}\). These are exactly the kind of facial residual functions needed in the abstract error bound result, Theorem 3.8. We conclude this subsection with a result that connects the one-step facial residual functions of a product cone and those of its constituent cones, which is useful for deriving error bounds for product cones.

Proposition 3.13

(\(\mathbb {1}\)-FRF for products) Let \( \mathcal {K}^i \subseteq \mathcal {E}^i\) be closed convex cones for every \(i \in \{1,\ldots ,m\}\) and let \( \mathcal {K}= \mathcal {K}^1 \times \cdots \times \mathcal {K}^m\). Let \( \mathcal {F}\mathrel {\unlhd } \mathcal {K}\), \({{\textbf {z}}}\in \mathcal {F}^*\) and suppose that \( \mathcal {F}= \mathcal {F}^1\times \cdots \times \mathcal {F}^m\) with \( \mathcal {F}^i \mathrel {\unlhd } \mathcal {K}^i\) for every \(i \in \{1,\ldots ,m\}\). Write \({{\textbf {z}}}= ({{\textbf {z}}}_1,\ldots ,{{\textbf {z}}}_m)\) with \({{\textbf {z}}}_i \in ( \mathcal {F}^i)^*\).

For every i, let \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) be a \(\mathbb {1}\)-FRF for \( \mathcal {F}^i\) and \({{\textbf {z}}}_i\). Then, there exists a \(\kappa > 0\) such that the function \(\psi _{ \mathcal {F},{{\textbf {z}}}}\) satisfying

$$\begin{aligned} \psi _{ \mathcal {F},{{\textbf {z}}}}(\epsilon ,t) = \sum _{i=1}^m \psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}(\kappa \epsilon ,t) \end{aligned}$$

is a \(\mathbb {1}\)-FRF for \( \mathcal {F}\) and \({{\textbf {z}}}\).

Proof

Suppose that \({{\textbf {x}}}\in \text {span}\, \mathcal {F}\) and \(\epsilon \ge 0\) satisfy the inequalities

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {F}) \le \epsilon , \quad \langle {{\textbf {x}}} , {{\textbf {z}}} \rangle \le \epsilon . \end{aligned}$$

We note that

$$\begin{aligned} \mathcal {F}\cap \{{{\textbf {z}}}\}^{\perp } = ( \mathcal {F}^{1} \cap \{{{\textbf {z}}}_1\}^\perp ) \times \cdots \times ( \mathcal {F}^{m} \cap \{{{\textbf {z}}}_m\}^\perp ), \end{aligned}$$

and that for every \(i \in \{1,\ldots ,m\}\),

$$\begin{aligned} \text {d}({{\textbf {x}}}_i, \mathcal {F}^i)\le \text {d}({{\textbf {x}}}, \mathcal {F})\le \epsilon . \end{aligned}$$
(3.25)

Since \({{\textbf {z}}}_i \in ( \mathcal {F}^i)^*\), we have from (3.25) that

$$\begin{aligned} 0 \le \langle {{\textbf {z}}}_i , P_{ \mathcal {F}^{i}}({{{\textbf {x}}}}_i) \rangle = \langle {{\textbf {z}}}_i , P_{ \mathcal {F}^{i}}({{{\textbf {x}}}}_i) - {{{\textbf {x}}}}_i + {{{\textbf {x}}}}_i \rangle \le \epsilon \Vert {{\textbf {z}}}_i\Vert + \langle {{\textbf {z}}}_i , {{{\textbf {x}}}}_i \rangle . \end{aligned}$$
(3.26)

Using (3.26) for all i and recalling that \(\langle {{\textbf {z}}} , {{\textbf {x}}} \rangle \le \epsilon \), we have

$$\begin{aligned} \langle ({{\textbf {z}}}_1,\ldots ,{{\textbf {z}}}_m) , (P_{ \mathcal {F}^{1}}({{{\textbf {x}}}}_1),\ldots ,P_{ \mathcal {F}^{m}}({{{\textbf {x}}}}_m)) \rangle \le \sum _{i=1}^m\left[ \epsilon \Vert {{\textbf {z}}}_i\Vert + \langle {{\textbf {z}}}_i , {{\textbf {x}}}_i \rangle \right] \le {\hat{\kappa }} \epsilon , \end{aligned}$$
(3.27)

where \({\hat{\kappa }} = 1+\sum _{i=1}^m\Vert {{\textbf {z}}}_i\Vert \). Since \(\langle {{\textbf {z}}}_i , P_{ \mathcal {F}^{i}}({{\textbf {x}}}_i) \rangle \ge 0\) for \(i \in \{1,\ldots ,m\}\), from (3.27) we obtain

$$\begin{aligned} \langle {{\textbf {z}}}_i , P_{ \mathcal {F}^{i}}({{\textbf {x}}}_i) \rangle \le {\hat{\kappa }} \epsilon , \qquad i \in \{1,\ldots ,m\}. \end{aligned}$$
(3.28)

This implies that for \(i \in \{1,\ldots ,m\}\) we have

$$\begin{aligned} \langle {{\textbf {z}}}_i , {{\textbf {x}}}_i \rangle = \langle {{\textbf {z}}}_i , {{\textbf {x}}}_i -P_{ \mathcal {F}^{i}}({{\textbf {x}}}_i) + P_{ \mathcal {F}^{i}}({{\textbf {x}}}_i) \rangle \le \epsilon \Vert {{\textbf {z}}}\Vert + {\hat{\kappa }} \epsilon , \end{aligned}$$
(3.29)

where the inequality follows from (3.25) and (3.28). Now, recapitulating, the facial residual function \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) has the property that if \(\gamma _1,\gamma _2 \in \mathbb {R}_+\) then the relations

$$\begin{aligned} {{\textbf {y}}}_i\in \text {span}\, \mathcal {F}^{i},\quad \text {d}({{\textbf {y}}}_i, \mathcal {F}^{i}) \le \gamma _1,\quad \langle {{\textbf {y}}}_i , {{\textbf {z}}}_i \rangle \le \gamma _2 \end{aligned}$$

imply \(\text {d}({{\textbf {y}}}_i, \mathcal {F}^i\cap \{{{\textbf {z}}}_i\}^\perp ) \le \psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}(\max _{1\le j \le 2}\{\gamma _j \},\Vert {{\textbf {y}}}_i\Vert )\). Therefore, from (3.25), (3.29) and the monotonicity of \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\), we have upon recalling \({{\textbf {x}}}\in \text {span}\, \mathcal {F}\) that

$$\begin{aligned} \text {d}({{\textbf {x}}}_i, \mathcal {F}^i\cap \{{{\textbf {z}}}_i\}^\perp ) \le \psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}(\max \{1,{\hat{\kappa }}+\Vert {{\textbf {z}}}\Vert \}\epsilon ,\Vert {{\textbf {x}}}_i\Vert ). \end{aligned}$$
(3.30)

Finally, from (3.30), we conclude that

$$\begin{aligned} \text {d}({{\textbf {x}}}, \mathcal {F}\cap \{{{\textbf {z}}}\}^\perp ) \le \sum _{i=1}^m{\text {d}({{{\textbf {x}}}}_i, \mathcal {F}^i \cap \{{{\textbf {z}}}_i\}^\perp )} \le \sum _{i=1}^m\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}(\max \{1,{\hat{\kappa }}+\Vert {{\textbf {z}}}\Vert \}\epsilon ,\Vert {{\textbf {x}}}\Vert ), \end{aligned}$$

where we also used the monotonicity of \(\psi _{ \mathcal {F}^i,{{\textbf {z}}}_i}\) for the last inequality. This completes the proof. \(\square \)

4 The exponential cone

In this section, we will use all the techniques developed so far to obtain error bounds for the 3D exponential cone \(K_{\exp }\). We will start with a study of its facial structure in Sect. 4.1, then we will compute its one-step facial residual functions in Sect. 4.2. Finally, error bounds will be presented in Sect. 4.3. In Sect. 4.4, we summarize odd behaviour found in the facial structure of the exponential cone.

4.1 Facial structure

Recall that the exponential cone is defined as follows:

$$\begin{aligned} K_{\exp }:=&\left\{ (x,y,z)\;|\;y>0,z\ge ye^{x/y}\right\} \cup \left\{ (x,y,z)\;|\; x \le 0, z\ge 0, y=0 \right\} . \end{aligned}$$
(4.1)
Fig. 3
figure 3

The exponential cone and its dual, with faces and exposing vectors labeled according to our index \(\beta \)

Its dual cone is given by

$$\begin{aligned} K_{\exp }^*:=&\left\{ (x,y,z)\;|\;x<0, ez\ge -xe^{y/x}\right\} \cup \left\{ (x,y,z)\;|\; x = 0, z\ge 0, y\ge 0 \right\} . \end{aligned}$$

It may therefore be readily seen that \(K_{\exp }^*\) is a scaled and rotated version of \(K_{\exp }\). In this subsection, we will describe the nontrivial faces of \(K_{\exp }\); see Fig. 3. We will show that we have the following types of nontrivial faces:

  1. (a)

    infinitely many exposed extreme rays (1D faces) parametrized by \(\beta \in \mathbb {R}\) as follows:

    $$\begin{aligned} {{\mathcal {F}}}_\beta {:}{=}\left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\bigg |\;y \in [0,\infty )\right\} . \end{aligned}$$
    (4.2)
  2. (b)

    a single “exceptional” exposed extreme ray denoted by \({{\mathcal {F}}}_{\infty }\):

    $$\begin{aligned} {{\mathcal {F}}}_{\infty }{:}{=}\{(x,0,0)\;|\;x\le 0 \}. \end{aligned}$$
    (4.3)
  3. (c)

    a single non-exposed extreme ray denoted by \({{\mathcal {F}}}_{ne} \):

    $$\begin{aligned} {{\mathcal {F}}}_{ne}{:}{=}\{(0,0,z)\;|\;z\ge 0 \}. \end{aligned}$$
    (4.4)
  4. (d)

    a single 2D exposed face denoted by \({{\mathcal {F}}}_{-\infty }\):

    $$\begin{aligned} {{\mathcal {F}}}_{-\infty }{:}{=}\{(x,y,z)\;|\;x\le 0, z\ge 0, y=0\}, \end{aligned}$$
    (4.5)

    where we note that \({{\mathcal {F}}}_{\infty }\) and \({{\mathcal {F}}}_{ne}\) are the extreme rays of \({{\mathcal {F}}}_{-\infty }\).

Notice that except for the case (c), all faces are exposed and thus arise as an intersection \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) for some \({{\textbf {z}}}\in K_{\exp }^*\). To establish the above characterization, we start by examining how the components of \({{\textbf {z}}}\) determine the corresponding exposed face.

4.1.1 Exposed faces

Let \({{\textbf {z}}}\in K_{\exp }^*\) be such that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a nontrivial face of \(K_{\exp }\). Then \({{\textbf {z}}}\ne 0\) and \({{\textbf {z}}}\in \partial K_{\exp }^*\). We consider the following cases.

\(\underline{z_x < 0}\): Since \({{\textbf {z}}}\in \partial K_{\exp }^*\), we must have \(z_z e = -z_x e^{\frac{z_y}{z_x}}\) and hence

$$\begin{aligned} {{\textbf {z}}}=(z_x,z_y,-z_x e^{\frac{z_y}{z_x}-1}). \end{aligned}$$
(4.6)

Since \(z_x \ne 0\), we see that \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \) if and only if

$$\begin{aligned} q_x+ q_y\left( \frac{z_y}{z_x}\right) -q_z e^{\frac{z_y}{z_x}-1} = 0. \end{aligned}$$
(4.7)

Solving (4.7) for \(q_z\) and letting \(\beta :=\frac{z_y}{z_x}\) to simplify the exposition, we have

$$\begin{aligned}&q_z = e^{1-\frac{z_y}{z_x}}\left( q_x+q_y\cdot \frac{z_y}{z_x} \right) = e^{1-\beta }\left( q_x+q_y\beta \right) \nonumber \\&\quad \text {with}\;\;\beta :=\frac{z_y}{z_x}\in (-\infty ,\infty ). \end{aligned}$$
(4.8)

Thus, we obtain that \(\{{{\textbf {z}}}\}^\perp = \left\{ \left( x,y, e^{1-\beta }\left( x+y\beta \right) \right) \;\big |\; x,y \in \mathbb {R}\right\} \). Combining this with the definition of \(K_{\exp }\) and the fact that \(\{{{\textbf {z}}}\}^\perp \) is a supporting hyperplane (so that \(K_{\exp } \cap \{{{\textbf {z}}}\}^\perp = \partial K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \)) yields

$$\begin{aligned}&K_{\exp } \cap \{{{\textbf {z}}}\}^\perp = \partial K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \nonumber \\&\quad = \left\{ \left( x,y, e^{1-\beta }\left( x+y\beta \right) \right) \;\;\big |\;\; e^{1-\beta }\left( x+y\beta \right) = ye^{\frac{x}{y}},y>0 \right\} \nonumber \\&\qquad \bigcup \left\{ \left( x,y, e^{1-\beta }\left( x+y\beta \right) \right) \;\;\big |\;\; x \le 0, e^{1-\beta }\left( x+y\beta \right) \ge 0,y=0 \right\} \nonumber \\&\quad =\left\{ \left( x,y, e^{1-\beta }\left( x+y\beta \right) \right) \;\;\big |\;\; e^{1-\beta }\left( x+y\beta \right) = ye^{\frac{x}{y}},y>0 \right\} \cup \{0\}. \end{aligned}$$
(4.9)

We now refine the above characterization in the next proposition.

Proposition 4.1

(Characterization of \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) satisfy \({{\textbf {z}}}=(z_x,z_y,z_z)\), where \(z_z e = -z_x e^{\frac{z_y}{z_x}}\) and \(z_x <0\). Define \(\beta =\frac{z_y}{z_x}\) as in (4.8) and let \({{\mathcal {F}}}_\beta := K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \). Then

$$\begin{aligned} {{\mathcal {F}}}_\beta = \left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\bigg |\;y \in [0,\infty )\right\} . \end{aligned}$$

Proof

Let \(\varOmega := \left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\big |\;y \in [0,\infty )\right\} \). In view of (4.9), we can check that \(\varOmega \subseteq {{\mathcal {F}}}_\beta \). To prove the converse inclusion, pick any \({{\textbf {q}}}=\left( x,y,e^{1-\beta }(x+y\beta )\right) \in {{\mathcal {F}}}_\beta \). We need to show that \({{\textbf {q}}}\in \varOmega \).

To this end, we note from (4.9) that if \(y = 0\), then necessarily \({{\textbf {q}}}= {\textbf {0}}\) and consequently \({{\textbf {q}}}\in \varOmega \). On the other hand, if \(y > 0\), then (4.9) gives \(ye^{x/y}= (x+\beta y)e^{1-\beta }\). Then we have the following chain of equivalences:

$$\begin{aligned} \begin{array}{crl} &{} ye^{x/y}&{}\!= (x+\beta y)e^{1-\beta } \\ \iff &{} -e^{-1} &{}\!=-(x/y+\beta )e^{-(x/y+\beta )} \\ \overset{\mathrm{(a)}}{\iff }&{} -x/y-\beta &{}\!=-1 \\ \iff &{} x&{}\!=y-y\beta , \end{array} \end{aligned}$$
(4.10)

where (a) follows from the fact that the function \(t\mapsto te^t\) is strictly increasing on \([-1,\infty )\). Plugging the last expression back into \({{\textbf {q}}}\), we may compute

$$\begin{aligned} q_z = e^{1-\beta }(x+y\beta ) = e^{1-\beta }(y-y\beta +y\beta )=ye^{1-\beta }. \end{aligned}$$
(4.11)

Altogether, (4.10), (4.11) together with \(y>0\) yield

$$\begin{aligned} {{\textbf {q}}}=\left( y-\beta y ,y,ye^{1-\beta } \right) \in \varOmega . \end{aligned}$$

This completes the proof. \(\square \)

Next, we move on to the two remaining cases.

\(\underline{z_x = 0, z_z > 0}\): Notice that \({{\textbf {q}}}\in K_{\exp }\) means that \(q_y \ge 0\) and \(q_z\ge 0\). Since \(z_z > 0\) and \(z_y \ge 0\), in order to have \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \), we must have \(q_z = 0\). The the definition of \(K_{\exp }\) also forces \(q_y = 0\) and hence

$$\begin{aligned} \{{{\textbf {z}}}\}^\perp \cap K_{\exp }=\{(x,0,0)\;|\; x \le 0\} =:{{\mathcal {F}}}_{\infty }. \end{aligned}$$
(4.12)

This one-dimensional face is exposed by any hyperplane with normal vectors coming from the set \(\{(0,z_y,z_z):\; z_y\ge 0, z_z > 0)\}\).

\({{\underline{z_x = 0, z_z = 0}}}\): In this case, we have \(z_y > 0\). In order to have \({{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \), we must have \(q_y = 0\). Thus

$$\begin{aligned} \{{{\textbf {z}}}\}^\perp \cap K_{\exp } = \{(x,y,z)\;|\;x\le 0, z\ge 0, y=0\} =: {{\mathcal {F}}}_{-\infty }, \end{aligned}$$
(4.13)

which is the unique two-dimensional face of \(K_{\exp }\).

4.1.2 The single non-exposed face and completeness of the classification

The face \({{\mathcal {F}}}_{ne}\) is non-exposed because, as shown in Proposition 4.1, (4.12) and (4.13), it never arises as an intersection of the form \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }\), for \({{\textbf {z}}}\in K_{\exp }^*\).

We now show that all nontrivial faces of \(K_{\exp }\) were accounted for in (4.2), (4.3), (4.4), (4.5). First of all, by the discussion in Sect. 4.1.1, all nontrivial exposed faces must be among the ones in (4.2), (4.3) and (4.5). So, let \( \mathcal {F}\) be a non-exposed face of \(K_{\exp }\). Then, it must be contained in a nontrivial exposed face of \(K_{\exp }\).Footnote 7 Therefore, \( \mathcal {F}\) must be a proper face of the unique 2D face (4.5). This implies that \( \mathcal {F}\) is one of the extreme rays of (4.5): \({{\mathcal {F}}}_{\infty }\) or \({{\mathcal {F}}}_{ne}\). By assumption, \( \mathcal {F}\) is non-exposed, so it must be \({{\mathcal {F}}}_{ne}\).

4.2 One-step facial residual functions

In this subsection, we will use the machinery developed in Sect. 3 to obtain the one-step facial residual functions for \(K_{\exp }\).

Let us first discuss how the discoveries were originally made, and how that process motivated the development of the framework we built in Sect. 3. The FRFs proven here were initially found by using the characterizations of Theorem 3.10 and Lemma 3.12 together with numerical experiments. Specifically, we used MapleFootnote 8 to numerically evaluate limits of relevant sequences (3.21), as well as plotting lower dimensional slices of the function \({{\textbf {v}}}\mapsto \mathfrak {g}(\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert )/\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert \), where \({{\textbf {w}}}\) and \({{\textbf {u}}}\) are defined similarly as in (3.15).

A natural question is whether it might be simpler to change coordinates and work with the nearly equivalent \({{\textbf {w}}}\mapsto \mathfrak {g}(\Vert {{\textbf {v}}}-{{\textbf {w}}}\Vert )/\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert \), since \({{\textbf {w}}}\in \{{{\textbf {z}}}\}^\perp \). However, \(P_{\{{{\textbf {z}}}\}^\perp }^{-1}\{{{\textbf {w}}}\}\cap \partial \mathcal {K}\) may contain multiple points, which creates many challenges. We encountered an example of this when working with the exponential cone, where the change of coordinates from \({{\textbf {v}}}\) to \({{\textbf {w}}}\) necessitates the introduction of the two real branches of the Lambert \({\mathcal {W}}\) function (see, for example, [7, 12, 14] or [48] for the closely related Wright Omega function). With terrible effort, one can use such a parametrization to prove the FRFs for \({{\mathcal {F}}}_{\beta }, \beta \in \left[ -\infty ,\infty \right] \setminus \{\hat{\beta }:=-{\mathcal {W}}_{\mathrm{principal}}(2e^{-2})/2 \}\). However, the change of branches inhibits proving the result for the exceptional number \(\hat{\beta }\). The change of variables to \({{\textbf {v}}}\) cures this problem by obviating the need for a branch function in the analysis; see [31] for additional details. This is why we present Theorem 3.10 in terms of \({{\textbf {v}}}\). Computational investigation also pointed to the path of proof, though the proof we present may be understood without the aid of a computer.

4.2.1 \({{\mathcal {F}}}_{-\infty }\): the unique 2D face

Recall the unique 2D face of \(K_{\exp }\):

$$\begin{aligned} \mathcal {F}_{-\infty }:=\{(x,y,z)\;|\; x\le 0, z\ge 0, y=0 \}. \end{aligned}$$

Define the piecewise modified Boltzmann–Shannon entropy \(\mathfrak {g}_{-\infty }:\mathbb {R}_+\rightarrow \mathbb {R}_+\) as follows:

$$\begin{aligned} \mathfrak {g}_{-\infty }(t) := {\left\{ \begin{array}{ll} 0 &{} \text {if}\;\; t=0,\\ -t \ln (t) &{} \text {if}\;\; t\in \left( 0,1/e^2\right] ,\\ t+ \frac{1}{e^2} &{} \text {if}\;\; t>1/e^2. \end{array}\right. } \end{aligned}$$
(4.14)

For more on its usefulness in optimization, see, for example, [7, 14]. We note that \(\mathfrak {g}_{-\infty }\) is monotone increasing and there exists \(L\ge 1\) such that the following inequalities hold for every \(t \in \mathbb {R}_+\) and \(M > 0\):Footnote 9

$$\begin{aligned}&|t| \le \mathfrak {g}_{-\infty }(t), \quad \mathfrak {g}_{-\infty }(2t) \le L\mathfrak {g}_{-\infty }(t),\nonumber \\&\quad \mathfrak {g}_{-\infty }(Mt) \le L^{1+|\log _2(M)|}\mathfrak {g}_{-\infty }(t). \end{aligned}$$
(4.15)

With that, we prove in the next theorem that \(\gamma _{{{\textbf {z}}},\eta }\) is positive for \({{\mathcal {F}}}_{-\infty }\), which implies that an entropic error bound holds.

Theorem 4.2

(Entropic error bound concerning \({{\mathcal {F}}}_{-\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_z=0\) and \(z_y > 0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in (4.14). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and

$$\begin{aligned}&\text {d}({{\textbf {q}}},{{\mathcal {F}}}_{-\infty })\le \max \{2,2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\cdot \mathfrak {g}_{-\infty }(\text {d}({{\textbf {q}}},K_{\exp }))\nonumber \\&\quad \text{ whenever } \,{{{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta ).} \end{aligned}$$
(4.16)

Proof

In view of Lemma 3.12, take any \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_{-\infty }\) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_{-\infty }\) such that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }{{\textbf {v}}}^k = \underset{k \rightarrow \infty }{\lim }{{\textbf {w}}}^k =\bar{{{\textbf {v}}}}, \end{aligned}$$
(4.17)

where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_{-\infty }}{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). We will show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\).

Since \({{\textbf {v}}}^k \notin {{\mathcal {F}}}_{-\infty }\), in view of (4.1) and (4.13), we have \(v^k_y>0\) and

$$\begin{aligned} {{\textbf {v}}}^k = (v^k_x,v^k_y,v^k_ye^{v^k_x/v^k_y}) = (v^k_y\ln (v^k_z/v^k_y),v^k_y,v^k_z), \end{aligned}$$
(4.18)

where the second representation is obtained by solving for \(v^k_x\) from \(v^k_z = v^k_ye^{v^k_x/v^k_y} > 0\). Using the second representation in (4.18), we then have

$$\begin{aligned} {{\textbf {w}}}^k = (v^k_y\ln (v^k_z/v^k_y),0,v^k_z)\ \ \mathrm{and}\ \ {{\textbf {u}}}^k = (0,0,v^k_z); \end{aligned}$$
(4.19)

here, we made use of the fact that \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\), which implies that \(v^k_y\ln (v^k_z/v^k_y) > 0\) and thus the resulting formula for \({{\textbf {u}}}^k\). In addition, we also note from \(v^k_y\ln (v^k_z/v^k_y) > 0\) (and \(v^k_y>0\)) that

$$\begin{aligned} v^k_z> v^k_y > 0. \end{aligned}$$
(4.20)

Furthermore, since \(\bar{{{\textbf {v}}}} \in {{\mathcal {F}}}_{-\infty }\), we see from (4.13) and (4.17) that

$$\begin{aligned} \lim _{k\rightarrow \infty } v_y^k = 0. \end{aligned}$$
(4.21)

Now, using (4.18), (4.19), (4.21) and the definition of \(\mathfrak {g}_{-\infty }\), we see that for k sufficiently large,

$$\begin{aligned} \frac{\mathfrak {g}_{-\infty }(\Vert {{\textbf {v}}}^k-{{\textbf {w}}}^k\Vert )}{\Vert {{\textbf {u}}}^k-{{\textbf {w}}}^k\Vert } = \frac{-v_y^k \ln (v_y^k)}{v_y^k\ln (v_z^k/v_y^k)} = \frac{-\ln (v_y^k)}{\ln (v_z^k)-\ln (v_y^k)}. \end{aligned}$$
(4.22)

We will show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in each of the following cases.

  1. (I)

    \(\bar{v}_z >0\).

  2. (II)

    \(\bar{v}_z =0\).

 (I): In this case, we deduce from (4.21) and (4.22) that

$$\begin{aligned} \lim _{k\rightarrow \infty } \frac{\mathfrak {g}_{-\infty }(\Vert {{\textbf {v}}}^k-{{\textbf {w}}}^k\Vert )}{\Vert {{\textbf {u}}}^k-{{\textbf {w}}}^k\Vert } = 1. \end{aligned}$$

Thus (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\).

 (II): By passing to a subsequence if necessary, we may assume that \(v^k_z < 1\) for all k. This together with (4.20) gives \(\frac{\ln (v_z^k)}{\ln (v_y^k)} \in (0,1)\) for all k. Thus, we conclude from (4.22) that for all k,

$$\begin{aligned} \frac{\mathfrak {g}_{-\infty }(\Vert {{\textbf {v}}}^k-{{\textbf {w}}}^k\Vert )}{\Vert {{\textbf {u}}}^k-{{\textbf {w}}}^k\Vert }&= \frac{-\ln (v_y^k)}{\ln (v_z^k)-\ln (v_y^k)} = \frac{1}{1 - \frac{\ln (v_z^k)}{\ln (v_y^k)}}> 1. \end{aligned}$$

Consequently, (3.21b) also fails for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in this case.

Having shown that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in any case, we conclude by Lemma 3.12 that \(\gamma _{{{\textbf {z}}},\eta } \in \left( 0,\infty \right] \). With that, (4.16) follows from Theorem 3.10 and (4.15). \(\square \)

Using Theorem 4.2, we can also show weaker Hölderian error bounds.

Corollary 4.3

Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_z=0\) and \(z_y > 0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\eta >0\), \(\alpha \in (0,1)\), and \(\gamma _{{{\textbf {z}}},\eta }\) be as in (3.15) with \(\mathfrak {g}= |\cdot |^\alpha \). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and

$$\begin{aligned} \text {d}({{\textbf {q}}},{{\mathcal {F}}}_{-\infty })\le \max \{2\eta ^{1-\alpha },2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\cdot \text {d}({{\textbf {q}}},K_{\exp })^\alpha \ \ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\text{. } \end{aligned}$$

Proof

Suppose that \(\gamma _{{{\textbf {z}}},\eta } = 0\) and let sequences \(\{{{\textbf {v}}}^k\},\{{{\textbf {w}}}^k\},\{{{\textbf {u}}}^k\}\) be as in Lemma 3.12. Then \({{\textbf {v}}}^k\ne {{\textbf {w}}}^k\) for all k because \(\{{{\textbf {v}}}^k\}\subset K_{\exp }\backslash {{\mathcal {F}}}_{-\infty }\), \(\{{{\textbf {w}}}^k\}\subset \{{{\textbf {z}}}\}^\perp \), and \({{\mathcal {F}}}_{-\infty } = K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \). Since \(\mathfrak {g}_{-\infty }(t)/|t|^{\alpha } \downarrow 0 \) as \(t \downarrow 0\) we have

$$\begin{aligned} \liminf _{k\rightarrow \infty } \frac{\mathfrak {g}_{-\infty }(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )}{\Vert {{\textbf {w}}}^k- {{\textbf {u}}}^k\Vert } = \liminf _{k\rightarrow \infty } \frac{\mathfrak {g}_{-\infty }(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )}{\Vert {{\textbf {w}}}^k- {{\textbf {v}}}^k\Vert ^\alpha } \frac{\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert ^{\alpha }}{\Vert {{\textbf {w}}}^k- {{\textbf {u}}}^k\Vert } = 0, \end{aligned}$$

which contradicts Theorem 4.2 because the quantity in (3.15) should be positive for \(\mathfrak {g}= \mathfrak {g}_{-\infty }\). \(\square \)

Recalling (4.15), we obtain one-step facial residual functions using Theorem 4.2 and Corollary 4.3 in combination with Theorem 3.10, Remark 3.11 and Lemma 3.9.

Corollary 4.4

(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_{-\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) be such that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_{-\infty }\) is the two-dimensional face of \(K_{\exp }\). Let \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) in (4.14) or \(\mathfrak {g}= |\cdot |^\alpha \) for \(\alpha \in (0,1)\).

Let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by

$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):=\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}\mathfrak {g}(\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ) \end{aligned}$$

is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\mathfrak {g}(\epsilon )\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).

4.2.2 \({{\mathcal {F}}}_\beta \): the family of one-dimensional faces \(\beta \in \mathbb {R}\)

Recall from Proposition 4.1 that for each \(\beta \in \mathbb {R}\),

$$\begin{aligned} {{\mathcal {F}}}_\beta := \left\{ \left( -\beta y+y,y,e^{1-\beta }y \right) \;\bigg |\;y \in [0,\infty )\right\} \end{aligned}$$

is a one-dimensional face of \(K_{\exp }\). We will now show that for \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\), the \(\gamma _{{{\textbf {z}}},\eta }\) defined in Theorem 3.10 is positive when \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Our discussion will be centered around the following quantities, which were also defined and used in the proof of Theorem 3.10. Specifically, for \({{\textbf {z}}}\in K_{\exp }^*\) such that \({{\mathcal {F}}}_{\beta } = K_{\exp }\cap \{{{\textbf {z}}}\}^\perp \), we let \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\beta \) and define

$$\begin{aligned} {{\textbf {w}}}:=P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\ \ \ \mathrm{and}\ \ \ {{\textbf {u}}}:=P_{{{\mathcal {F}}}_\beta }{{\textbf {w}}}. \end{aligned}$$
(4.23)

We first note the following three important vectors:

$$\begin{aligned} \widehat{{{\textbf {z}}}}:= \begin{bmatrix} 1\\ \beta \\ -e^{\beta -1} \end{bmatrix},\ \ \ \widehat{{{\textbf {f}}}}= \begin{bmatrix} 1-\beta \\ 1\\ e^{1-\beta } \end{bmatrix},\ \ \ \widehat{{{\textbf {p}}}}= \begin{bmatrix} \beta e^{1-\beta } + e^{\beta -1}\\ -e^{1-\beta } - (1-\beta )e^{\beta -1}\\ \beta ^2-\beta +1 \end{bmatrix}. \end{aligned}$$
(4.24)

Note that \(\widehat{{{\textbf {z}}}}\) is parallel to \({{\textbf {z}}}\) in (4.6) (recall that \(z_x < 0\) for \({{\mathcal {F}}}_\beta \), where \(\beta := \frac{z_y}{z_x}\in \mathbb {R}\)), \({{\mathcal {F}}}_\beta \) is the conic hull of \(\{\widehat{{{\textbf {f}}}}\}\) according to Proposition 4.1, \(\langle \widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}}\rangle =0\) and \(\widehat{{{\textbf {p}}}}= \widehat{{{\textbf {z}}}}\times \widehat{{{\textbf {f}}}}\ne {{\varvec{0}}}\). These three nonzero vectors form a mutually orthogonal set. The next lemma represents \(\Vert {{\textbf {u}}}- {{\textbf {w}}}\Vert \) and \(\Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert \) in terms of inner products of \({{\textbf {v}}}\) with these vectors, whenever possible.

Lemma 4.5

Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\eta >0\), \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\beta \) and define \({{\textbf {w}}}\) and \({{\textbf {u}}}\) as in (4.23). Let \(\{\widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}},\widehat{{{\textbf {p}}}}\}\) be as in (4.24). Then

$$\begin{aligned} \Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert = \frac{|\langle \widehat{{{\textbf {z}}}},{{\textbf {v}}}\rangle |}{\Vert \widehat{{{\textbf {z}}}}\Vert }\ \ \ \mathrm{and}\ \ \ \Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert ={\left\{ \begin{array}{ll} \frac{|\langle \widehat{{{\textbf {p}}}},{{\textbf {v}}}\rangle |}{\Vert \widehat{{{\textbf {p}}}}\Vert } &{} \mathrm{if}\ \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0,\\ \Vert {{\textbf {w}}}\Vert &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$

Moreover, when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0\), we have \({{\textbf {u}}}= P_{\mathrm{span}{{\mathcal {F}}}_\beta }{{\textbf {w}}}\).

Proof

Since \(\{\widehat{{{\textbf {z}}}},\widehat{{{\textbf {f}}}},\widehat{{{\textbf {p}}}}\}\) is orthogonal, one can decompose \({{\textbf {v}}}\) as

$$\begin{aligned} {{\textbf {v}}}= \lambda _1 \widehat{{{\textbf {z}}}}+ \lambda _2 \widehat{{{\textbf {f}}}}+ \lambda _3 \widehat{{{\textbf {p}}}}, \end{aligned}$$
(4.25)

with

$$\begin{aligned} \lambda _1 = \langle \widehat{{{\textbf {z}}}},{{\textbf {v}}}\rangle /\Vert \widehat{{{\textbf {z}}}}\Vert ^2, \ \ \lambda _2 = \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle /\Vert \widehat{{{\textbf {f}}}}\Vert ^2\ \ \mathrm{and}\ \ \lambda _3 = \langle \widehat{{{\textbf {p}}}},{{\textbf {v}}}\rangle /\Vert \widehat{{{\textbf {p}}}}\Vert ^2. \end{aligned}$$
(4.26)

Also, since \(\widehat{{{\textbf {z}}}}\) is parallel to \({{\textbf {z}}}\), we must have \({{\textbf {w}}}= \lambda _2 \widehat{{{\textbf {f}}}}+ \lambda _3 \widehat{{{\textbf {p}}}}\). Thus, it holds that \(\Vert {{\textbf {w}}}-{{\textbf {v}}}\Vert = |\lambda _1|\Vert \widehat{{{\textbf {z}}}}\Vert \) and the first conclusion follows from this and (4.26).

Next, we have \({{\textbf {u}}}= {\hat{t}} \,\widehat{{{\textbf {f}}}}\), where

$$\begin{aligned} {\hat{t}}= \text {argmin}_{t\ge 0}\Vert {{\textbf {w}}}- t\widehat{{{\textbf {f}}}}\Vert = {\left\{ \begin{array}{ll} \frac{\langle {{\textbf {w}}},\widehat{{{\textbf {f}}}}\rangle }{\Vert \widehat{{{\textbf {f}}}}\Vert ^2} &{} \mathrm{if}\ \langle \widehat{{{\textbf {f}}}},{{\textbf {w}}}\rangle \ge 0,\\ 0 &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$

Moreover, observe from (4.25) that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {w}}}\rangle = \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}- \lambda _1\widehat{{{\textbf {z}}}}\rangle = \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \). These mean that when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle < 0\), we have \({{\textbf {u}}}= 0\), while when \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0\), we have \({{\textbf {u}}}= \frac{\langle {{\textbf {w}}},\widehat{{{\textbf {f}}}}\rangle }{\Vert \widehat{{{\textbf {f}}}}\Vert ^2}\widehat{{{\textbf {f}}}}=P_{\mathrm{span}{{\mathcal {F}}}_\beta }{{\textbf {w}}}\) and

$$\begin{aligned} \Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert = \left\| {{\textbf {w}}}- \frac{\langle {{\textbf {w}}},\widehat{{{\textbf {f}}}}\rangle }{\Vert \widehat{{{\textbf {f}}}}\Vert ^2}\widehat{{{\textbf {f}}}}\right\| = |\lambda _3|\Vert \widehat{{{\textbf {p}}}}\Vert = |\langle \widehat{{{\textbf {p}}}},{{\textbf {v}}}\rangle |/\Vert \widehat{{{\textbf {p}}}}\Vert , \end{aligned}$$

where the second and the third equalities follow from (4.25), (4.26), and the fact that \({{\textbf {w}}}= \lambda _2 \widehat{{{\textbf {f}}}}+ \lambda _3 \widehat{{{\textbf {p}}}}\). This completes the proof. \(\square \)

We now prove our main theorem in this section.

Theorem 4.6

(Hölderian error bound concerning \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and

$$\begin{aligned} \text {d}({{\textbf {q}}},{{\mathcal {F}}}_\beta )\le \max \left\{ 2\sqrt{\eta },2\gamma _{{{\textbf {z}}},\eta }^{-1}\right\} \cdot \text {d}({{\textbf {q}}},K_{\exp })^\frac{1}{2}\ \ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\text{. } \end{aligned}$$

Proof

In view of Lemma 3.12, take any \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_\beta \) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_\beta \) such that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }{{\textbf {v}}}^k = \underset{k \rightarrow \infty }{\lim }{{\textbf {w}}}^k =\bar{{{\textbf {v}}}}, \end{aligned}$$
(4.27)

where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\beta }{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). We will show that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\).

We first suppose that \({{\textbf {v}}}^k\in {{\mathcal {F}}}_{-\infty }\) infinitely often. By extracting a subsequence if necessary, we may assume that \({{\textbf {v}}}^k\in {{\mathcal {F}}}_{-\infty }\) for all k. From the definition of \({{\mathcal {F}}}_{-\infty }\) in (4.13), we have \(v_x^k\le 0\), \(v_y^k=0\) and \(v_z^k \ge 0\). Thus, recalling the definition of \(\widehat{{{\textbf {z}}}}\) in (4.24), it holds that

$$\begin{aligned} \begin{aligned} |\langle \widehat{{{\textbf {z}}}},{{\textbf {v}}}^k\rangle |&= |v^k_x - v_z^ke^{\beta -1}| = -v^k_x + v_z^ke^{\beta -1} = |v^k_x| + e^{\beta -1}|v_z^k| \\&\ge \min \{1,e^{\beta -1}\}\sqrt{|v^k_x|^2 + |v_z^k|^2} = \min \{1,e^{\beta -1}\}\Vert {{\textbf {v}}}^k\Vert , \end{aligned} \end{aligned}$$
(4.28)

where the last equality holds because \(v_y^k=0\). Next, using properties of projections onto subspaces, we have \(\Vert {{\textbf {w}}}^k\Vert \le \Vert {{\textbf {v}}}^k\Vert \). This together with Lemma 4.5 and (4.28) shows that

$$\begin{aligned} \Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert = \text {d}({{\textbf {w}}}^k,{{\mathcal {F}}}_\beta )\le \Vert {{\textbf {w}}}^k\Vert \le \Vert {{\textbf {v}}}^k\Vert&\le \frac{1}{\min \{1,e^{\beta -1}\}}|\langle \widehat{{{\textbf {z}}}},{{\textbf {v}}}^k\rangle |\\&= \frac{\Vert \widehat{{{\textbf {z}}}}\Vert }{\min \{1,e^{\beta -1}\}}\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert . \end{aligned}$$

Since \({{\textbf {w}}}^k \rightarrow \bar{{{\textbf {v}}}}\) and \({{\textbf {v}}}^k \rightarrow \bar{{{\textbf {v}}}}\), the above display shows that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\) in this case.

Next, suppose that \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\) for all large k instead. By passing to a subsequence, we assume that \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\) for all k. In view of (4.1) and (4.13), this means in particular that

$$\begin{aligned} {{\textbf {v}}}^k = (v^k_x,v^k_y,v^k_ye^{v^k_x/v^k_y})\ \ \mathrm{and}\ \ v^k_y>0\ \text{ for } \text{ all }\ k. \end{aligned}$$
(4.29)

We consider two cases and show that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\) in either of them:

  1. (I)

    \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) infinitely often;

  2. (II)

    \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all large k.

 (I): Since \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) infinitely often, by extracting a subsequence if necessary, we assume that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle \ge 0\) for all k. Now, consider the following functions:

$$\begin{aligned} \begin{aligned} h_1(\zeta )&:= \zeta + \beta - e^{\beta +\zeta -1},\\ h_2(\zeta )&:= (\beta e^{1-\beta } + e^{\beta -1})\zeta - e^{1-\beta }-(1-\beta )e^{\beta -1} + (\beta ^2-\beta +1)e^\zeta . \end{aligned} \end{aligned}$$

Using these functions, Lemma 4.5, (4.24) and (4.29), one can see immediately that

$$\begin{aligned} \begin{aligned} \Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert&= \frac{|\langle \widehat{{{\textbf {z}}}},{{\textbf {v}}}^k\rangle |}{\Vert \widehat{{{\textbf {z}}}}\Vert } = \frac{v^k_y |h_1(v^k_x/v^k_y)|}{\Vert \widehat{{{\textbf {z}}}}\Vert },\\ \Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert&= \frac{|\langle \widehat{{{\textbf {p}}}},{{\textbf {v}}}^k\rangle |}{\Vert \widehat{{{\textbf {p}}}}\Vert } = \frac{v^k_y |h_2(v^k_x/v^k_y)|}{\Vert \widehat{{{\textbf {p}}}}\Vert }. \end{aligned} \end{aligned}$$
(4.30)

Note that \(h_1\) is zero if and only if \(\zeta = 1 - \beta \). Furthermore, we have \(h_1'(1-\beta ) = 0\) and \(h_1''(1-\beta ) = -1\). Then, considering the Taylor expansion of \(h_1\) around \(1-\beta \) we have

$$\begin{aligned} h_1(\zeta ) = 1 + (\zeta + \beta -1) - e^{\beta +\zeta -1} = -\frac{(\zeta +\beta -1)^2}{2} + O(|\zeta +\beta -1|^3)\ \ \ \mathrm{as}\ \ \zeta \rightarrow 1-\beta . \end{aligned}$$

Also, one can check that \(h_2(1-\beta )=0\) and that

$$\begin{aligned} h_2'(1-\beta ) = \beta e^{1-\beta } + e^{\beta -1} + (\beta ^2-\beta +1)e^{1-\beta } = e^{\beta -1} + (\beta ^2+1)e^{1-\beta } > 0. \end{aligned}$$

Therefore, we have the following Taylor expansion of \(h_2\) around \(1-\beta \):

$$\begin{aligned}&h_2(\zeta ) \!=\!(e^{\beta -1} \!\!+\! (\beta ^2+1)e^{1-\beta })(\zeta + \beta -1) \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad + O(|\zeta +\beta -1|^2)\ \ \ \nonumber \\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \mathrm{as}\ \ \zeta \rightarrow 1-\beta . \end{aligned}$$
(4.31)

Thus, using the Taylor expansions of \(h_1\) and \(h_2\) at \(1-\beta \) we have

$$\begin{aligned} \lim \limits _{\zeta \rightarrow 1-\beta }\frac{|h_1(\zeta )|^\frac{1}{2}}{|h_2(\zeta )|} = \frac{1}{\sqrt{2}(e^{\beta -1} + (\beta ^2+1)e^{1-\beta })}> 0. \end{aligned}$$
(4.32)

Hence, there exist \(C_h > 0\) and \(\epsilon >0\) so that

$$\begin{aligned} |h_1(\zeta )|^\frac{1}{2} \ge C_h|h_2(\zeta )| \ \ \mathrm{whenever}\ |\zeta - (1-\beta )| \le \epsilon . \end{aligned}$$
(4.33)

Next, consider the following functionFootnote 10

$$\begin{aligned} H(\zeta ):={\left\{ \begin{array}{ll} \frac{|h_1(\zeta )|}{|h_2(\zeta )|} &{} \mathrm{if}\ |\zeta -(1-\beta )|\ge \epsilon , h_2(\zeta )\ne 0,\\ \infty &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$

Then it is easy to check that H is proper closed and is never zero. Moreover, by direct computation, we have \(\lim \limits _{\zeta \rightarrow \infty }H(\zeta ) = \frac{e^{\beta -1}}{\beta ^2-\beta +1} > 0\) and

$$\begin{aligned} \lim \limits _{\zeta \rightarrow -\infty }H(\zeta ) = {\left\{ \begin{array}{ll} |\beta e^{1-\beta } + e^{\beta -1}|^{-1} &{} \mathrm{if}\ \beta e^{1-\beta } + e^{\beta -1}\ne 0,\\ \infty &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$

Thus, we deduce that \(\inf H > 0\).

Now, if it happens that \(|v^k_x/v^k_y - (1-\beta )|> \epsilon \) for all large k, upon letting \(\zeta _k:= v^k_x/v^k_y\), we have from (4.30) that for all large k,

$$\begin{aligned} \begin{aligned} \frac{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert }{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert }&=\frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert }\frac{|h_1(\zeta _k)|}{|h_2(\zeta _k)|} = \frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert }H(\zeta _k)\ge \frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert }\inf H >0, \end{aligned} \end{aligned}$$
(4.34)

where the second equality holds because of the definition of H and the facts that \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) (so that \(h_2(\zeta _k)\ne 0\) by (4.30)) and \(|v^k_x/v^k_y - (1-\beta )|> \epsilon \) for all large k.

On the other hand, if it holds that \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) infinitely often, then by extracting a further subsequence, we may assume that \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) for all k. Upon letting \(\zeta _k:= v^k_x/v^k_y\), we have from (4.30) that

$$\begin{aligned} \begin{aligned} \frac{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert }{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^\frac{1}{2}} =\frac{\sqrt{v^k_y\Vert \widehat{{{\textbf {z}}}}\Vert }}{\Vert \widehat{{{\textbf {p}}}}\Vert }\frac{|h_2(\zeta _k)|}{|h_1(\zeta _k)|^\frac{1}{2}} \le \frac{\sqrt{v^k_y\Vert \widehat{{{\textbf {z}}}}\Vert }}{C_h\Vert \widehat{{{\textbf {p}}}}\Vert } \le \frac{\sqrt{\eta \Vert \widehat{{{\textbf {z}}}}\Vert }}{C_h\Vert \widehat{{{\textbf {p}}}}\Vert }, \end{aligned} \end{aligned}$$
(4.35)

where the first inequality holds thanks to \(|v^k_x/v^k_y - (1-\beta )|\le \epsilon \) for all k, (4.33) and the fact that \({{\textbf {w}}}^k\ne {{{\textbf {u}}}}^k\) (so that \(h_2(\zeta _k)\ne 0\) and hence \(h_1(\zeta _k)\ne 0\)), and the second inequality holds because \({{\textbf {v}}}^k\in B(\eta )\).

Using (4.34) and (4.35) together with (4.27), we see that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). This concludes case (I).

 (II): By passing to a subsequence, we may assume that \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all k. Then we see from (4.24) and (4.29) that

$$\begin{aligned} (1-\beta )\frac{v_x^k}{v_y^k} + 1 + e^{1-\beta }e^{v_x^k/v_y^k} = \frac{1}{v_y^k} \langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0. \end{aligned}$$

Using this together with the fact that \((1-\beta )^2+1+e^{2(1-\beta )} > 0\), we deduce that there exists \(\epsilon > 0\) so that

$$\begin{aligned} \left| \frac{v_x^k}{v_y^k}-(1-\beta )\right| \ge \epsilon \ \ \text{ for } \text{ all }\ k. \end{aligned}$$
(4.36)

Now, consider the following function

$$\begin{aligned} G(\zeta ):= \frac{|\zeta +\beta -e^{\beta +\zeta -1}|}{\sqrt{\zeta ^2+1+e^{2\zeta }}}. \end{aligned}$$

Then G is continuous and is zero if and only if \(\zeta = 1-\beta \). Moreover, by direct computation, we have \(\lim \nolimits _{\zeta \rightarrow \infty } G(\zeta ) = e^{\beta -1} > 0\) and \(\lim \nolimits _{\zeta \rightarrow -\infty } G(\zeta ) = 1 > 0\). Thus, it follows that

$$\begin{aligned} \underline{G}:= \inf _{|\zeta +\beta -1|\ge \epsilon }G(\zeta ) > 0. \end{aligned}$$
(4.37)

Finally, since \(\langle \widehat{{{\textbf {f}}}},{{\textbf {v}}}^k\rangle < 0\) for all k, we see that

$$\begin{aligned} \begin{aligned} \frac{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert }{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert }&\overset{\mathrm{(a)}}{\ge }\frac{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert }{\Vert {{\textbf {v}}}^k\Vert } \overset{\mathrm{(b)}}{=} \frac{|{\widehat{z}}_xv^k_x + {\widehat{z}}_yv^k_y + {\widehat{z}}_zv^k_ye^{v^k_x/v^k_y}|}{\Vert \widehat{{{\textbf {z}}}}\Vert \sqrt{(v^k_x)^2+(v^k_y)^2+(v^k_y)^2e^{2v^k_x/v^k_y}}}\\&\overset{\mathrm{(c)}}{=}\frac{|{\widehat{z}}_x\zeta _k + {\widehat{z}}_y + {\widehat{z}}_ze^{\zeta _k}|}{\Vert \widehat{{{\textbf {z}}}}\Vert \sqrt{(\zeta _k)^2 + 1 + e^{2\zeta _k}}} \overset{\mathrm{(d)}}{=} \frac{1}{\Vert \widehat{{{\textbf {z}}}}\Vert }G(\zeta _k)\overset{\mathrm{(e)}}{\ge }\frac{1}{\Vert \widehat{{{\textbf {z}}}}\Vert }\underline{G} > 0, \end{aligned} \end{aligned}$$

where (a) follows from \(\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert = \Vert {{\textbf {w}}}^k\Vert \) (see Lemma 4.5) and \(\Vert {{\textbf {w}}}^k\Vert \le \Vert {{\textbf {v}}}^k\Vert \) (because \({{\textbf {w}}}^k\) is the projection of \({{{\textbf {v}}}}^k\) onto a subspace), (b) follows from Lemma 4.5 and (4.29), (c) holds because \(v^k_y > 0\) (see (4.29)) and we defined \(\zeta _k:= v_x^k/v_y^k\), (d) follows from (4.24) and the definition of G, and (e) follows from (4.36) and (4.37). The above together with (4.27) shows that (3.21b) does not hold for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\), which is what we wanted to show in case (II).

Summarizing the above cases, we conclude that there does not exist the sequence \(\{{{\textbf {v}}}^k\}\) and its associates so that (3.21b) holds for \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). By Lemma 3.12, it must then hold that \(\gamma _{{{\textbf {z}}},\eta }\in (0,\infty ]\) and we have the desired error bound in view of Theorem 3.10. This completes the proof. \(\square \)

Combining Theorems 4.6, 3.10 and Lemma 3.9, and using the observation that \(\gamma _{{{\textbf {z}}},0}=\infty \) (see (3.15)), we obtain a one-step facial residual function in the following corollary.

Corollary 4.7

(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_\beta \), \(\beta \in \mathbb {R}\)) Let \(\beta \in \mathbb {R}\) and \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x<0\) such that \({{\mathcal {F}}}_\beta = \{{{\textbf {z}}}\}^\perp \cap K_{\exp }\) is a one-dimensional face of \(K_{\exp }\). Let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= |\cdot |^\frac{1}{2}\). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by

$$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) := \max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}(\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} )^\frac{1}{2} \end{aligned}$$

is a \(\mathbb {1}\)-FRF for \(K_{\exp }\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\sqrt{\epsilon }\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).

4.2.3 \({{\mathcal {F}}}_{\infty }\): the exceptional one-dimensional face

Recall the special one-dimensional face of \(K_{\exp }\) defined by

$$\begin{aligned} \mathcal {F}_{\infty }:= \{(x,0,0) \;|\; x\le 0 \}. \end{aligned}$$

We first show that we have a Lipschitz error bound for any exposing normal vectors \({{\textbf {z}}}= (0, z_y,z_z)\) with \(z_y > 0\) and \(z_z > 0\).

Theorem 4.8

(Lipschitz error bound concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=0\), \(z_y > 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= |\cdot |\). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and

$$\begin{aligned} \text {d}({{\textbf {q}}},{{\mathcal {F}}}_{\infty })\le \max \{2,2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\cdot \text {d}({{\textbf {q}}},K_{\exp })\ \ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\text{. } \end{aligned}$$

Proof

Without loss of generality, upon scaling, we may assume that \({{\textbf {z}}}= (0,a,1)\) for some \(a > 0\). Similarly as in the proof of Theorem 4.6, we will consider the following vectors:

$$\begin{aligned} \widetilde{{{\textbf {z}}}}:= \begin{bmatrix} 0\\ a\\ 1 \end{bmatrix}, \ \ \widetilde{{{\textbf {f}}}}:= \begin{bmatrix} -1\\ 0\\ 0 \end{bmatrix},\ \ \widetilde{{{\textbf {p}}}}:= \begin{bmatrix} 0\\ 1\\ -a \end{bmatrix}. \end{aligned}$$

Here, \({{\mathcal {F}}}_\infty \) is the conical hull of \(\widetilde{{{\textbf {f}}}}\) (see (4.12)), and \(\widetilde{{{\textbf {p}}}}\) is constructed so that \(\{\widetilde{{{\textbf {z}}}},\widetilde{{{\textbf {f}}}},\widetilde{{{\textbf {p}}}}\}\) is orthogonal.

Now, let \({{\textbf {v}}}\in \partial K_{\exp }\cap B(\eta )\backslash {{\mathcal {F}}}_\infty \), \({{\textbf {w}}}= P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}\) and \({{\textbf {u}}}=P_{{{\mathcal {F}}}_\infty }{{\textbf {w}}}\) with \({{\textbf {u}}}\ne {{\textbf {w}}}\). Then, as in Lemma 4.5, by decomposing \({{\textbf {v}}}\) as a linear combination of \(\{\widetilde{{{\textbf {z}}}},\widetilde{{{\textbf {f}}}},\widetilde{{{\textbf {p}}}}\}\), we have

$$\begin{aligned} \Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert = \frac{|\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle |}{\Vert \widetilde{{{\textbf {z}}}}\Vert }\ \ \mathrm{and}\ \ \Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert = {\left\{ \begin{array}{ll} \frac{|\langle \widetilde{{{\textbf {p}}}},{{\textbf {v}}}\rangle |}{\Vert \widetilde{{{\textbf {p}}}}\Vert } &{} \mathrm{if}\ \langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle \ge 0,\\ \Vert {{\textbf {w}}}\Vert &{} \mathrm{otherwise}. \end{array}\right. } \end{aligned}$$
(4.38)

We consider the following cases for estimating \(\gamma _{{{\textbf {z}}},\eta }\).

  1. (I)

    \({{\textbf {v}}}\in {{\mathcal {F}}}_{-\infty }\backslash {{\mathcal {F}}}_\infty \);

  2. (II)

    \({{\textbf {v}}}\notin {{\mathcal {F}}}_{-\infty }\) with \(v_x \le 0\);

  3. (III)

    \({{\textbf {v}}}\notin {{\mathcal {F}}}_{-\infty }\) with \(v_x > 0\).

 (I): In this case, \({{\textbf {v}}}= (v_x,0,v_z)\) with \(v_x\le 0\le v_z\); see (4.13). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x\ge 0\) and \(|\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle | = |v_z| = \frac{1}{a}|\langle \widetilde{{{\textbf {p}}}},{{\textbf {v}}}\rangle |\). Consequently, we have from (4.38) that

$$\begin{aligned} \Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert = \frac{|\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle |}{\Vert \widetilde{{{\textbf {z}}}}\Vert } = \frac{|\langle \widetilde{{{\textbf {p}}}},{{\textbf {v}}}\rangle |}{a\Vert \widetilde{{{\textbf {z}}}}\Vert } = \frac{\Vert \widetilde{{{\textbf {p}}}}\Vert }{a\Vert \widetilde{{{\textbf {z}}}}\Vert }\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert . \end{aligned}$$

 (II): In this case, in view of (4.1) and (4.13), we have \({{\textbf {v}}}= (v_x,v_y,v_ye^{v_x/v_y})\) with \(v_x\le 0\) and \(v_y > 0\). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x\ge 0\). Moreover, since \(v_y>0\), we have

$$\begin{aligned} \begin{aligned} |\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle |&= |av_y+v_ye^{v_x/v_y}| = av_y+v_ye^{v_x/v_y} \ge \min \{1,a\}(v_y+v_ye^{v_x/v_y})\\&\ge \frac{\min \{1,a\}}{\max \{1,a\}}(v_y+av_ye^{v_x/v_y})\ge \frac{\min \{1,a\}}{\max \{1,a\}}|v_y-av_ye^{v_x/v_y}| \\&= \frac{\min \{1,a\}}{\max \{1,a\}}|\langle \widetilde{{{\textbf {p}}}},{{\textbf {v}}}\rangle |. \end{aligned} \end{aligned}$$

Using (4.38), we then obtain that \(\Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert \ge \frac{\min \{1,a\}\Vert \widetilde{{{\textbf {p}}}}\Vert }{\max \{1,a\}\Vert \widetilde{{{\textbf {z}}}}\Vert }\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert \).

 (III): In this case, in view of (4.1) and (4.13), \({{\textbf {v}}}= (v_x,v_y,v_ye^{v_x/v_y})\) with \(v_x> 0\) and \(v_y > 0\). Then \(\langle \widetilde{{{\textbf {f}}}},{{\textbf {v}}}\rangle = -v_x< 0\) and hence \(\Vert {{\textbf {w}}}- {{\textbf {u}}}\Vert = \Vert {{\textbf {w}}}\Vert \le \Vert {{\textbf {v}}}\Vert \), where the equality follows from (4.38) and the inequality holds because \({{\textbf {w}}}\) is the projection of \({{\textbf {v}}}\) onto a subspace. Since \(v_y > 0\), we have

$$\begin{aligned} \begin{aligned} |\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle |&= |av_y+v_ye^{v_x/v_y}| = av_y+0.5v_ye^{v_x/v_y} + 0.5v_ye^{v_x/v_y}\\&\overset{\mathrm{(a)}}{\ge }av_y+0.5v_y(1 + v_x/v_y) + 0.5v_ye^{v_x/v_y}\\&\ge 0.5v_y + 0.5v_x + 0.5v_ye^{v_x/v_y}\\&= 0.5\Vert {{\textbf {v}}}\Vert _1, \end{aligned} \end{aligned}$$

where we used \(v_y > 0\) and \(e^t\ge 1+t\) for all t in (a) and \(\Vert {{\textbf {v}}}\Vert _1\) denotes the 1-norm of \({{\textbf {v}}}\). Combining this with (4.38) and the fact that \(\Vert {{\textbf {w}}}\Vert \le \Vert {{\textbf {v}}}\Vert \), we see that

$$\begin{aligned} \Vert {{\textbf {w}}}- {{\textbf {v}}}\Vert = \frac{|\langle \widetilde{{{\textbf {z}}}},{{\textbf {v}}}\rangle |}{\Vert \widetilde{{{\textbf {z}}}}\Vert }\ge \frac{\Vert {{\textbf {v}}}\Vert _1}{2\Vert \widetilde{{{\textbf {z}}}}\Vert }\ge \frac{\Vert {{\textbf {v}}}\Vert }{2\Vert \widetilde{{{\textbf {z}}}}\Vert } \ge \frac{\Vert {{\textbf {w}}}\Vert }{2\Vert \widetilde{{{\textbf {z}}}}\Vert } = \frac{\Vert {{\textbf {w}}}-{{\textbf {u}}}\Vert }{2\Vert \widetilde{{{\textbf {z}}}}\Vert }. \end{aligned}$$

Summarizing the three cases, we conclude that

$$\begin{aligned}\gamma _{{{\textbf {z}}},\eta } \ge \min \left\{ \frac{\Vert \widetilde{{{\textbf {p}}}}\Vert }{a\Vert \widetilde{{{\textbf {z}}}}\Vert },\frac{\min \{1,a\}\Vert \widetilde{{{\textbf {p}}}}\Vert }{\max \{1,a\}\Vert \widetilde{{{\textbf {z}}}}\Vert },\frac{1}{2\Vert \widetilde{{{\textbf {z}}}}\Vert }\right\} > 0.\end{aligned}$$

In view of Theorem 3.10, we have the desired error bound. This completes the proof. \(\square \)

We next turn to the supporting hyperplane defined by \({{\textbf {z}}}= (0,0,z_z)\) for some \(z_z>0\) and so \(\{{{\textbf {z}}}\}^\perp \) is the xy-plane. The following lemma demonstrates that the Hölderian-type error bound in the form of (3.16) with \(\mathfrak {g}= |\cdot |^\alpha \) for some \(\alpha \in (0,1]\) no longer works in this case.

Lemma 4.9

(Nonexistence of Hölderian error bounds) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x= z_y = 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\alpha \in (0,1]\) and \(\eta > 0\). Then

$$\begin{aligned} \inf _{{\textbf {q}}}\left\{ \frac{\text {d}({{\textbf {q}}},K_{\exp })^\alpha }{\text {d}({{\textbf {q}}},{{\mathcal {F}}}_\infty )}\;\bigg |\;{{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\backslash {{\mathcal {F}}}_\infty \right\} = 0. \end{aligned}$$

Proof

For each \(k\in \mathbb {N}\), let \({{\textbf {q}}}^k := (-\frac{\eta }{2},\frac{\eta }{2k},0)\). Then \({{\textbf {q}}}^k\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\backslash {{\mathcal {F}}}_\infty \) and we have \(\text {d}({{\textbf {q}}}^k,{{\mathcal {F}}}_\infty ) = \frac{\eta }{2k}\). Moreover, since \((q^k_x,q^k_y,q^k_ye^{q^k_x/q^k_y})\in K_{\exp }\), we have \(\text {d}({{\textbf {q}}}^k,K_{\exp })\le q^k_ye^{q^k_x/q^k_y} = \frac{\eta }{2k}e^{-k}\). Then it holds that

$$\begin{aligned} \frac{\text {d}({{\textbf {q}}}^k,K_{\exp })^\alpha }{\text {d}({{\textbf {q}}}^k,{{\mathcal {F}}}_\infty )} \le \frac{\eta ^{\alpha -1}}{2^{\alpha -1}}k^{1-\alpha }e^{-\alpha k} \rightarrow 0 \ \ \ \mathrm{as}\ \ k\rightarrow \infty \end{aligned}$$

since \(\alpha \in (0,1]\). This completes the proof. \(\square \)

Since a zero-at-zero monotone nondecreasing function of the form \((\cdot )^\alpha \) no longer works, we opt for the following function \(\mathfrak {g}_\infty :\mathbb {R}_+\rightarrow \mathbb {R}_+\) that grows faster around \(t=0\):

$$\begin{aligned} \mathfrak {g}_\infty (t) := {\left\{ \begin{array}{ll} 0 &{}\text {if}\;t=0,\\ -\frac{1}{\ln (t)} &{} \text {if}\;0 < t\le \frac{1}{e^2},\\ \frac{1}{4}+\frac{1}{4}e^2t &{} \text {if}\;t>\frac{1}{e^2}. \end{array}\right. } \end{aligned}$$
(4.39)

Similar to \(\mathfrak {g}_{-\infty }\) in (4.14), \(\mathfrak {g}_{\infty }\) is monotone increasing and there exists a constant \({\widehat{L}} \ge 1\) such that the following inequalities hold for every \(t \in \mathbb {R}_+\) and \(M > 0\):

$$\begin{aligned} |t| \le \mathfrak {g}_{\infty }(t), \quad \mathfrak {g}_{\infty }(2t) \le {\widehat{L}}\mathfrak {g}_{\infty }(t),\quad \mathfrak {g}_{\infty }(Mt) \le {{\widehat{L}}}^{1+|\log _2(M)|}\mathfrak {g}_{\infty }(t). \end{aligned}$$
(4.40)

We next show that error bounds in the form of (3.16) holds for \({{\textbf {z}}}= (0,0,z_z)\), \(z_z>0\), if we use \(\mathfrak {g}_\infty \).

Theorem 4.10

(Log-type error bound concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=z_y = 0\) and \(z_z>0\) so that \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \). Let \(\eta > 0\) and let \(\gamma _{{{\textbf {z}}},\eta }\) be defined as in (3.15) with \(\mathfrak {g}= \mathfrak {g}_\infty \) in (4.39). Then \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\) and

$$\begin{aligned} \text {d}({{\textbf {q}}},{{\mathcal {F}}}_{\infty })\le \max \{2,2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\cdot \mathfrak {g}_\infty (\text {d}({{\textbf {q}}},K_{\exp }))\ \ \ \text{ whenever } {{\textbf {q}}}\in \{{{\textbf {z}}}\}^\perp \cap B(\eta )\text{. } \end{aligned}$$

Proof

Take \(\bar{{{\textbf {v}}}}\in {{\mathcal {F}}}_\infty \) and a sequence \(\{{{\textbf {v}}}^k\}\subset \partial K_{\exp }\cap B(\eta ) \backslash {{\mathcal {F}}}_\infty \) such that

$$\begin{aligned} \underset{k \rightarrow \infty }{\lim }{{\textbf {v}}}^k = \underset{k \rightarrow \infty }{\lim }{{\textbf {w}}}^k =\bar{{{\textbf {v}}}}, \end{aligned}$$

where \({{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k\), \({{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\infty }{{\textbf {w}}}^k\), and \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\). Since \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\), in view of (4.12) and (4.13), we must have \({{\textbf {v}}}^k\notin {{\mathcal {F}}}_{-\infty }\). Then, from (4.1) and (4.12), we have

$$\begin{aligned} \begin{aligned}&{{\textbf {v}}}^k = (v^k_x,v^k_y,v^k_ye^{v^k_x/v^k_y}) \text{ with } v^k_y>0,\\&{{\textbf {w}}}^k = (v^k_x,v^k_y,0)\ \ \mathrm{and}\ \ {{\textbf {u}}}^k = (\min \{v^k_x,0\},0,0). \end{aligned} \end{aligned}$$
(4.41)

Since \({{\textbf {w}}}^k\rightarrow \bar{{{\textbf {v}}}}\) and \({{\textbf {v}}}^k\rightarrow \bar{{{\textbf {v}}}}\), without loss of generality, by passing to a subsequence if necessary, we assume in addition that \(\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert \le e^{-2}\) for all k. From (4.41) we conclude that \({{\textbf {v}}}^k \ne {{\textbf {w}}}^k\), hence \(\mathfrak {g}_\infty (\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert ) = -(\ln \Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert )^{-1}\).

We consider the following two cases in order to show that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \):

  1. (I)

    \(\bar{{{\textbf {v}}}}\ne {\textbf {0}}\);

  2. (II)

    \(\bar{{{\textbf {v}}}}= {\textbf {0}}\).

 (I): In this case, we have \(\bar{{{\textbf {v}}}}= ({\bar{v}}_x,0,0)\) for some \({\bar{v}}_x < 0\). This implies that \(v^k_x<0\) for all large k. Hence, we have from (4.41) that for all large k,

$$\begin{aligned} \frac{\mathfrak {g}_\infty (\Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert ) }{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k \Vert } = \frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }&= -\frac{1}{v^k_y\left( v^k_x/v^k_y + \ln v^k_y \right) } \\&= -\frac{1}{v^k_y\ln v^k_y + v^k_x} \rightarrow -\frac{1}{{\bar{v}}_x} > 0 \end{aligned}$$

since \(v^k_y\rightarrow 0\) and \(v^k_x\rightarrow {\bar{v}}_x < 0\). This shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).

 (II): If \(v_x^k\le 0\) infinitely often, by extracting a subsequence, we assume that \(v_x^k\le 0\) for all k. Since \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) (and \({{\textbf {w}}}^k\ne {{\textbf {v}}}^k\)), we note from (4.41) that

$$\begin{aligned} -\frac{1}{v^k_y\ln v^k_y + v^k_x} = \frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert } \in (0,\infty )\ \ \text{ for } \text{ all }\ k. \end{aligned}$$

Since \(\{-(v^k_y\ln v^k_y + v^k_x)\}\) is a positive sequence and it converges to zero as \((v^k_x,v^k_y)\rightarrow 0\), it follows that \(\lim \nolimits _{k\rightarrow \infty }\frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }=\infty \). This shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).

Now, it remains to consider the case that \(v_x^k>0\) for all large k. By passing to a subsequence if necessary, we assume that \(v_x^k>0\) for all k. By solving for \(v_x^k\) from \(v^k_z=v^k_y e^{v^k_x/v^k_y} > 0\) and noting (4.41), we obtain that

$$\begin{aligned} \begin{aligned}&{{\textbf {v}}}^k = (v^k_y\ln (v_z^k/v_y^k),v^k_y,v^k_z) \text{ with } v^k_y>0,\\&{{\textbf {w}}}^k = (v^k_y\ln (v_z^k/v_y^k),v^k_y,0)\ \ \mathrm{and}\ \ {{\textbf {u}}}^k = (0,0,0). \end{aligned} \end{aligned}$$
(4.42)

Also, we note from \(v_x^k=v^k_y\ln (v_z^k/v_y^k)>0\), \(v_y^k>0\) and the monotonicity of \(\ln (\cdot )\) that for all k,

$$\begin{aligned} v^k_z>v^k_y>0. \end{aligned}$$
(4.43)

Next consider the function \(h(t) := \frac{1}{t}\sqrt{1+(\ln t)^2}\) on \([1,\infty )\). Then h is continuous and positive. Since \(h(1)=1\) and \(\lim _{t\rightarrow \infty }h(t) = 0\), there exists \(M_h\) such that \(h(t)\le M_h\) for all \(t\ge 1\). Now, using (4.42), we have, upon defining \(t_k:= v_z^k/v_y^k\) that

$$\begin{aligned} \begin{aligned} \frac{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}&= \frac{v_y^k\sqrt{1+[\ln (v_z^k/v_y^k)]^2}}{-(\ln v_z^k)^{-1}} = -v_y^k\sqrt{1+[\ln (v_z^k/v_y^k)]^2}\ln v_z^k\\&\overset{\mathrm{(a)}}{=} -\frac{v_y^k}{v_z^k}\sqrt{1+\left[ \ln \left( \frac{v_z^k}{v_y^k}\right) \right] ^2}v_z^k\ln v_z^k \overset{\mathrm{(b)}}{=} - h(t_k)v_z^k\ln v_z^k \\&\overset{\mathrm{(c)}}{\le }-M_hv_z^k\ln v_z^k, \end{aligned} \end{aligned}$$

where the division by \(v_z^k\) in (a) is legitimate because \(v_z^k>0\), (b) follows from the definition of h and the fact that \(t_k > 1\) (see (4.43)), and (c) holds because of the definition of \(M_h\) and the fact that \(-\ln v_z^k > 0\) (thanks to \(v_z^k = \Vert {{\textbf {w}}}^k - {{\textbf {v}}}^k\Vert \le e^{-2}\)). Since \(v^k_z\rightarrow 0\), it then follows that \(\left\{ \frac{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}\right\} \) is a positive sequence that converges to zero. Thus, \(\lim \nolimits _{k\rightarrow \infty }\frac{-(\ln \Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert )^{-1}}{\Vert {{\textbf {w}}}^k - {{\textbf {u}}}^k\Vert }=\infty \), which again shows that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \).

Having shown that (3.21b) does not hold for \(\mathfrak {g}= \mathfrak {g}_\infty \), in view of Lemma 3.12, we must have \(\gamma _{{{\textbf {z}}},\eta } \in (0,\infty ]\). Then the result follows from Theorem 3.10 and (4.40). \(\square \)

Combining Theorem 4.8, Theorem 4.10 and Lemma 3.9, and noting (4.40) and \(\gamma _{{{\textbf {z}}},0}=\infty \) (see (3.15)), we can now summarize the one-step facial residual functions derived in this section in the following corollary.

Corollary 4.11

(\(\mathbb {1}\)-FRF concerning \({{\mathcal {F}}}_{\infty }\)) Let \({{\textbf {z}}}\in K_{\exp }^*\) with \(z_x=0\) and \(\{{{\textbf {z}}}\}^\perp \cap K_{\exp }={{\mathcal {F}}}_\infty \).

  1. (i)

    In the case when \(z_y > 0\), let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= |\cdot |\). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):=\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}(\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ) \end{aligned}$$

    is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\epsilon \) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).

  2. (ii)

    In the case when \(z_y = 0\), let \(\kappa _{{{\textbf {z}}},t}\) be defined as in (3.16) with \(\mathfrak {g}= \mathfrak {g}_\infty \) in (4.39). Then the function \(\psi _{ \mathcal {K},{{\textbf {z}}}}:\mathbb {R}_+\times \mathbb {R}_+\rightarrow \mathbb {R}_+\) given by

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):=\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_\infty (\epsilon +\max \left\{ \epsilon ,\epsilon /\Vert {{\textbf {z}}}\Vert \right\} ) \end{aligned}$$

    is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\). In particular, there exist \(\kappa > 0\) and a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}\) given by \({\hat{\psi }}_{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon + \rho (t)\mathfrak {g}_{\infty }(\epsilon )\) is a \(\mathbb {1}\)-FRF for \(K_{\exp }\) and \({{\textbf {z}}}\).

4.2.4 The non-exposed face \({{\mathcal {F}}}_{ne}\)

Recall the unique non-exposed face of \(K_{\exp }\):

$$\begin{aligned} \mathcal {F}_{ne} := \{(0,0,z) \;|\; z \ge 0\}. \end{aligned}$$

In this subsection, we take a look at \({{\mathcal {F}}}_{ne}\). Note that \({{\mathcal {F}}}_{ne}\) is an exposed face of \({{\mathcal {F}}}_{-\infty }\), which is polyhedral. This observation leads immediately to the following corollary, which also follows from [34, Proposition 18] by letting \( \mathcal {F}{:}{=} \mathcal {K}{:}{=}{{\mathcal {F}}}_{-\infty }\) therein. We omit the proof for brevity.

Corollary 4.12

(\(\mathbb {1}\)-FRF for \({{\mathcal {F}}}_{ne}\)) Let \({{\textbf {z}}}\in \ {{\mathcal {F}}}_{-\infty }^*\) be such that \({{\mathcal {F}}}_{ne} = {{\mathcal {F}}}_{-\infty } \cap \{{{\textbf {z}}}\}^\perp \). Then there exists \(\kappa > 0\) such that

$$\begin{aligned} \psi _{{{\mathcal {F}}}_{-\infty },{{\textbf {z}}}}(\epsilon ,t) {:}{=}\kappa \epsilon \end{aligned}$$

is a \(\mathbb {1}\)-FRF for \({{\mathcal {F}}}_{-\infty }\) and \({{\textbf {z}}}\).

4.3 Error bounds

In this subsection, we return to the feasibility problem (Feas) and consider the case where \( \mathcal {K}= K_{\exp }\). We now have all the tools for obtaining error bounds. Recalling Definition 2.1, we can state the following result.

Theorem 4.13

(Error bounds for (Feas) with \( \mathcal {K}= K_{\exp }\)) Let \(\mathcal {L}\subseteq \mathbb {R}^3\) be a subspace and \({{\textbf {a}}}\in \mathbb {R}^3\) such that \((\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\ne \emptyset \). Then the following items hold.

  1. (i)

    The distance to the PPS condition of \(\{K_{\exp }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) \le 1\).

  2. (ii)

    If \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=0 \), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound.

  3. (iii)

    Suppose \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=1\) and let \( \mathcal {F}\subsetneq K_{\exp }\) be a chain of faces of length 2 satisfying items (ii) and (iii) of Proposition 3.2. We have the following possibilities.

    1. (a)

      If \( \mathcal {F}= {{\mathcal {F}}}_{-\infty }\) then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy an entropic error bound as in (4.44). In addition, for all \(\alpha \in (0,1)\), a uniform Hölderian error bound with exponent \(\alpha \) holds.

    2. (b)

      If \( \mathcal {F}= {{\mathcal {F}}}_{\beta }\), with \(\beta \in \mathbb {R}\), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a uniform Hölderian error bound with exponent 1/2.

    3. (c)

      Suppose that \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\). If there exists \({{\textbf {z}}}\in K_{\exp }^* \cap \mathcal {L}^\perp \cap \{a\}^\perp \) with \(z_x=0\), \(z_y > 0\) and \(z_z>0\) then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound. Otherwise, \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a log-type error bound as in (4.45).

    4. (d)

      If \( \mathcal {F}= \{{\textbf {0}} \}\), then \(K_{\exp }\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a Lipschitzian error bound.

Proof

  1. (i):

    All proper faces of \(K_{\exp }\) are polyhedral, therefore \(\ell _{\text {poly}}(K_{\exp }) = 1\). By item  of Proposition 3.2, there exists a chain of length 2 satisfying item  of Proposition 3.2. Therefore, \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) \le 1\).

  2. (ii):

    If \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}}) = 0\), it is because \(\{K_{\exp }, \mathcal {L}+{{\textbf {a}}}\}\) satisfies the PPS condition, which implies a Lipschitzian error bound by Proposition 2.2.

  3. (iii):

    Next, suppose \(d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})=1\) and let \( \mathcal {F}\subsetneq K_{\exp }\) be a chain of faces of length 2 satisfying items  and  of Proposition 3.2, together with \({{\textbf {z}}}\in K_{\exp }^* \cap \mathcal {L}^\perp \cap \{{{\textbf {a}}}\}^\perp \) such that

    $$\begin{aligned} \mathcal {F}= K_{\exp }\cap \{{{\textbf {z}}}\}^\perp . \end{aligned}$$

Since positively scaling \({{\textbf {z}}}\) does not affect the chain of faces, we may assume that \(\Vert {{\textbf {z}}}\Vert = 1\). Also, in what follows, for simplicity, we define

$$\begin{aligned} {\widehat{\text {d}}}({{\textbf {x}}}) {:}{=}\max \{\text {d}({{\textbf {x}}},\mathcal {L}+{{\textbf {a}}}), \text {d}({{\textbf {x}}}, K_{\exp }) \}. \end{aligned}$$

Then, we prove each item by applying Theorem 3.8 with the corresponding facial residual function.

  1. (a)

    If \( \mathcal {F}= {{\mathcal {F}}}_{-\infty }\), the one-step facial residual functions are given by Corollary 4.4. First we consider the case where \(\mathfrak {g}= \mathfrak {g}_{-\infty }\) and we have

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_{-\infty }(2\epsilon ), \end{aligned}$$

    where \(\mathfrak {g}_{-\infty }\) is as in (4.14). Then, if \(\psi \) is a positively rescaled shift of \(\psi _{ \mathcal {K},{{\textbf {z}}}}\), using the monotonicity of \(\mathfrak {g}_{-\infty }\) and of \(\kappa _{{{\textbf {z}}},t}\) as a function of t, we conclude that there exists \({\widehat{M}} > 0\) such that

    $$\begin{aligned} \psi (\epsilon ,t) \le {\widehat{M}} \epsilon + {\widehat{M}} \kappa _{{{\textbf {z}}}, {\widehat{M}} t}\mathfrak {g}_{-\infty }({\widehat{M}}\epsilon ). \end{aligned}$$

    Invoking Theorem 3.8, using the monotonicity of all functions involved in the definition of \(\psi \) and recalling (4.15), we conclude that for every bounded set B, there exists \(\kappa _B > 0\)

    $$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_{-\infty }({\widehat{\text {d}}}({{\textbf {x}}})), \qquad \forall {{\textbf {x}}}\in B, \end{aligned}$$
    (4.44)

    which shows that an entropic error bound holds. Next, we consider the case \(\mathfrak {g}= |\cdot |^{\alpha }\). Given \(\alpha \in (0,1)\), we have the following one-step facial residual function:

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}2^\alpha \epsilon ^\alpha , \end{aligned}$$

    where \(\kappa _{{{\textbf {z}}},t}\) is defined as in (3.16). Invoking Theorem 3.8, we conclude that for every bounded set B, there exists \(\kappa _B > 0\) such that

    $$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _B {\widehat{\text {d}}}({{\textbf {x}}}) + \kappa _{B} {\widehat{\text {d}}}({{\textbf {x}}})^\alpha , \qquad \forall {{\textbf {x}}}\in B, \end{aligned}$$

    In addition, for \({{\textbf {x}}}\in B\), we have \({\widehat{\text {d}}}({{\textbf {x}}}) \le {\widehat{\text {d}}}({{\textbf {x}}})^\alpha {M}\), where \(M = \sup _{{{\textbf {x}}}\in B} {\widehat{\text {d}}}({{\textbf {x}}})^{1-\alpha }\). In conclusion, for \(\kappa = 2\kappa _{B}\max \{M,1\}\), we have

    $$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa {\widehat{\text {d}}}({{\textbf {x}}})^\alpha , \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$

    That is, a uniform Hölderian error bound holds with exponent \(\alpha \).

  2. (b)

    If \( \mathcal {F}= {{\mathcal {F}}}_{\beta }\), with \(\beta \in \mathbb {R}\), then the one-step facial residual function is given by Corollary 4.7, that is, we have

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t) := \epsilon + \kappa _{{{\textbf {z}}},t}\sqrt{2} \epsilon ^{1/2}, \end{aligned}$$

    Then, following the same argument as in the second half of item (a), we conclude that a uniform Hölderian error bound holds with exponent 1/2.

  3. (c)

    If \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\), the one-step facial residual functions are given by Corollary 4.11 and they depend on \({{\textbf {z}}}\). Since \( \mathcal {F}= {{\mathcal {F}}}_{\infty }\), we must have \(z_x = 0\) and \(z_z > 0\), see Sect. 4.1.1. The deciding factor is whether \(z_y\) is positive or zero. If \(z_y > 0\), then we have the following one-step facial residual function:

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= (1+2 \kappa _{{{\textbf {z}}},t})\epsilon , \end{aligned}$$

    where \(\kappa _{{{\textbf {z}}},t}\) is defined as in (3.16). In this case, analogously to items (a) and (b) we have a Lipschitzian error bound. If \(z_y = 0\), we have

    $$\begin{aligned} \psi _{ \mathcal {K},{{\textbf {z}}}}(\epsilon ,t):= \epsilon + \kappa _{{{\textbf {z}}},t}\mathfrak {g}_\infty (2\epsilon ), \end{aligned}$$

    where \(\mathfrak {g}_\infty \) is as in (4.39). Analogous to the proof of item (a) but making use of (4.40) in place of (4.15), we conclude that for every bounded set B, there exists \(\kappa _B > 0\) such that

    $$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_\infty ({\widehat{\text {d}}}({{\textbf {x}}}))), \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$
    (4.45)
  4. (d)

    See [34, Proposition 27].

\(\square \)

Remark 4.14

(Tightness of Theorem 4.13) We will argue that Theorem 4.13 is tight by showing that for every situation described in item (iii), there is a specific choice of \(\mathcal {L}\) and a sequence \(\{{{\textbf {w}}}^k\}\) in \(\mathcal {L}\backslash K_{\exp }\) with \(\text {d}({{\textbf {w}}}^k,K_{\exp }) \rightarrow 0\) along which the corresponding error bound for \(K_{\exp }\) and \(\mathcal {L}\) is off by at most a multiplicative constant.

  1. (a)

    Let \(\mathcal {L}= \text {span}\,{{\mathcal {F}}}_{-\infty } = \{(x,y,z) \mid y = 0 \}\) (see (4.5)) and consider the sequence \(\{{{\textbf {w}}}^k\}\) where \({{\textbf {w}}}^k = ((1/(k+1))\ln (k+1),0,1)\), for every \(k \in \mathbb {N}\). Then, \(\mathcal {L}\cap K_{\exp }= {{\mathcal {F}}}_{-\infty }\) and we are under the conditions of item (iii)(a) of Theorem 4.13. Since \(\{{{\textbf {w}}}^k\} =: B \subseteq \mathcal {L}\), there exists \(\kappa _B > 0\) such that

    $$\begin{aligned} \text {d}\left( {{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }\right) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })), \quad \forall k \in \mathbb {N}. \end{aligned}$$

    Then, the projection of \({{\textbf {w}}}^k\) onto \({{\mathcal {F}}}_{-\infty }\) is given by (0, 0, 1). Therefore,

    $$\begin{aligned} \frac{\ln (k+1)}{k+1} = \text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp }) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })). \end{aligned}$$

    Let \({{\textbf {v}}}^k = ((1/(k+1))\ln (k+1),1/(k+1),1)\) for every k. Then, we have \({{\textbf {v}}}^k \in K_{\exp }\). Therefore, \(\text {d}({{\textbf {w}}}^k, K_{\exp }) \le 1/(k+1)\). In view of the definition of \(\mathfrak {g}_{-\infty }\) (see (4.14)), we conclude that for large enough k we have

    $$\begin{aligned} \frac{\ln (k+1)}{k+1} = \text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp }) \le \kappa _{B}\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k, K_{\exp })) \le \kappa _B\frac{\ln (k+1)}{k+1}. \end{aligned}$$

    Thus, it holds that for all sufficiently large k,

    $$\begin{aligned} 1\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \le \kappa _B. \end{aligned}$$

    Consequently, for any given nonnegative function \(\mathfrak {g}:\mathbb {R}_+\rightarrow \mathbb {R}_+\) such that \(\lim _{t\downarrow 0}\frac{\mathfrak {g}(t)}{\mathfrak {g}_{-\infty }(t)}=0\), we have upon noting \(\text {d}({{\textbf {w}}}^k,K_{\exp })\rightarrow 0\) that

    $$\begin{aligned} \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}(\text {d}({{\textbf {w}}}^k,K_{\exp }))} = \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))}\frac{\mathfrak {g}_{-\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))}{\mathfrak {g}(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \rightarrow \infty , \end{aligned}$$

    which shows that the choice of \(\mathfrak {g}_{-\infty }\) in the error bound is tight.

  2. (b)

    Let \(\beta \in \mathbb {R}\) and let \(\widehat{{{\textbf {z}}}}\), \(\widehat{{{\textbf {p}}}}\) and \(\widehat{{{\textbf {f}}}}\) be as in (4.24). Let \(\mathcal {L}= \{{{\textbf {z}}}\}^\perp \) with \(z_x < 0\) such that \(K_{\exp }\cap \mathcal {L}= {{\mathcal {F}}}_{\beta }\). We are then under the conditions of item (iii)(b) of Theorem 4.13. We consider the following sequences

    $$\begin{aligned} {{\textbf {v}}}^k = \begin{bmatrix} 1-\beta +1/k\\ 1\\ e^{1-\beta + 1/k} \end{bmatrix},\quad {{\textbf {w}}}^k = P_{\{{{\textbf {z}}}\}^\perp }{{\textbf {v}}}^k,\quad {{\textbf {u}}}^k = P_{{{\mathcal {F}}}_\beta }{{\textbf {w}}}^k. \end{aligned}$$

    For every k we have \({{\textbf {v}}}^k \in \partial K_{\exp }\setminus {{\mathcal {F}}}_{\beta }\), and \({{\textbf {v}}}^k \ne {{\textbf {w}}}^k\) (because otherwise, we would have \({{\textbf {v}}}^k \in K_{\exp }\cap \{{{\textbf {z}}}\}^\perp = {{\mathcal {F}}}_{\beta }\)). In addition, we have \({{\textbf {v}}}^k \rightarrow \widehat{{{\textbf {f}}}}\) and, since \(\widehat{{{\textbf {f}}}}\in {{\mathcal {F}}}_{\beta }\), we have \({{\textbf {w}}}^k \rightarrow \widehat{{{\textbf {f}}}}\) as well. Next, notice that we have \(\langle \widehat{{{\textbf {f}}}}, {{\textbf {v}}}^k \rangle \ge 0\) for k sufficiently large and \(|v_x^k/v_y^k - (1-\beta )| \rightarrow 0\). Then, following the computations outlined in case (I) of the proof of Theorem 4.6 and letting \(\zeta _k{:}{=}v_x^k/v_y^k\), we have from (4.30) and (4.31) that \(h_2(\zeta _k)\ne 0\) for all large k (hence, \({{\textbf {w}}}^k\ne {{\textbf {u}}}^k\) for all large k), and that

    $$\begin{aligned} L_{\beta } {:}{=}\lim _{k \rightarrow \infty }\frac{\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}}{\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert }= & {} \lim _{k \rightarrow \infty }\frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert ^{\frac{1}{2}}}\frac{|h_1(\zeta _k)|^{\frac{1}{2}}}{|h_2(\zeta _k)|} \nonumber \\= & {} \frac{\Vert \widehat{{{\textbf {p}}}}\Vert }{\Vert \widehat{{{\textbf {z}}}}\Vert ^{\frac{1}{2}}}\frac{1}{\sqrt{2}(e^{\beta -1} + (\beta ^2+1)e^{1-\beta })} \in (0,\infty ),\nonumber \\ \end{aligned}$$
    (4.46)

    where the latter equality is from (4.32). On the other hand, from item (iii)(b) of Theorem 4.13, for \(B {:}{=}\{{{\textbf {w}}}^k\}\), there exists \(\kappa _B > 0\) such that for all \(k\in \mathbb {N}\),

    $$\begin{aligned} \Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \text {d}({{\textbf {w}}}^k,K_{\exp })^{\frac{1}{2}} \le \kappa _B\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}. \end{aligned}$$

    However, from (4.46), for large enough k, we have \(\Vert {{\textbf {w}}}^k-{{\textbf {u}}}^k\Vert \ge 1/(2L_{\beta })\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}\). Therefore, for large enough k we have

    $$\begin{aligned} \frac{1}{2L_{\beta }}\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}} \le \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp })\le \kappa _B \text {d}({{\textbf {w}}}^k,K_{\exp })^{\frac{1}{2}} \le \kappa _B\Vert {{\textbf {w}}}^k-{{\textbf {v}}}^k\Vert ^{\frac{1}{2}}. \end{aligned}$$

    Consequently, it holds that for all large enough k,

    $$\begin{aligned} \frac{1}{2L_\beta }\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\text {d}({{\textbf {w}}}^k,K_{\exp })^\frac{1}{2}} \le \kappa _B. \end{aligned}$$

    Arguing similarly as in case (a), we can also conclude that the choice of \(|\cdot |^\frac{1}{2}\) in the error bound is tight.

  3. (c)

    Let \({{\textbf {z}}}= (0,0,1)\) and \(\mathcal {L}= \{(x,y,0) \mid x,y \in \mathbb {R}\} = \{{{\textbf {z}}}\}^\perp \). Then, from (4.12), we have \(\mathcal {L}\cap K_{\exp }= {{\mathcal {F}}}_{\infty }\). We are then under case (iii)(c) of Theorem 4.13. Because there is no \({\hat{{{\textbf {z}}}}} \in \mathcal {L}^\perp \) with \({\hat{z}} _y > 0\), we have a log-type error bound as in (4.45). We proceed as in item (a) using sequences such that \({{\textbf {w}}}^k=(-1,1/k,0)\), \({{\textbf {v}}}^k=(-1,1/k,(1/k)e^{-k})\), \({{\textbf {u}}}^k=(-1,0,0)\), for every k. Note that \({{\textbf {w}}}^k \in \mathcal {L}, {{\textbf {v}}}^k \in K_{\exp }\) and \({\text {P} }_{\negthinspace \negthinspace {{\mathcal {F}}}_\infty }({{\textbf {w}}}^k) = {{\textbf {u}}}^k\), for every k. Therefore, there exists \(\kappa _B > 0\) such that

    $$\begin{aligned} \frac{1}{k} = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))\le \kappa _{B}\mathfrak {g}_{\infty }\left( \frac{1}{ke^k}\right) , \quad \forall k \in \mathbb {N}. \end{aligned}$$

    In view of the definition of \(\mathfrak {g}_{\infty }\) (see (4.39)), there exists \(L > 0\) such that for large enough k we have

    $$\begin{aligned} \frac{1}{k} = \text {d}({{\textbf {w}}}^k, \mathcal {L}\cap K_{\exp }) \le \kappa _B \mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp })) \le \frac{L}{k}. \end{aligned}$$

    Consequently, it holds that for all large enough k,

    $$\begin{aligned} \frac{\kappa _B}{L}\le \frac{\text {d}({{\textbf {w}}}^k,\mathcal {L}\cap K_{\exp })}{\mathfrak {g}_{\infty }(\text {d}({{\textbf {w}}}^k,K_{\exp }))} \le \kappa _B. \end{aligned}$$

    Arguing similarly as in case (a), we conclude that the choice of \(\mathfrak {g}_{\infty }\) is tight.

Note that a Lipschitz error bound is always tight up to a constant, because \(\text {d}({{\textbf {x}}}, \mathcal {K}\cap (\mathcal {L}+{{\textbf {a}}})) \ge \max \{\text {d}({{\textbf {x}}}, \mathcal {K}),\text {d}({{\textbf {x}}},\mathcal {L}+{{\textbf {a}}})\}\). Therefore, the error bounds in items (ii), (iii)(d) and in the first half of (iii)(c) are tight.

Sometimes we may need to consider direct products of multiple copies of \(K_{\exp }\) in order to model certain problems, i.e., our problem of interest could have the following shape:

$$\begin{aligned} \text {find} \quad {{\textbf {x}}}\in (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}, \end{aligned}$$

where \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) is a direct product of m exponential cones.

Fortunately, we already have all the tools required to extend Theorem 4.13 and compute error bounds for this case too. We recall that the faces of a direct product of cones are direct products of the faces of the individual cones.Footnote 11 Therefore, using Proposition 3.13, we are able to compute all the necessary one-step facial residual functions for \( \mathcal {K}\). Once they are obtained we can invoke Theorem 3.8. Unfortunately, there is quite a number of different cases one must consider, so we cannot give a concise statement of an all-encompassing tight error bound result.

We will, however, given an error bound result under the following simplifying assumption of non-exceptionality or SANE.

Assumption 4.15

(SANE: simplifying assumption of non-exceptionality) Suppose (Feas) is feasible with \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) being a direct product of m exponential cones. We say that \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy the simplifying assumption of non-exceptionality (SANE) if there exists a chain of faces \( \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\) as in Proposition 3.2 with \(\ell - 1 = {d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})}\) such that for all i, the exceptional face \({{\mathcal {F}}}_{\infty }\) of \(K_{\exp }\) never appears as one of the blocks of \( \mathcal {F}_{i}\).

Remark 4.16

(SANE is not unreasonable) In many modelling applications of the exponential cone presented in [37, Chapter 5], translating to our notation, the \({{\textbf {y}}}\) variable is fixed to be 1 in (4.1). For example, the hypograph of the logarithm function “\(x \le \ln (z)\)” can be represented as the constraint “\((x,y,z) \in K_{\exp }\cap (\mathcal {L}+{{\textbf {a}}})\)”, where \(\mathcal {L}+{{\textbf {a}}}= \{(x,y,z) \mid y = 1\}\). Because the y variable is fixed to be 1, the feasible region does not intersect the 2D face \({{\mathcal {F}}}_{-\infty }\) nor its subfaces \({{\mathcal {F}}}_{\infty }\) and \({{\mathcal {F}}}_{ne}\). In particular, SANE is satisfied. More generally, if \( \mathcal {K}\) is a direct product of exponential cones and the affine space \(\mathcal {L}+{{\textbf {a}}}\) is such that the \({{\textbf {y}}}\) components of each block are fixed positive constants, then \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy SANE.

On the other hand, problems involving the relative entropy \(D(x,y) {:}{=}x \ln (x/y)\) are often modelled as “minimize t” subject to “\((-t,x,y) \in K_{\exp }\)” and additional constraints. We could also have sums so that the problem is of the form “minimize \(\sum t_i\)” subject to “\((-t_i,x_i,y_i) \in K_{\exp }\)” and additional constraints. In those cases, it seems that it could happen that SANE is not satisfied.

Under SANE, we can state the following result.

Theorem 4.17

(Error bounds for direct products of exponential cones) Suppose (Feas) is feasible with \( \mathcal {K}= K_{\exp }\times \cdots \times K_{\exp }\) being a direct product of m exponential cones. Then the following hold.

  1. (i)

    The distance to the PPS condition of \(\{ \mathcal {K}, \mathcal {L}+{{\textbf {a}}}\}\) satisfies \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}}) \le m\).

  2. (ii)

    If SANE is satisfied, then \( \mathcal {K}\) and \(\mathcal {L}+{{\textbf {a}}}\) satisfy a uniform Hölderian error bound with exponent \(2^{-d_{\text {PPS}}(K_{\exp },\mathcal {L}+{{\textbf {a}}})}\).

Proof

  1. (i):

    All proper faces of \(K_{\exp }\) are polyhedral, therefore \(\ell _{\text {poly}}(K_{\exp }) = 1\). By item  of Proposition 3.2, there exists a chain of length \(\ell \) satisfying item  of Proposition 3.2 such that \(\ell -1 \le m\). Therefore, \(d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})\le \ell -1 \le m\).

  2. (ii):

    If SANE is satisfied, then there exists a chain \( \mathcal {F}_{\ell } \subsetneq \cdots \subsetneq \mathcal {F}_1 = \mathcal {K}\) of length \(\ell \le m +1\) as in Proposition 3.2, together with the corresponding \({{\textbf {z}}}_{1},\ldots ,{{\textbf {z}}}_{\ell -1}\). Also, the exceptional face \({{\mathcal {F}}}_{\infty }\) never appears as one of the blocks of the \( \mathcal {F}_i\).

In what follows, for simplicity, we define

$$\begin{aligned} {\widehat{\text {d}}}({{\textbf {x}}}) {:}{=}\max \{\text {d}({{\textbf {x}}},\mathcal {L}+{{\textbf {a}}}), \text {d}({{\textbf {x}}}, \mathcal {K}) \}. \end{aligned}$$

Then, we invoke Theorem 3.8, which implies that given a bounded set B, there exists a constant \(\kappa _B > 0\) such that

$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}\right) \le \kappa _B ({\widehat{\text {d}}}({{\textbf {x}}})+\varphi ({\widehat{\text {d}}}({{\textbf {x}}}),M)), \end{aligned}$$
(4.47)

where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \) and there are two cases for \(\varphi \). If \(\ell = 1\), \(\varphi \) is the function such that \(\varphi (\epsilon ,M) = \epsilon \). If \(\ell \ge 2\), we have \(\varphi = \psi _{{\ell -1}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), where \(\psi _{i}\) is a (suitable positively rescaled shift of a) one-step facial residual function for \( \mathcal {F}_{i}\) and \({{\textbf {z}}}_i\). In the former case, the PPS condition is satisfied, we have a Lipschitzian error bound and we are done. We therefore assume that the latter case occurs with \(\ell - 1 = {d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})}\).

First, we compute the one-step facial residual functions for each \( \mathcal {F}_i\). In order to do that, we recall that each \( \mathcal {F}_{i}\) is a direct product \( \mathcal {F}_{i}^1\times \cdots \times \mathcal {F}_{i}^m\) where each \( \mathcal {F}_{i}^j\) is a face of \(K_{\exp }\), excluding \({{\mathcal {F}}}_{\infty }\) by SANE. Therefore, a one-step facial residual function for \( \mathcal {F}_{i}^j\) can be obtained from Corollary 4.44.7 or 4.12. In particular, taking the worstFootnote 12 case in consideration, and taking the maximum of the facial residual functions, there exists a nonnegative monotone nondecreasing function \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \(\psi \) given by

$$\begin{aligned} \psi (\epsilon ,t) {:}{=}\rho (t) \epsilon +\rho (t)\sqrt{\epsilon } \end{aligned}$$

is a one-step facial residual function for each \( \mathcal {F}_{i}^j\). In what follows, in order to simplify the notation, we define \({\hat{\mathfrak {g}}}(t) {:}{=}\sqrt{t}\). Also, for every j, we use \({\hat{\mathfrak {g}}}_j\) to denote the composition of \({\hat{\mathfrak {g}}}\) with itself j-times, i.e.,

$$\begin{aligned} {\hat{\mathfrak {g}}}_j = \underbrace{{\hat{\mathfrak {g}}}\circ \cdots \circ {\hat{\mathfrak {g}}}}_{j \text { times}}; \end{aligned}$$
(4.48)

and we set \({\hat{\mathfrak {g}}}_0\) to be the identity map.

Using the above notation and Proposition 3.13, we conclude the existence of a nonnegative monotone nondecreasing function \(\sigma : \mathbb {R}_+ \rightarrow \mathbb {R}_+\) such that the function \(\psi _{i}\) given by

$$\begin{aligned} \psi _{i}(\epsilon ,t) {:}{=}\sigma (t)\epsilon + \sigma (t){\hat{\mathfrak {g}}}{(\epsilon )} \end{aligned}$$

is a one-step facial residual function for \( \mathcal {F}_i\) and \({{\textbf {z}}}_i\). Therefore, for \({{\textbf {x}}}\in B\), we have

$$\begin{aligned} \psi _{i}(\epsilon ,\Vert {{\textbf {x}}}\Vert ) \le \sigma (M)\epsilon + \sigma (M){\hat{\mathfrak {g}}}{(\epsilon )} = \psi _{i}(\epsilon ,M), \end{aligned}$$
(4.49)

where \(M = \sup _{{{\textbf {x}}}\in B} \Vert {{\textbf {x}}}\Vert \).

Next we are going to make a series of arguments related to the following informal principle: over a bounded set only the terms \({\hat{\mathfrak {g}}}_j\) with largest j matter. We start by noting that for any \({{\textbf {x}}}\in B\) and any \(0\le k\le j\le \ell \),

$$\begin{aligned} {\hat{\mathfrak {g}}}_k({\widehat{\text {d}}}({{\textbf {x}}})) = {\widehat{\text {d}}}({{\textbf {x}}})^{2^{-k}} = {\widehat{\text {d}}}({{\textbf {x}}})^{(2^{-k} - 2^{-j})}{\widehat{\text {d}}}({{\textbf {x}}})^{2^{-j}} \le {\hat{\kappa }}_{j,k}{\widehat{\text {d}}}({{\textbf {x}}})^{2^{-j}} \le {\hat{\kappa }}{\hat{\mathfrak {g}}}_j({\widehat{\text {d}}}({{\textbf {x}}})),\nonumber \\ \end{aligned}$$
(4.50)

where \({\hat{\kappa }}_{j,k}:= \sup _{x\in B}{\widehat{\text {d}}}({{\textbf {x}}})^{(2^{-k} - 2^{-j})} < \infty \) because \({{\textbf {x}}}\mapsto {\widehat{\text {d}}}({{\textbf {x}}})^{(2^{-k} - 2^{-j})}\) is continuous, and \({\hat{\kappa }} := \max _{0\le k\le j\le \ell }{\hat{\kappa }}_{j,k}\).

Now, let \(\varphi _j {:}{=}\psi _{{j}}\diamondsuit \cdots \diamondsuit \psi _{{1}}\), where \(\diamondsuit \) is the diamond composition defined in (3.3). We will show by induction that for every \(j \le \ell -1\) there exists \(\kappa _j\) such that

$$\begin{aligned} \varphi _j({\widehat{\text {d}}}({{\textbf {x}}}),M) \le \kappa _j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})), \qquad \forall {{\textbf {x}}}\in B. \end{aligned}$$
(4.51)

For \(j = 1\), it follows directly from (4.49) and (4.50). Now, suppose that the claim is valid for some j such that \(j+1 \le \ell -1\). By the inductive hypothesis, we have

$$\begin{aligned} \varphi _{j+1}({\widehat{\text {d}}}({{\textbf {x}}}),M)&= \psi _{j+1}({\widehat{\text {d}}}({{\textbf {x}}})+ \varphi _{j}({\widehat{\text {d}}}({{\textbf {x}}}),M),M) \nonumber \\&\le \psi _{j+1}({\widehat{\text {d}}}({{\textbf {x}}})+ \kappa _j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})),M) \nonumber \\&\le \psi _{j+1}(\tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})),M), \end{aligned}$$
(4.52)

where \(\tilde{\kappa }_j {:}{=}2\max \{{\hat{\kappa }},\kappa _j\}\) and the last inequality follows from (4.50). Then, we plug \(\epsilon = \tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}}))\) in (4.49) to obtain

$$\begin{aligned} \psi _{j+1}(\tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})),M)&= \sigma (M)\tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})) + \sigma (M){\hat{\mathfrak {g}}}(\tilde{\kappa }_j{\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}}))) \nonumber \\&= \sigma (M) \tilde{\kappa }_j {\hat{\mathfrak {g}}}_{j}({\widehat{\text {d}}}({{\textbf {x}}})) + \sigma (M) \sqrt{\tilde{\kappa }_j} {\hat{\mathfrak {g}}}_{j+1}({\widehat{\text {d}}}({{\textbf {x}}})) \nonumber \\&\le \sigma (M)(\tilde{\kappa }_j{\hat{\kappa }} + \sqrt{\tilde{\kappa }_j}){\hat{\mathfrak {g}}}_{j+1}({\widehat{\text {d}}}({{\textbf {x}}})) , \end{aligned}$$
(4.53)

where the last inequality follows from (4.50). Combining (4.52) and (4.53) concludes the induction proof. In particular, (4.51) is valid for \(j = \ell -1\). Then, taking into account some positive rescaling and shifting (see (3.2)) and adjusting constants, from (4.47), (4.51) and (4.50) we deduce that there exists \(\kappa > 0\) such that

$$\begin{aligned} \text {d}\left( {{\textbf {x}}}, (\mathcal {L}+ {{\textbf {a}}}) \cap \mathcal {K}\right) \le \kappa {\hat{\mathfrak {g}}}_{\ell -1}({\widehat{\text {d}}}({{\textbf {x}}})), \qquad \forall {{\textbf {x}}}\in B \end{aligned}$$

with \({\hat{\mathfrak {g}}}_{\ell -1}\) as in (4.48). To complete the proof, we recall that \({d_{\text {PPS}}( \mathcal {K},\mathcal {L}+{{\textbf {a}}})} = \ell -1\). \(\square \)

Remark 4.18

(Variants of Theorem 4.17) Theorem 4.17 is not tight and admits variants that are somewhat cumbersome to describe precisely. For example, the \(\mathfrak {g}_{-\infty }\) function was not taken into account explicitly but simply “relaxed" to \(t\mapsto \sqrt{t}\).

Going for greater generality, we can also drop the SANE assumption altogether and try to be as tight as our analysis permits when dealing with possibly inSANE instances. Although there are several possibilities one must consider, the overall strategy is the same as outlined in the proof of Theorem 4.17: invoke Theorem 3.8, fix a bounded set B, pick a chain of faces as in Proposition 3.2 and upper bound the diamond composition of facial residual function as in (4.51). Intuitively, whenever sums of function compositions appear, only the “higher” compositions matter. However, the analysis must consider the possibility of \(\mathfrak {g}_{-\infty }\) or \(\mathfrak {g}_{\infty }\) appearing. After this is done, it is just a matter to plug this upper bound into (4.47).

We conclude this subsection with an application. In [11], among other results, the authors showed that when a Hölderian error bound holds, it is possible to derive the convergence rate of several algorithms from the exponent of the error bound. As a consequence, Theorems 4.13 and 4.17 allow us to apply some of their results (e.g., [11, Corollary 3.8]) to the conic feasibility problem with exponential cones, whenever a Hölderian error bound holds. For non-Hölderian error bounds appearing in Theorem 4.13, different techniques are necessary, such as the ones discussed in [33] for deriving convergence rates under more general error bounds.

4.4 Miscellaneous odd behavior and connections to other notions

In this final subsection, we collect several instances of pathological behaviour that can be found inside the facial structure of the exponential cone.

Example 4.19

(Hölderian bounds and the non-attainment of admissible exponents) We recall Definition 2.1 and we consider the special case of two closed convex sets \(C_1,C_2\) with non-empty intersection. We say that \(\gamma \in (0,1]\) is an admissible exponent for \(C_1, C_2\) if \(C_1\) and \(C_2\) satisfy a uniform Hölderian error bound with exponent \(\gamma \). It turns out that the supremum of the set of admissible exponents is not itself admissible. In particular, if \(C_1 = K_{\exp }\) and \(C_2 = \text {span}\,{{\mathcal {F}}}_{-\infty }\), then we see from Corollary 4.3 that \(C_1 \cap C_2 = {{\mathcal {F}}}_{-\infty }\) and that \(C_1\) and \(C_2\) satisfy a uniform Hölderian error bound for all \(\gamma \in (0,1)\); however, in view of the sequence constructed in Remark 4.14(a), the exponent cannot be chosen to be \(\gamma = 1\).

In fact, from Theorem 4.13 and Remark 4.14(a), \(C_1\) and \(C_2\) satisfy an entropic error bound which is tight and is, in a sense, better than any Hölderian error bound with \(\gamma \in (0,1)\) but worse than a Lipschitzian error bound.

Example 4.20

(Non-Hölderian error bound) The facial structure of \(K_{\exp }\) can be used to derive an example of two sets that provably do not have a Hölderian error bound. Let \(C_1 = K_{\exp }\) and \(C_2 = \{{{\textbf {z}}}\}^\perp \), where \(z_x=z_y = 0\) and \(z_z=1\) so that \(C_1\cap C_2={{\mathcal {F}}}_\infty \). Then, for every \(\eta > 0\) and every \(\alpha \in (0,1]\), there is no constant \(\kappa > 0\) such that

$$\begin{aligned} \text {d}({{\textbf {x}}},{{\mathcal {F}}}_\infty ) \le \kappa \max \{\text {d}({{\textbf {x}}},K_{\exp })^\alpha , \text {d}({{\textbf {x}}},\{{{\textbf {z}}}\}^\perp )^\alpha \}, \qquad \forall \ {{\textbf {x}}}\in B(\eta ). \end{aligned}$$

This is because if there were such a positive \(\kappa \), the infimum in Lemma 4.9 would be positive, which it is not. This shows that \(C_1\) and \(C_2\) do not have a Hölderian error bound. However, as seen in Theorem 4.10, \(C_1\) and \(C_2\) have a log-type error bound. In particular if \({{\textbf {q}}}\in B(\eta )\), using (2.1), (2.2) and Theorem 4.10, we have

$$\begin{aligned} \text {d}({{\textbf {q}}}, {{\mathcal {F}}}_\infty )&\le \text {d}({{\textbf {q}}},\{{{\textbf {z}}}\}^\perp ) + \text {d}(P_{\{{{\textbf {z}}}\}^\perp }({{\textbf {q}}}),{{\mathcal {F}}}_\infty ) \nonumber \\&\le \text {d}({{\textbf {q}}},\{{{\textbf {z}}}\}^\perp ) + \max \{2,2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\mathfrak {g}_\infty (\text {d}(P_{\{{{\textbf {z}}}\}^\perp }({{\textbf {q}}}),K_{\exp })) \nonumber \\&\le {\widehat{\text {d}}}({{\textbf {q}}}) + \max \{2,2\gamma _{{{\textbf {z}}},\eta }^{-1}\}\mathfrak {g}_\infty (2{\widehat{\text {d}}}({{\textbf {q}}})) , \end{aligned}$$
(4.54)

where \({\widehat{\text {d}}}({{\textbf {q}}}) {:}{=}\max \{\text {d}({{\textbf {q}}},K_{\exp }),\text {d}({{\textbf {q}}},\{{{\textbf {z}}}\}^\perp ) \}\) and in the last inequality we used the monotonicity of \(\mathfrak {g}_\infty \).

Let \(C_1, \cdots , C_m\) be closed convex sets having nonempty intersection and let \(C {:}{=}\cap _{i=1}^m C_i\). Following [33], we say that \(\varphi : \mathbb {R}_+\times \mathbb {R}_+ \rightarrow \mathbb {R}_+ \) is a consistent error bound function (CEBF) for \(C_1, \ldots , C_m\) if the following inequality holds

$$\begin{aligned} \text {d}({{\textbf {x}}},\, C) \le \varphi \left( \max _{1 \le i \le m}\text {d}({{\textbf {x}}}, C_i), \, \Vert {{\textbf {x}}}\Vert \right) \ \ \ \forall \ {{\textbf {x}}}\in \mathcal {E}; \end{aligned}$$

and the following technical conditions are satisfied for every \(a,b\in \mathbb {R}_+\): \(\varphi (\cdot ,b)\) is monotone nondecreasing, right-continuous at 0 and \(\varphi (0,b) = 0\); \(\varphi (a,\cdot )\) is mononotone nondecreasing. CEBFs are a framework for expressing error bounds and can be used in the convergence analysis of algorithms for convex feasibility problems, see [33, Sects. 3 and 4]. For example, \(C_1,\ldots , C_m\) satisfy a Hölderian error bound (Definition 2.1) if and only if these sets admit a CEBF of the format \(\varphi (a,b) {:}{=}\rho (b)\max \{a,a^{\gamma (b)}\}\), where \(\rho :\mathbb {R}_+ \rightarrow \mathbb {R}_+\) and \(\gamma :\mathbb {R}_+ \rightarrow (0,1]\) are monotone nondecreasing functions [33, Theorem 3.4].

We remark that in Example 4.20, although the sets \(C_1, C_2\) do not satisfy a Hölderian error bound, the log-type error bound displayed therein is covered under the framework of consistent error bound functions. This is because \(\mathfrak {g}_\infty \) is a continuous monotone nondecreasing function and \(\gamma _{{{\textbf {z}}},\eta }^{-1}\) is monotone nondecreasing as a function of \(\eta \) (Remark 3.11). Therefore, in view of (4.54), the function given by \(\varphi (a,b) {:}{=}a + \max \{2,2\gamma _{{{\textbf {z}}},b}^{-1}\}\mathfrak {g}_\infty (2a)\) is a CEBF for \(C_1\) and \(C_2\).

By the way, it seems conceivable that many of our results in Sect. 3.1 can be adapted to derive CEBFs for arbitrary convex sets. Specifically, Lemma 3.9, Theorem 3.10, and Lemma 3.12 only rely on convexity rather than on the more specific structure of cones.

Next, we will see that we can also adapt Examples 4.19 and 4.20 to find instances of odd behavior of the so-called Kurdyka-Łojasiewicz (KL) property [1, 2, 8,9,10, 30]. First, we recall some notations and definitions. Let \(f: \mathbb {R}^n\rightarrow \mathbb {R}\cup \{+\infty \}\) be a proper closed convex extended-real-valued function. We denote by \(\text {dom}\partial f\) the set of points for which the subdifferential \(\partial f({{\textbf {x}}})\) is non-empty and by \([a< f < b]\) the set of \({{\textbf {x}}}\) such that \(a< f({{\textbf {x}}}) < b\). As in [10, Sect. 2.3], we define for \(r_0\in (0,\infty )\) the set

$$\begin{aligned} {{\mathcal {K}}}(0,r_0) := \{\phi \in C[0,r_0)\cap C^1(0,r_0)\;|\; \phi \text{ is } \text{ concave }, \ \phi (0) = 0, \ \phi '(r) > 0\ \forall r\in (0,r_0)\}. \end{aligned}$$

Let \(B({{\textbf {x}}},\epsilon )\) denote the closed ball of radius \(\epsilon > 0\) centered at \({{\textbf {x}}}\). With that, we say that f satisfies the KL property at \({{\textbf {x}}}\in \text {dom}\partial f\) if there exist \(r_0 \in (0,\infty )\), \(\epsilon > 0\) and \(\phi \in \mathcal {K}(0,r_0)\) such that for all \({{\textbf {y}}}\in B({{\textbf {x}}},\epsilon ) \cap [f({{\textbf {x}}})< f < f({{\textbf {x}}}) + r_0 ]\) we have

$$\begin{aligned} \phi '(f({{\textbf {y}}})-f({{\textbf {x}}}))\text {d}(0,\partial f({{\textbf {y}}})) \ge 1. \end{aligned}$$

In particular, as in [30], we say that f satisfies the KL property with exponent \(\alpha \in [0,1)\) at \({{\textbf {x}}}\in \text {dom}\partial f\), if \(\phi \) can be taken to be \(\phi (t) = ct^{1-\alpha }\) for some positive constant c. Next, we need a result which is a corollary of [10, Theorem 5].

Proposition 4.21

Let \(C_1, C_2 \subseteq \mathbb {R}^n\) be closed convex sets with \(C_1 \cap C_2 \ne \emptyset \). Define \(f: \mathbb {R}^n \rightarrow \mathbb {R}\) as

$$\begin{aligned} f({{\textbf {y}}}) = \text {d}({{\textbf {y}}},C_1)^2 + \text {d}({{\textbf {y}}},C_2)^2. \end{aligned}$$

Let \({{\textbf {x}}}\in C_1\cap C_2\), \(\gamma \in (0,1]\). Then, there exist \(\kappa > 0\) and \(\epsilon > 0 \) such that

$$\begin{aligned} \text {d}({{\textbf {y}}}, C_1\cap C_2) \le \kappa \max \{\text {d}({{\textbf {y}}},C_1),\text {d}({{\textbf {y}}},C_2)\}^\gamma ,\qquad \forall {{\textbf {y}}}\in B({{\textbf {x}}},\epsilon ) \end{aligned}$$
(4.55)

if and only if f satisfies the KL property with exponent \(1-\gamma /2\) at \({{\textbf {x}}}\).

Proof

Note that \(\inf f = 0\) and \(\text {argmin}f = C_1\cap C_2\). Furthermore, (4.55) is equivalent to the existence of \(\kappa ' > 0\) and \(\epsilon > 0\) such that

$$\begin{aligned} \text {d}({{\textbf {y}}}, \text {argmin}f) \le \varphi (f({{\textbf {y}}})),\qquad \forall {{\textbf {y}}}\in B({{\textbf {x}}},\epsilon ), \end{aligned}$$

where \(\varphi \) is the function given by \(\varphi (r) = \kappa ' r^{\gamma /2}\). With that, the result follows from [10, Theorem 5]. \(\square \)

Example 4.22

(Examples in the KL world) In Example 4.19, we have two sets \(C_1, C_2\) satisfying a uniform Hölderian error bound for \(\gamma \in (0,1)\) but not for \(\gamma = 1\). Because \(C_1\) and \(C_2\) are cones and the corresponding distance functions are positively homogeneous, this implies that for \({\textbf {0}} \in C_1 \cap C_2\), a Lipschitzian error bound never holds at any neighbourhood of \({\textbf {0}}\). That is, given \(\eta > 0\), there is no \(\kappa > 0\) such that

$$\begin{aligned} \text {d}({{\textbf {y}}}, C_1\cap C_2) \le \kappa \max \{\text {d}({{\textbf {y}}},C_1),\text {d}({{\textbf {y}}},C_2)\},\qquad \forall {{\textbf {y}}}\in B(\eta ) \end{aligned}$$

holds. Consequently, the function f in Proposition 4.21 satisfies the KL property with exponent \(\alpha \) for any \(\alpha \in (1/2,1)\) at the origin, but not for \(\alpha = 1/2\). To the best of our knowledge, this is the first explicitly constructed function in the literature such that the infimum of KL exponents at a point is not itself a KL exponent.

Similarly, from Example 4.20 we obtain \(C_1,C_2\) for which (4.55) does not hold for \({\textbf {0}} \in C_1\cap C_2\) with any chosen \(\kappa ,\varepsilon >0,\;\gamma \in \left( 0,1 \right] \). Thus from Proposition 4.21 we obtain a function f that does not satisfy the KL property with exponent \(\beta \in [1/2,1)\) at the origin. Since a function satisfying the KL property with exponent \(\alpha \in [0,1)\) at an \({{\textbf {x}}}\in \text {dom}\partial f\) necessarily satisfies it with exponent \(\beta \) for any \(\beta \in [\alpha ,1)\) at \({{\textbf {x}}}\), we see that this f does not satisfy the KL property with any exponent at the origin. On passing, we would like to point out that there are functions known in the literature that fail to satisfy the KL property; e.g., [9, Example 1].

5 Concluding remarks

In this work, we presented an extension of the results of [34] and showed how to obtain error bounds for conic linear systems using one-step facial residual functions and facial reduction (Theorem 3.8) even when the underlying cone is not amenable. Related to facial residual functions, we also developed techniques that aid in their computation; see Sect. 3.1. Finally, all techniques and results developed in Sect. 3 were used in some shape or form in order to obtain error bounds for the exponential cone in Sect. 4. Our new framework unlocks analysis for cones not reachable with the techniques developed in [34]; these include cones that are not facially exposed, as well as cones for which the projection operator has no simple closed form or is only implicitly specified. These were, until now, significant barriers against error bound analysis for many cones of interest.

As future work, we are planning to use the techniques developed in this paper to analyze and obtain error bounds for some of these other cones that have been previously unapproachable. Potential examples include the cone of \(n\times n\) completely positive matrices and its dual, the cone of \(n\times n\) copositive matrices. The former is not facially exposed when \(n\ge 5\) (see [53]) and the latter is not facially exposed when \(n \ge 2\). It would be interesting to clarify how far error bound problems for these cones can be tackled by our framework. Or, more ambitiously, we could try to obtain some of the facial residual functions and some error bound results. Of course, a significant challenge is that their facial structure is not completely understood, but we believe that even partial results for general n or complete results for specific values of n would be relevant and, possibly, quite non-trivial. Finally, as suggested by one of the reviewers, our framework may be enriched by investigating further geometric interpretations of the key quantity \(\gamma _{{{\textbf {z}}},\eta }\) in (3.15), beyond Fig. 2. For instance, it will be interesting to see whether the positivity of \(\gamma _{{{\textbf {z}}},\eta }\) is related to some generalization of the angle condition in [38], which was originally proposed for the study of Lipschitz error bounds.