1 Introduction

Hoffman constants for systems of linear inequalities, and more general error bounds for feasibility problems, play a central role in mathematical programming. In particular, Hoffman constants provide a key building block for the convergence of a variety of algorithms [1, 3, 10, 11, 13, 23]. Since Hoffman’s seminal work [7], Hoffman constants and more general error bounds has been widely studied [2, 4, 6, 12, 14, 18, 24, 25]. However, there has been very limited work on algorithmic procedures that compute or bound Hoffman constants. The only two references that appear to tackle this computational challenge are the 1995 article by Klatte and Thiere [9] and the more recent 2021 article by Peña, Vera, and Zuluaga [16]. However, as it is discussed in both [9] and [16], there are limitations on the algorithmic schemes proposed in both these articles.

The central goal of this paper is to devise a procedure that computes an upper bound on the following homogeneous Hoffman constant \(H_0(A)\). Suppose \(A\in {\mathbb R}^{m\times n}\). Let \(P:=\{x:Ax\le 0\}\) and define \(H_0(A)\) as

$$\begin{aligned} H_0(A):=\sup _{u\in {\mathbb R}^n \setminus P} \frac{{{\,\textrm{dist}\,}}(u,P)}{{{\,\textrm{dist}\,}}(Au, {\mathbb R}^m_-)}. \end{aligned}$$

For notational convenience, by convention let \(H_0(A):= 0\) when \(P={\mathbb R}^n\). This occurs precisely when \(A = 0\).

To position this work in the context of Hoffman constants, we next recall the local and global Hoffman constants H(Ab) and H(A) associated to linear systems of inequalities defined by A. The homogeneous Hoffman constant \(H_0(A)\) is a special case of the following local Hoffman constant H(Ab). Suppose \(A\in {\mathbb R}^{m\times n}\) and \(b\in A{\mathbb R}^n + {\mathbb R}^m_+\). Let \(P_A(b):=\{x\in {\mathbb R}^n: Ax\le b\}\) and define H(Ab) as

$$\begin{aligned} H(A,b):=\sup _{u \in {\mathbb R}^n\setminus P_A(b)} \frac{{{\,\text {dist}\,}}(u,P_A(b))}{{{\,\text {dist}\,}}(Au-b, {\mathbb R}^m_-)}. \end{aligned}$$

It is evident that \(H_0(A) = H(A,0)\) and thus \(H_0(A)\) is bounded above by the following global Hoffman constant H(A). Suppose \(A\in {\mathbb R}^{m\times n}\). Define

$$\begin{aligned} H(A):=\sup _{b\in A{\mathbb R}^n + {\mathbb R}^m_+} H(A,b). \end{aligned}$$

In his seminal paper [7], Hoffman showed that H(A) is finite and consequently so are \(H_0(A)\) and H(Ab) for all \(b\in A{\mathbb R}^n + {\mathbb R}^m_+\).

The articles [9, 16] propose algorithms to compute or estimate the global Hoffman constant H(A). These algorithms readily yield a computational procedure to bound \(H_0(A)\). However, as it is detailed in [9, 16], except for very special cases the computation or even approximation of H(A) is an extremely challenging problem. Indeed, the recent results in [15] show that the Stewart-Todd condition measure \(\chi (A)\) [20, 21] is the same as \(H(\textbf{A})\) where \(\textbf{A} = \begin{bmatrix} A\\ -A \end{bmatrix}\). Since the quantity \(\chi (A)\) is known to be NP-hard to approximate [8], so is H(A). The computation of the (non-homogeneous) local Hoffman constant H(Ab), as discussed in [2, 25], also poses similar computational challenges. In sharp contrast, the procedure proposed in this paper for upper bounding the more specialized Hoffman constant \(H_0(A)\) is entirely tractable and easily implementable for any \(A\in {\mathbb R}^{m\times n}\). The bound is a formalization of the following three-step approach detailed in Sect. 2.

First, upper bound \(H_0(A)\) in the following two special cases:

  1. (i)

    When \(A\hat{x} < 0\) for some \(\hat{x} \in {\mathbb R}^n\) or equivalently when \(A^{\textsf{T}}y = 0, y\ge 0 \Rightarrow y =0\). (See Proposition 1.)

  2. (ii)

    When \(A^{\textsf{T}}\hat{y} = 0\) for some \(\hat{y} > 0\) or equivalently when \(Ax\le 0\Rightarrow Ax=0\). (See Proposition 2.)

Second, use a canonical partition \(A = \begin{bmatrix} A_B\\ A_N \end{bmatrix}\) of the rows of A such that \(A_N\) is as in case (i) and \(A_B\) is as in case (ii) above. (See Proposition 3.)

Third, upper bound \(H_0(A)\) by stitching together the Hoffman constants \(H_0(A_B)\), \(H_0(A_N),\) and a third Hoffman constant \(\mathcal {H}(L,K)\) associated to the intersection of the subspace \(L:=\{x: A_Bx = 0\}\) and the cone \(K:=\{x: A_N x \le 0\}\). (See Theorem 1.)

The above steps suggest the following computational procedure to upper bound \(H_0(A)\): First, compute the partition BN. Second, compute upper bounds on \(H_0(A_B)\) and on \(H_0(A_N)\). Third, upper bound \(\mathcal {H}(L,K)\). Section 3 details this procedure. As explained in Sect. 3, the total computational work in the entire procedure consists of two linear programs, two quadratic programs, a convex program, and a singular value calculation, all of which are computationally tractable. This is noteworthy in light of the challenges associated to estimating the Hoffman constants H(A) and H(Ab). A Python implementation and some illustrative examples of this procedure are publicly available at https://github.com/javi-pena.

For ease of notation and computability, we assume throughout the paper that the norm in \({\mathbb R}^m\) satisfies the following componentwise compatibility condition: if \(y,z\in {\mathbb R}^m\) and \(|y|\le |z|\) componentwise then \(\Vert y\Vert \le \Vert z\Vert \). The componentwise compatibility condition in particular implies that for all \(u\in {\mathbb R}^n\)

$$\begin{aligned} {{\,\textrm{dist}\,}}(Au,{\mathbb R}^n_-) = \Vert (Au)^+\Vert \end{aligned}$$

where \((Au)^+ = \max \{Au,0\}\) componentwise. Consequently,

$$\begin{aligned} H_0(A)=\sup _{u\in {\mathbb R}^n \setminus P} \frac{{{\,\textrm{dist}\,}}(u,P)}{\Vert (Au)^+\Vert }. \end{aligned}$$

Observe that most of the usual norms in \({\mathbb R}^m\), including the \(\ell _p\) norms for \(1\le p \le \infty \) satisfy the componentwise compatibility condition.

We conclude this introduction by highlighting that our developments for bounding \(H_0(A)\) rely critically on the features of homogeneous systems of inequalities. In contrast to non-homogeneous systems of inequalities and more general affine cone inclusions, homogeneous systems of inequalities and more general homogeneous affine cone inclusions possess a number of attractive properties as discussed in [5, 17, 19, 22]. In particular, although it is tempting to conjecture that a bound on the non-homogeneous Hoffman constant H(Ab) could be obtained from some \(H_0(A_b)\) via homogenization, that is not the case as we next detail. Indeed, consider the natural homogenization \(A_b z \le 0\) of the system of inequalities \(Ax\le b\) where

$$\begin{aligned} A_b:= \begin{bmatrix} A&{}{}-b\\ 0&{}{}-1 \end{bmatrix}, \; z:= \begin{bmatrix} x\\ t \end{bmatrix}.\end{aligned}$$

The following example shows that H(Ab) cannot be bounded above by any reasonable multiple of \(H_0(A_b)\). Suppose \(0< \epsilon < 1\) and let

$$\begin{aligned} A = \begin{bmatrix} 1 &{} \epsilon \\ -1 &{} \epsilon \\ 0&{} -1 \end{bmatrix}, \; b = \begin{bmatrix} 1\\ 1\\ 0 \end{bmatrix}. \end{aligned}$$

Then

$$\begin{aligned} A_b = \begin{bmatrix} 1 &{} \epsilon &{} -1\\ -1 &{} \epsilon &{} -1\\ 0&{} -1 &{} 0 \\ 0&{}0&{}-1 \end{bmatrix}. \end{aligned}$$

For ease of computation, suppose all relevant spaces are endowed with the infinite norm. Hence the remarks following Proposition 1 below imply that \(H_0(A_b) \le 1/(1-\epsilon )\le 2\). On the other hand, \(H(A,b) \ge 1/\epsilon \) because \(Ax \le b\) implies that \(x_2\le 1/\epsilon \) and thus for \(u =\begin{bmatrix} 0,2/\epsilon \end{bmatrix}^{\textsf {T}}\) we have \(\Vert (Au-b)^+\Vert _{\infty }=1\) but \(\Vert u-x\Vert _{\infty }\ge 1/\epsilon = 1/\epsilon \cdot \Vert (Au-b)^+\Vert _{\infty }\) for any x such that \(Ax \le b\). Since this holds for any \(0<\epsilon < 1\), it follows that H(Ab) cannot be bounded above in terms of \(H_0(A_b)\).

2 Upper bounds on \(H_0(A)\)

2.1 Upper bounds on \(H_0(A)\) in two special cases

We next consider two special cases that can be seen as dual counterparts of each other.

Proposition 1

Suppose \(A\in {\mathbb R}^{m\times n}\) and \(A\hat{x} < 0\) for some \(\hat{x}\in {\mathbb R}^n\) or equivalently \(A^{\textsf{T}}y = 0, y \ge 0 \Rightarrow y=0\). Then

$$\begin{aligned} H_0(A) \le \max _{\begin{array}{c} y\in {\mathbb R}^m \\ \Vert y\Vert \le 1 \end{array}} \min _{\begin{array}{c} x\in {\mathbb R}^n\\ Ax\le y \end{array}} \Vert x\Vert . \end{aligned}$$
(1)

Proof

For ease of notation, let H denote the right-hand side expression in (1), that is,

$$\begin{aligned} H:= \max _{\begin{array}{c} y\in {\mathbb R}^m\\ \Vert y\Vert \le 1 \end{array}} \min _{\begin{array}{c} x\in {\mathbb R}^n\\ Ax\le y \end{array}} \Vert x\Vert = \max _{y\in {\mathbb R}^m\setminus \{0\}} \min _{\begin{array}{c} x\in {\mathbb R}^n\\ Ax\le y \end{array}} \frac{\Vert x\Vert }{\Vert y\Vert }. \end{aligned}$$

Observe that \(H < +\infty \) because the assumption on A implies that \(A{\mathbb R}^n + {\mathbb R}^m_+ = {\mathbb R}^m\).

We need to show that \(H_0(A) \le H\). To that end, let \(P:=\{x\in {\mathbb R}^n: Ax \le 0\}\) and suppose that \(u \in {\mathbb R}^n{\setminus } P\). Let \(y:=(Au)^+ \in {\mathbb R}^m\). The construction of H implies that there exists \(x \in {\mathbb R}^n\) such that \(Ax\le -y\) and \(\Vert x\Vert \le H \cdot \Vert y\Vert = H \cdot \Vert (Au)^+\Vert \). Thus \(x+u \in P\) because

$$\begin{aligned} A(x+u) = Ax + Au \le -y + Au = -(Au)^+ + Au\le 0. \end{aligned}$$

Furthermore \(\Vert (x+u) - u\Vert = \Vert x\Vert \le H \cdot \Vert (Au)^+\Vert \). Since this holds for all \(u\in {\mathbb R}^n\setminus P\), it follows that \(H_0(A) \le H\).

In addition to the simple direct proof above, an alternative proof of Proposition 1 can also be obtained from [16]. Indeed, [16, Proposition 2] implies that when \(A\in {\mathbb R}^{m\times n}\) satisfies the assumption in Proposition 1, the right-hand side in (1) is precisely the global Hoffman constant H(A) which is at least as large as \(H_0(A)\) as previously noted.

For computational purposes, it is useful to note that when \({\mathbb R}^m\) is endowed with the \(\ell _\infty \) norm, the upper bound in Proposition 1 can be computed via the following convex optimization problem:

$$\begin{aligned} \min \{ \Vert x\Vert : Ax \ge \textbf{1}\}. \end{aligned}$$

In particular, any \(\bar{x} \in {\mathbb R}^n\) such that \(A\bar{x} \ge \textbf{1}\) yields the upper bound

$$\begin{aligned} H_0(A) \le \Vert \bar{x}\Vert . \end{aligned}$$

The following proposition, which can be seen as a dual counterpart of Proposition 1, relies on the dual norms in \({\mathbb R}^m\) and \({\mathbb R}^n\). More precisely, suppose both \({\mathbb R}^m\) and \({\mathbb R}^n\) are endowed with their canonical inner products. In each case let \(\Vert \cdot \Vert ^*\) denote the norm defined as

$$\begin{aligned} \Vert u\Vert ^* = \max _{\Vert x\Vert \le 1} \left\langle u , x \right\rangle . \end{aligned}$$

Proposition 2

Suppose \(A\in {\mathbb R}^{m\times n}\) is such that \(A^{\textsf{T}}\hat{y} = 0\) for some \(\hat{y} >0 \) or equivalently \(Ax \le 0 \Rightarrow Ax = 0\). Then

$$\begin{aligned} H_0(A) \le \max _{\begin{array}{c} v \in A^{\textsf{T}}({\mathbb R}^m) \\ \Vert v\Vert ^*\le 1 \end{array}} \min _{y\in {\mathbb R}^m_+, A^{\textsf{T}}y = v} \Vert y\Vert ^*. \end{aligned}$$
(2)

Proof

We shall assume that \(A\ne 0\) as otherwise \(H_0(A) = 0\) and (2) trivially holds. Again for ease of notation, let H denote the right-hand side expression in (2), that is,

$$\begin{aligned} H:= \max _{\begin{array}{c} v \in A^{\textsf{T}}({\mathbb R}^m) \\ \Vert v\Vert ^*\le 1 \end{array}} \min _{y\in {\mathbb R}^m_+, A^{\textsf{T}}y = v} \Vert y\Vert ^* = \max _{\begin{array}{c} v \in A^{\textsf{T}}({\mathbb R}^m) \\ v \ne 0 \end{array}} \min _{y\in {\mathbb R}^m_+, A^{\textsf{T}}y = v} \frac{\Vert y\Vert ^*}{\Vert v\Vert ^*}. \end{aligned}$$

Observe that \(H < +\infty \) because the assumption on A implies that \(A^{\textsf{T}}{\mathbb R}^m_+= A^{\textsf{T}}{\mathbb R}^m\).

We need to show that \(H_0(A) \le H\). To that end, let \(P:=\{x\in {\mathbb R}^n: Ax \le 0\}= \{x \in {\mathbb R}^n: Ax = 0\}\) and suppose that \(u \in {\mathbb R}^n{\setminus } P\). Let

$$\begin{aligned} \bar{x}:= \mathop {\hbox {arg min}}\limits _{x \in P} \Vert u-x\Vert = \mathop {\hbox {arg min}}\limits _{x: Ax=0} \Vert u-x\Vert . \end{aligned}$$

The optimality conditions of the latter problem imply that there exists \(v\in A^{\textsf{T}}{\mathbb R}^m\) with \(\Vert v\Vert ^*=1\) such that

$$\begin{aligned} \Vert u-\bar{x}\Vert = \left\langle v , u-\bar{x} \right\rangle . \end{aligned}$$

The construction of H implies that there exists \(y \in {\mathbb R}^m_+\) such that \(A^{\textsf{T}}y =v\) and \(\Vert y\Vert ^* \le H.\) Since \(v = A^{\textsf{T}}y\) we have

$$\begin{aligned} \Vert u-\bar{x}\Vert = \left\langle v , u-\bar{x} \right\rangle = \left\langle A^{\textsf{T}}y , u-\bar{x} \right\rangle = \left\langle y , A(u-\bar{x}) \right\rangle =\left\langle y , Au \right\rangle . \end{aligned}$$

In addition, since \(y \in {\mathbb R}^m_+\) and \(\Vert y\Vert ^* \le H\), we also have

$$\begin{aligned} \Vert u-\bar{x}\Vert = \left\langle y , Au \right\rangle \le \left\langle y , (Au)^+ \right\rangle \le \Vert y\Vert ^* \cdot \Vert (Au)^+\Vert \le H \cdot \Vert (Au)^+\Vert . \end{aligned}$$

Since this holds for all \(u\in {\mathbb R}^n\setminus P\), it follows that \(H_0(A) \le H\).

For computational purposes, it is useful to note that when \({\mathbb R}^m\) is endowed with the \(\ell _\infty \) norm, the upper bound in Proposition 2 can be computed as follows

$$\begin{aligned} \max _{\begin{array}{c} v\in A^{\textsf{T}}({\mathbb R}^m) \\ \Vert v\Vert ^*\le 1 \end{array}} \min _{\begin{array}{c} y\in {\mathbb R}^m_+\\ A^{\textsf{T}}y = v \end{array}} \textbf{1}^{\textsf{T}}y. \end{aligned}$$

The reciprocal of the latter quantity in turn is the radius of the largest ball in \(A^{\textsf{T}}({\mathbb R}^m)\) centered at 0 and contained in the set

$$\begin{aligned} \{A^{\textsf{T}}y: y \in {\mathbb R}^m_+, \textbf{1}^{\textsf{T}}y = 1\} = \{A^{\textsf{T}}y: y \in {\mathbb R}^m_+, \textbf{1}^{\textsf{T}}y \le 1\}. \end{aligned}$$

Therefore, if in addition \({\mathbb R}^n\) is endowed with the \(\ell _2\) norm then any \(\bar{y}\in {\mathbb R}^m_{++}\) with \(\textbf{1}^{\textsf{T}}\bar{y} =1\) and \(A^{\textsf{T}}\bar{y} = 0\) yields the upper bound

$$\begin{aligned} H_0(A) \le \frac{2}{\sigma _{\min }^+(A^{\textsf{T}}\bar{Y})}, \end{aligned}$$
(3)

where \(\bar{Y} = \text {Diag}(\bar{y})\) and \(\sigma _{\min }^+(A^{\textsf{T}}\bar{Y})\) denotes the smallest positive singular value of \(A^{\textsf{T}}\bar{Y}\). To see why (3) holds, observe that if \(v \in A^{\textsf{T}}{\mathbb R}^m\) and \(\Vert v\Vert _2 \le \frac{\sigma _{\min }^+(A^{\textsf{T}}\bar{Y})}{2}\) then \(2v = A^{\textsf{T}}\bar{Y} z\) for some \(\Vert z\Vert _2\le 1\). The latter implies that \(|\bar{Y}z| \le \bar{y}\) componentwise and thus \(2v = A^{\textsf{T}}(\bar{y} + \bar{Y}z)\) with

$$\begin{aligned} \bar{y} + \bar{Y}z \in {\mathbb R}^m_+ \text { and } \textbf{1}^{\textsf{T}}(\bar{y}+\bar{Y}z) \le 2\cdot \textbf{1}^{\textsf{T}}\bar{y} =2. \end{aligned}$$

In particular, \(v\in \{A^{\textsf{T}}y: y\in {\mathbb R}^m_+, \textbf{1}^{\textsf{T}}y \le 1\}\). Since this holds for any \(v\in A^{\textsf{T}}{\mathbb R}^m\) with \(\Vert v\Vert _2 \le \frac{\sigma _{\min }^+(A^{\textsf{T}}\bar{Y})}{2}\), it follows that the radius of the largest ball in \(A^{\textsf{T}}({\mathbb R}^m)\) centered at 0 and contained in the set

$$\begin{aligned} \{A^{\textsf{T}}y: y \in {\mathbb R}^m_+, \textbf{1}^{\textsf{T}}y = 1\} = \{A^{\textsf{T}}y: y \in {\mathbb R}^m_+, \textbf{1}^{\textsf{T}}y \le 1\}. \end{aligned}$$

is at least \(\frac{\sigma _{\min }^+(A^{\textsf{T}}\bar{Y})}{2}\).

2.2 Upper bound on \(H_0(A)\) for general A

An upper bound on H(A) for general \(A\in {\mathbb R}^{m\times n}\) follows by stitching together the cases in the above two propositions via the the canonical partition result in Proposition 3 and the additional Hoffman constant \(\mathcal {H}(L,K)\) defined in (4) below.

The following result is a consequence of the classical Goldman-Tucker partition theorem. To make our exposition self-contained, we include a proof.

Proposition 3

Let \(A\in {\mathbb R}^{m\times n}\). There exists a unique partition \(B\cup N = \{1,\dots ,m\}\) such that

$$\begin{aligned} A_B \hat{x} = 0, \; A_N\hat{x} < 0 \text { for some } \hat{x}\in {\mathbb R}^n \end{aligned}$$

and

$$\begin{aligned} A_B^{\textsf{T}}\hat{y}_B = 0 \text { for some } \hat{y}_B > 0. \end{aligned}$$

Proof

Let \(N\subseteq \{1,\dots ,m\}\) be the largest subset of \(\{1,\dots ,m\}\) such that

$$\begin{aligned} Ax \le 0 \text { and } A_Nx < 0 \end{aligned}$$

has a solution. In other words,

$$\begin{aligned} N:=\{i\in \{1,\dots ,m\}: Ax\le 0 \text { and } (Ax)_i < 0 \text { for some } x\in {\mathbb R}^n\}. \end{aligned}$$

Observe that N is well-defined and unique and thus so is \(B:=\{1,\dots ,m\}\setminus N\). Furthermore the construction of N implies that \(A_Bx= 0\) and \(A_Nx<0\) for some \(x\in {\mathbb R}^n\). Hence to finish the proof it suffices to show that

$$\begin{aligned} A_B^{\textsf{T}}y_B { = 0},\; y_B > 0 \end{aligned}$$

has a solution. To that end, for \(i\in \{1,\dots ,m\}\) let \(e_i\in {\mathbb R}^n\) is the vector with i-th component equal to one and all other equal to zero. Observe that \(i \in B\) if and only if the following system of equations and inequalities does not have a solution:

$$\begin{aligned} \begin{bmatrix} A&e_i \end{bmatrix}\begin{bmatrix} x\\ t \end{bmatrix} \le 0, \begin{bmatrix} 0&1 \end{bmatrix}\begin{bmatrix} x\\ t \end{bmatrix} > 0. \end{aligned}$$

Farkas Lemma thus implies that \(i\in B\) if and only if the following system of equations and inequalities has a solution:

$$\begin{aligned} \begin{bmatrix} A^{\textsf{T}}\\ e_i^{\textsf{T}} \end{bmatrix} y = \begin{bmatrix} 0 \\ 1 \end{bmatrix}, y \ge 0. \end{aligned}$$

Since this holds for each \(i\in B\), it follows that \(A_B^{\textsf{T}}y_B =0, y_B > 0\) has a solution.

We should note that, depending on A, the set N in Proposition 3 could be any subset of \(\{1,\dots ,m\}\). In particular, \(N = \emptyset \) if \(A^{\textsf{T}}y = 0\) for some \(y > 0\), and \(N = \{1,\dots ,m\}\) if \(Ax<0\) for some \(x\in \mathbb {R}^n\). For instance, \(N = \emptyset \) if \(A = \begin{bmatrix} 1\\ -1 \end{bmatrix}\) and \(N = \{1,2\}\) if \(A = \begin{bmatrix} 1\\ 1 \end{bmatrix}\).

Suppose \(L\subseteq {\mathbb R}^n\) is a linear subspace and \(K \subseteq {\mathbb R}^n\) is a closed convex cone. Let

$$\begin{aligned} \mathcal {H}(L,K) = \sup _{u \in {\mathbb R}^n\setminus L\cap K} \frac{{{\,\textrm{dist}\,}}(u,L\cap K)}{\max \{{{\,\textrm{dist}\,}}(u,L),{{\,\textrm{dist}\,}}(u,K)\}}, \end{aligned}$$
(4)

with the convention that \(\mathcal {H}(L,K) =0\) when \(L\cap K = {\mathbb R}^n\).

In the remainder of this paper, we will use the following notation for \(A\in {\mathbb R}^{m\times n}\): Let BN denote the canonical partition defined by A as in Proposition 3 and let \(L\subseteq {\mathbb R}^n, K\subseteq {\mathbb R}^n\) be defined as

$$\begin{aligned} L:=\{x: A_B x = 0\}, K:= \{x: A_N x \le 0\}, \end{aligned}$$

with the convention that \(L = {\mathbb R}^n\) if \(B=\emptyset \) and \(K = {\mathbb R}^n\) if \(N=\emptyset \).

Observe that L is a linear subspace, K is a closed convex cone, and \(\{x: Ax\le 0\} = L\cap K\). We now have all the necessary ingredients to upper bound \(H_0(A)\).

Theorem 1

Suppose \(A\in {\mathbb R}^{m\times n}\) and the norm in \({\mathbb R}^m\) satisfies the componentwise compatibility condition. Let BN and LK be as above. Then

$$\begin{aligned} H_0(A) \le \mathcal {H}(L,K) \cdot \max \{H_0(A_N),H_0(A_B)\}. \end{aligned}$$
(5)

Proof

Suppose \(u\in {\mathbb R}^n \setminus P\). The construction of \(\mathcal {H}(\cdot ,\cdot )\) and \(H_0(\cdot ),\) and the componentwise compatibility condition imply that there exists \(x \in P=L\cap K\) such that

$$\begin{aligned} \Vert x-u\Vert&\le \mathcal {H}(L,K)\cdot \max \{{{\,\textrm{dist}\,}}(u,L),{{\,\textrm{dist}\,}}(u,K)\} \\&\le \mathcal {H}(L,K) \cdot \max \{H_0(A_B)\cdot \Vert (A_Bu)^+\Vert ,H_0(A_N)\cdot \Vert (A_Nu)^+\Vert \} \\&\le \mathcal {H}(L,K) \cdot \max \{H_0(A_B),H_0(A_N)\}\cdot \Vert (Au)^+\Vert . \end{aligned}$$

Since this holds for all \(u\in {\mathbb R}^n\setminus P\), the inequality in (5) follows.

Observe that unlike \(H_0(A)\) that depends on the data representation \(A\in {\mathbb R}^{m\times n}\) of the cone \(P=\{x: Ax\le 0\}\), the constant \(\mathcal {H}(L,K)\) only depends on the sets \(L\subseteq {\mathbb R}^n\) and \(K\subseteq {\mathbb R}^n\). In particular, \(\mathcal {H}(L,K)\) does not depend on the norm in \({\mathbb R}^m\) while \(H_0(A)\) evidently does.

The next proposition provides an upper bound on \(\mathcal {H}(L,K)\) analogous to the upper bounds on \(H_0(A)\) in Propositions 1 and  2. It will be useful for the computational procedure in Sect. 3.

Proposition 4

Suppose \(L\subseteq {\mathbb R}^n\) is a linear subspace and \(K \subseteq {\mathbb R}^n\) is a closed convex cone. Then

$$\begin{aligned} \mathcal {H}(L,K) \le 1 + 2\cdot \max _{\begin{array}{c} u\in {\mathbb R}^n\\ \Vert u\Vert \le 1 \end{array}} \min _{\begin{array}{c} x\in L, y\in K\\ x-y = u \end{array}} \Vert x\Vert . \end{aligned}$$

Proof

To ease notation, let

$$\begin{aligned} H:=\max _{\begin{array}{c} u\in {\mathbb R}^n\\ \Vert u\Vert \le 1 \end{array}} \min _{\begin{array}{c} x\in L, y\in K\\ x-y = u \end{array}} \Vert x\Vert . \end{aligned}$$

We need to show that \(\mathcal {H}(L,K) \le 1 + 2H.\) To that end, suppose \(u \in {\mathbb R}^n{\setminus } L\cap K\). Let \(u_L:=\mathop {\hbox {arg min}}\limits _v\{\Vert u-v\Vert : v \in L\}\) and \(u_K:=\mathop {\hbox {arg min}}\limits _v\{\Vert u-v\Vert : v \in K\}\). The construction of H implies that there exist \(x\in L, y\in K\) such that \(\Vert x\Vert \le H\cdot \Vert u_K-u_L\Vert \) and \( x-y = u_K-u_L. \) Hence \( u_L+x = u_K+y \in L\cap K \) and

$$\begin{aligned} {{\,\textrm{dist}\,}}(u,L\cap K)&\le \Vert u-u_L-x\Vert \\&\le \Vert u-u_L\Vert + \Vert x\Vert \\&\le \Vert u-u_L\Vert + H\cdot \Vert u_K-u_L\Vert \\&\le \max \{{{\,\textrm{dist}\,}}(u,L),{{\,\textrm{dist}\,}}(u,K)\} + H \cdot ({{\,\textrm{dist}\,}}(u,K) + {{\,\textrm{dist}\,}}(u,L))\\&\le (1+2 H) \cdot \max \{{{\,\textrm{dist}\,}}(u,L),{{\,\textrm{dist}\,}}(u,K)\}. \end{aligned}$$

Since this holds for any \(u\in {\mathbb R}^n\setminus L\cap K\), it follows that

$$\begin{aligned} \mathcal {H}(L,K) \le 1+ 2H. \end{aligned}$$

For computational purposes, it is useful to note that if \(\bar{x} \in L \cap \textrm{int}(K)\) is such that \(\bar{x} + u \in K\) for all \(\Vert u\Vert \le 1\) then Proposition 4 implies that

$$\begin{aligned} \mathcal {H}(L,K) \le 1 + 2\Vert \bar{x}\Vert . \end{aligned}$$

3 A computable procedure to bound \(H_0(A)\)

We next describe a procedure to compute an upper bound on \(H_0(A)\). The procedure consists of four main steps. First, compute the partition BN. Second, compute an upper bound on \(H_0(A_B)\). Third, compute an upper bound on \(H_0(A_N)\). Fourth, compute an upper bound on \(\mathcal {H}(L,K)\). An upper bound on \(H_0(A)\) thereby follows from Theorem 1. For computational convenience, throughout this section we assume that \({\mathbb R}^m\) is endowed with the \(\ell _\infty \) norm and \({\mathbb R}^n\) is endowed with the \(\ell _2\) norm. A Python implementation and some illustrative examples of this procedure are publicly available at https://github.com/javi-pena

3.1 Step 1: Partition BN

The partition BN can be obtained from any point (xyst) that satisfies the following systems of equations and inequalities for some \(t > 0\):

$$\begin{aligned} \begin{array}{rl} &{}A^{\textsf{T}}y = 0\\ &{} Ax +s = 0 \\ &{} y + s -t\textbf{1} \ge 0 \\ &{}\textbf{1}^{\textsf{T}}y + \textbf{1}^{\textsf{T}}s = 1\\ &{}y \ge 0, s \ge 0. \end{array} \end{aligned}$$
(6)

More precisely, if (xyst) satisfies (6) with \(t>0\) then BN can be obtained as follows:

$$\begin{aligned} B:=\{i: y_i>0\}, \; \; N:=\{i:s_i > 0\}. \end{aligned}$$

Proposition 3 guarantees that a solution (xyst) to (6) with \(t>0\) always exists and that the associated partition BN is unique. Such a point (xyst) can be computed via the following linear program:

$$\begin{aligned} \begin{array}{rl} \displaystyle \max _{x,y,s,t} &{} t \\ &{}A^{\textsf{T}}y = 0\\ &{} Ax +s = 0 \\ &{} y + s -t\textbf{1} \ge 0 \\ &{}\textbf{1}^{\textsf{T}}y + \textbf{1}^{\textsf{T}}s = 1\\ &{}y \ge 0, s \ge 0. \end{array} \end{aligned}$$
(7)

3.2 Step 2: Upper bound on \(H_0(A_N)\)

Suppose \(N \ne \emptyset \) as otherwise \(H_0(A_N) = 0\). The remarks following Proposition 1 show that

$$\begin{aligned} H_0(A_N)\le \Vert \bar{x}\Vert _2 \end{aligned}$$

for any \(\bar{x}\in {\mathbb R}^n\) such that \(A_N\bar{x} \ge \textbf{1}\). The best such upper bound can be computed via the following quadratic program

$$\begin{aligned} \bar{x}:= \mathop {\hbox {arg min}}\limits \{ \Vert x\Vert ^2_2: A_Nx \ge \textbf{1}\}. \end{aligned}$$
(8)

3.3 Step 3: Upper bound on \(H_0(A_B)\)

Suppose \(B \ne \emptyset \) as otherwise \(H_0(A_B) = 0\). The remarks following Proposition 2 show that

$$\begin{aligned} H_0(A_B) \le \frac{2}{\sigma _{\min }^+(A_B^{\textsf{T}}\bar{Y})} \end{aligned}$$

for any \(\bar{y}\in {\mathbb R}^B_{++}\) such that \(\textbf{1}_B^{\textsf{T}}\bar{y} = 1\) and \(A_B^{\textsf{T}}\bar{y} = 0\). Although the best such upper bound is challenging to compute, an upper bound of this kind that is within a factor of \(\sqrt{|B|}\) of the best possible one can be computed via the following convex program

$$\begin{aligned} \bar{y}:= \mathop {\hbox {arg min}}\limits _{y\in {\mathbb R}^B_{++}}\left\{ -\sum _{i\in B} \log (y_i): \textbf{1}_B^{\textsf{T}}y = 1, A_B^{\textsf{T}}y = 0\right\} . \end{aligned}$$
(9)

3.4 Step 4: Upper bound on \(\mathcal {H}(L,K)\)

Suppose both \(N \ne \emptyset \) and \(B \ne \emptyset \) as otherwise \(\mathcal {H}(L,K)=1\) or \(\mathcal {H}(L,K) = 0\). Let Q be an orthonormal basis for \(L:=\{x:A_Bx =0\}\) and \(M = DA_NQ\) where D is the diagonal matrix with positive diagonal entries such that all rows of \(DA_N\) have Euclidean norm equal to one. Then the remarks following Proposition 4 imply that

$$\begin{aligned} \mathcal {H}(L,K) \le 1+2\Vert \bar{z}\Vert _2 \end{aligned}$$

for any \(\bar{z} \ge 0\) such that \(M\bar{z} \ge \textbf{1}\). The best such upper bound can be computed via the following quadratic program

$$\begin{aligned} \bar{z}:= \mathop {\hbox {arg min}}\limits \{ \Vert z\Vert ^2_2: Mz \ge \textbf{1}\}. \end{aligned}$$
(10)

3.5 Putting it all together: A procedure to bound \(H_0(A)\)

Theorem 1 allows us to stitch together the partition BN and the upper bounds on \(H_0(A_B),\) \(H_0(A_N),\) and \(\mathcal {H}(L,K)\) to obtain an upper bound on \(H_0(A)\) as detailed in Algorithm 1 below.

Algorithm 1
figure a

Upper bound on \(H_0(A)\)