An easily computable upper bound on the Hoffman constant for homogeneous inequality systems

Let $A\in \mathbb{R}^{m\times n}\setminus \{0\}$ and $P:=\{x:Ax\le 0\}$. This paper provides a procedure to compute an upper bound on the following homogeneous Hoffman constant: \[ H_0(A) := \sup_{u\in \mathbb{R}^n \setminus P} \frac{\text{dist}(u,P)}{\text{dist}(Au, \mathbb{R}^m_-)}. \] In sharp contrast to the intractability of computing more general Hoffman constants, the procedure described in this paper is entirely tractable and easily implementable.


Introduction
Hoffman constants for systems of linear inequalities, and more general error bounds for feasibility problems, play a central role in mathematical programming.In particular, Hoffman constants provide a key building block for the convergence of a variety of algorithms [1,3,10,11,13,23].Since Hoffman's seminal work [7], Hoffman constants and more general error bounds has been widely studied [2,4,6,12,14,18,24,25].However, there has been very limited work on algorithmic procedures that compute or bound Hoffman constants.The only two references that appear to tackle this computational challenge are the 1995 article by Klatte and Thiere [9] and the more recent 2021 article by Peña, Vera, and Zuluaga [16].However, as it is discussed in both [9] and [16], there are limitations on the algorithmic schemes proposed in both these articles.
The central goal of this paper is to devise a procedure that computes an upper bound on the following homogeneous Hoffman constant H 0 (A).Suppose A ∈ R m×n .Let P := {x : Ax ≤ 0} and define H 0 (A) as .
For notational convenience, by convention let H 0 (A) := 0 when P = R n .This occurs precisely when A = 0.
To position this work in the context of Hoffman constants, we next recall the local and global Hoffman constants H(A, b) and H(A) associated to linear systems of inequalities defined by A. The homogeneous Hoffman constant H 0 (A) is a special case of the following local Hoffman constant .
It is evident that H 0 (A) = H(A, 0) and thus H 0 (A) is bounded above by the following global Hoffman constant H(A).Suppose A ∈ R m×n .Define In his seminal paper [7], Hoffman showed that H(A) is finite and consequently so are H 0 (A) and H(A, b) for all b ∈ AR n + R m + .The articles [9,16] propose algorithms to compute or estimate the global Hoffman constant H(A).These algorithms readily yield a computational procedure to bound H 0 (A).However, as it is detailed in [9,16], except for very special cases the computation or even approximation of H(A) is an extremely challenging problem.Indeed, the recent results in [15] show that the Stewart-Todd condition measure χ(A) [20,21] is the same as H(A) where A = A −A .Since the quantity χ(A) is known to be NP-hard to approximate [8], so is H(A).The computation of the (non-homogeneous) local Hoffman constant H(A, b), as discussed in [2,25], also poses similar computational challenges.In sharp contrast, the procedure proposed in this paper for upper bounding the more specialized Hoffman constant H 0 (A) is entirely tractable and easily implementable for any A ∈ R m×n .The bound is a formalization of the following three-step approach detailed in Section 2.
First, upper bound H 0 (A) in the following two special cases: (i) When Ax < 0 for some x ∈ R n or equivalently when A T y = 0, y ≥ 0 ⇒ y = 0. (See Proposition 1.) (ii) When A T ŷ = 0 for some ŷ > 0 or equivalently when Ax ≤ 0 ⇒ Ax = 0. (See Proposition 2.) Second, use a canonical partition A = A B A N of the rows of A such that A N is as in case (i) and A B is as in case (ii) above.(See Proposition 3.) Third, upper bound H 0 (A) by stitching together the Hoffman constants H 0 (A B ), H 0 (A N ), and a third Hoffman constant H(L, K) associated to the intersection of the subspace L := {x : A B x = 0} and the cone K := {x : A N x ≤ 0}.(See Theorem 1.) The above steps suggest the following computational procedure to upper bound H 0 (A): First, compute the partition B, N. Second, compute upper bounds on H 0 (A B ) and on H 0 (A N ).Third, upper bound H(L, K).Section 3 details this procedure.As explained in Section 3, the total computational work in the entire procedure consists of two linear programs, two quadratic programs, a convex program, and a singular value calculation, all of which are computationally tractable.This is noteworthy in light of the challenges associated to estimating the Hoffman constants H(A) and H(A, b).A Python implementation and some illustrative examples of this procedure are publicly available at https://github.com/javi-penaFor ease of notation and computability, we assume throughout the paper that the norm in R m satisfies the following componentwise compatibility condition: if y, z ∈ R m and |y| ≤ |z| componentwise then y ≤ z .The componentwise compatibility condition in particular implies that for all u ∈ R n dist(Au, R n − ) = (Au) + where (Au) + = max{Au, 0} componentwise.Consequently, Observe that most of the usual norms in R m , including the ℓ p norms for 1 ≤ p ≤ ∞ satisfy the componentwise compatibility condition.We conclude this introduction by highlighting that our developments for bounding H 0 (A) rely critically on the features of homogeneous systems of inequalities.In contrast to nonhomogeneous systems of inequalities and more general affine cone inclusions, homogeneous systems of inequalities and more general homogeneous affine cone inclusions possess a number of attractive properties as discussed in [5,17,19,22].In particular, although it is tempting to conjecture that a bound on the non-homogeneous Hoffman constant H(A, b) could be obtained from some H 0 (A b ) via homogenization, that is not the case as we next detail.Indeed, consider the natural homogenization A b z ≤ 0 of the system of inequalities Ax ≤ b where The following example shows that H(A, b) cannot be bounded above by any reasonable multiple of H 0 (A b ).Suppose 0 < ǫ < 1 and let For ease of computation, suppose all relevant spaces are endowed with the infinite norm.Hence the remarks following Proposition 1 below imply that H 0 (A b ) ≤ 1.On the other hand, H(A, b) ≥ 1/ǫ because Ax ≤ b implies that x 2 ≤ 1/ǫ and thus for x = 0, 2/ǫ we have for any x such that Ax ≤ b.Since this holds for any 0 < ǫ < 1, it follows that H(A, b) cannot be bounded above in terms of H 0 (A b ).
2 Upper bounds on H 0 (A) 2.1 Upper bounds on H 0 (A) in two special cases We next consider two special cases that can be seen as dual counterparts of each other.
Proof.For ease of notation, let H denote the right-hand side expression in (1), that is, Observe that H < +∞ because the assumption on A implies that AR n + R m + = R m .We need to show that H 0 (A) ≤ H.To that end, let P := {x ∈ R n : Ax ≤ 0} and suppose that u ∈ R n \ P .Let y := (Au) + ∈ R m .The construction of H implies that there exists x ∈ R n such that Ax ≤ −y and x ≤ H • y = H • (Au) + .Thus x + u ∈ P because Since this holds for all u ∈ R n \ P , it follows that H 0 (A) ≤ H.
In addition to the simple direct proof above, an alternative proof of Proposition 1 can also be obtained from [16].Indeed, [16, Proposition 2] implies that when A ∈ R m×n satisfies the assumption in Proposition 1, the right-hand side in (1) is precisely the global Hoffman constant H(A) which is at least as large as H 0 (A) as previously noted.
For computational purposes, it is useful to note that when R m is endowed with the ℓ ∞ norm, the upper bound in Proposition 1 can be computed via the following convex optimization problem: min{ x : Ax ≥ 1}.
In particular, any x ∈ R n such that Ax ≥ 1 yields the upper bound The following proposition, which can be seen as a dual counterpart of Proposition 1, relies on the dual norms in R m and R n .More precisely, suppose both R m and R n are endowed with their canonical inner products.In each case let • * denote the norm defined as Proof.We shall assume that A = 0 as otherwise H 0 (A) = 0 and (2) trivially holds.Again for ease of notation, let H denote the right-hand side expression in (2), that is, Observe that H < +∞ because the assumption on A implies that A T R m + = A T R m .We need to show that H 0 (A) ≤ H.To that end, let The optimality conditions of the latter problem imply that there exists The construction of H implies that there exists y ∈ R m + such that A T y = v and y * ≤ H. Since v = A T y we have In addition, since y ∈ R m + and y * ≤ H, we also have Since this holds for all u ∈ R n \ P , it follows that H 0 (A) ≤ H.
For computational purposes, it is useful to note that when R m is endowed with the ℓ ∞ norm, the upper bound in Proposition 2 can be computed as follows The reciprocal of the latter quantity in turn is the radius of the largest ball in A T (R m ) centered at 0 and contained in the set Therefore, if in addition R n is endowed with the ℓ 2 norm then any ȳ ∈ R m ++ with 1 T ȳ = 1 and A T ȳ = 0 yields the upper bound where Ȳ = Diag(ȳ) and σ + min (A T Ȳ ) denotes the smallest positive singular value of A T Ȳ .To see why (3) In particular, v ∈ {A T y : , it follows that the radius of the largest ball in A T (R m ) centered at 0 and contained in the set is at least .

Upper bound on H 0 (A) for general A
An upper bound on H(A) for general A ∈ R m×n follows by stitching together the cases in the above two propositions via the the canonical partition result in Proposition 3 and the additional Hoffman constant H(L, K) defined in (4) below.
The following result is a consequence of the classical Goldman-Tucker partition theorem.To make our exposition self-contained, we include a proof.Proposition 3. Let A ∈ R m×n .There exists a unique partition B ∪ N = {1, . . ., m} such that A B x = 0, A N x < 0 for some x ∈ R n and A T B ŷB = 0 for some ŷB > 0. Proof.Let N ⊆ {1, . . ., m} be the largest subset of {1, . . ., m} such that Ax ≤ 0 and A N x < 0 has a solution.In other words, Observe that N is well-defined and unique and thus so is B := {1, . . ., m} \ N. Furthermore the construction of N implies that Ax ≤ 0 and A N x < 0 for some x ∈ R n .Hence to finish the proof it suffices to show that has a solution.To that end, for i ∈ {1, . . ., m} let e i ∈ R n is the vector with i-th component equal to one and all other equal to zero.Observe that i ∈ B if and only if the following system of equations and inequalities does not have a solution: Farkas Lemma thus implies that i ∈ B if and only if the following system of equations and inequalities has a solution: Since this holds for each i ∈ B, it follows that A T B y B = 0, y B > 0 has a solution.We should note that, depending on A, the set N in Proposition 3 could be any subset of {1, . . ., m}.In particular, N = ∅ if A T y = 0 for some y > 0, and N = {1, . . ., m} if Ax = 0 for some x > 0. For instance, with the convention that H(L, K) = 0 when L ∩ K = R n .
In the remainder of this paper, we will use the following notation for A ∈ R m×n : Let B, N denote the canonical partition defined by A as in Proposition 3 and let L ⊆ R n , K ⊆ R n be defined as Observe that L is a linear subspace, K is a closed convex cone, and {x : Ax ≤ 0} = L∩K.We now have all the necessary ingredients to upper bound H 0 (A).
Theorem 1. Suppose A ∈ R m×n and the norm in R m satisfies the componentwise compatibility condition.Let B, N and L, K be as above.Then Proof.Suppose u ∈ R n \ P .The construction of H(•, •) and H 0 (•), and the componentwise compatibility condition imply that there exists x ∈ P = L ∩ K such that Since this holds for all u ∈ R n \ P , the inequality in (5) follows.
Observe that unlike H 0 (A) that depends on the data representation A ∈ R m×n of the cone P = {x : Ax ≤ 0}, the constant H(L, K) only depends on the sets L ⊆ R n and K ⊆ R n .In particular, H(L, K) does not depend on the norm in R m while H 0 (A) evidently does.
The next proposition provides an upper bound on H(L, K) analogous to the upper bounds on H 0 (A) in Proposition 1 and Proposition 2. It will be useful for the computational procedure in Section 3.
Proof.To ease notation, let We need to show that H(L, K) ≤ 1 + 2H.To that end, suppose u

Since this holds for any
For computational purposes, it is useful to note that if x ∈ L ∩ int(K) is such that x + u ∈ K for all u ≤ 1 then Proposition 4 implies that H(L, K) ≤ 1 + 2 x .

A computable procedure to bound H 0 (A)
We next describe a procedure to compute an upper bound on H 0 (A).The procedure consists of four main steps.First, compute the partition B, N. Second, compute an upper bound on H 0 (A B ). Third, compute an upper bound on H 0 (A N ).Fourth, compute an upper bound on H(L, K).An upper bound on H 0 (A) thereby follows from Theorem 1.For computational convenience, throughout this section we assume that R m is endowed with the ℓ ∞ norm and R n is endowed with the ℓ 2 norm.A Python implementation and some illustrative examples of this procedure are publicly available at https://github.com/javi-penaAlgorithm 1 Upper bound on H 0 (A) 1: input: A ∈ R m×n \ {0} 2: solve (7) to enough accuracy to get a solution (x, y, s, t) to ( 6

) with t > 0 3 :
Let B := {i : y i > 0}, N := {i :s i > 0} 4: if N = ∅ then 5: solve (8) to enough accuracy to get x ∈ R n such that A N x ≥ 1 6: end if 7: if B = ∅ then 8: solve (9) to enough accuracy to get ȳ ∈ R B ++ such that 1 T B ȳ = 1 and A T B ȳ = 0 9: end if 10: if N = ∅ then return the upper bound H 0 (A) ≤ 2 σ + min (A T B Ȳ )11: end if 12: if B = ∅ then return the upper bound H 0 (A) ≤ x 2 13: end if 14: let Q be an orthonormal basis for L := {x : A B x = 0} and M = DA N Q where D is the diagonal matrix with positive diagonal entries such that all rows of DA N have Euclidean norm equal to one 15: solve (10) to enough accuracy to get z such that M z ≥ 1 16: return the upper bound H 0 (A) ≤ (1 + 2 z 2 ) • max x 2 , 2 σ + min (A T B Ȳ )