Generating Valid Linear Inequalities for Nonlinear Programs via Sums of Squares

Valid linear inequalities are substantial in linear and convex mixed-integer programming. This article deals with the computation of valid linear inequalities for nonlinear programs. Given a point in the feasible set, we consider the task of computing a tight valid inequality. We reformulate this geometrically as the problem of finding a hyperplane which minimizes the distance to the given point. A characterization of the existence of optimal solutions is given. If the constraints are given by polynomial functions, we show that it is possible to approximate the minimal distance by solving a hierarchy of sum of squares programs. Furthermore, using a result from real algebraic geometry, we show that the hierarchy converges if the relaxed feasible set is bounded. We have implemented our approach, showing that our ideas work in practice.


Introduction
The problem we want to solve is the following: Given a subset S of R n and a point q in S, find a valid linear inequality for S which is as close as possible to q (a formal definition is given in Sect. 2). Our motivation stems from the fact that valid linear inequalities play an important role in solving mixed-integer linear and mixed-integer convex programs. It is thus a natural task to study valid inequalities for the more general class of mixed-integer nonlinear programs (MINLP). More specifically, we search for valid inequalities for the feasible set F I and its continuous relaxation F. We also consider the special case where we require the objective and constraint functions to be polynomials, which we refer to as mixed-integer polynomial programming (MIPP). To avoid unnecessary clutter, we state our results for the set S which can be thought of being equal to F or F I .
In mixed-integer linear and convex programming, one is interested in finding valid inequalities for F I . One reason for this interest is that the convex hull of F I can be described by finitely many valid inequalities for rational data in the mixed-integer linear case [1]. This result does not generalize to the convex case; however, in mixedinteger convex programming, a common solution approach is the generation of cuts. A cut is a valid linear inequality for the feasible set that is violated at some point of the relaxed feasible set. A second motivation to find valid linear inequalities is "polyhedrification", a special form of convexification [2], that is, outer approximation of the sets F and F I by polyhedra. Note that the meaning of outer approximation is twofold in the literature: It is the name of a celebrated solution method for a special class of MINLP [3,4], and also describes the process of relaxing a complicated set to a larger set that is easier to handle.
An early result on cuts in mixed-integer linear programming is the algorithmic generation of so-called Gomory cuts [5]. Later, it was shown that the repeated application of all Gomory-type cuts yields the convex hull of F I for linear integer programming, see Theorem 1.1 in [6]. Nowadays, the underlying theory of cuts has become quite deep and the number of different types of cuts is-even though the underlying ideas of the cuts are often related-vast. The article [6] is a modern presentation of the most influential cuts, and the article [7] explores the relationships in the cut zoo. Recently, maximal lattice-free polyhedra have attracted attention, since it can be shown that the strongest cuts are derived from maximal lattice-free polyhedra, see [8].
Methods for generating cuts also exist in convex programming; for an early approach, see [9]. A modern introduction into the key ideas on cuts for continuous convex problems is given in [10]. For an overview on cuts for mixed-integer convex problems, we refer to Chapter 4 in [11].
There is work on the computation of valid linear inequalities in the non-convex setting: In [12], the authors compute outer approximations for separable non-convex mixed-integer nonlinear programs, and require the feasible set to be contained in a known polytope, i.e., a compact polyhedron. Another recent approach which has proven to be quite successful is via so-called S-free sets, generalizing the idea of lattice-free polyhedra. For example, in [13], an oracle-based cut generating algorithm is presented that computes an arbitrarily precise-as measured by the Hausdorffdistance-approximation of the convex hull of a closed set, if the latter is contained in a known polytope. Related is the work [14], and its extension [15], where the authors derive cuts for a pre-image of a closed set by a linear mapping via so-called cut generating functions and show that these functions are intimately related to S-free sets. There are also new results on cut generation for special cases. A framework is proposed in [16] that creates valid inequalities for a reverse convex set using a cut generating linear program. Also, maximal S-free sets for quadratically constrained problems are computed [17].
Finding a valid linear inequality can be considered as a hyperplane location problem. For an overview on a location theory approach, we refer to [18,19].
In this article, sos programming plays a key role. This technique can be traced back to [20][21][22][23]. See [24] for a survey, and [25] for a focus on geometric aspects. An algebraic approach is [26]. For a concise treatment, let us mention [27].
Nonlinear mixed-integer programming itself is a large problem class, and the literature is extensive. For an overview and several pointers to key results, let us mention the survey [28], as well as [11,29].
The remainder of the paper is structured as follows. Section 2 settles notation and prerequisites from sum of squares programming. Section 3 formulates the task of finding a valid linear inequality for a given subset S of R n that is close to a point q in S. The distance is measured by a gauge. We give geometric characterizations that ensure the existence of feasible and optimal solutions and formulate the problem as a non-convex and semi-infinite optimization problem. In order to make the problem tractable, we first linearize the objective function using a result from [30] in Sect. 4. In Sect. 5, we give a convex reformulation if the distance is measured by a polyhedral gauge and furthermore, restricting ourselves to a semi-algebraic set S in Sect. 6 we receive a hierarchy of sos programs, which converge to the optimal value of the original program if S is bounded. Illustrating examples are provided in Sect. 7. Extensions are discussed in Sect. 8.
Our first main contribution-definitions deferred to Sect. 2-is Proposition 5.1: If valid linear inequalities exist, we can find one that is tight with respect to a polyhedral gauge by solving finitely many linear semi-infinite problems. This result holds without any structural assumptions on the set S, which is to the best of our knowledge a new contribution. The second main contribution is Theorem 6.1: If S is semi-algebraic and the gauge polyhedral, we can give a weakened formulation in terms of a hierarchy of sos programs. Feasibility provided, the hierarchy yields a sequence of hyperplanes with decreasing distances to q. If the corresponding quadratic module is Archimedean, we can guarantee convergence of the hierarchy towards a tight valid inequality. In contrast with the approach of [12], we do not require S to be contained in a polytope to produce feasible solutions (see Sect. 7 for an unbounded example). Similarly, for the method in [13], an oracle is needed, and in the case of polynomial constraints, an oracle is only provided if the feasible set is contained in a polytope. The approach in [14] is not algorithmic and is, without further research, not directly applicable in our setting.
Regarding the scope of the paper, let us note that, for MINLP, it is an important question how cuts can be generated from valid inequalities. One reason for this is that they can improve the strength of an optimization model by only using information inherent in the model, see, e.g., [5,6,8,[13][14][15][16][17]. Also, it is an interesting question how different choices of q affect the obtained valid inequalities. Both questions are not within the scope of this article and left as starting points for future research. numbers do not contain 0, and we denote N 0 := N ∪ {0} and put [k] := {1, . . . , k} for k ∈ N.

Tight Valid Inequalities
An inequality (a T x ≤ b) is given by some a ∈ R n , a = 0, b ∈ R. We say that violated by some x ∈ S, -tight for S at q ∈ S if the inequality is tight for S and a T q = b.
The associated hyperplane is denoted by

Polynomials, Sum of Squares, and Quadratic Modules
We denote the ring of polynomials in n unknowns X 1 , . . . , X n and coefficients in R by R[X 1 , . . . , X n ]. A polynomial is a sum of squares or sos for short if it has a representation as a sum of squared polynomials. Formally, we have p ∈ R[X 1 , . . . , X n ] is sos if there are q 1 , . . . , q l ∈ R[X 1 , . . . , X n ] with p = q 2 1 + · · · + q 2 l .
We denote the set of all sos polynomials by What makes this notion useful is that an immediate consequence of a representation of p as in (1) is that p is nonnegative on all of R n , and the q i certify nonnegativity. It turns out that deciding if a polynomial is a sum of squares can be reformulated as a semidefinite program (SDP), and SDPs in turn are well-understood and can be solved efficiently, see, e.g., [31,32]. The set Σ n of all sos polynomials is a convex cone in R[X 1 , . . . , X n ].
We also need the notion of semi-algebraic sets: Given a finite collection of multivariate polynomials h 1 , . . . , h s ∈ R[X 1 , . . . , X n ], consider the subset of R n where all polynomials h i attain nonnegative values A subset of R n is called basic closed semi-algebraic, or in this article semi-algebraic for short, if it is of the form (2) for some polynomials h 1 , . . . , h s . For example, we note that for a MIPP with constraint polynomials h 1 , . . . , h s , we have F = K (h 1 , . . . , h s ). Given the constraints h i (x) ≥ 0 to a MIPP, a way to infer further valid inequalities is to scale the h i by sos (and thus nonnegative) polynomials and add them up. This is formalized in the algebraic definition of a quadratic module generated by the h i : where h 0 := 1.
In sos programming, the coefficients of the σ i appearing in (3) are unknowns that we optimize. As there is no degree bound on the σ i , this is impractical. Hence, we instead use the truncated quadratic module of order k ∈ {−∞} ∪ N, given by where, again, h 0 := 1.
We address now the question how polynomials in M(h 1 , . . . , h s ) and polynomials nonnegative on K (h 1 , . . . , h s ) are related. The following observation which derives a geometric statement from an algebraic one, follows directly from the definitions of The question addressing the "converse direction"-suppose a polynomial p is nonnegative on K (h 1 , . . . , h s ), does p ∈ M(h 1 , . . . , h s ) hold?-is more difficult to answer. Conditions that guarantee such representations are addressed in Positivstellensätzen. In this article, we use a Positivstellensatz by Putinar. It holds under a technical condition that we outline next.

The Archimedean Property and Putinar's Positivstellensatz
The condition needed for the Positivstellensatz to hold is that the quadratic module The following equivalent characterization is useful for our purposes.
. . , h s ) being Archimedean, which is straightforward, well-known but nevertheless important, is that the basic closed semi-algebraic set associated with the polynomials h i is compact.
Proof This follows from Observation 2.1 and Theorem 2.1.
On the other hand, if K (h 1 , . . . , h s ) is compact, then M(h 1 , . . . , h s ) need not be Archimedean, see, e.g., Example 7.3.1 in [26]. As one of our main results (Theorem 6.1) requires M(h 1 , . . . , h s ) to be Archimedean, it is a natural question if one can decide whether a given quadratic module satisfies this property.
it is possible to enforce the Archimedean property on the associated quadratic module M if we have a known bound R ≥ 0 such that 1 Specifically, by adding the redundant constraint h s+1 := is Archimedean, see, e.g., [33].

Gauge Functions
A gauge is a function γ : R n → R of the form for A ⊂ R n compact, convex with 0 ∈ int A. Note that every norm with unit ball B is a gauge γ (·; B). On the other hand, a gauge γ (·; A) satisfies definiteness, positive homogeneity and the triangle property as norms do. It is a norm if additionally absolute homogeneity holds, equivalently, if A is symmetric, i.e., −A = A.
Given C, D ⊂ R n and a gauge γ on R n , the distance from C to D is For a singleton set C = {c} we also write d(c, D). Analogous to norms, the distance measured by gauges between two nonempty sets C, K ⊂ R n is attained if C is closed and K compact, i.e., there exist c * ∈ C and k * ∈ K with in this case.
The polar of a set A ⊂ R n is It can be shown that if A is compact, convex with 0 ∈ int A, the same holds for A • , see, e.g., Corollary 14.5.1 in [34]. For a gauge γ , the function is the polar of γ . It then holds that that is, the polar of a gauge is again a gauge, see, e.g., Theorem 15.1 in [34].
We also consider gauges γ that are polyhedral: A gauge γ (·, A) is called polyhedral if A is a polyhedron. As the polar of a polyhedron is a polyhedron (see, e.g., Corollary 19.2.2 in [34]), it is clear in view of (7) that the polar of a polyhedral gauge is again a polyhedral gauge. For a polyhedral gauge γ (·, A), the extreme points of A of are called fundamental directions of γ .

A Geometric Reformulation and Its Properties
In this section we formulate the task to find a tight valid linear inequality as the following geometric optimization 2 problem: Given q ∈ S, find a valid linear inequality for S such that the associated hyperplane has a minimum distance (defined by an arbitrary gauge function 3 ) to q.
Let us interpret solutions of Program V1 geometrically.
Every optimal solution (a, b) with objective value 0 yields an inequality that is tight for S at q.
Proof Claim 1 is clear. To see Claim 2, let (a, b) be an optimal solution and assume the contrary, i.e., there is H (a, b)). Using q ∈ S, we get the inequalities and hence a T (u − q) > 0. Put .
Observe thatû is a point on H (a, b ): Note that (8) H (a, b)).
Hence (a, b ) is a feasible solution to V1 with better objective value, contradicting optimality of (a, b).
To see Claim 3, note that if the objective value is 0 at (a, b) we know from (5) that d(q, u) = 0 for some u ∈ H (a, b), hence q = u and we conclude q ∈ H (a, b). The claim follows.
It turns out that feasibility of V1 is sufficient for the existence of optimal solutions. Theorem 3.1 Let S ⊂ R n and q ∈ S. Then, the following are equivalent: For the implication Claim 3 ⇒ Claim 1, let z ∈ R n \ conv S. By the Separating Hyperplane Theorem (see, e.g., Theorem 4.4 in [35]), we may separate z from conv S by a hyperplane H (a, b) with a T x ≤ b for all x ∈ conv S, and this hyperplane yields a feasible solution to Program V1.
To see Claim 1 ⇒ Claim 2, we construct an optimal solution that corresponds to a supporting hyperplane at a suitably chosen point on the boundary of the closure of the convex hull of S. So let (a T x ≤ b) be an inequality that is valid for S and thus conv S. Moreover, as half-spaces are closed, (a T x ≤ b) remains valid for C := cl conv S, and we conclude C R n . Also, C is convex as it is the closure of a convex set (see, e.g., Corollary 11.5.1 in [34]). As q ∈ S ⊂ C, C is a nonempty, proper closed subset of R n , so its boundary B := bd C is nonempty. As B is closed, (5) ensures the existence of x 1 ∈ B with d(q, x 1 ) = d(q, B). By the Supporting Hyperplane Theorem (see, e.g., Chapter 2.5.2, p. 51 in [36] is valid for S and the corresponding hyperplane H 2 : = H (a 2 , b 2 ) satisfies the inequality We now distinguish two possible locations for x 2 and derive a contradiction in every case.
1. x 2 ∈ R n \ int C. As q ∈ C, the line segment from x 2 to q crosses the boundary B of C at a point x 3 . But then d(x 3 , q) ≤ d 2 < d 1 , contradicting the optimality of Since half-spaces are convex and closed, we may conclude that We conclude that x 2 cannot exist, so neither can (a 2 , b 2 ) . Hence (a 1 , b 1 ) is an optimal solution to Program V1.
There is nothing to prove for Claim 2 ⇒ Claim 1.

Linearizing the Objective
In this section, we linearize the objective function d (q, H (a, b)) in Program V1. As a first step, we use an analytic expression for the objective from the literature.  H (a, b) Let us first note that γ • (a) > 0: Since γ is a gauge, there is A ⊂ R n closed, convex with 0 ∈ int A such that γ (·) = γ (·; A). By (7), we have γ • (·) = γ (·; A • ), and we saw in Sect. 2.4 that A • is also closed, convex with 0 ∈ int A • . Thus γ • (x) = 0 if and only if x = 0, hence γ • (a) > 0. The variable a enters the fractions in (9) in a nonlinear fashion. Moreover, the constraint a = 0 is not closed. Now compare Program V1 with the following program with linear objective that avoids a constraint of the form a = 0: R1T3 It turns out that Programs V1 and V2 are closely related. To this end let us introduce the following notion: Two solutions (a, b) and (a , b ) are geometrically equivalent if   H (a, b) = H (a , b ).
We say that two programs have geometrically equivalent feasible/optimal solutions if for every feasible/optimal solution (a, b) of the first program there is a geometrically equivalent feasible/optimal solution (a , b ) of the second program and vice versa.

Proposition 4.1
Let q ∈ S ⊂ R n and γ be a gauge on R n . Then, the following hold: 1. Programs V1 and V2 have geometrically equivalent feasible solutions.

The optimal values of both programs coincide.
In particular, both programs have geometrically equivalent optimal solutions.
Proof By (9) and using the fact that a T q ≤ b, Program V1 is the same as For the first claim, let (a, b) be feasible for V1 . Then γ • (a) −1 · (a, b) is feasible for V2 and geometrically equivalent to (a, b). On the other hand, if (a, b) is feasible for V2, it is also feasible for V1 . For the second claim, we note that the programs V1 and V2 are either both feasible or both infeasible, so in the following we may assume they are feasible. Let z 1 be the optimal value of V1 and z 2 be the optimal value of V2. Let (a, b) be feasible for V1 . Then γ • (a) −1 · (a, b) is feasible for V2 and geometrically equivalent to (a, b). Furthermore, this shows that z 2 ≤ z 1 . Now, let (a, b) be feasible for V2. Then (a, b) is feasible for V1 . Since γ • (a) ≥ 1, we have As (a, b) was an arbitrary feasible solution to V2, we have shown that z 1 ≤ z 2 . The claim about geometrically equivalent optimal solutions follows immediately from the first two statements.
To summarize, instead of solving V1 we may solve V2.

A Linear Semi-Infinite Program for Polyhedral Gauges
Program V2 contains the non-convex constraint γ • (a) ≥ 1. This constraint can be linearized if we restrict ourselves to polyhedral gauges. This is not a hard restriction since due to [37] every norm can be approximated arbitrarily closely by a block norm, and similarly every gauge by a polyhedral gauge.
We need the following characterization of the facets of a unit ball in terms of the extreme points of the polar polyhedron defined in (6). {x ∈ R n : v T j x ≤ 1}.
We use this characterization as follows.

Corollary 5.1 Let γ be a polyhedral gauge and denote its fundamental directions by
and x ∈ R n : γ Proof For the interior of the unit ball B • of γ • it holds that Since B • is polyhedral, we have from (10) int This means that the set {x ∈ R n : γ • (x) ≥ 1} equals proving the first equality. The second equality follows from the fact that (13) and then using distributivity for union and intersection of sets on the explicit representations (10) and (12) of the two sets on the right hand side in (13).
The idea now is to use Corollary 5.1 to decompose the nonlinear program V2 into a set of l linear programs, one for each fundamental direction v j , j ∈ [l] of the polyhedral gauge γ . These programs are given as The relation between the programs V3 j and V2 is described next. Proposition 5.1 Let q ∈ S ⊂ R n and γ be a polyhedral gauge on R n with fundamental directions v 1 , . . . , v l ∈ R n . Denote the optimal value of Program V2 by z * and for V3 j by z * j . Let (a, b) ∈ R n × R. Then the following hold: 1. (a, b) that (a, b) is an optimal solution to V3 j for j = j 0 with z * j 0 = min j∈[l] z * j .

(a, b) is an optimal solution of V2 if and only if there is j
Proof Denote the feasible set of V2 by F and of V3 j by F j . From (11) we then have F = j∈[l] F j and all claims follow easily.

Remark 5.1
Note that V3 j has a single linear constraint involving the fundamental directions. This is the reason why in V2, we did not use the constraint γ • (a) = 1 instead of γ • (a) ≥ 1: In view of Corollary 5.1, we would have l additional constraints involving the fundamental directions.

Remark 5.2
For practical purposes let us note that the number l of fundamental directions of a gauge, and therefore the number of programs V3 j , can vary tremendously. For example, the gauge given by the 1-norm on R n has l = 2n fundamental directions, i.e., is linear in the dimension n, whilst the gauge given by the ∞-norm on R n has l = 2 n fundamental directions, i.e., is exponential in the dimension, see, e.g., p. 5 in [39].
To summarize, instead of solving V1 we may solve the linear and semi-infinite programs V3 j for all j ∈ [l]. How the semi-infinite constraint can be circumvented is shown in the next section.

An Approximating Hierarchy for Polynomial Constraints and Polyhedral Gauges
In this section we approximate Program V3 j by a hierarchy of sos programs. The main reason is that the constraint is semi-infinite if S contains infinitely many points. There is much literature on semiinfinite programming problems. Classical overview articles are, e.g., [40,41]; a more recent survey is [42]. A bi-level approach is explored in [43]. Also, several numerical solution methods exist, for an overview, we refer to [44][45][46]. However, in this article we take a different route. Let us explore how semi-infinite constraints can be sidestepped by the requirement of semi-algebraic S and a polyhedral gauge γ . For example when considering MIPP, the set S = F is semi-algebraic. In the following, we use an arbitrary basic closed semi-algebraic set S = K (h 1 , . . . , h s ).
With this in mind, we consider the following hierarchy of programs, where k ∈ N, The number k is called the truncation order of program VR j,k . Next we show that VR j,k is an sos program, i.e., that it has the form max c T y (SOSP) s.t. p i0 + y 1 p i1 + · · · + y m p im ∈ Σ n , i ∈ [r ], for c ∈ R m and fixed polynomials p i0 , p i j ∈ R[X 1 , . . . , X n ], i ∈ [r ], j ∈ [m], and decision variables y ∈ R m . This is helpful since it is possible to solve sos programs. For a detailed introduction to sos programming, we refer to [24,25].

Proposition 6.1 Program VR j,k is an sos program.
Proof As is common in sos programming, a constraint of the form for some p i ∈ R[X 1 , . . . , X n ] and k ∈ N translates to a classical sos programming constraint as follows: The statement is, using the fact that h 0 = 1 in the defining equation (4) of M[k], equivalent to The degree bounds ensure that only finitely many real decision variables appear in the σ i , and thus they can be rewritten by constraints of the form SOSP. Note that we tacitly assume h i = 0, otherwise we may remove the constraint. Also, linear programming constraints can be used in sos programming, since for c ∈ R, the requirement c ≥ 0 is equivalent to c ∈ Σ n . Finally, we note that the objective of Program VR j,k is linear.
The next proposition shows that feasible solutions to Program VR j,k yield feasible solutions to Program V3 j . Hence (a, b) is feasible to V3 j .
The next theorem shows that, if M(h 1 , . . . , h s ) is Archimedean, we get a hierarchy of sos programs indexed by the truncation order k, producing a sequence of valid inequalities for S. Also, as k → ∞, we can show that the distance of the hyperplanes to the point q is monotonically decreasing and converges to the optimal solution of V1.
Now fix some ε > 0, and note that p(a * , b * +ε) is positive on S. The Positivstellensatz Concerning Claim 2 we note that compactness of S implies feasibility of V1, and hence z * < +∞. By Theorem 3.1, V1 has an optimal solution (a, b). By Proposition 4.1 and rescaling if necessary, we may further assume that (a, b) solves V2 to optimality, that is, By Proposition 5.1, there is j 0 ∈ [l] such that (a, b) solves V3 j 0 to optimality. Let ε > 0. Hence, the linear polynomial p(a, b+ε) is positive on S and by the Positivstellensatz Note that z * ≤ z j 0 ,k ε , and our estimates combine to and we conclude z j 0 ,k ε → z * for ε → 0. On the other hand, we have , which yields z j 0 ,k z * for k → +∞. The claim min j∈[l] z j,k = z k z * follows.

Illustration
We have implemented the hierarchy using SOSTOOLS and SeDuMi and illustrate our results on some examples. 4 Our implementation is published as open-source software [49]. In our first example, we consider the polynomials Thus, the set S = {x ∈ R 2 : h 1 (x) ≥ 0, h 2 (x) ≥ 0} ⊂ R 2 is thus the compact set given by the intersection of the Euclidean unit norm ball and the epigraph of the function , hence by Theorem 2.1 the associated quadratic module M(h 1 , h 2 ) is Archimedean, and the hierarchy converges by Theorem 6.1. The point q = (0.4, −0.5) lies in S. We have solved our hierarchy for the polyhedral gauge γ = · 1 (in this case a block norm) with fundamental directions {(1, 0), (0, 1), (−1, 0), (0, −1)} and two different truncation orders, which we report in Table 1. Figures 1 and 2 show the vanishing sets V (h 1 ) and V (h 2 ) of h 1 and h 2 , that is, , and the point q. The figures show a computed optimal hyperplane for a low (k = 2) and a high (k = 5) truncation order k. The optimal solutions and optimal values along with the computation times can be found in Table 1. Allowing for a higher truncation order of k = 6 did not improve the result further. We infer from these first examples that a low truncation order of, say, k = 2 cannot be expected to give an optimal hyperplane-however, for k = 2, we get a valid  Fig. 1 Bounded example, k low inequality that already can be used as approximation in a very short computation time. The examples also show that an optimal solution (a, b) for VR j,k does not necessarily yield a tight inequality for S.
An unbounded example (Fig. 3) is given by The set S = {x ∈ R 2 : h 1 (x) ≥ 0, h 2 (x) ≥ 0} is the intersection of the filled unit parabola given by x 2 ≥ x 2 1 and the outside of the rotated unit parabola given by x 1 ≤ x 2 2 . The set S is thus indeed unbounded, hence M(h 1 , h 2 ) cannot be Archimedean. Nevertheless we can apply our approach (now without convergence guarantee). We choose q = (0.25, 0.5) on the boundary of S. We report the computed values for k = 4 in Table 1. This example shows that, even though the Archimedean condition does not hold, it can still be possible to obtain a valid inequality that is close to q and nearly tight. For lower orders (k = 3), no solution was found. This can also be concluded directly from (4), as we would have to express a nontrivial linear polynomial as a linear combination of h 1 and h 2 , which is impossible. The objective did not improve by increasing to k = 5 or k = 6.

Modifications and Extensions
In this section we consider some modifications of Program V1. Namely, we consider the case that a point q ∈ S is not known, and the case that the normal a is fixed. We will state the modifications as general optimization problems (similar to V1) along with a reformulation using sos programming (similar to VR j,k ).

Finding Valid inequalities Without a known q ∈ S
As a first modification, we search for a tight valid inequality for S without knowing a point q ∈ S. We formulate the program and its sos variant with the constraint γ • (a) = 1 that leads to more constraints-cf. Remark 5.1-as follows. As before, let S ⊂ R n , M = M(h 1 , . . . , h s ) for h i ∈ R[X 1 , . . . , X n ], and v j ∈ R n .
We require an equation in Vb as opposed to V2 because otherwise the program would be unbounded from below whenever the optimal objective is negative. Let us again state some observations. Proposition 8.1 Let S ⊂ R n and γ be a gauge on R n . 1. Every feasible solution (a, b) Let (a, b) be a feasible solution for VbR j,k . The and v T j a ≥ 1 for some j ∈ [l]. The claim follows. For this modification, we consider the following example (Fig. 4): The set S = {x ∈ R 2 : h 1 (x) ≥ 0, h 2 (x) ≥ 0} is the intersection of a branch of a hyperbola {(x 1 , x 2 ) ∈ R 2 : x 1 < 0 and x 2 ≤ 1 8x 1 } and a strip given by {(x 1 , x 2 ) ∈ R 2 : |x 1 + 1 2 | ≤ 4}. Hence, S is unbounded. We ran the hierarchy VbR j,k for k = 4 (let us stress that we did not specify a point q ∈ S). The values of the computed tight inequality are shown in Table 1. No feasible solution could be found for k = 3 and the objective did not improve for k = 5 or k = 6, which is to be expected since Fig. 4 reveals that the hyperplane is tight.

Fixed Normal
The second modification we consider is the variant where a fixed normal a ∈ R n , a = 0, is given and we want to find b ∈ R such that (a T x ≤ b) is valid for S and as tight as possible, i.e., b ∈ R is the only decision variable. 5 The programs read In the next result we show that for a fixed normal and provided some q ∈ S is known, the optimal solutions do not change by replacing the objective by d (q, H (a, b)).

Observation 8.1
Let q ∈ S ⊂ R n , a gauge γ and 0 = a ∈ R n be given. Consider the program min d (q, H (a, b)) s.t. a T x ≤ b for all x ∈ S b ∈ R.
Then, Vn and (17) have the same feasible and optimal solutions. Let us state some properties of Vn and VnR k . We omit the proof since it is similar to the proof for the corresponding statements of V2 and VR j,k .  Table 1. Again, no solution was found for k = 3 and the objective did not improve for k = 5 and k = 6, which is again to be expected since the figure reveals that the hyperplane is tight in this example, too.

Conclusions
To summarize, we have shown that the problem to find a tight valid inequality for a subset S of R n , using a polyhedral gauge γ , can be approximated with sos programming if the set S is semi-algebraic, i.e., if S is given as K (h 1 , . . . , h s ) for some polynomials h 1 , . . . , h s . The approximating hierarchy is guaranteed to converge if the quadratic module M(h 1 , . . . , h s ) is Archimedean. In view of Remark 2.1, this is the case if S is a bounded set.
Sos programs like ours are computationally tractable on current SDP solvers for small instances (few variables and low degrees of the polynomials). We hence can find cuts for MIPP instances in reasonable time in this case. If the corresponding semidefinite programs become too large for current state-of-the-art solvers, there are promising ideas that keep larger instances tractable, e.g., restrictions to subsets of sos polynomials that translate to linear or second-order cone programs [50,51] as well as column generation [52]. Still, our sos programs translate to SDPs which can, leaving technical details aside, essentially be solved in polynomial time [53]. Note that we cannot expect much more, since MIPP is known to be NP-hard. In the continuous case, this can be seen since deciding nonnegativity of a polynomial of degree 4 is NP-hard [54] and thus minimization of a polynomial of degree 4 is NP-hard. In the integer case, it can be shown that no algorithm for integer programming with quadratic constraints exists [55].
Further research includes to identify situations when a cut may be derived from a given tight valid inequality. That is, given a tight valid inequality for S, are there assumptions that ensure that there is a way to derive a related inequality which is valid for the integer points in S but violated at (a non-integer) point q in S, say? First ideas in this direction are given in [56].