1 Introduction

In convex integer programming, there exist various procedures to strengthen convex relaxations of sets of integer points. Formally, given a set \( S \subseteq {\mathbb {Z}}^n \) of integer points and a convex set \( Q \subseteq {\mathbb {R}}^n \) with \( Q \cap {\mathbb {Z}}^n = S \), these methods aim to construct a new convex set f(Q) satisfying \( S \subseteq f(Q) \subseteq Q \).

On the one hand, there exist several general-purpose methods that strengthen relaxations without specific knowledge of the set S, in a systematic way. The hierarchies of Sherali and Adams [27], Lovász and Schrijver [21], and Lasserre [20], which are tailored to 0/1-sets \( S \subseteq \{0,1\}^n \), are methods of this type.

On the other hand, various methods have been designed for obtaining strengthened relaxations for specific sets S. Such methods include, as an example, an impressive collection of families of valid inequalities of the traveling salesperson polytope that strengthen the classical subtour elimination formulation. Similar research has been performed for many other polytopes arising in combinatorial optimization, such as stable set polytopes and knapsack polytopes.

In this work, we propose a new method that interpolates between the two approaches described above. We design a procedure to strengthen any convex set \( Q \subseteq {\mathbb {R}}^n \) containing a set \( S \subseteq \{0,1\}^n \) by exploiting certain additional information about S. Namely, the required extra information will be in the form of a Boolean formula \( \phi \) defining the target set S. Instead of viewing a Boolean formula as taking 0/1-vectors as input, the improved relaxation is obtained by “feeding” the convex set Q into the formula \( \phi \), and will be denoted by \( \phi (Q) \).

While the formula \( \phi \) has to be provided as a further input, for certain problems, there is an “obvious” candidate for \(\phi \). For example, suppose that the set S arises from a 0/1-covering problem. That is, it is given by a matrix \( A \in \{0,1\}^{m \times n} \) such that \( S = \{ x \in \{0,1\}^n : Ax \geqslant \mathbf {1} \} \), where \( \mathbf {1} \) is the all-ones vector. Then S can be equivalently specified by the following Boolean formula in conjunctive normal form

$$\begin{aligned} \phi := \bigwedge _{i = 1}^m \bigvee _{j : A_{ij} = 1} x_j. \end{aligned}$$
(1)

In addition, there is a vast literature on representing sets of 0/1-points via Boolean formulas, and we are free to use any of these formulas for our procedure.

An important property of \( \phi (Q) \) is that it can be described by an extended formulation whose size is bounded by the size of the formula \( \phi \) times the size of an extended formulation defining the input relaxation Q. Recall that an extended formulation of size m of a polytope P is determined by matrices \( T \in {\mathbb {R}}^{n \times d} \), \( A \in {\mathbb {R}}^{m \times d} \) and vectors \( t \in {\mathbb {R}}^n \), \( b \in {\mathbb {R}}^m \) such that \( P = \{ x \in {\mathbb {R}}^n : \exists y \in {\mathbb {R}}^d : Ay \geqslant b, \, x = Ty + t\} \). Therefore, provided that \(\phi \) has polynomial size, our procedure is efficient in the sense that a small extended formulation for Q can be converted into a small extended formulation for \(\phi (Q)\) in polynomial time.

To illustrate, the well-studied procedure [10] that maps a relaxation Q to its Chvátal-Gomory closure f(Q) is not efficient in the above sense (this is to be expected since determining membership in f(Q) is \(\mathsf {NP}\)-complete [12]). A striking example is given by choosing Q as the fractional matching polytope [26, Section 30.2]. In this case, the Chvátal-Gomory closure f(Q) is the matching polytope. It was recently shown by Rothvoß that all extended formulations of the matching polytope have exponential size [25].

Another property of our procedure is that it is complete in the sense that iterating it a finite number of times (in fact, at most n times) always yields the convex hull of S. Furthermore, our procedure can be applied to any convex set \( Q \subseteq [0,1]^n \) that contains the target set S. In particular, the set Q is even allowed to contain 0/1-points that do not belong to S. As an example, we can always apply our method with \( Q = [0,1]^n \), and thus finding an initial relaxation for \({\text {conv}}(S)\) is never an issue. Intuitively, this is possible since the information of which points belong to S is stored in \( \phi \).

By viewing an iterated application of our procedure as a hierarchy, we obtain a significant simplification of the Bienstock-Zuckerberg hierarchy [7]. This is a powerful hierarchy tailored to 0/1-covering problems. However, one of its drawbacks is that the definition of the hierarchy is quite complicated. Prior to this work, it had been simplified by Mastrolilli [22] using a modification of the Sherali-Adams hierarchy that is based on appropriately defined high-degree polynomials. Subsequent to our work, it has also been simplified by Bienstock and Zuckerberg [9] themselves. Despite the simplicity of our method, we obtain extended formulations whose size is often vastly smaller than those of Bienstock and Zuckerberg [7, 8] and Mastrolilli [22]. We discuss this in more detail in Sect. 6.

Another aspect of our work is that it should serve as a bridge between combinatorial optimization and circuit complexity. As a concrete example, our procedure yields a very simple proof that Rothvoß’ result [25] on the extension complexity of the matching polytope implies a seminal result of Raz and Wigderson [23, Theorem 4.1] on the size of monotone formulas required to describe the matching function. In the other direction, constructions of small formulas describing a set \( S \subseteq \{0,1\}^n \) can now be used to obtain small extended formulations for \({\text {conv}}(S)\). We give a few non-trivial examples of this in Sect. 5, but of course there are many more.

For readers familiar with circuit complexity, we mention that our work is inspired by a relatively unknown connection between Karchmer-Wigderson games [18] and nonnegative factorizations, pioneered by Hrubeš [17]. This connection was recently rediscovered by Göös, Jain and Watson [15] and exploited in [3]. As such, the proofs of the main properties of our hierarchy are also very short. Indeed, the proof of our main theorem can be regarded as a “polyhedral” Karchmer-Wigderson game, but no knowledge of communication complexity is required.

Paper Outline We start by describing our procedure to obtain \( \phi (Q) \) in Sect. 2. In Sect. 3, we introduce notions that allow us to quantify the strength of our relaxations. Our main results regarding properties of the set \( \phi (Q) \) are presented in Sect. 4. In Sect. 5, we discuss several applications of our method in detail. In Sect. 6 we compare our procedure to related work of Bienstock and Zuckerberg [7, 8] and Mastrolilli [22], and state some open problems.

2 Description of the procedure

In order to present the construction of the procedure, let us fix some notation concerning Boolean formulas. We consider formulas that are built out of input variables \( x_1,\dotsc ,x_n \), conjunctions \( \wedge \), disjunctions \( \vee \), and negations \( \lnot \) in the standard way. Here, we define the sizeFootnote 1 of a Boolean formula as the total number of occurrences of input variables. We denote by \(|\phi |\) the size of \(\phi \).

Given a Boolean formula \( \phi \), we can interpret it as a function from \( \{0,1\}^n \rightarrow \{0,1\} \) and for an input \( x = (x_1,\dotsc ,x_n) \in \{0,1\}^n \) we will denote its output by \( \phi (x) \). We say that the set \( S = \{ x \in \{0,1\}^n : \phi (x) = 1 \} \) is defined by \( \phi \). Two formulas are said to be equivalent if they define the same set.

We say that a formula is reduced if negations are only applied to input variables. Note that, by De Morgan’s laws, every Boolean formula can be brought into an equivalent reduced formula of the same size. As an example, the formulas

$$\begin{aligned} \phi _1&= \lnot \left( \left( x_1 \wedge \lnot x_2 \right) \vee \left( \lnot \left( x_1 \vee x_3 \right) \right) \right) \nonumber \\ \phi _2&= \left( \lnot x_1 \vee x_2 \right) \wedge \left( x_1 \vee x_3 \right) \end{aligned}$$
(2)

are equivalent and both have size 4, but only the second is in reduced form.

Below, we will repeatedly use the elementary fact that for every reduced formula \(\phi \) of size \(|\phi |\), one of the following holds:

  • \(|\phi | = 1\) and either \(\phi = x_i\) or \(\phi = \lnot x_i\) for some \(i \in [n]\), or

  • \(|\phi | \geqslant 2\) and \(\phi \) is either the conjuction or the disjunction of two reduced formulas \(\phi _1,\phi _2\) such that \( |\phi | = |\phi _1| + |\phi _2| \).

This gives a way to represent any reduced Boolean formula as a rooted tree each of whose inner nodes is labeled with \(\wedge \) or \(\vee \) and each of whose leaves is labeled with a non-negated variable \(x_i\) or a negated variable \(\lnot x_i\). Note that there may be many trees that represent the same reduced Boolean formula, but this will not matter. Observe that the size of a formula is the number of leaves in any one of its trees.

We are ready to describe our method to strengthen a convex relaxation of a given set of points in \( \{0,1\}^n \).

Definition 1

Let \( \phi \) be a reduced Boolean formula with input variables \( x_1,\dotsc ,x_n \) and let \( Q \subseteq [0,1]^n \) be any convex set.

The set \( \phi (Q) \subseteq {\mathbb {R}}^n \) is recursively constructed from the formula \( \phi \) as follows.

  • Replace any non-negated input variable \( x_i \) by the set \( \{ x \in Q : x_i = 1 \} \).

  • Replace any negated input variable \( \lnot x_i \) by the set \( \{ x \in Q : x_i = 0 \} \).

  • Replace any conjuction \( \wedge \) of two sets by their intersection.

  • Replace any disjunction \( \vee \) of two sets by the convex hull of their union.

As an example, given any convex set \( Q \subseteq [0,1]^3 \) and the formula \( \phi _2 \) defined in (2), we have

$$\begin{aligned} \phi _2(Q)&= {\text {conv}}\big ( \{ x \in Q : x_1 = 0 \} \cup \{ x \in Q : x_2 = 1 \} \big ) \\&\quad \cap {\text {conv}}\big ( \{ x \in Q : x_1 = 1 \} \cup \{ x \in Q : x_3 = 1 \} \big ). \end{aligned}$$

In the remainder of this work, we will analyze several properties of \( \phi (Q) \). One simple observation will be that, if S is defined by \( \phi \) and \( S \subseteq Q \), then \( S \subseteq \phi (Q) \subseteq Q \). Furthermore, \( \phi (Q) \) is strictly contained in Q unless \( {\text {conv}}(S) = Q \). In order to quantify this improvement over Q, we will introduce useful measures in the next section.

3 Measuring the strength: pitch and notch

We now introduce two quantities that measure the strength of our procedure. To this end, note that for every linear inequality in variables \( x_1, \dotsc , x_n \) we can partition [n] into sets \( I^+, I^- \subseteq [n] \) (with \( I^+ \cup I^- = [n] \) and \( I^+ \cap I^- = \emptyset \)) such that the inequality can be written as

$$\begin{aligned} \sum _{i \in I^+} c_i x_i + \sum _{i \in I^-} c_i (1 - x_i) \geqslant \delta , \end{aligned}$$
(3)

where \( c = (c_1,\dotsc ,c_n)^\intercal \in {\mathbb {R}}^n_{\ge 0} \) and \( \delta \in {\mathbb {R}}\). Since we will only consider the intersection of \( [0,1]^n \) with the set of points satisfying such an inequality, we are only interested in inequalities where \( \delta \geqslant 0 \). In this case, we call (3) an inequality in standard form.

The notch of an inequality in standard form is the smallest number \( \nu \) such that

$$\begin{aligned} \sum _{j \in J} c_j \geqslant \delta \end{aligned}$$
(4)

holds for every \( J \subseteq [n] \) with \( |J| \geqslant \nu \), while its pitch is the smallest number \( p\) such that (4) holds for every \( J \subseteq {\text {supp}}(c) \) with \( |J| \geqslant p\). Note that the pitch of an inequality is at most its notch. For instance, the notch of the inequality \(x_1 + x_n \geqslant 1\) is \(n-1\), while its pitch equals 1. Both quantities appear in the study of Chvátal-Gomory closures of polytopes in \( [0,1]^n \).

Intuitively, the notch of an inequality is related to how “deep” it cuts the 0/1-cube. For simplicity, assume that \(I^- = \emptyset \), so that the origin minimizes the left-hand side of (3) over the cube. The notch of (3) is then the smallest number \( \nu \) such that no 0/1-vector of Hamming weight \( \nu \) or more is cut by the inequality. A similar intuition applies to the pitch.

We extend the definition of notch from inequalities to sets of 0/1-points as follows. The notch of a non-empty set \( S \subseteq \{0,1\}^n \), denoted \(\nu (S)\), is the largest notch of any inequality in standard form that is valid for S. It can be shown that \( \nu (S) \) is equal to the smallest number k such that every k-dimensional face of \( [0,1]^n \) contains a point from S. This equivalent definition of notchFootnote 2 was introduced in [6]. The main result of [6] is that if S has bounded notch and \({\text {conv}}(S)\) has bounded facet coefficients, then every polytope \(Q \subseteq [0,1]^n\) whose set of 0/1-points is S has bounded Chvátal-Gomory rank.

The term pitch was used by Bienstock and Zuckerberg [7], who defined it for monotone inequalities in standard form, that is, where \( I^- = \emptyset \). Bounded pitch inequalities are related to the Chvátal-Gomory closure as follows. Consider any constants \(\varepsilon > 0\) and \(\ell \in {\mathbb {Z}}_{\geqslant 1}\), and any relaxation \(Q := \{x \in [0,1]^n : Ax \geqslant b\}\) of a set \(S := Q \cap \{0,1\}^n\), with A, b nonnegative. Bienstock and Zuckerberg [8, Lemma 2.1] proved that adding all valid pitch-\(p\) inequalities for \(p\leqslant \lceil \ell / \ln (1+\varepsilon ) \rceil = \Theta (\ell / \varepsilon )\) to the system defining Q gives a relaxation R that is a \((1+\varepsilon )\)-approximationFootnote 3 of the \(\ell \)-th Chvátal-Gomory closure of Q.

4 Main results

In this section, we prove several properties of the set \( \phi (Q) \). Let us start with the following simple observation.

Proposition 2

For every reduced Boolean formula \(\phi \) and every convex set \(Q \subseteq [0,1]^n\), the set \(\phi (Q)\) is a convex subset of Q. Moreover, \(\phi (Q)\) contains every point \(x \in \{0,1\}^n\) such that \(x \in Q\) and \(\phi (x) = 1\). In other words, \(\phi (Q)\) contains \(Q \cap \phi ^{-1}(1)\).

Proof

The fact that \(\phi (Q)\) is a convex set contained in Q is clear, since \(\phi (Q)\) is constructed from faces of Q by taking intersections and convex hulls of unions.

We prove the second part by induction on the size of \(\phi \). If \(|\phi | = 1\), then \(\phi \) is either \(\phi = x_i\) or \(\phi = \lnot x_i\) for some \(i \in [n]\). So either \(\phi (Q) = \{x \in Q : x_i = 1\}\) or \(\phi (Q) = \{x \in Q : x_i = 0\}\), respectively. We see immediately that \(\phi (Q)\) contains \(Q \cap \phi ^{-1}(1)\).

Now if \(|\phi | \geqslant 2\), then \(\phi \) is the conjunction or disjunction of two formulas of smaller size, say \(\phi _1\) and \(\phi _2\). In the first case, \(\phi = \phi _1 \wedge \phi _2\) and we have \(\phi (Q) = \phi _1(Q) \cap \phi _2(Q) \supseteq (Q \cap \phi ^{-1}_1(1)) \cap (Q \cap \phi ^{-1}_2(1)) = Q \cap (\phi ^{-1}_1(1) \cap \phi ^{-1}_2(1)) = Q \cap \phi ^{-1}(1)\), where the inclusion follows from induction. In the second case, \(\phi = \phi _1 \vee \phi _2\) and \(\phi (Q) = {\text {conv}}(\phi _1(Q) \cup \phi _2(Q)) \supseteq (Q \cap \phi ^{-1}_1(1)) \cup (Q \cap \phi ^{-1}_2(1)) = Q \cap (\phi ^{-1}_1(1) \cup \phi ^{-1}_2(1)) = Q \cap \phi ^{-1}(1)\). \(\square \)

Next, we argue that we can use \(\phi \) to transform any extended formulation for Q into one for \( \phi (Q) \). To this end, we make use of the extension complexity of a polytope P, which is defined as the smallest size of any extended formulation for P, and is denoted by \( {\text {xc}}(P) \). We need the following standard facts about extension complexity. First, if F is a non-empty face of P, then \( {\text {xc}}(F) \le {\text {xc}}(P) \). Second, for any non-empty polytopes \( P_1, P_2 \subseteq {\mathbb {R}}^n \) one has \( {\text {xc}}(P_1 \cap P_2) \le {\text {xc}}(P_1) + {\text {xc}}(P_2) \). Third, a slight refinement of Balas’ theorem [2] states that \( {\text {xc}}({\text {conv}}(P_1 \cup P_2)) \le \max \{{\text {xc}}(P_1), 1\} + \max \{{\text {xc}}(P_2), 1\} \), see [30, Prop. 3.1.1].

Proposition 3

Let \( \phi \) be a reduced Boolean formula and let \( Q \subseteq [0,1]^n \) be a polytope such that \( \phi (Q) \ne \emptyset \). Then \( \phi (Q) \) is a polytope with extension complexity \( {\text {xc}}(\phi (Q)) \le |\phi | {\text {xc}}(Q) \).

Proof

First, note that if \( {\text {xc}}(Q) = 0 \), then Q is a single point and so is \( \phi (Q) \), which implies \( {\text {xc}}(\phi (Q)) = 0 \) and hence the claimed inequality holds trivially. Thus, we may assume that \( {\text {xc}}(Q) \geqslant 1 \) holds.

We prove the claim by induction over the size of \( \phi \). If \( |\phi | = 1 \), then \( \phi = x_i\) or \(\phi = \lnot x_i\) for some \(i \in [n]\). So either \(\phi (Q) = \{x \in Q : x_i = 1\}\) or \(\phi (Q) = \{x \in Q : x_i = 0\}\), respectively. In both cases, \( \phi (Q) \) is a face of Q and hence \( {\text {xc}}(\phi (Q)) \leqslant {\text {xc}}(Q) \).

If \( |\phi | \geqslant 2 \), there exist reduced Boolean formulas \( \phi _1,\phi _2 \) (of size smaller than \( |\phi | \)) with \( |\phi | = |\phi _1| + |\phi _2| \) such that \( \phi = \phi _1 \wedge \phi _2 \) or \( \phi = \phi _1 \vee \phi _2 \). First, consider the case \( \phi = \phi _1 \wedge \phi _2 \), in which we have \( \phi (Q) = \phi _1(Q) \cap \phi _2(Q) \). Since \( \phi (Q) \) is non-empty, the same holds for \( \phi _1(Q) \) and \( \phi _2(Q) \) and hence, by the induction hypothesis, we have \( {\text {xc}}(\phi _i(Q)) \leqslant |\phi _i| {\text {xc}}(Q) \) for \( i = 1,2 \). Therefore,

$$\begin{aligned} {\text {xc}}(\phi (Q)) \leqslant {\text {xc}}(\phi _1(Q)) + {\text {xc}}(\phi _2(Q)) \leqslant |\phi _1| {\text {xc}}(Q) + |\phi _2| {\text {xc}}(Q) = |\phi | {\text {xc}}(Q). \end{aligned}$$

It remains to consider the case \( \phi = \phi _1 \vee \phi _2 \), in which we have \( \phi (Q) = {\text {conv}}(\phi _1(Q) \cup \phi _2(Q)) \).

Note that the claimed inequality holds if \( \phi _1(Q) = \emptyset \) or \( \phi _2(Q) = \emptyset \). Thus, we may assume that \( \phi _1(Q) \) and \( \phi _2(Q) \) are both non-empty. By the induction hypothesis, \( {\text {xc}}(\phi _i(Q)) \leqslant |\phi _i| {\text {xc}}(Q) \) for \( i = 1,2 \). Therefore,

$$\begin{aligned} {\text {xc}}(\phi (Q))&\leqslant \max \{{\text {xc}}(\phi _1(Q)), 1\} + \max \{{\text {xc}}(\phi _2(Q)), 1\} \\&\leqslant \max \{|\phi _1| {\text {xc}}(Q), 1\} + \max \{|\phi _2| {\text {xc}}(Q), 1\} \\&= |\phi _1| {\text {xc}}(Q) + |\phi _2| {\text {xc}}(Q) \\&= |\phi | {\text {xc}}(Q). \end{aligned}$$

\(\square \)

We remark that the upper bound provided by Proposition 3 is quite generous, and can be improved in some cases. For instance, if we let \(\tau \) denote the number of maximal rooted subtrees of \(\phi \) whose nodes are either input variables or \(\wedge \) gates, then we have \({\text {xc}}(\phi (Q)) \leqslant \tau {\text {xc}}(Q)\). This is due to the well-known fact that any intersection of faces of Q is a face of Q.

A Boolean formula is monotone if it does not contain negations. We are ready to prove our main theorem in the monotone case.

Theorem 4

Let \( \phi \) be a monotone Boolean formula defining a set \( S \subseteq \{0,1\}^n \) and let \( Q \subseteq [0,1]^n \) be any convex set containing S. If Q satisfies all monotone inequalities of pitch at most \( p\) that are valid for S, then \( \phi (Q) \) satisfies all monotone inequalities of pitch at most \( p+ 1 \) that are valid for S. Moreover, if Q is a polytope defined by an extended formulation of size \( \sigma \), then \( \phi (Q) \) is a polytope that can be defined by an extended formulation of size \( |\phi | \sigma \), where \( |\phi | \) is the size of the formula.

Proof

The second part of the theorem is implied by Proposition 3. For the first part, consider any monotone pitch-\((p+1)\) inequality in standard form that is valid for \(S = \{x \in \{0,1\}^n : \phi (x) = 1\}\),

$$\begin{aligned} \sum _{i \in I^+} c_i x_i \geqslant \delta . \end{aligned}$$
(5)

By the definition of pitch, we may assume \(c_i > 0\) for all \(i \in I^+\). We also assume \(\delta >0\); otherwise, there is nothing to prove. Let \(a \in \{0,1\}^n\) be the characteristic vector of \([n] {\setminus } I^+\). Thus, \(a_i = 1\) if \(i \in [n] {\setminus } I^+\) and \(a_i = 0\) if \(i \in I^+\). Notice that a violates (5). This implies \(\phi (a) = 0\).

By contradiction, suppose that (5) is not valid for \(\phi (Q)\). That is, there exists a point in \(\phi (Q)\) that violates (5). Let T be a tree that represents the formula \(\phi \). Each \(v \in V(T)\) has a corresponding formula, which is the formula computed by the subtree of T rooted at v. For notational convenience, we identity each node of T with its corresponding formula.

Our strategy is to find a root-to-leaf path in T such that for every node \(\psi \) on this path,

$$\begin{aligned} (\star ) \qquad \psi (a) = 0 \quad \hbox { and }\quad \hbox { there exists a point }\tilde{x} = \tilde{x}(\psi ) \in \psi (Q)\hbox { that violates }(5). \end{aligned}$$

This is satisfied at the root node \(\phi \).

Now consider any non-leaf node \(\psi \) in T that satisfies (\(\star \)). Let \(\psi _1\) and \(\psi _2\) denote the children of \(\psi \), so that \(\psi = \psi _1 \wedge \psi _2\) or \(\psi = \psi _1 \vee \psi _2\). We claim that, in both cases, there exists an index \(k \in \{1,2\}\) such that \(\psi _k\) satisfies (\(\star \)).

First, in case \(\psi = \psi _1 \wedge \psi _2\), we let \(\tilde{x}(\psi _1) = \tilde{x}(\psi _2) := \tilde{x}(\psi )\) and choose \(k \in \{1,2\}\) such that \(\psi _k(a) = 0\). Such an index is guaranteed to exist since \(\psi (a) = 0\). Then \(\psi _k\) satisfies (\(\star \)).

Second, in case \(\psi = \psi _1 \vee \psi _2\), we have \(\psi _1(a) = \psi _2(a) = \psi (a) = 0\). We let \(\tilde{x}(\psi _1)\) and \(\tilde{x}(\psi _2)\) be any points of \(\psi _1(Q)\) and \(\psi _2(Q)\) (respectively) such that the segment \([\tilde{x}(\psi _1),\tilde{x}(\psi _2)]\) contains \(\tilde{x}\). For at least one \(k \in \{1,2\}\), the point \(\tilde{x}(\psi _k)\) violates (5). Thus \(\psi _k\) satisfies (\(\star \)) for that choice of k.

By iterating the argument above, starting at the root node \(\phi \), we reach a leaf node \(\psi \) that satisfies (\(\star \)). Note that \(\psi = x_j\) for some j, since \(\phi \) is monotone. We have \(a_j = \psi (a) = 0\), so \(j \in I^+\). Moreover, there exists a point \(\tilde{x} = \tilde{x}(\psi ) \in \psi (Q) = \{x \in Q : x_j = 1\}\) that violates (5).

Now consider the monotone inequality

$$\begin{aligned} \sum _{\begin{array}{c} i \in I^+ \\ i \ne j \end{array}} c_i x_i \geqslant \delta - c_j\,. \end{aligned}$$
(6)

This inequality is valid for S since it is the sum of (5) and \(c_j (1-x_j) \geqslant 0\), which are both valid. Since \(c_j (1-\tilde{x}_j) = 0\), (6) is also violated by \(\tilde{x} \in \psi (Q) \subseteq Q\). The key observation is that the pitch of (6) is at most \(p\), which contradicts our assumption that Q satisfies all monotone inequalities of pitch at most \(p\). \(\square \)

In the non-monotone case, we now prove a statement analogous to Theorem 4 where the pitch is replaced by the notch.

Theorem 5

Let \( \phi \) be a reduced Boolean formula defining a set \( S \subseteq \{0,1\}^n \) and let \( Q \subseteq [0,1]^n \) be any convex set containing S. If Q satisfies all inequalities of notch at most \( \nu \) that are valid for S, then \( \phi (Q) \) satisfies all inequalities of notch at most \( \nu + 1\) that are valid for S. Moreover, if Q is a polytope defined by an extended formulation of size \( \sigma \), then \( \phi (Q) \) is a polytope that can be defined by an extended formulation of size \( |\phi | \sigma \).

Proof

The proof is almost identical to that of Theorem 4. Instead of repeating the whole proof, here we only explain the differences. The starting point is a notch-\((\nu +1)\) inequality

$$\begin{aligned} \sum _{i \in I^+} c_i x_i + \sum _{i \in I^-} c_i (1 - x_i) \geqslant \delta \,, \end{aligned}$$
(7)

where \(I^+ \subseteq [n] \) and \(I^- \subseteq [n] \) satisfy \( I^+ \cap I^- = \emptyset \) and \( I^+ \cup I^- = [n] \), \(\delta >0\), and \(c_i \geqslant 0\) for all \(i \in [n]\). Contrary to the previous proof, here we allow \(c_i=0\). Let \(a \in \{0,1\}^n\) be the characteristic vector of \(I^-\). Notice that a violates (7). This implies \(\phi (a) = 0\).

Let T be a tree that represents the formula \(\phi \). Using the same proof strategy, we find a leaf node \(\psi = x_j\) or \(\psi = \lnot x_j\) of T such that \(\psi (a) = 0\), and there exists a point \(\tilde{x} = \tilde{x}(\psi ) \in \psi (Q)\) that violates (7).

If \(\psi = x_j\), then \(j \in I^+\) and we consider the valid inequality

$$\begin{aligned} \sum _{\begin{array}{c} i \in I^+ \\ i \ne j \end{array}} c_i x_i + \sum _{i \in I^-} c_i (1 - x_i) + \delta (1-x_j) \geqslant \delta - c_j\,. \end{aligned}$$

Otherwise, \(\psi = \lnot x_j\) and thus \(j \in I^-\). In this case, we consider the valid inequality

$$\begin{aligned} \sum _{i \in I^+} c_i x_i + \sum _{\begin{array}{c} i \in I^- \\ i \ne j \end{array}} c_i (1 - x_i) + \delta x_j \geqslant \delta - c_j\,. \end{aligned}$$

Since (7) is a notch-\((\nu +1)\) inequality, it is easy to check that the notch of both of the above inequalities is at most \(\nu \). However, they are violated by the point \(\tilde{x} = \tilde{x}(\psi ) \in Q\). As in the proof of Theorem 4, this gives the desired contradiction. \(\square \)

Setting \( \phi ^1(Q) := \phi (Q) \) and \( \phi ^{\ell + 1}(Q) := \phi (\phi ^\ell (Q)) \) for \( \ell \in {\mathbb {Z}}_{\ge 1} \), and using the trivial fact that the notch of a non-trivial inequality is at most n, we immediately obtain the following corollary.

Corollary 6

Let \( \phi \) be a reduced Boolean formula defining a set \( S \subseteq \{0,1\}^n \) and let \( Q \subseteq [0,1]^n \) be any convex set containing S. Then we have \( \phi ^n(Q) = {\text {conv}}(S) \).

Another consequence of Theorem 5 is that integer points not belonging to S are already excluded from \( \phi (Q) \).

Corollary 7

Let \( \phi \) be a reduced Boolean formula defining a set \( S \subseteq \{0,1\}^n \) and let \( Q \subseteq [0,1]^n \) be any convex set containing S. Then we have \( \phi (Q) \cap {\mathbb {Z}}^n = S \).

Proof

It suffices to show that no point from \( \{0,1\}^n {\setminus } S \) is contained in \( \phi (Q) \). To this end, fix \( {\bar{x}} \in \{0,1\}^n {\setminus } S \) and consider the inequality

$$\begin{aligned} \sum _{i \in [n] : \bar{x}_i = 0} x_i + \sum _{i \in [n] : \bar{x}_i = 1} (1 - x_i) \geqslant 1\,, \end{aligned}$$

which is violated by \( {\bar{x}} \), but valid for all other points of \(\{0,1\}^n\). Since the inequality has notch 1, by Theorem 5 it is also valid for \( \phi (Q) \) and hence \( {\bar{x}} \) is not contained in \( \phi (Q) \). \(\square \)

5 Applications

In this section, we present several applications of our procedure, in which we repeatedly make use of Theorems 4 and 5.

5.1 Monotone formulas for matching

As a first application, we demonstrate how our findings together with Rothvoß’ result [25] on the extension complexity of the matching polytope yield a very simple proof of a seminal result of Raz and Wigderson [23, Theorem 4.1], which statesFootnote 4 that any monotone Boolean formula deciding whether a graph on n nodes contains a perfect matching has size \( 2^{\Omega (n)} \). Before giving any further detail, we point out that Raz and Wigderson’s result extends to the bipartite case [23, Theorem 4.2], which is not the case of the polyhedral approach described below.

The fact that Rothvoß’ theorem implies Raz and Wigderson’s was first discovered by Göös, Jain and Watson [15]. While their arguments are based on connections between nonnegative ranks of certain slack matrices and Karchmer-Wigderson games, which implicitly play an important role in the proofs of Theorems 4 and 5, our results yield a straightforward proof that does not require any further notions.

To this end, let \( n \in {\mathbb {Z}}_{\ge 2} \) be even and let \( G = (V, E) \) denote the complete undirected graph on n nodes. The set S considered by Raz and Wigderson is the set

$$\begin{aligned} S := \{ x \in \{0,1\}^E : {\text {supp}}(x) \subseteq E \text { contains a perfect matching} \}. \end{aligned}$$

Let \( \phi \) be any monotone Boolean formula in variables \( x_e \) (\( e \in E \)) that defines S. Next, define the polytope

$$\begin{aligned} P := \{ x \in [0,1]^E : x(\delta (U)) \geqslant 1 \text { for every } U \subseteq V \text { with } |U| \text { odd} \}. \end{aligned}$$

It is a basic fact that S is contained in P. Furthermore, observe that every non-trivial inequality in the definition of P has pitch 1. Thus, we have \( {\text {conv}}(S) \subseteq \phi ([0,1]^E) \subseteq P \). Moreover, if we consider the affine subspace

$$\begin{aligned} D := \{ x \in {\mathbb {R}}^E : x(\delta (\{u\}) = 1 \text { for every } u \in V \}, \end{aligned}$$

it is well-known that both \( {\text {conv}}(S) \cap D \) and \( P \cap D \) are equal to the perfect matching polytope of G, and hence we obtain that \( \phi ([0,1]^E) \cap D \) is also equal to the perfect matching polytope of G. By Rothvoß’ result, this implies \( {\text {xc}}(\phi ([0,1]^E)) = 2^{\Omega (n)} \). On the other hand, by Proposition 3 we also have \( {\text {xc}}(\phi ([0,1]^E)) \le |\phi | \cdot {\text {xc}}([0,1]^E) = |\phi | \cdot 2 |E| \le n^2 |\phi | \) and hence \( |\phi | \) must be exponential in n.

5.2 Covering problems: the binary case

In this section, we consider sets \( S \subseteq \{0,1\}^n \) that arise from 0/1-covering problems, in which there is a matrix \( A \in \{0,1\}^{m \times n} \) such that \( S = \{ x \in \{0,1\}^n : Ax \geqslant \mathbf {1} \} \), where \( \mathbf {1} \) is the all-ones vector. As an example, if A is the node-edge incidence matrix of an undirected graph G, then the points of S correspond to vertex covers in G. This shows that, in general, the convex hull of such sets S may not admit polynomial-size (in n) extended formulations, see for example, [4, 14, 15].

Moreover, general 0/1-hierarchies may have difficulties identifying basic inequalities even in simple instances. For example, in [8] it is shown that if \( Ax \geqslant \mathbf {1} \) consists of the inequalities \( \sum _{i \in [n] \setminus \{j\}} x_i \geqslant 1 \) for each \( j \in [n] \), then it takes at least \( n - 2 \) rounds of the Lovász-Schrijver or Sherali-Adams hierarchy to satisfy the pitch-2 inequality \( \sum _{i \in [n]} x_i \geqslant 2 \).

By developing a hierarchy tailored to 0/1-covering problems, Bienstock and Zuckerberg [7] were able to bypass some of these issues. As their main result, for each \( k \in {\mathbb {N}} \), they construct a polytope \( f^k(Q) \) containing S satisfying the following two properties. First, every inequality of pitch at most k that is valid for S is also valid for \( f^k(Q) \). Second, \( f^k(Q) \) can be described by an extended formulation of size \( (m + n)^{g(k)} \), where \( g(k) = \Omega (k^2) \). However, constructing the polytope \( f^k(Q) \) is quite technical and involved.

In contrast, our procedure directly implies significantly simpler and smaller extended formulations that satisfy all pitch-k inequalities.

Corollary 8

Let \( A \in \{0,1\}^{m \times n} \), \( S = \{ x \in \{0,1\}^n : Ax \geqslant \mathbf {1} \} \), and \(k \in {\mathbb {N}}\). Then there is a polyhedral relaxation P of S such that all points of P satisfy all valid inequalities of pitch at most k, and P can be defined by an extended formulation of size at most \(2n \cdot (mn)^k\).

Proof

Let \(\phi := \bigwedge _{i = 1}^m \bigvee _{j : A_{ij} = 1} x_j\). Since \([0,1]^n\) has 2n facets and \(\phi \) has size at most mn, we may take \(P=\phi ^k([0,1]^n)\) by Theorem 4. \(\square \)

5.3 Covering problems: bounded coefficients

Next, we consider a more general form of a covering problem in which \( S = \{ x \in \{0,1\}^n : Ax \geqslant b \} \) for some non-negative integer matrix \( A \in {\mathbb {Z}}_{\ge 0}^{m \times n} \) and \( b \in {\mathbb {Z}}_{\ge 1}^m \). We first restrict ourselves to the case that all entries in A and b are bounded by some constant \( \Delta \in {\mathbb {Z}}_{\ge 2} \).

Based on their results in [7], Bienstock and Zuckerberg [8] provide an extended formulation of size \( O(m + n^\Delta )^{g(k)} \), where \( g(k) = \Omega (k^2) \). Our method yields a significantly smaller extended formulation, via the following lemma.

Lemma 9

For every \( A \in {\mathbb {Z}}_{\ge 0}^{m \times n} \) and \( b \in {\mathbb {Z}}_{\ge 1}^m \) with entries bounded by \( \Delta \), the set \( S = \{ x \in \{0,1\}^n : Ax \geqslant b \} \) can be defined by a monotone formula \( \phi \) of size at most \( \Delta ^{5.3} m n \log ^{O(1)}(n) \). Moreover, this formula can be constructed in randomized polynomial time.

Proof

Fix \( i \in [m] \), let \( n' := \sum _{j=1}^n A_{ij} \) and let \( \psi _i \) be a monotone formula defining the set \( \{ y \in \{0,1\}^{n'} : \sum _{k=1}^{n'} y_k \ge b_i\} \). Next, pick any function \( h : [n'] \rightarrow [n] \) such that \( |h^{-1}(j)| = A_{ij} \) for all \( j \in [n] \). In formula \(\psi _i\), replace every occurrence of \( y_{k} \) by \( x_{h(k)} \), for \( k \in [n'] \). We obtain a monotone formula \(\phi _i\) defining the set \( \{x \in \{0,1\}^n : \sum _{j=1}^n A_{ij} x_j \ge b_i\} \). By using the construction of Hoory, Magen and Pitassi [16] for the initial formula \( \psi _i \), the resulting formula \( \phi _i \) has size

$$\begin{aligned} |\phi _i| = |\psi _i| \le \Delta ^{4.3} n' \log ^{O(1)}(n'/\Delta ) \le \Delta ^{5.3} n \log ^{O(1)} (n)\,, \end{aligned}$$

since \( n' \le n \Delta \) and \( b_i \le \Delta \). The result follows by taking \( \phi := \bigwedge _{i=1}^m \phi _i \). \(\square \)

Corollary 10

Let \( A \in {\mathbb {Z}}_{\ge 0}^{m \times n} \), \( b \in {\mathbb {Z}}_{\ge 1}^m \) with entries bounded by \( \Delta \), \( S = \{ x \in \{0,1\}^n : Ax \geqslant b \} \), and \(k \in {\mathbb {N}}\). Then there is a polyhedral relaxation P of S such that all points of P satisfy all valid inequalities of pitch at most k, and P can be defined by an extended formulation of size at most \((\Delta ^{5.3} m n \log ^{O(1)}(n))^k\).

Proof

By Theorem 4, we may take \(P= \phi ^k([0,1]^n) \), where \(\phi \) is the formula from Lemma 9. \(\square \)

5.4 Covering problems: the general case

In some cases, especially when \( m = O(1) \), the matrix \( A \in {\mathbb {Z}}_{\ge 0}^{m \times n} \) and vector \( b \in {\mathbb {Z}}_{\ge 0}^m \) may have coefficients as large as \( 2^{\Omega (n \log n)} \). For such general instances, we can improve the bound from Corollary 10.

Corollary 11

Let \( A \in {\mathbb {Z}}_{\ge 0}^{m \times n} \), \( b \in {\mathbb {Z}}_{\ge 1}^m \), \( S = \{ x \in \{0,1\}^n : Ax \geqslant b \} \), and \(k \in {\mathbb {N}}\). Then there is a polyhedral relaxation P of S such that all points of P satisfy all valid inequalities of pitch at most k, and P can be defined by an extended formulation of size at most \( \left( m n^{O(\log n)} \right) ^k \).

Proof

Beimel and Weinreb [5] show that, for every \( a_1,\dotsc ,a_n,\delta \in {\mathbb {R}}_{\ge 0} \), the set \( \{ x \in \{0,1\}^n : \sum _{j=1}^n a_j x_j \geqslant \delta \} \) can be decided by a monotone formula of size \( n^{O(\log n)} \). Let \( \phi \) be the conjunction of these formulas for each inequality in \( Ax \geqslant b \). By Theorem 4, we may take \(P= \phi ^k([0,1]^n) \). \(\square \)

In comparison, for this general case, Bienstock and Zuckerberg [7] have no nontrivial upper bound.

5.5 Constant notch 0/1-sets

In this section, we consider non-empty sets \(S \subseteq \{0,1\}^n\) with constant notch \( \nu (S) \). These sets have several desirable properties. For example, as noted in [3] (and implicitly in [11]), there is an easy polynomial-time algorithm to optimize a linear function over a constant notch set S, provided that we have a polynomial-time membership oracle for S. On the other hand, sets with constant notch do not necessarily admit small extended formulations. Indeed, counting arguments developed in [1, 24] show that even for a “generic” set \( S \subseteq \{0,1\}^n \) with notch \( \nu (S) = 1 \), \({\text {conv}}(S)\) requires extended formulations of size \( 2^{\Omega (n)} \).

This raises the question of which constant notch sets do admit compact extended formulations. As an immediate corollary to Theorem 5, we have the following nice partial answer.

Corollary 12

If \( S \subseteq \{0,1\}^n \) has constant notch and S can be described by a formula \(\phi \) of size polynomial in n, then \({\text {conv}}(S)\) can be described by a polynomial-size extended formulation.

Notice that every explicit 0/1-set S of constant notch such that \({\text {xc}}({\text {conv}}(S))\) is large would thus provide an explicit Boolean function requiring large depth circuits, and solve one of the hardest open problems in circuit complexity.

6 Comparison and conclusion

In this paper, we propose a new method for strengthening convex relaxations of 0/1-sets. Our approach currently yields the simplest and smallest linear extended formulations expressing inequalities of constant pitch in the monotone case, and constant notch in the general case.

By viewing an iterated application of our procedure as a hierarchy, we obtain a significant simplification of the Bienstock-Zuckerberg hierarchy [7]. Prior to our work, [7] had been simplified by Mastrolilli [22] using a modification of the Sherali-Adams hierarchy. Subsequent to our work, [7] has also been simplified by Bienstock and Zuckerberg [9] themselves for the case of \( A \in \{0,1\}^{m \times n} \). The way [9] construct their extended formulation is similar to what we do, except that they replace the canonical monotone formula in (1) by a (logically equivalent) non-monotone formula, which might yield a tighter relaxation in some cases.

Although [22] is an important simplification of [7], our approach is from first principles and assumes no knowledge of polynomial optimization. Moreover, despite the simplicity of our approach, our extended formulations (see Corollaries 8, 10, and 11) are significantly smaller than those provided by [7, 22]. This is possible since we allow any monotone formula, and can thus use any known construction from the literature. In contrast, Bienstock and Zuckerberg [7] implicitly only consider formulas in conjunctive normal form. The number of clauses in every formula in conjunctive normal form is at least the number of minimal coversFootnote 5, which makes it impossible for them to construct small extended formulations in situations where the number of minimal covers is large.

Furthermore, the way in which we derive our extended formulations is conceptually different than [22]. Mastrolilli [22] first writes down a proof of validity of any bounded-pitch inequality that has a certain “polynomial” form (similar to a sum-of-squares proof, except that no square is necessary). He then uses this proof to recursively define a set of polynomials \(\mathcal {S} = \mathcal {S}(A,k)\), and then constructs an extended formulation generalizing the Sherali-Adams hierarchy from \(\mathcal {S}\). At the heart of his approach is a lemma due to Bienstock and Zuckerberg [7, Lemma 4.2].

In our paper, we give a direct way to strengthen any given relaxation by “feeding” it in a Boolean formula \(\phi \) defining the set of feasible 0/1 solutions. That is, we first describe how to construct the extended formulation. Then we prove that each iteration (of the same procedure) “gives at least one extra unit of pitch”. At the heart of our analysis lies a new ingredient (coming from a Karchmer-Widgerson game) replacing the lemma from Bienstock and Zuckerberg. This is the reason why we improve the exponential dependence in k from \(k^2\) to k in Corollary 8.

Finally, as far as we can tell, our results from Sects. 5.1 and 5.5 are completely independent from [7, 9, 22]. To conclude, we state a few open questions raised by our work.

  1. (1)

    Do the new extended formulations lead to any new interesting algorithmic application, in particular for covering problems? This appears to be connected to the following question. How good are the lower bounds on the optimum value obtained after performing a few rounds of the Chvátal-Gomory closure? For some problems, such as the vertex cover problem in graphs or more generally in q-uniform hypergraphs with \(q = O(1)\), the bounds turn out to be quite poor in the worst case [4, 28]. The situation is less clear for other problems, such as network design problems. Recent work [13] on the tree augmentation problem uses certain inequalities from the first Chvátal-Gomory closure in an essential way. For the related 2-edge connected spanning subgraph problem, our work implies that one can approximately optimize over the \(\ell \)-th Chvátal-Gomory closure in quasi-polynomial time, for every \(\ell = O(1)\).

  2. (2)

    For which classes of polytopes in \( [0,1]^n \) can one approximate a constant number of rounds of the Chvátal-Gomory closure with compact extended formulations? Mastrolilli [22] show that this is possible for packing problems. However, his approach crucially uses positive semi-definite extended formulations. Packing problems are unlikely to admit compact linear extended formulations, although we do not have a proof of this.

  3. (3)

    Can one find polynomial-size monotone formulas for any nonnegative weighted threshold function, that is, for every min-knapsack \(\{x \in \{0,1\}^n : \sum _{i=1}^n a_i x_i \geqslant \beta \}\)? This would improve on the \(n^{O(\log n)}\) upper bound by Beimel and Weinreb [5]. Klabjan, Nemhauser and Tovey show that separating pitch-1 inequalities for such sets is NP-hard [19]. However, this does not rule out a polynomial-size extended formulation defining a relaxation that would be stronger than that provided by pitch-1 inequalities.