1 Introduction

Consider a (convex) polytope P in ℝd. An extension (or extended formulation) of P is a polytope Q in ℝe such that P is the image of Q under a linear projection from ℝe to ℝd. The main motivation for seeking extensions Q of the polytope P is perhaps that the number of facets of Q can sometimes be significantly smaller than that of P. This phenomenon has already found numerous applications in optimization, and in particular linear and integer programming. To our knowledge, systematic investigations began at the end of the 1980s with the work of Martin [13] and Yannakakis [17], among others. Recently, the subject is receiving an increasing amount of attention. See, e.g., the surveys by Conforti, Cornuéjols and Zambelli [4], Vanderbeck and Wolsey [16], and Kaibel [10].

A striking example, which is relevant to this paper, arises when P is a regular n-gon in ℝ2. As follows from results of Ben-Tal and Nemirovski [2], for such a polytope P, one can construct an extension Q with as few as O(logn) facets. It remained an open question to determine to which extent such a dramatic decrease in the number of facets is possible when P is a non-regular n-gon.Footnote 1 This is the main question we address in this paper.

Before giving an outline of the paper, we state a few more definitions. The size of an extension Q is simply the number of facets of Q. The extension complexity of P is the minimum size of an extension of P, denoted as  xc(P). See Fig. 1 for an illustration.

Fig. 1
figure 1

Proof by picture that the extension complexity of a regular 8-gon is at most 6. Here P⊆ℝ2 is a regular 8-gon, Q⊆ℝ3 is a polytope combinatorially equivalent to a 3-cube, and π:ℝ3→ℝ2 is a linear projection map such that π(Q)=P

Notice that the extension complexity of every n-gon is Ω(logn). This follows from the fact that any extension Q with k facets has at most 2k faces. Since each face of P is the projection of a face of the extension Q, it follows that Q must have at least log2 f facets if P has f faces [7]. Thus if P is an n-gon, we have  xc(P)⩾log2(2n+2)=Ω(logn). When P is a regular n-gon, we have  xc(P)=Θ(logn).

One of the fundamental results that can be found in Yannakakis’ groundbreaking paper [17] is a characterization of the extension complexity of a polytope in terms of the non-negative rank of its slack matrix. Although this is discussed in detail in Sect. 2, we include a brief description here. To each polytope P one can associate a matrix S(P) that records, in the entry that is in the ith row and jth column, the slack of the jth vertex with respect to the ith facet. This matrix is the ‘slack matrix’ of P. It turns out that computing  xc(P) amounts to determining the minimum number r such that there exists a factorization of the slack matrix of P as S(P)=TU, where T is a non-negative matrix with r columns and U is a non-negative matrix with r rows. Such a factorization is called a ‘rank r non-negative factorization’ of the slack matrix S(P).

In Sect. 3, we give an explicit O(logn) rank non-negative factorization of the slack matrix of a regular n-gon. This provides a new proof that the extension complexity of every regular n-gon is O(logn). Our proof technique directly generalizes to other polytopes, such as the permutahedron. In particular, we obtain a new proof of the fact that the extension complexity of the n-permutahedron is O(nlogn), a result due to Goemans [7]. Our approach builds on a new proof of this result by Kaibel and Pashkovich [11] but is different because it works by directly constructing a non-negative factorization of the slack matrix.

In Sect. 4, we prove that there exist n-gons whose extension complexity is at least \(\sqrt{2n}\). However, the proof uses polygons whose coordinates are transcendental numbers, which is perhaps not entirely satisfactory. For instance, one might ask whether a similar result holds when the encoding length of each vertex of the polygon is O(logn).

In Sect. 5, we settle this last question by proving the existence of n-gons whose vertices belong to an O(nO(n 2) integer grid and with extension complexity \(\varOmega (\sqrt{n}/\sqrt{\log n})\). This is inspired by recent work of one of the authors on the extension complexity of 0/1-polytopes [14].

2 Slack Matrices and Non-negative Factorizations

Consider a polytope P in ℝd with m facets and n vertices. Let A 1 xb 1,…,A m xb m denote the facet-defining inequalities of P, where A 1,…,A m are row vectors. Let also v 1,…,v n denote the vertices of P. The slack matrix of P is the non-negative m×n matrix S=S(P) with S ij =b i A i v j .

A rank r non-negative factorization of a non-negative matrix S is an expression of S as product S=TU where T and U are non-negative matrices with r columns and r rows, respectively. The non-negative rank of S, denoted by  rank+(S), is the minimum number r such that S admits a rank r non-negative factorization [3].

The following theorem is (essentially) due to Yannakakis, see also [6].

Theorem 1

(Yannakakis [17])

For all polytopes P,

$$\mathop {\mathrm {xc}}(P) = \mathop {\mathrm {rank}_+}\bigl(S(P)\bigr).$$

To conclude this section, we briefly indicate how to obtain extensions from non-negative factorizations, and prove half of Theorem 1. Assuming P={x∈ℝd:Axb}, consider a rank r non-negative factorization S(P)=TU of the slack matrix of P. Then it can be shown that the image of the polyhedron Q:={(x,y)∈ℝd+rAx+Ty=b,y⩾0} under the projection ℝd+r→ℝd:(x,y)↦x is exactly P. Notice that Q has at most r facets. Now if we take r= rank+(S(P)), then Q is actually a polytope [5]. Thus Q is an extension of P with at most  rank+(S(P)) facets, and hence  xc(P)⩽ rank+(S(P)).

3 Regular Polygons

First, we give a new proof of the tight logarithmic upper bound on the extension complexity of a regular n-gon. This result is implicit in work by Ben-Tal and Nemirovski [2] (although for n being a power of two). Another proof can be found in Kaibel and Pashkovich [11]. Then, we discuss a generalization of the proof to related higher-dimensional polytopes.

Theorem 2

Let P be a regular n-gon in2. Then  xc(P)=O(logn).

Proof

Without loss of generality, we may assume that the origin is the barycenter of P. After numbering the vertices of P counterclockwise as v 1,…,v n , we define a sequence 0,…, q−1 of axes of symmetry of P, as follows.

Initialize i to 0, and k to n. While k>1, repeat the following steps:

  • define i as the line through the origin and the midpoint of vertices \(v_{ \lceil\frac{k}{2} \rceil}\) and \(v_{ \lceil\frac{k+1}{2} \rceil}\);

  • replace k by \(\lfloor\frac{k+1}{2} \rfloor\);

  • increase i by one.

Define q as the final value of i. Thus, q is the number of axes of symmetry i defined. Note that when k=k(i) is odd, then i passes through one of the vertices of P. Note also that q=O(logn). For each i=0,…,q−1, one of the two closed half-planes bounded by i contains v 1. We denote it \(\ell_{i}^{+}\). We denote the other by \(\ell_{i}^{-}\).

Now, consider a vertex v of P. We define the folding sequence v (0), v (1),…,v (q) of v as follows. We let v (0):=v, and for i=0,…,q−1, we let v (i+1) denote the image of v (i) by the reflection with respect to i if v (i) is not in the half-space \(\ell_{i}^{+}\), and we let v (i+1):=v (i) otherwise. In other words, v (i+1) is the image of v (i) under the conditional reflection with respect to half-plane \(\ell_{i}^{+}\). By construction, we always have v (q)=v 1.

Next, consider a facet F of P. The folding sequence F (0), F (1),…,F (q) of facet F is defined similarly as the folding sequence of vertex v. Pick any inequality a T xβ defining F. We let a (0):=a, and for i=0,…,q−1, we let a (i+1) denote the image of a (i) under the conditional reflection with respect to \(\ell_{i}^{+}\). Then F (i) is the facet of P defined by (a (i))T xβ. The last facet F (q) in the folding sequence is always either the segment [v 1,v 2] or the segment [v 1,v n ]. See Fig. 2 for an illustration with n=15, and thus q=4.

Fig. 2
figure 2

A 15-gon with four axes of symmetry, a vertex- and a facet-folding sequence

Finally, we define a non-negative factorization S(P)=TU of the slack matrix of P, of rank 2q=O(logn). Below, let d(x, i ) denote the distance of x∈ℝ2 to line  i .

In the left factor of the factorization, the row corresponding to facet F is of the form (t 0,…,t q−1), where \(t_{i} := (\sqrt{2}\,d(a^{(i)},\ell_{i}),0)\) if a (i) is not in \(\ell_{i}^{+}\) and \(t_{i} :=(0,\sqrt{2}\,d(a^{(i)},\ell_{i}))\) otherwise. Similarly, in the right factor, the column corresponding to vertex v is of the form (u 0,…,u q−1)T, where \(u_{i} := (0,\sqrt{2}\,d(v^{(i)},\ell_{i}))^{T}\) if v (i) is not in \(\ell_{i}^{+}\) and \(u_{i} := (\sqrt{2}\,d(v^{(i)},\ell_{i}),0)^{T}\) otherwise.

The correctness of the factorization rests on the following simple observation: for i=0,…,q−1 the slack of v (i+1) with respect to F (i+1) equals the slack of v (i) with respect to F (i) plus some correction term. If a (i) and v (i) are on opposite sides of i , then the correction term is 2d(a (i), i )d(v (i), i ). Otherwise, it is zero (no correction is necessary). Indeed, letting n i denote a unit vector normal to i , and assuming that v (i) and a (i) are on opposite sides of i , we have

When v (i) and a (i) are on the same side of i , we obviously have

Observe that the slack of v (q) with respect to F (q) is always 0. The theorem follows. □

The n-permutahedron is the polytope of dimension n−1 in ℝn whose n! vertices are the points obtained by permuting the coordinates of (1,2,…,n)T. It has 2n−2 facets, defined by the inequalities ∑ jS x j g(|S|) for all proper non-empty subsets S of [n]:={1,2,…,n}, where \(g(S) := {n+1 \choose2} - {n-|S|+1 \choose2}\).

Let j and k denote two elements of [n] such that j<k. We denote H j,k the hyperplane defined by x j =x k , and \(H_{j,k}^{+}\) the closed half-space defined by x j x k . Applying the conditional reflection with respect to \(H_{j,k}^{+}\) to a vector x∈ℝn amounts to swapping the coordinates x j and x k if and only if x j >x k . Intuitively, the conditional reflection with respect to \(H_{j,k}^{+}\) sorts the coordinates x j and x k .

The proof of Theorem 2 can be modified to give a new proof of the existence of O(nlogn) size extension of the n-permutahedron [7], as follows. Since there exists a sorting network of size O(nlogn) for sorting n inputs, a celebrated result of Ajtai, Komlós and Szemerédi [1], there exist q=O(nlogn) half-spaces \(H_{j_{0},k_{0}}^{+}\), \(H_{j_{1},k_{1}}^{+}, \ldots, H_{j_{q-1},k_{q-1}}^{+}\) such that sequentially applying the conditional reflection with respect to \(H_{j_{i},k_{i}}^{+}\) for i=0,…,q−1 to any point x∈ℝn, sorts this point x.

Therefore, the folding sequence of any vertex v of the n-permutahedron always ends with the vertex (1,2,…,n)T. Moreover, the folding sequence of the facet defined by ∑ jS x j g(|S|) always ends with the facet defined by \(\sum_{j=n-|S|+1}^{n} x_{j} \leqslant g(|S|)\). Note that this last facet contains the vertex (1,2,…,n)T. Hence the proof technique used above for a regular n-gon extends to the n-permutahedron.

In fact, it turns out that the proof technique further extends to the permutahedron of any finite reflection group. One simply has to choose the right sequence of conditional reflections. Such sequences were constructed by Kaibel and Pashkovich [11], with the help of Ajtai–Komlós–Szemerédi sorting networks. Thus we can re-prove their main results about permutahedra of finite reflection groups. Our proof is different in the sense that we explicitly construct a non-negative factorization of the slack matrix.

4 Generic Polygons

We begin by recalling some basic facts about field extensions (see, e.g., Hungerford [9], Lang [12], or Stewart [15]). Let L be a field and K be a subfield of L. Then L is an extension field of K, and L/K is a field extension. We say that the field extension L/K is algebraic if every element of L is algebraic over K, that is, for each element of L there exists a non-zero polynomial with coefficients in K that has the element as one of its roots.

For α 1,…,α q L, the inclusion-wise minimal subfield of L that contains both K and {α 1,…,α q } is denoted by K({α 1,…,α q }), or simply K(α 1,…,α q ). It is also the subfield formed by all fractions \(\frac{f(\alpha_{1},\ldots,\alpha_{q})}{g(\alpha_{1},\ldots,\alpha_{q})}\) where f and g are polynomials with coefficients in K and g(α 1,…,α q )≠0.

A subset X of L is said to be algebraically independent over K if no non-trivial polynomial relation with coefficients in K holds among the elements of X. The transcendence degree of the field extension L/K is defined as the largest cardinality of an algebraically independent subset of L over K. It is also the minimum cardinality of a subset Y of L such that L/K(Y) is algebraic.

We say that a polygon in ℝ2 is generic if the coordinates of its vertices are distinct and form a set that is algebraically independent over the rationals.

Theorem 3

If P is a generic convex n-gon in2 then \(\mathop {\mathrm {xc}}(P)\geqslant\sqrt{2n}\).

Proof

Let α 1,…,α 2n denote the coordinates of the n vertices of P, listed in any order. Thus X:={α 1,…,α 2n } is algebraically independent over ℚ.

Now suppose that P is the projection of a d-dimensional polytope Q with k facets. Without loss of generality, we may assume that Q lives in ℝd and that the projection is onto the two first coordinates.

Consider any linear description of Q. This description is defined by k(d+1) real numbers: the kd entries of the constraint matrix and the k right-hand sides. We denote these reals as β 1,…,β k(d+1). By Cramer’s rule, each α i can be written as \(\alpha_{i} = \frac{f_{i}(\beta_{1},\ldots,\beta_{k(d+1)})}{g_{i}(\beta_{1},\ldots,\beta_{k(d+1)})}\) where f i and g i are polynomials with rational coefficients and g i (β 1,…,β k(d+1))≠0. In particular, this means that each α i is in the extension field L:=ℚ(β 1,…,β k(d+1)).

Since X is algebraically independent over ℚ and XL, the transcendence degree of L/ℚ is at least 2n. But on the other hand, the transcendence degree of L/ℚ is at most k(d+1). Indeed, letting Y:={β 1,…,β k(d+1)}, we have ℚ(Y)=L and thus L/ℚ(Y) is algebraic. It follows that k(d+1)⩾2n. Since kd+1, we see that k 2⩾2n, hence \(k \geqslant\sqrt{2n}\). □

5 Polygons with Integer Vertices

Since encoding transcendental numbers would require an infinite number of bits, an objection might be raised that Theorem 3 is not very satisfying. In this section we provide a slightly weaker lower bound with polygons whose vertices can be encoded efficiently. In particular we will now show that for every n there exist polygons with vertices on an O(nO(n 2) grid and whose extension complexity is large. To do this we will need a slightly modified version of a rounding lemma proved by Rothvoß [14], see Lemma 5 below.

For a matrix A let A (resp. A ) denote the th row (resp. th column) of A. Similarly, for a subset I of row indices of A, let A I denote the submatrix of A obtained by picking the rows indexed by the elements of I.

Let T and U be m×r and r×n non-negative matrices. Since below T and U will be respectively the left and right factor of a factorization of some slack matrix, we can assume that no column of T is identically zero and, similarly, no row of U is identically zero. The pair T,U is said to be normalized if ∥T =∥U for every ∈[r]. Since multiplying a column  of T by λ>0 and simultaneously dividing row of U by λ leaves the product TU unchanged, we can always scale the rows and columns of two matrices so that they are normalized without changing TU.

Lemma 4

(Rothvoß [14])

If the pair T,U is normalized, then \(\max\{\|T\|_{\infty}, \|U\|_{\infty}\}\leqslant\sqrt{ \|TU\|_{\infty}}\).

Proof

Let S:=TU. Suppose, for the sake of contradiction, that the assertion does not hold. Without loss of generality, we may assume that \(\|T\|_{\infty}> \sqrt{ \|TU\|_{\infty}}\). Thus \(T_{i\ell} > \sqrt{ \|TU\|_{\infty}}\) for some indices i and . Since T,U is normalized, \(\|U_{\ell}\|_{\infty}= \|T^{\ell}\|_{\infty}> \sqrt{ \|TU\|_{\infty}}\) and there must be an index j such that \(U_{\ell j} > \sqrt{ \|TU\|_{\infty}}\). Then S ij T iℓ U ℓj >∥TU, which is a contradiction. □

Consider a set of n convex independent points V in the plane lying on an integer grid of size polynomial in n, its convex hull P:= conv(V), and X:=ℤ2P. The next crucial lemma (adapted from a similar result in [14]) implies that the description of an extension Q:={(x,y)∣Ax+Ty=b,y≥0} for P—potentially containing irrational numbers—can be rounded such that an integer point x is in X if and only if there is a y≥0 such that \(\bar{A}x + \bar{T}y \approx\bar {b}\) holds for the rounded system. Moreover, all coefficients in the rounded system come from a domain which is bounded by a polynomial in n.

Lemma 5

For d,N≥2 let V={v 1,…,v n }⊆ℤd be a convex independent and non-empty set of points withv i N for i∈[n]. Let P:= conv(V) and let X:=P∩ℤd. Denote r:= xc(P) and Δ:=((d+1)N)d. Then there are matrices \(\bar{A} \in\mathbb{Z}^{(d+r)\times d},~\bar{T} \in(\frac{1}{4r(d+r)\varDelta }\mathbb {Z}_{+})^{(d+r) \times r}\) and a vector \(\bar{b} \in\mathbb{Z}^{d+r}\) with \(\|\bar{A}\|_{\infty}, \|\bar{b}\|_{\infty}, \|\bar{T}\|_{\infty} \leqslant \varDelta \) such that

$$X = \biggl\{ x \in\mathbb{Z}^d \mid\exists y\in[0,\varDelta ]^r: \| \bar{A}x + \bar{T}y - \bar{b} \|_{\infty}\leqslant\frac {1}{4(d+r)} \biggr\}.$$

Proof

Let Axb be a non-redundant description of P with integral coefficients. We may assume (see, e.g., [8, Lemma D.4.1]) that ∥A,∥bΔ=((d+1)N)d. Since  xc(P)=r, by Yannakakis’ Theorem 1 there exist matrices \(T\in\mathbb{R}^{m\times r}_{+}\) and \(U\in\mathbb {R}^{r\times n}_{+}\) such that S:=TU is the slack matrix of P, and P={x∈ℝd∣∃y∈ℝr:Ax+Ty=b,y⩾0}. Without loss of generality assume that the pair T,U is normalized. Note that

$$\|S\|_{\infty} = \max_{i \in[m] \atop j \in[n]} (b_i-A_iv_j)\leqslant \varDelta + dN\varDelta \leqslant \varDelta ^2.$$

Since T,U are normalized, using Lemma 4, we have that ∥TΔ and ∥UΔ.

Let W:= span({(A i ,T i )∣i∈[m]}) be the row span of the constraint matrix of the system Ax+Ty=b and let k:=dim(W) be the dimension of W. Choose I⊆{1,…,m} of size |I|=k such that the volume of the parallelepiped spanned by the vectors {(A i ,T i )∣iI}, denoted by  vol({(A i ,T i )∣iI}), is maximized. Let \(T_{I}'\) be the matrix obtained from rounding the coefficients of T I to the nearest multiple of \(\frac{1}{4r(d+r)\varDelta }\). Our choice will be \(\bar{A}:= A_{I}\), \(\bar{T}:=T_{I}'\) and \(\bar{b}:=b_{I}\). Let

$$Y := \biggl\{ x \in\mathbb{Z}^d \mid\exists y\in[0,\varDelta ]^r: \bigl\| A_Ix + T_I'y -b_I \bigr\|_{\infty} \leqslant\frac{1}{4(d+r)} \biggr\}.$$

Then it is sufficient to show that X=Y.

Claim 6

XY.

Proof of claim

Consider an arbitrary vertex v j V. Since, S=TU, we can choose y:=U j⩾0 such that Av j +Ty=b. Since T,U are normalized, we have that ∥y⩽∥UΔ. Note that \(\|T- T'\|_{\infty} \leqslant\frac {1}{4r(d+r)\varDelta }\). By the triangle inequality,

Thus v j Y and hence VY. It follows that XY. □

Claim 7

XY.

Proof of claim

We show that x∈ℤd\X implies xY. Since xX and XP, there must be a row with A x>b . Since Ab and x are integral, one even has A xb +1. Note that in general is not among the selected constraints with row indices in I. But there are unique coefficients λ∈ℝk such that we can express constraint A x+T y=b as a linear combination of those with indices in I, i.e.

$$\begin{pmatrix} A_{\ell}, T_{\ell}\end{pmatrix} = \sum_{i\in I}\lambda_{i} \begin{pmatrix} A_{i}, T_{i}\end{pmatrix} .$$

It is easy to see that ∑ iI λ i b i =b , since otherwise the system Ax+Ty=b could not have any solution (x,y) at all and P=∅. The next step is to bound the coefficients λ i . Here we recall that by Cramer’s rule,

$$|\lambda_{i}| = \dfrac{\operatorname{vol} ( \{ ( A_{i'},T_{i'})\mid i' \in I \backslash\{ i \}\cup\{ \ell\} \} )}{\operatorname{vol} ( \{ (A_{i'}, T_{i'})\mid i' \in I \} )} \leqslant1,$$

since we picked I such that \(\operatorname{vol}(\{ ( A_{i'},T_{i'}) \mid i' \in I\})\) is maximized. Fix an arbitrary y∈[0,Δ]r, then

(1)

using the triangle inequality and the fact that |I|⩽d+r. Again making use of the triangle inequality yields

(2)

Combining (1) and (2) gives \(\|A_{I}x - b_{I} + T_{I}'y\|_{\infty} \geqslant\frac{1}{d+r} - \frac {1}{4(d+r)} > \frac{1}{4(d+r)}\) for all y∈[0,Δ]r and consequently xY. □

The theorem follows. Note that by padding zeros, we can ensure that \(\bar{A}\), \(\bar{T}\) and \(\bar{b}\) have exactly d+r rows.  □

Now we are ready to prove our lower bound for the extension complexity of polygons.

Theorem 8

For every n≥3, there exists a convex n-gon P with vertices in [2n]×[4n 2] and \(\mathop {\mathrm {xc}}(P) = \varOmega (\sqrt{n}/\sqrt{\log n})\).

Proof

The 2n points of the set Z:={(z,z 2)∣z∈[2n]} are obviously convex independent. In other words, every subset XZ of size |X|=n yields a different convex n-gon. The number of such n-gons is \({2n \choose n} \geqslant2^{n}\). Let R:=max{ xc( conv(X))∣XZ,|X|=n}. Lemma 5 provides a map Φ which takes X as input and provides the rounded system \((\bar{A}, \bar{T}, \bar{b})\). (If the choice of A, b and I is not unique, make an arbitrary canonical choice.) By padding zeros, we may assume that this system is of size (2+R)×(3+R).

Also, Lemma 5 guarantees that for each system \((\bar {A}, \bar{T} , \bar{b})\), the corresponding set X can be reconstructed. In other words, the map Φ must be injective and the number of such system must be at least 2n. Thus it suffices to determine the number of such systems: the entries in each system \((\bar{A}, \bar{T}, \bar{b})\) are integer multiples of \(\frac{1}{4r(d+r)\varDelta } = \frac{1}{4r(2+r)144n^{4}}\) for some r∈[R] using d=2, N=4n 2, Δ=(12n 2)2=144n 4. Since no entry exceeds Δ, for each entry there are at most \(1 + \sum_{r=1}^{R} (165888\,r(2+r)n^{8}) \leqslant cn^{11}\) many possible choices for some fixed constant c (note that Rn). Thus the number of such systems is bounded by \((cn^{11})^{(3+R)\cdot(2+R)}\leqslant2^{c' \log{n}\cdot R^{2}}\) for some constant c′.

We conclude that \(2^{c'\log_{2} n\cdot R^{2}} \geqslant2^{n}\) and thus \(R =\varOmega (\sqrt{n} / \sqrt{\log n})\). □

6 Concluding Remarks

Although the two lower bounds presented here on the worst case extension complexity of a n-gon are \(\tilde{\varOmega }(\sqrt{n})\), it is plausible that the true answer is \(\tilde{\varOmega }(n)\). We leave this as an open problem.