1 Introduction

The theta number of a graph, introduced by Lovász [18] to determine the Shannon capacity of the pentagon, is one of the founding results of semidefinite programming and has inspired numerous developments in combinatorics (see Grötschel, Lovász, and Schrijver [12, Chapter 9] and Schrijver [23, Chapter 67]), coding theory (see Schrijver [24]), and discrete geometry (see Oliveira and Vallentin [5]). It is a graph parameter that provides at the same time an upper bound for the independence number of a graph and a lower bound for the chromatic number of its complement, a result known as Lovász’s sandwich theorem. The theta number also has weighted variants, and both Lovász’s original parameter and its variants can be computed in polynomial time. To this day, the only known polynomial-time algorithms to compute a maximum-weight independent set or a minimum-weight coloring in a perfect graph compute the weighted theta number as a subroutine.

The sandwich theorem has a geometrical counterpart, the theta body. The theta body of a graph \(G = (V, E)\) was introduced by Grötschel, Lovász, and Schrijver [13]; it is the convex body \({{\,\textrm{TH}\,}}(G) \subseteq \mathbb {R}^V\) given by the feasible region of the optimization program defining the theta number. It contains the independent-set polytope of G and is contained in the polytope defined by the clique inequalities of G. One can optimize linear functions over the theta body in polynomial time, that is, the weak optimization problem over \({{\,\textrm{TH}\,}}(G)\) can be solved in polynomial time. The theta body provides a characterization of perfect graphs: \({{\,\textrm{TH}\,}}(G)\) is a polytope, and in this case is exactly the independent-set polytope, if and only if G is a perfect graph.

In this paper we extend the definition of the theta body from graphs to hypergraphs, derive fundamental properties of this extension, and discuss applications.

1.1 Independence in Hypergraphs

Let \(H = (V, E)\) be an r-uniform hypergraph for some integer \(r \ge 1\), so V is a finite set and \(E \subseteq \left( {\begin{array}{c}V\\ r\end{array}}\right) \), where \(\left( {\begin{array}{c}V\\ r\end{array}}\right) \) denotes the set of r-element subsets of V. For \(r=2\) this gives the usual notion of a graph, while the case \(r=1\) is somewhat degenerate but convenient for inductive arguments. The complement of H is the r-uniform hypergraph \({\overline{H}}\) with vertex set V in which an r-subset e of V is an edge if and only if e is not an edge of H.

A set \(I \subseteq V\) is independent in H if no edge of H is contained in I. Given a weight function \(w \in \mathbb {R}^V\), the weighted independence number of H is

$$\begin{aligned} \alpha (H, w) = \max \{\, w(I): I \subseteq V \text {is independent}\,\}, \end{aligned}$$

where \(w(I) = \sum _{x \in I} w(x)\). When \(w = \textbf{1}\) is the constant-one function, \(\alpha (H, w)\) is the independence number of H, denoted simply by \(\alpha (H)\). Computing the independence number of a graph is a known NP-hard problem [16] and computing its hypergraph counterpart is also NP-hard.

The independent-set polytope of H is the convex hull of characteristic functions of independent sets of H, namely

$$\begin{aligned} {{\,\textrm{IND}\,}}(H) = {{\,\textrm{conv}\,}}\{\, \chi _I \in \mathbb {R}^V: I \subseteq V \text {is independent}\,\}, \end{aligned}$$

where \(\chi _S \in \mathbb {R}^V\) is the characteristic function of \(S \subseteq V\). The weighted independence number \(\alpha (H, w)\) can be computed by maximizing \(w^\textsf{T}f\) over \(f \in {{\,\textrm{IND}\,}}(H)\), and so optimizing over \({{\,\textrm{IND}\,}}(H)\) is an NP-hard problem.

A clique of H is a set \(C \subseteq V\) such that every r-subset of C is an edge. Note that cliques of H are independent sets of \({\overline{H}}\) and vice versa. Note also that any set with fewer than r elements is both a clique and an independent set (the same happens with graphs: single vertices are both cliques and independent sets).

If C is a clique of H and if \(f \in {{\,\textrm{IND}\,}}(H)\), then \(f(C) \le r - 1\). These valid inequalities for \({{\,\textrm{IND}\,}}(H)\) are called clique inequalities; they give a relaxation of the independent-set polytope, namely the polytope

$$\begin{aligned} {{\,\textrm{QIND}\,}}(H) = \{\, f \in [0, 1]^V: f(C) \le r - 1 \text { for every clique}~C \subseteq V\, \}. \end{aligned}$$
(1)

Clearly, \({{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{QIND}\,}}(H) \subseteq [0, 1]^V\). The integer vectors in \({{\,\textrm{QIND}\,}}(H)\) are precisely the characteristic functions of independent sets, and so the integer hull of \({{\,\textrm{QIND}\,}}(H)\) is \({{\,\textrm{IND}\,}}(H)\).

Since cliques of H are independent sets of \({\overline{H}}\), the separation problem over \({{\,\textrm{QIND}\,}}(H)\) consists of finding a maximum-weight independent set of \({\overline{H}}\), and it is therefore NP-hard. As a consequence, optimizing over \({{\,\textrm{QIND}\,}}(H)\) is NP-hard as well.

1.2 The Theta Body of Graphs and Hypergraphs

Grötschel, Lovász, and Schrijver [13] defined the theta body of a graph G: a convex relaxation of \({{\,\textrm{IND}\,}}(G)\) stronger than \({{\,\textrm{QIND}\,}}(G)\) over which it is possible to optimize a linear function in polynomial time.

For a symmetric matrix A, write

$$\begin{aligned} R(A) = \begin{pmatrix} 1&{}a^\textsf{T}\\ a&{}A \end{pmatrix}, \end{aligned}$$

where \(a = {{\,\textrm{diag}\,}}A\) is the diagonal of A. The theta body of a graph \(G = (V, E)\) is

$$\begin{aligned} \begin{aligned} {{\,\textrm{TH}\,}}(G) = \{\, f \in \mathbb {R}^V:\ {}&\text {there is}~F \in \mathbb {R}^{V \times V} \text {such that} f = {{\,\textrm{diag}\,}}F,\\&F(x, y) = 0\quad \text {if~}\{x, y\} \in E, \text {and}\\&R(F) \text { is positive semidefinite}\,\}. \end{aligned} \end{aligned}$$
(2)

(This specific formulation was given by Lovász and Schrijver [17].) Here and elsewhere, positive semidefinite matrices are always symmetric.

The theta body is a closed and convex set satisfying

$$\begin{aligned} {{\,\textrm{IND}\,}}(G) \subseteq {{\,\textrm{TH}\,}}(G) \subseteq {{\,\textrm{QIND}\,}}(G) \end{aligned}$$

for every graph G; since \({{\,\textrm{QIND}\,}}(G)\) is bounded, the theta body is compact. Moreover, optimizing over the theta body is the same as solving a semidefinite program, and in this case this can be done to any desired precision in polynomial time using either the ellipsoid method [12, Chapter 9] or the interior-point method [4].

The Lovász theta number of G for a weight function \(w \in \mathbb {R}^V\) is obtained by optimizing over the theta body, namely

$$\begin{aligned} \vartheta (G, w) = \max \{\, w^\textsf{T}f: f \in {{\,\textrm{TH}\,}}(G)\, \}; \end{aligned}$$

for \(w = \textbf{1}\) we recover the theta number as originally defined by Lovász [18], which we denote simply by \(\vartheta (G)\). Immediately we get

$$\begin{aligned} \alpha (G, w) \le \vartheta (G, w). \end{aligned}$$

Our aim is to extend the definition of the theta body, and therefore of the theta number, to r-uniform hypergraphs for \(r \ge 3\). We do so recursively, and the base of our recursion is \(r = 1\). By taking this as the base, we can give uniform proofs without relying on what is known about the theta body of a graph. So we will always take \(r = 1\) as the base, unless this choice would lead us into trouble.

Let \(H = (V, E)\) be an r-uniform hypergraph for \(r \ge 2\). Given \(x \in V\), the link of x in H is the \((r-1)\)-uniform hypergraph \(H_x\) with vertex set

$$\begin{aligned} V_x = \{\, y \in V\setminus \{x\}: \text {there is}~e \in E \text { containing}~x \text { and}~y\,\}, \end{aligned}$$

in which an \((r-1)\)-subset e of \(V_x\) is an edge if and only if \(e \cup \{x\}\) is an edge of H.

Given a matrix \(A \in \mathbb {R}^{V \times V}\) and \(x \in V\), let \(A_x \in \mathbb {R}^V\) denote the row of A indexed by x, that is, \(A_x(y) = A(x, y)\). If \(f:V \rightarrow \mathbb {R}\) is a function and \(U \subseteq V\) is a set, denote by f[U] the restriction of f to U.

We are now ready to give our main definition.

Definition 1.1

Let \(H = (V, E)\) be an r-uniform hypergraph. For \(r = 1\), the theta body of H is \({{\,\textrm{TH}\,}}(H) = {{\,\textrm{IND}\,}}(H)\). For \(r\ge 2\), the theta body of H is

$$\begin{aligned} \begin{aligned} {{\,\textrm{TH}\,}}(H) = \{\, f \in \mathbb {R}^V:\ {}&\text {there is}~F \in \mathbb {R}^{V \times V} \text {such that} f = {{\,\textrm{diag}\,}}F,\\&F_x[V_x] \in F(x, x) {{\,\textrm{TH}\,}}(H_x)\quad \text {for every}~x \in V, \text {and}\\&R(F) \text { is positive semidefinite}\,\}, \end{aligned} \end{aligned}$$

where, if a link \(H_x\) is empty, no constraint is imposed on the row \(F_x\).

Since the links of an r-uniform hypergraph are \((r-1)\)-uniform hypergraphs, we have a recursive definition. When \(r = 2\), we have \({{\,\textrm{TH}\,}}(H_x) = \{0\}\) for every nonempty link, and so we recover the usual definition (2) of the theta body of a graph.

The theta number can now be extended to hypergraphs: given a weight function \(w \in \mathbb {R}^V\), the theta number of H for w is

$$\begin{aligned} \vartheta (H, w) = \max \{\, w^\textsf{T}f: f \in {{\,\textrm{TH}\,}}(H)\,\}. \end{aligned}$$
(3)

For unit weights, we write \(\vartheta (H)\) instead of \(\vartheta (H, \textbf{1})\).

In Sect. 2, we will see how \({{\,\textrm{TH}\,}}(H)\) defined above is in many ways analogous to the theta body of a graph defined in (2). In particular, we will see in Theorem 2.1 that

$$\begin{aligned} {{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{TH}\,}}(H) \subseteq {{\,\textrm{QIND}\,}}(H), \end{aligned}$$

and therefore \(\alpha (H, w) \le \vartheta (H, w)\) for every weight function w. Moreover, as shown in Theorem 2.5, it is possible to optimize linear functions over \({{\,\textrm{TH}\,}}(H)\) in polynomial time.

1.3 The Weighted Fractional Chromatic Number

Let \(H = (V, E)\) be an r-uniform hypergraph for some \(r \ge 2\). The chromatic number of H, denoted by \(\chi (H)\), is the minimum number of colors needed to color the vertices of H in such a way that no edge is monochromatic. In other words, \(\chi (H)\) is the minimum number of disjoint independent sets needed to partition the vertex set of H.

Given \(w \in \mathbb {R}_+^V\), the weighted fractional chromatic number of H is

$$\begin{aligned} \begin{array}{ll} \chi ^*(H, w)={}&{}\text {minimum of } \lambda _1 + \cdots + \lambda _k, \text {where } \lambda _1, \dots ,~\lambda _k \ge 0\, \text {and there are}\\ &{} \text {independent sets } I_1, \dots , I_k \text { such that}~\lambda _1 \chi _{I_1} + \cdots + \lambda _k \chi _{I_k} = w. \end{array} \end{aligned}$$

When \(w = \textbf{1}\) is the constant-one function, \(\chi ^*(H, w)\) is the fractional chromatic number, denoted simply by \(\chi ^*(H)\). Note also that k is not specified, so we may consider any number of independent sets. In this way, if \(w = \textbf{1}\) and the \(\lambda _i\) are required to be integers, then we get the chromatic number, so \(\chi ^*(H) \le \chi (H)\).

For the chromatic or weighted fractional chromatic number, the case \(r = 1\) is degenerate: if the hypergraph has an edge, then there is no coloring, hence the restriction to \(r \ge 2\).

For a graph \(G = (V, E)\) and a weight function \(w \in \mathbb {R}_+^V\), it is known [12, Chapter 9] that \(\vartheta (G, w) \le \chi ^*({\overline{G}}, w)\). (The same inequality for the chromatic number and \({w = \textbf{1}}\) was proved by Lovász [18].) Corollary 2.3 generalizes this inequality to the setting of hypergraphs: if \(H = (V, E)\) is an r-uniform hypergraph and \(w \in \mathbb {R}_+^V\) is a weight function, then \(\vartheta (H, w) \le (r-1)\chi ^*({\overline{H}}, w)\).

1.4 The Hoffman Bound

The Lovász theta number is also related to a well-known spectral upper bound for the independence number of regular graphs, originally due to Hoffman. If G is a d-regular graph on n vertices and if \(\lambda \) is the smallest eigenvalue of its adjacency matrix, then

$$\begin{aligned} \alpha (G) \le \frac{-\lambda }{d - \lambda } n; \end{aligned}$$

this upper bound for the independence number is known as the Hoffman bound.

The Hoffman bound connects spectral graph theory with extremal combinatorics, and as such has found many applications in combinatorics and theoretical computer science. Recently, it has been extended to the high-dimensional setting of edge-weighted hypergraphs by Filmus, Golubev, and Lifshitz [10], who also derived interesting applications in extremal set theory.

Lovász [18, Theorem 9] showed that the theta number \(\vartheta (G)\) is always at least as good as the Hoffman bound, that is, \(\alpha (G) \le \vartheta (G) \le -\lambda n/(d - \lambda )\) for every d-regular graph G. In Sect. 3 we will extend Lovász’s result to hypergraphs, showing that the hypergraph theta number \(\vartheta (H)\) is also at least as good as the high-dimensional Hoffman bound.

1.5 The Antiblocker of the Theta Body

A convex set \(K \subseteq \mathbb {R}^n\) is of antiblocking type if \(\emptyset \ne K \subseteq \mathbb {R}_+^n\) and if \(x \in K\) and \(0 \le y \le x\) implies that \(y \in K\). The antiblocker of K is

$$\begin{aligned} A(K) = \{\, x \in \mathbb {R}_+^n: x^\textsf{T}y \le 1 \text { for all}~y \in K\,\}. \end{aligned}$$

Note that the antiblocker of a convex set of antiblocking type is also a convex set of antiblocking type. If K is also assumed to be closed, then \(A(A(K)) = K\) (see Grötschel, Lovász, and Schrijver [12, p. 11]).

If G is a graph, then the antiblocker of \({{\,\textrm{TH}\,}}(G)\) is \({{\,\textrm{TH}\,}}({\overline{G}})\) (see Grötschel, Lovász, and Schrijver [12, Chapter 9]). This fact is essential for proving that a graph is perfect if and only if its theta body is a polytope.

The same, however, does not hold for hypergraphs in general. In Sect. 4 we will describe the antiblocker of \({{\,\textrm{TH}\,}}(H)\) explicitly, and this will lead to another relaxation of \({{\,\textrm{IND}\,}}(H)\) and corresponding bounds for the weighted independence number and the weighted fractional chromatic number.

1.6 Symmetry and Applications

When a hypergraph is highly symmetric, it is possible to greatly simplify the optimization problem giving the theta number, as we explore in Sect. 5.

By exploiting symmetry we are able to explicitly compute the theta number in the following two illustrative cases. In Sect. 6 we consider a family of hypergraphs related to Mantel’s theorem in extremal graph theory. In this toy example, we compute the theta number of these hypergraphs, showing that it gives a tight bound for the independence number leading to a proof of Mantel’s theorem.

In Sect. 7 we consider 3-uniform hypergraphs over the Hamming cube whose edges are all triangles with a given side length in Hamming distance. We give a closed formula for the theta number, and we show numerical results supporting our conjecture (see Conjecture 7.3) that the density of such triangle-avoiding sets in the Hamming cube decays exponentially fast with the dimension.

1.7 Notation

For an integer \(n\ge 1\) we write \([n] = \{1, \ldots , n\}\). For a set V and \(S \subseteq V\) we denote by \(\chi _S:V \rightarrow \mathbb {R}\) the characteristic function of S, which is defined by \(\chi _S(x) = 1\) if \(x \in S\) and \(\chi _S(x) = 0\) otherwise. If \(f:V \rightarrow \mathbb {R}\) is a function and \(S \subseteq V\), then \(f(S) = \sum _{x \in S} f(x)\). The collection of all r-subsets of V is denoted by \(\left( {\begin{array}{c}V\\ r\end{array}}\right) \).

If \(H = (V, E)\) is an r-uniform hypergraph, we denote by \({\overline{H}}\) the complement of H, which is the hypergraph with vertex set V and edge set \(\left( {\begin{array}{c}V\\ r\end{array}}\right) \setminus E\).

We denote by \({{\,\textrm{diag}\,}}A\) the vector giving the diagonal of a square matrix A. The trace inner product between symmetric matrices A, \(B \in \mathbb {R}^{n \times n}\) is \(\langle A, B \rangle = {{\,\textrm{tr}\,}}AB = \sum _{i,j=1}^n A_{ij} B_{ij}\). Positive semidefinite matrices are always symmetric. For a symmetric matrix A with diagonal a, we write

$$\begin{aligned} R(A) = \begin{pmatrix} 1&{}a^\textsf{T}\\ a&{}A \end{pmatrix}. \end{aligned}$$

2 Properties of the Theta Body

Given an r-uniform hypergraph \(H = (V, E)\) for \(r \ge 2\), it is useful to consider the lifted version of the theta body as given in Definition 1.1, namely

$$\begin{aligned} \begin{aligned} {{\,\textrm{LTH}\,}}(H) = \{\, F \in \mathbb {R}^{V \times V}:\&F_x[V_x] \in F(x, x) {{\,\textrm{TH}\,}}(H_x)\text { for every}~x \in V \text {and}\\&R(F) \text { is positive semidefinite}\,\}. \end{aligned} \end{aligned}$$

Note that \({{\,\textrm{TH}\,}}(H)\) is the projection of \({{\,\textrm{LTH}\,}}(H)\) onto the subspace of diagonal matrices, being therefore a projected spectrahedron.

Theorem 2.1

If H is an r-uniform hypergraph, then \({{\,\textrm{TH}\,}}(H)\) is compact, convex, and satisfies

$$\begin{aligned} {{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{TH}\,}}(H) \subseteq {{\,\textrm{QIND}\,}}(H). \end{aligned}$$
(4)

Proof

The proof proceeds by induction on r. The base case is \(r = 1\), for which the statement is easily seen to hold.

Assume \(r \ge 2\). By the induction hypothesis, the statement of the theorem holds for the theta body of every link. This implies that \({{\,\textrm{LTH}\,}}(H)\) is convex, and hence \({{\,\textrm{TH}\,}}(H)\) is convex.

Let us show that \({{\,\textrm{LTH}\,}}(H)\) is compact and, since \({{\,\textrm{TH}\,}}(H)\) is a projection of \({{\,\textrm{LTH}\,}}(H)\), it will follow that \({{\,\textrm{TH}\,}}(H)\) is compact.

Let \((F_k)_{k \ge 1}\) be a sequence of points in \({{\,\textrm{LTH}\,}}(H)\) that converges to F. Immediately we have that \(R(F)\) is positive semidefinite. Now fix \(x \in V\) and let \(a^\textsf{T}f \le \beta \) be any valid inequality for \({{\,\textrm{TH}\,}}(H_x)\). Then

$$\begin{aligned} a^\textsf{T}F_x[V_x] = \lim _{k\rightarrow \infty } a^\textsf{T}(F_k)_x[V_x] \le \lim _{k\rightarrow \infty } F_k(x, x) \beta = F(x, x)\beta , \end{aligned}$$

and we see that \(F_x[V_x] \in F(x, x) {{\,\textrm{TH}\,}}(H_x)\), proving that \({{\,\textrm{LTH}\,}}(H)\) is closed.

To see that \({{\,\textrm{LTH}\,}}(H)\) is bounded, note that for every \(x \in V\) the \(2\times 2\) submatrix

$$\begin{aligned} \begin{pmatrix} 1&{}f(x)\\ f(x)&{}f(x) \end{pmatrix} \end{aligned}$$

of \(R(F)\) is positive semidefinite (where \(f = {{\,\textrm{diag}\,}}F\)), hence \(f(x) - f(x)^2 \ge 0\) and so \(|F(x, x)| = |f(x)| \le 1\) for all \(x \in V\). This implies that \({{\,\textrm{tr}\,}}F \le |V|\) and, since F is positive semidefinite, the Frobenius normFootnote 1 of F is at most |V|. This finishes the proof that \({{\,\textrm{LTH}\,}}(H)\) is compact.

It remains to show that (4) holds. For the first inclusion, let \(I \subseteq V\) be an independent set. For every \(x \in V\), if \(x \in I\), then \(I \cap V_x\) is an independent set of the link \(H_x\), so by the induction hypothesis \(\chi _I[V_x] \in {{\,\textrm{TH}\,}}(H_x)\). It follows that \(\chi _I \chi _I^\textsf{T}\in {{\,\textrm{LTH}\,}}(H)\), and so \({{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{TH}\,}}(H)\).

For the second inclusion in (4), note first that \({{\,\textrm{TH}\,}}(H) \subseteq [0, 1]^V\). Let \(C \subseteq V\) be a clique and let \(F \in {{\,\textrm{LTH}\,}}(H)\); write \(f = {{\,\textrm{diag}\,}}F\). If \(|C| \le r - 1\), then since \(f \le \textbf{1}\) we have \(f(C) \le |C| \le r - 1\) and we are done.

So assume \(|C| \ge r\). Since \(R(F)\) is positive semidefinite we have

$$\begin{aligned} 0 \le (r-1; -\chi _C)^\textsf{T}R(F) (r-1; -\chi _C) = (r-1)^2 - 2(r-1)\chi _C^\textsf{T}f + \chi _C^\textsf{T}F \chi _C. \end{aligned}$$
(5)

Since \(|C| \ge r\), every r-element subset of C is an edge of H, and so for every \(x \in C\) we know that \(C \setminus \{x\} \subseteq V_x\) is a clique of the link \(H_x\). By the induction hypothesis we know that \({{\,\textrm{TH}\,}}(H_x) \subseteq {{\,\textrm{QIND}\,}}(H_x)\), hence

$$\begin{aligned} \begin{aligned} \chi _C^\textsf{T}F \chi _C = \sum _{x, y \in C} F(x, y)&= \sum _{x \in C} \biggl (F(x, x) + \sum _{y \in C \setminus \{x\}} F(x, y)\biggr )\\&\le \sum _{x \in C} (F(x, x) + F(x, x) (r - 2)) = (r-1) f(C). \end{aligned} \end{aligned}$$

Together with (5) we get \(0 \le (r-1)^2 - (r-1)f(C)\), whence \(f(C) \le r-1\) as wished. \(\square \)

As a corollary we get that \({{\,\textrm{TH}\,}}(H)\) is a formulation of \({{\,\textrm{IND}\,}}(H)\), that is, the integer hull of the theta body is the independent-set polytope.

Corollary 2.2

If H is an r-uniform hypergraph and if \(f \in {{\,\textrm{TH}\,}}(H)\) is an integral vector, then f is the characteristic function of an independent set of H.

Proof

As \({{\,\textrm{TH}\,}}(H) \subseteq {{\,\textrm{QIND}\,}}(H) \subseteq [0, 1]^V\), we know that f is a 0–1 vector that satisfies all clique inequalities, and the conclusion follows. \(\square \)

Since \({{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{TH}\,}}(H)\), it follows immediately from the definition (3) that \(\alpha (H, w) \le \vartheta (H, w)\) for every \(w \in \mathbb {R}^V\). What about a lower bound for the chromatic number?

For a graph \(G = (V, E)\) and \(w \in \mathbb {R}_+^V\) we also have \(\vartheta (G, w) \le \chi ^*({\overline{G}}, w)\). (Recall the definition of \(\chi ^*\) from Sect. 1.3.) A simple example shows that the same cannot be true for hypergraphs in general. Indeed, fix \(r \ge 3\) and let H be the complete r-uniform hypergraph on r vertices (that is, H has exactly one edge containing all of its vertices). The complement of H is the empty hypergraph. Then \(\vartheta (H) = r-1\), whereas \(\chi ^*({\overline{H}}) = 1\), and the inequality fails to hold. It can, however, be extended, and this simple example shows that the extension is tight.

Corollary 2.3

If H is an r-uniform hypergraph and \(w \in \mathbb {R}_+^V\), then \(\alpha (H, w) \le \vartheta (H, w)\). If moreover \(r \ge 2\), then \(\vartheta (H, w) \le (r-1) \chi ^*({\overline{H}}, w)\).

Proof

The first statement follows immediately from \({{\,\textrm{IND}\,}}(H) \subseteq {{\,\textrm{TH}\,}}(H)\).

The second statement follows from \({{\,\textrm{TH}\,}}(H) \subseteq {{\,\textrm{QIND}\,}}(H)\). Indeed, let \(\lambda _1\), ..., \(\lambda _k\) be nonnegative numbers and \(C_1\), ..., \(C_k\) be independent sets of \({\overline{H}}\) such that \(\lambda _1 \chi _{C_1} + \cdots + \lambda _k \chi _{C_k} = w\). If \(f \in {{\,\textrm{TH}\,}}(H)\), then f satisfies all clique inequalities, so since each \(C_i\) is a clique of H we have \(\chi _{C_i}^\textsf{T}f = f(C_i) \le r-1\). Hence

$$\begin{aligned} w^\textsf{T}f = (\lambda _1 \chi _{C_1} + \cdots + \lambda _k \chi _{C_k})^\textsf{T}f \le (r-1) (\lambda _1 + \cdots + \lambda _k), \end{aligned}$$

and we are done. \(\square \)

Just like the theta body of a graph, the theta body of a hypergraph can be shown to be of antiblocking type (see Sect. 1.5 for background), and this leads to an inequality description of the theta body in terms of the theta number.

Theorem 2.4

If \(H = (V, E)\) is an r-uniform hypergraph, then \({{\,\textrm{TH}\,}}(H)\) is of antiblocking type and \({{\,\textrm{TH}\,}}(H) = \{\, f \in \mathbb {R}_+^V: w^\textsf{T}f \le \vartheta (H, w)\) for all \(w \in \mathbb {R}_+^V\, \}\).

Proof

We proceed by induction on r. The statement is immediate for the base case \(r = 1\), so assume \(r \ge 2\). We claim: if \(w \in \mathbb {R}^V\) and \(w_+(x) = \max \{0, w(x)\}\) for all \(x \in V\), then \(\vartheta (H, w) = \vartheta (H, w_+)\).

Since \({{\,\textrm{TH}\,}}(H) \subseteq \mathbb {R}_+^V\), it is clear that \(\vartheta (H, w) \le \vartheta (H, w_+)\) for every \(w \in \mathbb {R}^V\); let us now prove the reverse inequality. Let \(F \in {{\,\textrm{LTH}\,}}(H)\) be a matrix such that \(w_+^\textsf{T}f = \vartheta (H, w_+)\), where \(f = {{\,\textrm{diag}\,}}F\). Let \(S = \{\, x \in V: w(x) \ge 0\,\}\) and denote by \({\bar{F}}\) the Hadamard (entrywise) product of F and \(\chi _S \chi _S^\textsf{T}\); write \({\bar{f}} = {{\,\textrm{diag}\,}}{\bar{F}}\).

Note that \(R({\bar{F}})\) is the Hadamard product of \(R(F)\) and \((1; \chi _S)(1; \chi _S)^\textsf{T}\), hence \(R({\bar{F}})\) is positive semidefinite. For every \(x \in V\) we have \(0 \le {\bar{F}}_x[V_x] \le F_x[V_x]\), and so the induction hypothesis implies that \({\bar{F}}_x[V_x] \in {\bar{F}}(x, x) {{\,\textrm{TH}\,}}(H_x)\). Hence \({\bar{F}} \in {{\,\textrm{LTH}\,}}(H)\), and \(\vartheta (H, w) \ge w^\textsf{T}{\bar{f}} = w_+^\textsf{T}f = \vartheta (H, w_+)\), proving the claim.

The inequality description follows immediately. Every \(f \in {{\,\textrm{TH}\,}}(H)\) is nonnegative and satisfies \(w^\textsf{T}f \le \vartheta (H, w)\) for all \(w \in \mathbb {R}_+^V\). For the reverse inclusion note that, since \({{\,\textrm{TH}\,}}(H)\) is closed and convex,

$$\begin{aligned} {{\,\textrm{TH}\,}}(H) = \{\, f \in \mathbb {R}^V: w^\textsf{T}f \le \vartheta (H, w) \text { for all}~w \in \mathbb {R}^V\, \}. \end{aligned}$$

So let \(f\ge 0\) be such that \(w^\textsf{T}f \le \vartheta (H, w)\) for all \(w \in \mathbb {R}_+^V\). For \(w \in \mathbb {R}^V\), let \(w_+\) be defined as above, so \(w_+ \ge 0\). Then, for every \(w \in \mathbb {R}^V\) we have

$$\begin{aligned} w^\textsf{T}f \le w_+^\textsf{T}f \le \vartheta (H, w_+) = \vartheta (H, w), \end{aligned}$$

and we see that \(f \in {{\,\textrm{TH}\,}}(H)\).

To finish, let us show that the theta body is of antiblocking type. If \(f \in {{\,\textrm{TH}\,}}(H)\) and \(0\le g \le f\), then for every \(w \in \mathbb {R}_+^V\) we have \(w^\textsf{T}g \le w^\textsf{T}f \le \vartheta (H, w)\), and so \(g \in {{\,\textrm{TH}\,}}(H)\). \(\square \)

Finally, for every fixed \(r \ge 1\) it is possible to optimize over \({{\,\textrm{TH}\,}}(H)\) in polynomial time. More precisely, in the language of Grötschel, Lovász, and Schrijver [12, Chapter 4], we have:

Theorem 2.5

If \(r \ge 1\) is fixed, then the weak optimization problem over \({{\,\textrm{TH}\,}}(H)\) can be solved in polynomial time for every r-uniform hypergraph H.

Proof

The result is trivial for \(r = 1\). For graphs, that is, \(r = 2\), the statement was proven by Grötschel, Lovász, and Schrijver [12, Theorem 9.3.30], and here it is easier to take \(r = 2\) as our base case, as will become clear soon. So we assume that \(r \ge 3\) and that the weak optimization problem can be solved in polynomial time for \((r-1)\)-uniform hypergraphs; we want to show how to solve the weak optimization problem in polynomial time for r-uniform hypergraphs.

Let \(H = (V, E)\) be an r-uniform hypergraph. If we show that we can solve the weak optimization problem over the convex set \({{\,\textrm{LTH}\,}}(H)\), then we are done. It suffices [12, Chapter 4] to show that \({{\,\textrm{LTH}\,}}(H)\) has the required inscribed and circumscribed balls of appropriate size, and that the weak membership problem for \({{\,\textrm{LTH}\,}}(H)\) can be solved in polynomial time.

It can be easily checked that all entries of a matrix in \({{\,\textrm{LTH}\,}}(H)\) are bounded in absolute value by 1, and so the origin-centered ball of radius |V| circumscribes the theta body. To find an inscribed ball, note that the full-dimensional convex set

$$\begin{aligned} {{\,\textrm{conv}\,}}\{\, \chi _I \chi _I^\textsf{T}\in \mathbb {R}^{V \times V}: I \subseteq V, |I| \le 2\,\} \end{aligned}$$

is a subset of \({{\,\textrm{LTH}\,}}(H)\), so it contains a ball which is also contained in \({{\,\textrm{LTH}\,}}(H)\). (This assertion fails when H is a graph, which is why we take \(r = 2\) as the base to simplify the proof.)

Now, given a symmetric matrix \(F \in \mathbb {R}^{V \times V}\), to test whether \(F \in {{\,\textrm{LTH}\,}}(H)\) we first test whether \(R(F)\) is positive semidefinite using (for instance) Cholesky decomposition. By induction, the weak optimization problem for each link can be solved in polynomial time, hence so can the weak membership problem for each link. We then finish by calling the weak membership oracle for each link. \(\square \)

3 Relationship to the Hoffman Bound

Let G be a d-regular graph on n vertices and let \(\lambda \) be the smallest eigenvalue of its adjacency matrix. Hoffman showed that

$$\begin{aligned} \alpha (G) \le \frac{-\lambda }{d-\lambda } n; \end{aligned}$$

the right-hand side above became know as the Hoffman bound. Hoffman never published this particular result, though he did publish a similar lower bound [15] for the chromatic number which also came to be known as the Hoffman bound; see Haemers [14] for a historical account. Lovász [18] showed that the theta number is always at least as good as the Hoffman bound and that, when the graph is edge transitive, both bounds coincide. The Hoffman bound has also been extended to certain infinite graphs, and its relation to extensions of the theta number has been studied [1].

Filmus, Golubev, and Lifshitz [10] extended the Hoffman bound to edge-weighted hypergraphs and described several applications to extremal combinatorics. Our goal in this section is to show that our extension of the theta number to hypergraphs is always at least as good as the extended Hoffman bound. We begin with the extension of Filmus, Golubev, and Lifshitz.

To simplify the presentation and to be consistent with the setup used so far, we restrict ourselves to weighted hypergraphs without loops. A weighted r -uniform hypergraph is a pair \(X = (V, \mu )\) where V is a finite set, called the vertex set of the hypergraph, and \(\mu \) is a probability measure on \(\left( {\begin{array}{c}V\\ r\end{array}}\right) \). The underlying hypergraph of X is the r-uniform hypergraph on V whose edge set is the support of \(\mu \).

Let \(X = (V, \mu )\) be a weighted r-uniform hypergraph and let \(H = (V, E)\) be its underlying hypergraph. For \(i = 1\), ..., \(r-1\), the measure \(\mu \) induces a probability measure \(\mu ^{(i)}\) on \(\left( {\begin{array}{c}V\\ i\end{array}}\right) \) by the following experiment: we first choose an edge e of X according to \(\mu \) and then we choose an i-subset of e uniformly at random. Concretely, for \(\sigma \in \left( {\begin{array}{c}V\\ i\end{array}}\right) \) we have

$$\begin{aligned} \mu ^{(i)}(\sigma ) = \left( {\begin{array}{c}r\\ i\end{array}}\right) ^{-1} \mu (\{\, e \in E: \sigma \subseteq e\,\}). \end{aligned}$$
(6)

Note that \(\mu ^{(1)}\) can be seen as a weight function on V. We define the independence number of X as \(\alpha (X) = \alpha (H, \mu ^{(1)})\).

Let \(X^{(i)} \subseteq \left( {\begin{array}{c}V\\ i\end{array}}\right) \) be the support of \(\mu ^{(i)}\). We may assume, without loss of generality, that \(X^{(1)} = V\), since vertices not in the support of \(\mu ^{(1)}\) are isolated and do not contribute to the independence number.

The link of \(\sigma \in X^{(i)}\) is the weighted \((r-i)\)-uniform hypergraph \(X_\sigma = (V, \mu _\sigma )\), where \(\mu _\sigma \) is the probability measure on \(\left( {\begin{array}{c}V\\ r-i\end{array}}\right) \) defined by the following experiment: sample a random edge \(e \in \left( {\begin{array}{c}V\\ r\end{array}}\right) \) according to \(\mu \) conditioned on \(\sigma \subseteq e\) and output \(e \setminus \sigma \). We also say that \(X_\sigma \) is an i -link of X. Concretely, for \(e \in \left( {\begin{array}{c}V {\setminus } \sigma \\ r-i\end{array}}\right) \) we have

$$\begin{aligned} \mu _\sigma (e) = \frac{\mu (e \cup \sigma )}{\mu (\{\, e' \in E: \sigma \subseteq e'\,\})}. \end{aligned}$$
(7)

For a vertex \(x \in V\) we write \(X_x\) instead of \(X_{\{x\}}\) for the link of x. Note that the underlying hypergraph of \(X_x\), minus its isolated vertices, is exactly \(H_x\), the link of x in the underlying hypergraph of X, which we have used so far.

Equip \(\mathbb {R}^V\) with the inner product

$$\begin{aligned} (f, g)= \sum _{x \in V} f(x) g(x) \mu ^{(1)}(x) \end{aligned}$$

for f, \(g \in \mathbb {R}^V\). Since V is the support of \(\mu ^{(1)}\), this inner product is nondegenerate. The normalized adjacency operator of X is the operator \(T_X\) on \(\mathbb {R}^V\) such that

$$\begin{aligned} (T_X f)(x) = \sum _{y \in V} f(y) \mu _x^{(1)}(y) \end{aligned}$$

for all \(f \in \mathbb {R}^V\). Here, \(\mu _x^{(1)} = (\mu _x)^{(1)}\) is the measure on V induced by the measure \(\mu _x\) defining the link of x. Combine (6) and (7) to get

$$\begin{aligned} \mu _x^{(1)}(y) = \frac{1}{r-1} \frac{\mu (\{\, e \in E: x, y \in e\,\})}{\mu (\{\, e \in E: x \in e\, \})}. \end{aligned}$$
(8)

Now use (6) and (8) to see that

$$\begin{aligned} \mu _x^{(1)}(y) = \frac{\mu ^{(2)}(\{x, y\})}{2\mu ^{(1)}(x)} \end{aligned}$$
(9)

for every \(x \in V\) and \(y \in V_x\). Hence

$$\begin{aligned} (T_X f, g)= \sum _{x \in V} \sum _{y \in V} f(y) \mu _x^{(1)}(y) g(x) \mu ^{(1)}(x) = \sum _{\{x, y\} \in \left( {\begin{array}{c}V\\ 2\end{array}}\right) } f(x) g(y) \mu ^{(2)}(\{x, y\})\nonumber \\ \end{aligned}$$
(10)

for f, \(g \in \mathbb {R}^V\). It follows at once that \(T_X\) is self-adjoint and thus has real eigenvalues.

Note that \(T_X\textbf{1}= \textbf{1}\), hence the constant one vector is an eigenvector of \(T_X\) with associated eigenvalue 1. Moreover, the largest eigenvalue of \(T_X\) is 1. Indeed, recall that if \(A \in \mathbb {R}^{n \times n}\) is a matrix and if \(\lambda \) is an eigenvalue of A, then \(|\lambda | \le \Vert A\Vert _\infty = \max _{i \in [n]} \sum _{j=1}^n |A_{ij}|\). Since \(\Vert T_X\Vert _\infty = 1\) by construction, it follows that 1 is the largest eigenvalue of \(T_X\).

Let \(\lambda (X)\) be the smallest eigenvalue of \(T_X\), which is negative since \({{\,\textrm{tr}\,}}T_X = 0\) as is clear from (10). For \(i = 1\), ..., \(r-2\), let \(\lambda _i(X)\) be the minimum possible eigenvalue of the normalized adjacency operator of any i-link of X, that is,

$$\begin{aligned} \lambda _i(X) = \min _{\sigma \in X^{(i)}} \lambda (X_\sigma ), \end{aligned}$$

and set \(\lambda _0(X) = \lambda (X)\).

With this notation, the Hoffman bound of X introduced by Filmus, Golubev, and Lifshitz [10] is

$$\begin{aligned} {{\,\textrm{Hoff}\,}}(X) = 1 - \frac{1}{(1-\lambda _0(X))(1 - \lambda _1(X)) \cdots (1 - \lambda _{r-2}(X))}. \end{aligned}$$

Say \(G = (V, E)\) is a d-regular graph and introduce on its edges the uniform probability measure, obtaining a weighted 2-uniform hypergraph \(X = (V, \mu )\). In this case, \(\mu ^{(1)}\) is the uniform probability measure on V and the normalized adjacency operator \(T_X\) is simply the adjacency matrix of G divided by d. If \(\lambda \) is the smallest eigenvalue of the adjacency matrix of G, then \(\lambda (X) = \lambda / d\) and the high-dimensional Hoffman bound reads

$$\begin{aligned} {{\,\textrm{Hoff}\,}}(X) = 1 - \frac{1}{1-\lambda _0(X)} = \frac{-\lambda }{d-\lambda }, \end{aligned}$$

which is, up to normalization, the Hoffman bound for \(\alpha (G)\).

Filmus, Golubev, and Lifshitz showed that \(\alpha (X) \le {{\,\textrm{Hoff}\,}}(X)\) and that this bound does not change when one takes tensor powers of the hypergraph, a fact that has implications for some problems in extremal combinatorics. The next theorem relates the hypergraph theta number to the high-dimensional Hoffman bound.

Theorem 3.1

If \(X = (V, \mu )\) is a weighted r-uniform hypergraph for some \(r \ge 2\) and if H is its underlying hypergraph, then \(\alpha (X) \le \vartheta (H, \mu ^{(1)}) \le {{\,\textrm{Hoff}\,}}(X)\).

A few remarks before the proof. The theta number is a bound for the weighted independence number, where the weights are placed on the vertices. The Hoffman bound on the other hand is defined for an edge-weighted hypergraph, and since edge weights naturally induce vertex weights, it is possible to compare it to the theta number. However, not every weight function on vertices can be derived from a weight function on edges, so in this sense the theta number applies in more general circumstances.

Moreover, even when a vertex-weight function \(w:V \rightarrow \mathbb {R}_+\) can be derived from an edge-weight function, it is not clear how to efficiently find an edge-weight function that gives w and for which the Hoffman bound gives a good upper bound for \(\alpha (H, w)\). A natural idea is to compute the optimal Hoffman bound, that is, to find the edge weights inducing w for which the corresponding Hoffman bound is the smallest possible. This was proposed by Filmus, Golubev, and Lifshitz [10, Sect. 4.3], but for \(r \ge 3\) the resulting optimization problem has a nonconvex objective function, and it is not clear how to solve it efficiently. In contrast, one can always efficiently compute the theta number of a hypergraph (see Theorem 2.5), and Theorem 3.1 says that the bound so obtained will always be at least as good as the optimal Hoffman bound.

Finally, an important property of the extension of the Hoffman bound is that it is invariant under the tensor power operation, while it is unclear whether the hypergraph theta number behaves nicely under natural hypergraph products.

Proof of Theorem 3.1

By definition we have \(\alpha (X) = \alpha (H, \mu ^{(1)})\) which, by Corollary 2.3, is at most \(\vartheta (H, \mu ^{(1)})\).

The proof of the inequality \(\vartheta (H, \mu ^{(1)}) \le {{\,\textrm{Hoff}\,}}(X)\) proceeds by induction on r. The base is \(r = 2\), in which case the statement was shown by Lovász [18, Theorem 9]. (Note that the Hoffman bound is not defined for \(r = 1\), which is why we take \(r = 2\) as the base.)

So assume \(r \ge 3\). Let \(f \in {{\,\textrm{TH}\,}}(H)\) be such that \((\mu ^{(1)})^\textsf{T}f = \vartheta (H, \mu ^{(1)})\) and let \(F \in {{\,\textrm{LTH}\,}}(H)\) be a matrix such that \(f = {{\,\textrm{diag}\,}}F\). Since \(R(F)\) is positive semidefinite, by taking the Schur complement we see that \(F - f f^\textsf{T}\) is also positive semidefinite, and so

$$\begin{aligned} \sum _{x, y \in V} F(x, y) \mu ^{(1)}(x) \mu ^{(1)}(y) \ge ((\mu ^{(1)})^\textsf{T}f)^2 = \vartheta (H, \mu ^{(1)})^2. \end{aligned}$$

To finish the proof it then suffices to show that

$$\begin{aligned} \sum _{x, y \in V} F(x, y) \mu ^{(1)}(x) \mu ^{(1)}(y) \le \vartheta (H, \mu ^{(1)}) {{\,\textrm{Hoff}\,}}(X). \end{aligned}$$
(11)

Since \(F \in {{\,\textrm{LTH}\,}}(H)\), we have \(F_x[V_x] \in F(x, x) {{\,\textrm{TH}\,}}(H_x)\) for every \(x \in V\), and so

$$\begin{aligned} \sum _{y \in V_x} F(x, y) \mu _x^{(1)}(y) \le F(x, x) \vartheta (H_x, \mu _x^{(1)}). \end{aligned}$$

By induction, \(\vartheta (H_x, \mu _x^{(1)}) \le {{\,\textrm{Hoff}\,}}(X_x)\), hence

$$\begin{aligned} \sum _{x \in V} \mu ^{(1)}(x) \sum _{y \in V_x} F(x, y) \mu _x^{(1)}(y) \le \sum _{x \in V} \mu ^{(1)}(x) F(x, x) \vartheta (H_x, \mu _x^{(1)}) \le \vartheta (H, \mu ^{(1)}) M, \end{aligned}$$

where \(M = \max _{x \in V} {{\,\textrm{Hoff}\,}}(X_x)\). Use (9) on the left-hand side above to get

$$\begin{aligned} \sum _{\{x,y\} \in \left( {\begin{array}{c}V\\ 2\end{array}}\right) } F(x, y) \mu ^{(2)}(\{x, y\}) \le \vartheta (H, \mu ^{(1)}) M, \end{aligned}$$
(12)

which already looks much closer to (11).

We work henceforth on the space \(\mathbb {R}^V\) equipped with the nondegenerate inner product \((\cdot ,\cdot )\) defined above. Since F is positive semidefinite, let \(g_1\), ..., \(g_n\) be an orthonormal basis of eigenvectors of F, with associated nonnegative eigenvalues \(\lambda _1\), ..., \(\lambda _n\). We then have \(F(x, y) = \sum _{i=1}^n \lambda _i g_i(x) g_i(y)\) and

$$\begin{aligned} \sum _{i=1}^n \lambda _i = \sum _{i=1}^n \lambda _i (g_i, g_i )= \sum _{x\in V} F(x, x) \mu ^{(1)}(x) = \vartheta (H, \mu ^{(1)}), \end{aligned}$$
(13)
$$\begin{aligned} \sum _{i=1}^n \lambda _i (g_i, \textbf{1})^2 = \sum _{x, y \in V} F(x, y) \mu ^{(1)}(x) \mu ^{(1)}(y), \end{aligned}$$
(14)

and, using (10),

$$\begin{aligned} \begin{aligned} \sum _{i=1}^n \lambda _i (T_X g_i, g_i)&= \sum _{i=1}^n \lambda _i \sum _{\{x,y\} \in \left( {\begin{array}{c}V\\ 2\end{array}}\right) } g_i(x) g_i(y) \mu ^{(2)}(\{x,y\})\\&= \sum _{\{x,y\} \in \left( {\begin{array}{c}V\\ 2\end{array}}\right) } F(x, y) \mu ^{(2)}(\{x,y\}). \end{aligned} \end{aligned}$$
(15)

Let \(\textbf{1}= v_1\), \(v_2\), ..., \(v_n\) be an orthonormal basis of eigenvectors of \(T_X\) with associated eigenvalues \(1 = \alpha _1 \ge \alpha _2 \ge \cdots \ge \alpha _n = \lambda (X)\). For every i we have

$$\begin{aligned} 1 = (g_i, g_i )= \sum _{j=1}^n (g_i, v_j)^2, \end{aligned}$$

whence \(\sum _{j=2}^n (g_i, v_j)^2 = 1 - (g_i, \textbf{1})^2\). It follows that

$$\begin{aligned} \begin{aligned} (T_X g_i, g_i)&= \sum _{j=1}^n \alpha _j (g_i, v_j)^2\\&\ge (g_i, \textbf{1})^2 + \sum _{j=2}^n \alpha _n (g_i, v_j)^2\\&= (1 - \alpha _n) (g_i, \textbf{1})^2 + \alpha _n. \end{aligned} \end{aligned}$$

Combine this with (15) and (12) to get

$$\begin{aligned} \sum _{i=1}^n \lambda _i ((1 - \alpha _n)(g_i, \textbf{1})^2 + \alpha _n) \le \sum _{\{x,y\} \in \left( {\begin{array}{c}V\\ 2\end{array}}\right) } F(x, y) \mu ^{(2)}(\{x,y\})\le \vartheta (H, \mu ^{(1)}) M. \end{aligned}$$

By (13) this implies that

$$\begin{aligned} (1-\alpha _n)\sum _{i=1}^n \lambda _i (g_i, \textbf{1})^2 \le \vartheta (H, \mu ^{(1)}) (M - \alpha _n). \end{aligned}$$

Since \(\alpha _n < 0\) and hence \(1 - \alpha _n > 0\), using (14) we finally get

$$\begin{aligned} \sum _{x, y \in V} F(x, y) \mu ^{(1)}(x) \mu ^{(1)}(y) = \sum _{i=1}^n \lambda _i (g_i, \textbf{1})^2 \le \vartheta (H, \mu ^{(1)}) \frac{M - \alpha _n}{1 - \alpha _n}. \end{aligned}$$
(16)

We are now essentially done. Indeed, \(\lambda _i(X_x) \ge \lambda _{i+1}(X)\) for all \(x \in V\) and \(i = 0\), ..., \(r-2\), hence

$$\begin{aligned} 1 - \frac{1}{(1 -\lambda _0(X_x)) \cdots (1 - \lambda _{r-3}(X_x))} \le 1 - \frac{1}{(1 - \lambda _1(X)) \cdots (1 - \lambda _{r-2}(X))}. \end{aligned}$$

The left-hand side above is precisely \({{\,\textrm{Hoff}\,}}(X_x)\) and the right-hand side is equal to \(\lambda _0(X) + (1 - \lambda _0(X)) {{\,\textrm{Hoff}\,}}(X)\). We conclude that

$$\begin{aligned} \frac{M - \lambda _0(X)}{1 - \lambda _0(X)} = \max _{x \in V} \frac{{{\,\textrm{Hoff}\,}}(X_x) - \lambda _0(X)}{1 - \lambda _0(X)} \le {{\,\textrm{Hoff}\,}}(X). \end{aligned}$$

Since \(\alpha _n = \lambda _0(X)\) by definition, this combined with (16) gives (11), as wished. \(\square \)

We mentioned above that the Hoffman bound coincides with the theta number when the graph is edge transitive. More generally, if a weighted hypergraph and all its lower-order links are vertex transitive, then the Hoffman bound coincides with the theta number. The proof of this assertion is an adaptation of the proof of Theorem 3.1: use the results of Sect. 5 to take an invariant matrix \(F \in {{\,\textrm{LTH}\,}}(H)\) and check that since the hypergraph and all its lower-order links are vertex transitive, every inequality in the proof is tight.

4 The Antiblocker

Recall the definition of antiblocker from Sect. 1.5. If G is a graph, then the antiblocker of \({{\,\textrm{TH}\,}}(G)\) is \({{\,\textrm{TH}\,}}({\overline{G}})\) (see Grötschel, Lovász, and Schrijver [12, Chapter 9]). The same does not hold for hypergraphs in general. Consider, for instance, the hypergraph \(H = ([r], \{[r]\})\) and notice that \(f = \chi _{[r-1]} \in {{\,\textrm{IND}\,}}(H)\) and \(g = \chi _{[r]} \in {{\,\textrm{IND}\,}}({\overline{H}})\). So \(f \in {{\,\textrm{TH}\,}}(H)\) and \(g \in {{\,\textrm{TH}\,}}({\overline{H}})\), but \(f^\textsf{T}g = r-1\). Hence, for \(r \ge 3\)\({{\,\textrm{TH}\,}}({\overline{H}})\) is not the antiblocker of \({{\,\textrm{TH}\,}}(H)\).

It seems from this simple example that we are off by a factor of \(r-1\), so is \(A({{\,\textrm{TH}\,}}(H)) = (r-1)^{-1} {{\,\textrm{TH}\,}}({\overline{H}})\)? The answer is again no, and the smallest example is the hypergraph on \(\{1, \ldots , 5\}\) with edges \(\{1, 2, 5\}\), \(\{1, 3, 4\}\), and \(\{2, 3, 4\}\).

To describe the antiblocker of the theta body, we start by defining an alternative theta number inspired by the dual of the theta number for graphs. For a number \(\lambda \) and a symmetric matrix A with diagonal a, write

$$\begin{aligned} R(\lambda , A) = \begin{pmatrix} \lambda &{}a^\textsf{T}\\ a&{}A \end{pmatrix}. \end{aligned}$$

Given an r-uniform hypergraph H for \(r \ge 2\) and a weight function \(w \in \mathbb {R}_+^V\), denote by \(\vartheta ^\circ (H, w)\) both the semidefinite program below and its optimal value:

(17)

where \({\overline{V}}_x\) is the vertex set of the link of x in \({\overline{H}}\).

Now define

$$\begin{aligned} {{\,\mathrm{TH^\circ }\,}}(H) = \{\, g \in \mathbb {R}_+^V: w^\textsf{T}g \le \vartheta ^\circ (H, w) \text { for all}~w \in \mathbb {R}_+^V\,\}. \end{aligned}$$

This is a nonempty, closed, and convex set of antiblocking type contained in \([0, 1]^V\). Indeed, to see that \({{\,\mathrm{TH^\circ }\,}}(H) \subseteq [0, 1]^V\), fix \(x \in V\) and let \(w = \chi _{\{x\}}\). Then \(\lambda = 1\) and \(Z = \chi _{\{x\}} \chi _{\{x\}}^\textsf{T}\) is a feasible solution of \(\vartheta ^\circ (H, w)\), whence \(\vartheta ^\circ (H, w) \le 1\), implying that \(g(x) = w^\textsf{T}g \le 1\) for every \(g \in {{\,\mathrm{TH^\circ }\,}}(H)\), as we wanted.

Theorem 4.1

If \(H = (V, E)\) is an r-uniform hypergraph for \(r \ge 2\), then:

  1. (1)

    \(\vartheta ^\circ (H, w) = \max \{\, w^\textsf{T}g: g \in {{\,\mathrm{TH^\circ }\,}}(H)\,\}\) for every \(w \in \mathbb {R}_+^V\);

  2. (2)

    \(\vartheta (H, l) \vartheta ^\circ ({\overline{H}}, w) \ge l^\textsf{T}w\) for every l, \(w \in \mathbb {R}_+^V\);

  3. (3)

    \(A({{\,\textrm{TH}\,}}(H)) = {{\,\mathrm{TH^\circ }\,}}({\overline{H}})\).

If G is a graph, then \(A({{\,\textrm{TH}\,}}(G)) = {{\,\textrm{TH}\,}}({\overline{G}})\), and hence \({{\,\textrm{TH}\,}}(G) = {{\,\mathrm{TH^\circ }\,}}(G)\). For r-uniform hypergraphs with \(r \ge 3\), this is no longer always the case.

Proof

The proof of (1) will require the following facts:

  1. (i)

    if \(w \in \mathbb {R}_+^V\) and \(\alpha \ge 0\), then \(\vartheta ^\circ (H, \alpha w) = \alpha \vartheta ^\circ (H, w)\);

  2. (ii)

    if w, \(w' \in \mathbb {R}_+^V\), then \(\vartheta ^\circ (H, w + w') \le \vartheta ^\circ (H, w) + \vartheta ^\circ (H, w')\);

  3. (iii)

    if w, \(w' \in \mathbb {R}_+^V\) and \(w' \le w\), then \(\vartheta ^\circ (H, w') \le \vartheta ^\circ (H, w)\).

To show (i), note that if \((\lambda , Z)\) is a feasible solution of \(\vartheta ^\circ (H, w)\), then \((\alpha \lambda , \alpha Z)\) is a feasible solution of \(\vartheta ^\circ (H, \alpha w)\). For (ii), simply take feasible solutions for w and \(w'\) and note that their sum is a feasible solution for \(w + w'\). For (iii), we show that if \(w' \le w\) differs from w in a single entry \(x \in V\), then the inequality holds; by applying this result repeatedly, we then get (iii).

Indeed, fix \(x \in V\) and let \((\lambda , Z)\) be an optimal solution of \(\vartheta ^\circ (H, w)\). If \({\bar{Z}}\) is the Hadamard product of Z and \((\textbf{1}- \chi _{\{x\}})(\textbf{1}- \chi _{\{x\}})^\textsf{T}\), then \((\lambda , {\bar{Z}})\) is a feasible solution of \(\vartheta ^\circ (H, {\bar{w}})\), where \({\bar{w}}(x) = 0\) and \({\bar{w}}(y) = w(y)\) for \(y \ne x\). By taking convex combinations of \((\lambda , {\bar{Z}})\) and \((\lambda , Z)\), we then see that \(\vartheta ^\circ (H, w') \le \lambda = \vartheta ^\circ (H, w)\) for every \(w'\) such that \(0 \le w'(x) \le w(x)\) and \(w'(y) = w(y)\) for \(y \ne x\).

Back to (1), suppose \(\max \{\, w^\textsf{T}g: g \in {{\,\mathrm{TH^\circ }\,}}(H)\,\} < \vartheta ^\circ (H, w)\). Since \({{\,\mathrm{TH^\circ }\,}}(H)\) is a compact set, Theorem 8.1 from Appendix 1 gives us a function \(y:\mathbb {R}_+^V \rightarrow \mathbb {R}_+\), of finite support, such that

$$\begin{aligned} \sum _{{\bar{w}} \in \mathbb {R}_+^V} y({\bar{w}}) {\bar{w}} \ge w\quad \text {and}\quad \sum _{{\bar{w}} \in \mathbb {R}_+^V} y({\bar{w}}) \vartheta ^\circ (H, {\bar{w}}) < \vartheta ^\circ (H, w), \end{aligned}$$

and together with (i), (ii), and (iii) we get a contradiction.

To see (2), fix l, \(w \in \mathbb {R}_+^V\) and let \((\lambda , Z)\) be an optimal solution of \(\vartheta ^\circ ({\overline{H}}, w)\). If \(w = 0\), then the result is immediate, so assume \(w \ne 0\) and therefore \(\vartheta ^\circ ({\overline{H}}, w) > 0\). Then \(\lambda ^{-1} Z \in {{\,\textrm{LTH}\,}}(H)\) so \(\lambda ^{-1} w \in {{\,\textrm{TH}\,}}(H)\) and

$$\begin{aligned} \vartheta (H, l) \ge l^\textsf{T}(\lambda ^{-1} w) = \vartheta ^\circ ({\overline{H}}, w)^{-1} l^\textsf{T}w, \end{aligned}$$

proving (2).

To finish, we prove that if \(f \in {{\,\textrm{TH}\,}}(H)\) and \(g \in {{\,\mathrm{TH^\circ }\,}}({\overline{H}})\), then \(f^\textsf{T}g \le 1\), as (3) then follows by using Lehman’s length-width inequalityFootnote 2 together with (1) and (2). So take \(f \in {{\,\textrm{TH}\,}}(H)\) and \(g \in {{\,\mathrm{TH^\circ }\,}}({\overline{H}})\). Let \(A \in {{\,\textrm{LTH}\,}}(H)\) be such that \(\vartheta (H, g) = g^\textsf{T}a\), where \(a = {{\,\textrm{diag}\,}}A\). Note that \(\lambda = 1\) and \(Z = A\) is a feasible solution of \(\vartheta ^\circ ({\overline{H}}, a)\), so \(\vartheta ^\circ ({\overline{H}}, a) \le 1\). Hence

$$\begin{aligned} f^\textsf{T}g \le \vartheta (H, g) = g^\textsf{T}a \le \vartheta ^\circ ({\overline{H}}, a) \le 1, \end{aligned}$$

and we are done. \(\square \)

The antiblocker offers another relaxation of the independent-set polytope: we have the following analogue of Theorem 2.1 and Corollary 2.3.

Theorem 4.2

If H is an r-uniform hypergraph for \(r \ge 2\), then

$$\begin{aligned} (r-1)^{-1}{{\,\textrm{IND}\,}}(H) \subseteq {{\,\mathrm{TH^\circ }\,}}(H) \subseteq (r-1)^{-1} {{\,\textrm{QIND}\,}}(H) \end{aligned}$$
(18)

and \((r-1)^{-1} \alpha (H, w) \le \vartheta ^\circ (H, w) \le \chi ^*({\overline{H}}, w)\) for every \(w \in \mathbb {R}_+^V\).

Proof

The antiblocker of \({{\,\textrm{IND}\,}}(H)\) is \((r-1)^{-1} {{\,\textrm{QIND}\,}}({\overline{H}})\) (see Theorem 9.4 in Schrijver [25]). Since also \(A(\alpha K) = \alpha ^{-1}A(K)\) for every convex set of antiblocking type K and \(\alpha > 0\), we get (18) directly from Theorems 2.1 and 4.1.

It follows that \((r-1)^{-1}\alpha (H, w) \le \vartheta ^\circ (H, w)\). The proof of \(\vartheta ^\circ (H, w) \le \chi ^*({\overline{H}}, w)\) is a straightforward modification of the proof of Corollary 2.3. \(\square \)

5 Exploiting Symmetry

When a hypergraph is highly symmetric, the optimization problem over the theta body or its lifted counterpart can be significantly simplified. We enter the realm of invariant semidefinite programs, a topic which has been thoroughly explored in the last decade [2]. In this section, we discuss the aspects of the general theory that are most relevant to our applications.

Let V be a finite set and let \(\Gamma \) be a finite group that acts on V. The action of \(\Gamma \) extends naturally to a function \(f \in \mathbb {R}^V\): given \(\sigma \in \Gamma \) we define

$$\begin{aligned} (\sigma f)(x) = f(\sigma ^{-1}x). \end{aligned}$$

Similarly, the action extends to matrices \(A \in \mathbb {R}^{V \times V}\) by setting

$$\begin{aligned} (\sigma A)(x, y) = A(\sigma ^{-1} x, \sigma ^{-1} y) \end{aligned}$$

for every \(\sigma \in \Gamma \). We say that \(f\in \mathbb {R}^V\) is \(\Gamma \) -invariant if \(\sigma f = f\) for all \(\sigma \in \Gamma \). We define \(\Gamma \)-invariant matrices likewise.

An automorphism \(\sigma \) of a hypergraph \(H = (V, E)\) is a permutation of V that preserves edges: if \(e \in E\), then \(\sigma e \in E\). The set of all automorphisms forms a group under function composition, called the automorphism group of H and denoted by \({{\,\textrm{Aut}\,}}(H)\).

Let \(H = (V, E)\) be an r-uniform hypergraph for \(r \ge 2\). Consider first the optimization problem over \({{\,\textrm{LTH}\,}}(H)\): given \(w \in \mathbb {R}_+^V\), we want to find

$$\begin{aligned} \max \{\, w^\textsf{T}f: F \in {{\,\textrm{LTH}\,}}(H) \text { and}~f = {{\,\textrm{diag}\,}}F\,\}. \end{aligned}$$
(19)

If \(\Gamma \subseteq {{\,\textrm{Aut}\,}}(H)\) is a group and w is \(\Gamma \)-invariant, then when solving the optimization problem above we may restrict ourselves to \(\Gamma \)-invariant matrices F.

Indeed, for \(x \in V\), let \(H_x = (V_x, E_x)\) be the link of x. Since \(\Gamma \subseteq {{\,\textrm{Aut}\,}}(H)\), for every \(x \in V\) and every \(\sigma \in \Gamma \) we have that \(V_{\sigma x} = \sigma V_x\) and \(E_{\sigma x} = \sigma E_x\), hence \(H_{\sigma x} = \sigma H_x\). It follows that \({{\,\textrm{TH}\,}}(H_{\sigma x}) = \sigma {{\,\textrm{TH}\,}}(H_x)\), where the action of \(\sigma \) maps the function \(f:V_x \rightarrow \mathbb {R}\) to the function \(\sigma f:V_{\sigma x} \rightarrow \mathbb {R}\) by \((\sigma f)(\sigma y) = f(y)\) for \(y \in V_x\).

This implies that, if \(F \in {{\,\textrm{LTH}\,}}(H)\), then \(\sigma F \in {{\,\textrm{LTH}\,}}(H)\) for every \(\sigma \in \Gamma \). Since w is invariant, the objective values of F and \(\sigma F\) coincide for every \(\sigma \in \Gamma \). Use the convexity of \({{\,\textrm{LTH}\,}}(H)\) to conclude that, if \(F \in {{\,\textrm{LTH}\,}}(H)\), then

$$\begin{aligned} {\bar{F}} = \frac{1}{|\Gamma |} \sum _{\sigma \in \Gamma } \sigma F \end{aligned}$$

also belongs to \({{\,\textrm{LTH}\,}}(H)\). Now \({\bar{F}}\) is \(\Gamma \)-invariant and has the same objective value as F, hence when solving (19) we can restrict ourselves to \(\Gamma \)-invariant matrices.

If \(\Gamma \) is a large group, this restriction allows us to simplify (19) considerably using standard techniques [2]. The case when \(\Gamma \) acts transitively on V is of particular interest to us.

Theorem 5.1

If \(H = (V, E)\) is an r-uniform hypergraph for \(r \ge 2\) and if \(\Gamma \subseteq {{\,\textrm{Aut}\,}}(H)\) acts transitively on V, then the optimal value of (19) for \(w = \textbf{1}\) is equal to the optimal value of the problem

(20)

where \(x_0 \in V\) is any fixed vertex and J is the all-ones matrix.

Proof

Note that \(w = \textbf{1}\) is \(\Gamma \)-invariant, so when solving (19) we can restrict ourselves to \(\Gamma \)-invariant matrices. Let F be a \(\Gamma \)-invariant feasible solution of (19) and set \(A = |V| (\textbf{1}^\textsf{T}f)^{-1} F\), where \(f = {{\,\textrm{diag}\,}}F\).

Since \(\Gamma \) acts transitively, all diagonal entries of F are equal, hence A is a feasible solution of (20). Now \(R(F)\) is positive semidefinite, and hence the Schur complement \(F - f f^\textsf{T}\) is also positive semidefinite. So

$$\begin{aligned} |V|^{-1} \langle J, A\rangle = (\textbf{1}^\textsf{T}f)^{-1} \langle J, F\rangle \ge (\textbf{1}^\textsf{T}f)^{-1} \langle J, f f^\textsf{T}\rangle = \textbf{1}^\textsf{T}f, \end{aligned}$$

and we see that the optimal value of (20) is at least that of (19).

For the reverse inequality, let A be a feasible solution of (20). Since the action of \(\Gamma \) is transitive, we immediately get that \(A(x, x) = 1\) for all \(x \in V\); it is a little more involved, though mechanical, to verify that \(A_x[V_x] \in {{\,\textrm{TH}\,}}(H_x)\) for all \(x \in V\).

So set \(F = |V|^{-2} \langle J, A\rangle A\) and \(f = {{\,\textrm{diag}\,}}F\); note that \(f = |V|^{-2} \langle J, A\rangle \textbf{1}\). Since \(\textbf{1}^\textsf{T}f = |V|^{-1} \langle J, A\rangle \), if we show that F is a feasible solution of (19), then we are done, and to show that F is feasible for (19) it suffices to show that \(R(F)\) is positive semidefinite.

This in turn can be achieved by showing that the Schur complement \(F - f f^\textsf{T}\) is positive semidefinite. Indeed, note that since A is \(\Gamma \)-invariant, the constant vector \(\textbf{1}\) is an eigenvector of A with eigenvalue \(|V|^{-1} \langle J, A\rangle \). Hence \(\textbf{1}\) is an eigenvector of both F and \(f f^\textsf{T}\) with the same eigenvalue; since all other eigenvalues of \(f f^\textsf{T}\) are zero and F is positive semidefinite, we are done. \(\square \)

Symmetry also simplifies testing whether a given vector is in the theta body.

Theorem 5.2

Let \(H = (V, E)\) be an r-uniform hypergraph with \(r \ge 2\) and let \(\Gamma \subseteq {{\,\textrm{Aut}\,}}(H)\) be a group. A \(\Gamma \)-invariant vector \(f \in \mathbb {R}^V\) is in \({{\,\textrm{TH}\,}}(H)\) if and only if \(f \ge 0\) and \(w^\textsf{T}f \le \vartheta (H, w)\) for every \(\Gamma \)-invariant \(w \in \mathbb {R}_+^V\).

Proof

Necessity being trivial from Theorem 2.4, let us prove sufficiency. If \(w\in \mathbb {R}_+^V\) is any weight function, then since f is \(\Gamma \)-invariant we have that

$$\begin{aligned} w^\textsf{T}f = \frac{1}{|\Gamma |} \sum _{\sigma \in \Gamma } w^\textsf{T}(\sigma f) = \frac{1}{|\Gamma |} \sum _{\sigma \in \Gamma } (\sigma ^{-1}w)^\textsf{T}f = {\bar{w}}^\textsf{T}f, \end{aligned}$$

where \({\bar{w}} = |\Gamma |^{-1}\sum _{\sigma \in \Gamma } \sigma ^{-1} w\). Note that \({\bar{w}}\) is \(\Gamma \)-invariant.

We claim that \(\vartheta (H, {\bar{w}}) \le \vartheta (H, w)\). Indeed, since \({\bar{w}}\) is \(\Gamma \)-invariant, let \(g \in {{\,\textrm{TH}\,}}(H)\) be a \(\Gamma \)-invariant vector such that \({\bar{w}}^\textsf{T}g = \vartheta (H, {\bar{w}})\). Then \((\sigma w)^\textsf{T}g = w^\textsf{T}(\sigma ^{-1} g) = w^\textsf{T}g\) for all \(\sigma \in \Gamma \), and so \(w^\textsf{T}g = {\bar{w}}^\textsf{T}g\), hence \(\vartheta (H, w) \ge {\bar{w}}^\textsf{T}g\), proving the claim.

Now use to claim to get \(w^\textsf{T}f = {\bar{w}}^\textsf{T}f \le \vartheta (H, {\bar{w}}) \le \vartheta (H, w)\), and with Theorem 2.4 we are done. \(\square \)

6 Triangle-Encoding Hypergraphs and Mantel’s Theorem

In a 1910 issue of the journal Wiskundige Opgaven, published by the Royal Dutch Mathematical Society, Mantel [19] asked what perhaps turned out to be the first question of extremal graph theory; in modern terminology: how many edges can a triangle-free graph on n vertices have? The complete bipartite graph on n vertices with parts of size \(\lfloor n/2\rfloor \) and \(\lceil n/2\rceil \) is triangle-free and has \(\lfloor n^2/4\rfloor \) edges. Can we do better?

The answer came in the same issue, supplied by Mantel and several others: a triangle-free graph on n vertices has at most \(\lfloor n^2/4\rfloor \) edges. Mantel’s theorem was later generalized by Turán to \(K_r\)-free graphs for \(r \ge 4\).

Given an integer \(n \ge 3\), we want to find the largest triangle-free graph on \([n] = \{1, \ldots , n\}\). So we construct a 3-uniform hypergraph \(H_n = (V_n, E_n)\) as follows:

  • the vertices of \(H_n\) are the edges of the complete graph \(K_n\) on [n];

  • three vertices of \(H_n\), corresponding to three edges of \(K_n\), form an edge of \(H_n\) if they form a triangle in \(K_n\).

Independent sets of \(H_n\) thus correspond to triangle-free subgraphs of \(K_n\), and the independent-set polytope of \(H_n\) coincides with the Turán polytope studied by Raymond [21]. In order to illustrate our methods we will compute the theta number \(\vartheta (H_n)\), which provides an upper bound of \(n^2/4\) for the independence number of \(H_n\). This bound, rounded down, coincides with the lower bound \(\lfloor n^2/4\rfloor \) given by the complete bipartite graph, showing that the theta number is essentially tight for this infinite family of hypergraphs. Incidentally, this gives another proof of Mantel’s theorem, though not a particularly short one.

The symmetric group \(\mathcal {S}_n\) on n elements acts on [n], and therefore on \(V_n\), and this action preserves edges of \(H_n\), hence \(\mathcal {S}_n\) is a subgroup of \({{\,\textrm{Aut}\,}}(H_n)\). The action of \(\mathcal {S}_n\) is also transitive, so we set \(x_0 = \{1, 2\}\) and use Theorem 5.1 to get

(21)

The link of \(x_0 = \{1,2\}\) is the graph with vertex set

$$\begin{aligned} (V_n)_{x_0} = \{\, \{1, k\}: k \in \{3, \ldots , n\}\,\} \cup \{\, \{2, k\}: k \in \{3, \ldots , n\}\,\} \end{aligned}$$

and edge set

$$\begin{aligned} (E_n)_{x_0} = \{\,\{\{1, k\}, \{2,k\}\}: k \in \{3, \ldots , n\}\,\}, \end{aligned}$$

that is, it is a matching with \(2(n-2)\) vertices and \(n-2\) edges (see Figure). 1

Fig. 1
figure 1

On the left, the hypergraph \(H_4\) where each vertex is the edge of \(K_4\) shown. On the right, the link of , consisting of a matching with 4 vertices and 2 edges

The row \(A_{x_0}\) of an \(\mathcal {S}_n\)-invariant matrix \(A \in \mathbb {R}^{V_n \times V_n}\) is invariant under the stabilizer of \(x_0\), and so \(A_{x_0}[V_{x_0}]\) is a constant function since the stabilizer acts transitively on \((V_n)_{x_0}\). Theorem 5.2 then implies that \(A_{x_0}[(V_n)_{x_0}] \in {{\,\textrm{TH}\,}}((H_n)_{x_0})\) if and only if

$$\begin{aligned} 0 \le A(x_0, \{1, 3\}) \le \frac{\vartheta ((H_n)_{x_0})}{2(n-2)} = \frac{1}{2}, \end{aligned}$$
(22)

since \(n-2 \le \alpha ((H_n)_{x_0}) \le \vartheta ((H_n)_{x_0}) \le \chi (\overline{(H_n)_{x_0}}) \le n-2\).

We simplify this problem further by computing a basis of the space of \(\mathcal {S}_n\)-invariant symmetric matrices in \(\mathbb {R}^{V_n \times V_n}\). The action of \(\mathcal {S}_n\) on \(V_n\) extends naturally to an action on \(V_n \times V_n\). There are three orbits of \(V_n \times V_n\) under this action, namely \(R_k = \{\, (x, y): |x \cap y| = 2-k\, \}\) for \(k = 0\), 1, and 2. So a basis of the invariant subspace is given by the matrices \(A_k\) such that

$$\begin{aligned} A_k(x, y) = {\left\{ \begin{array}{ll} 1,&{}\text {if}~(x, y) \in R_k;\\ 0,&{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Note that \(A_0\) is the identity matrix.

A feasible solution of (21) is then of the form

$$\begin{aligned} A = A_0 + \alpha A_1 + \beta A_2 \end{aligned}$$
(23)

for some real numbers \(\alpha \) and \(\beta \). We see moreover that \(A(x_0, \{1,3\}) = \alpha \), and so (22) becomes \(0 \le \alpha \le 1/2\). The objective function is

$$\begin{aligned} \begin{aligned} |V_n|^{-1} \langle J, A\rangle&= |V_n|^{-1} (\langle J, A_0\rangle + \alpha \langle J, A_1 \rangle + \beta \langle J, A_2\rangle )\\&= 1 + |V_n|^{-1} |R_1| \alpha + |V_n|^{-1} |R_2| \beta . \end{aligned} \end{aligned}$$
(24)

For the positive semidefiniteness constraint on A, we observe that \(\{A_0, A_1, A_2\}\) is the Johnson scheme \(\mathcal {J}(n, 2)\) (see Godsil and Meagher [11, Chapter 6]). The algebra spanned by the scheme (its Bose-Mesner algebra) is commutative, unital, and closed under transposition; its matrices then share a common basis of eigenvectors, say \(v_1\), \(v_2\), and \(v_3\), and can therefore be simultaneously diagonalized. The eigenvalues of \(v_1\), \(v_2\), and \(v_3\) for each matrix are (see Theorem 6.5.2, ibid.):

$$\begin{aligned} \begin{array}{ll} A_0:&{}1,\ 1,\ 1;\\ A_1:&{}-2,\ n-4,\ 2n-4;\\ A_2:&{}1,\ -(n-3),\ (n-2)(n-3)/2. \end{array} \end{aligned}$$

Putting it all together, our original problem can be rewritten as

This is a linear program on two variables. Using the dual, or finding all vertices of the primal feasible region, it is easy to verify that one optimal solution is

$$\begin{aligned} \alpha = 1/2\quad \text {and}\quad \beta = \frac{n-2}{2(n-3)} \end{aligned}$$

for all \(n \ge 4\). This gives us an optimal value of \(n^2/4\), which rounded down coincides with the lower bound coming from complete bipartite graphs.

7 Triangle-Avoiding Sets in the Hamming Cube

For an integer \(n \ge 1\), consider the Hamming cube \(\mathbb {H}^n = \{0,1\}^n\) equipped with the Hamming distance, which for x, \(y \in \mathbb {H}^n\) is denoted by d(xy) and equals the number of bits in which x and y differ. A classical problem in coding theory is to determine the parameter A(nd), which is the maximum size of a subset I of \(\mathbb {H}^n\) such that \(d(x, y) \ge d\) for all distinct x, \(y \in I\).

If we let G(nd) be the graph with vertex set \(\mathbb {H}^n\) in which x, \(y \in \mathbb {H}^n\) are adjacent if \(d(x, y) < d\), then \(A(n, d) = \alpha (G(n, d))\). A simple variant of the Lovász theta number of G(nd), obtained by requiring that F in (2) be nonnegative as well, then provides an upper bound for A(nd), which is easy to compute given the abundant symmetry of G(nd). This bound, known as the linear programming bound, was originally described by Delsarte [6]; its relation to the theta number was later discovered by McEliece, Rodemich, and Rumsey [20] and Schrijver [22].

We now consider a hypergraph analogue of this problem. Let \(s \ge 1\) be an integer. Three distinct points \(x_1\), \(x_2\), \(x_3 \in \mathbb {H}^n\) form an s -triangle if \(d(x_i, x_j) = s\) for all \(i \ne j\). It is easy to show that there is an s-triangle in \(\mathbb {H}^n\) if and only if s is even and \(0 < s \le \lfloor 2n/3\rfloor \).

We want to find the largest size of a set of points in \(\mathbb {H}^n\) that avoids s-triangles. More precisely, given integers n, \(s \ge 1\), we consider the hypergraph H(ns) whose vertex set is \(\mathbb {H}^n\) and whose edges are all s-triangles and we want to find its independence number. The theta number \(\vartheta (H(n, s))\) defined in (3) gives us an upper bound.

To compute \(\vartheta (H(n, s))\), start by noting that \({{\,\textrm{Iso}\,}}(\mathbb {H}^n)\), the group of isometries of \(\mathbb {H}^n\), is a subgroup of the automorphism group of \(H = H(n, s)\) and, since it acts transitively on \(\mathbb {H}^n\), we can use Theorem 5.1 to simplify our problem. To do so we choose \(x_0 = 0\).

The vertex set of the link \(H_0\) of 0 is \(\mathbb {H}^n_s\), the set of all words of weight s, the weight of a word being the number of 1 s in it; two words are adjacent in \(H_0\) if they are at distance s. The isometry group \({{\,\textrm{Iso}\,}}(\mathbb {H}^n_s)\) of \(\mathbb {H}^n_s\) is a subgroup of the automorphism group of \(H_0\).

If \(A:\mathbb {H}^n \times \mathbb {H}^n \rightarrow \mathbb {R}\) is an \({{\,\textrm{Iso}\,}}(\mathbb {H}^n)\)-invariant symmetric matrix, then A(xy) depends only on d(xy), and so \(a = A_0[V_0]\) is a constant function. We write A(t) for the value of A(xy) when \(d(x, y) = t\).

By Theorem 5.2, we have \(a \in {{\,\textrm{TH}\,}}(H_0)\) if and only if \(a \ge 0\) and \(w^\textsf{T}a \le \vartheta (H_0, w)\) for every \({{\,\textrm{Iso}\,}}(\mathbb {H}^n_s)\)-invariant \(w \in \mathbb {R}_+^{V_0}\). Since \({{\,\textrm{Iso}\,}}(\mathbb {H}^n_s)\) acts transitively on \(\mathbb {H}^n_s\), every such invariant w is constant, and we conclude that \(A_0[V_0] \in {{\,\textrm{TH}\,}}(H_0)\) if and only if \(0 \le |\mathbb {H}^n_s| A(s) \le \vartheta (H_0)\).

The problem can be further simplified. A matrix \(A:\mathbb {H}^n \times \mathbb {H}^n \rightarrow \mathbb {R}\) is \({{\,\textrm{Iso}\,}}(\mathbb {H}^n)\)-invariant and positive semidefinite if and only if there are numbers \(a_0\), ..., \(a_n \ge 0\) such that

$$\begin{aligned} A(t) = \sum _{k=0}^n a_k K^n_k(t), \end{aligned}$$

where \(K^n_k\) is the Krawtchouk polynomial of degree k, normalized so \(K^n_k(0) = 1\). This polynomial can be defined on integers \(t \in \{0, \ldots ,n\}\) by the formula

$$\begin{aligned} K^n_k(t) = \left( {\begin{array}{c}n\\ k\end{array}}\right) ^{-1} \sum _{i=0}^k (-1)^i \left( {\begin{array}{c}t\\ i\end{array}}\right) \left( {\begin{array}{c}n-t\\ k-i\end{array}}\right) . \end{aligned}$$

If \(E_k(x, y) = K^n_k(d(x, y))\), then we have the orthogonality relations \(\langle E_k, E_l\rangle = 0\) for \(k \ne l\); see Dunkl [8].

With this characterization, and noting that \(E_0 = J\) is the all-ones matrix, we have

$$\begin{aligned} \langle J, A\rangle = \langle J, a_0 E_0 \rangle = |\mathbb {H}^n|^2 a_0 = 2^{2n} a_0. \end{aligned}$$

Rewriting (20), we see that \(\vartheta (H(n,s))\) is the optimal value of the problem

(25)

Here, we have omitted the constraint \(0 \le |\mathbb {H}_s^n| A(s)\), since it is automatically satisfied by the optimal solution.

Problem (25) has only two constraints, and so its optimal solution admits a simple expression. With \(M_K^n(s) = \min \{\, K^n_k(s): {k = 0, \dots ,~n}\,\}\) for \(s \ge 0\) we have:

Theorem 7.1

If \(n \ge 1\) is an integer and \(0 < s \le \lfloor 2n/3\rfloor \) is an even integer, then

$$\begin{aligned} \vartheta (H(n, s)) = 2^n\frac{M_K^n(s) - |\mathbb {H}^n_s|^{-1} \vartheta (H(n, s)_0)}{M_K^n(s) - 1}. \end{aligned}$$
(26)

Proof

Write \(H = H(n, s)\). By our choice of s, there are s-triangles in \(\mathbb {H}^n\), so \(H_0\) is a nonempty graph. Hence \(\vartheta (H_0) \le \chi (\overline{H_0}) < |\mathbb {H}^n_s|\), and so a feasible solution of (25) has to use some variable \(a_k\) for \(k > 0\).

To solve our problem we want to maximize \(a_0\) keeping the convex combination

$$\begin{aligned} \sum _{k=0}^n a_k K^n_k(s) \end{aligned}$$

below \(|\mathbb {H}^n_s|^{-1} \vartheta (H_0)\). We cannot achieve this by using only \(a_0\), so the best way to do it is to let \(k^*\) be such that \(K^n_{k^*}(s) = M_K^n(s)\) and use only the variables \(a_0\) and \(a_{k^*}\). This leads us to the system

$$\begin{aligned} a_0 + a_{k^*}&= 1,\\ a_0 + a_{k^*} M_K^n(s)&= |\mathbb {H}^n_s|^{-1} \vartheta (H_0), \end{aligned}$$

whose solution yields exactly (26). \(\square \)

To compute \(\vartheta (H_0)\) we again use symmetry. Let \(A:\mathbb {H}^n_s \times \mathbb {H}^n_s \rightarrow \mathbb {R}\) be a matrix. If A is \({{\,\textrm{Iso}\,}}(\mathbb {H}^n_s)\)-invariant, then A(xy) depends only on d(xy), and so we write A(t) for the value of A(xy) when \(d(x, y) = t\). The matrix A is \({{\,\textrm{Iso}\,}}(\mathbb {H}^n_s)\)-invariant and positive semidefinite if and only if there are numbers \(a_0\), ..., \(a_s \ge 0\) such that

$$\begin{aligned} A(t) = \sum _{k=0}^s a_k Q^{n,s}_k(t/2) \end{aligned}$$

(note that Hamming distances in \(\mathbb {H}^n_s\) are always even), where \(Q^{n,s}_k\) is the Hahn polynomial of degree k, normalized so \(Q_k^{n,s}(0) = 1\). For an integer \(0 \le t \le s\), these polynomials are given by the formula

$$\begin{aligned} Q^{n,s}_k(t) = \sum _{i=0}^k (-1)^i \left( {\begin{array}{c}s\\ i\end{array}}\right) ^{-1} \left( {\begin{array}{c}n-s\\ i\end{array}}\right) ^{-1} \left( {\begin{array}{c}k\\ i\end{array}}\right) \left( {\begin{array}{c}n+1-k\\ i\end{array}}\right) \left( {\begin{array}{c}t\\ i\end{array}}\right) . \end{aligned}$$

If \(E_k(x, y) = Q^{n,s}_k(d(x, y) / 2)\), then \(\langle E_k, E_l\rangle = 0\) whenever \(k \ne l\) (see Delsarte [7], in particular Theorem 5, and Dunkl [9]).

With this characterization, \(\langle J, A\rangle = |\mathbb {H}^n_s|^2 a_0\) since \(E_0 = J\). Rewriting (20), we see that \(\vartheta (H_0)\) is the optimal value of the problem

(27)

Writing \(M^n_Q(s) = \min \{\, Q^{n,s}_k(s/2): {k=0, \dots ,~s}\,\}\), we have the analogue of Theorem 7.1.

Theorem 7.2

If \(n \ge 1\) is an integer and \(0 < s \le \lfloor 2n/3\rfloor \) is an even integer, then

$$\begin{aligned} \vartheta (H(n, s)_0) = |\mathbb {H}^n_s| \frac{M_Q^n(s)}{M_Q^n(s) - 1}. \end{aligned}$$

Proof

Adapt the proof of Theorem 7.1. \(\square \)

The upshot is that \(\vartheta (H(n,s))\) may be expressed entirely in terms of the parameters

$$\begin{aligned} M_K^n(s)&= \min \{\, K^n_k(s) : k = 0, \dots ,n \,\}\quad \text {and} \end{aligned}$$
(28)
$$\begin{aligned} M^n_Q(s)&= \min \{\, Q^{n,s}_k(s/2) : k=0, \dots ,s \,\}. \end{aligned}$$
(29)

Very similar expressions can be derived for the theta number in the more general setting of q -ary cubes \(\{0, \dots , q-1\}^n\) for any integer \(q \ge 2\); in this case we must use Krawtchouk polynomials with weight \((q-1)/q\) (see Dunkl [8]) and q-ary Hahn polynomials [7].

The theta number for hypergraphs can also be extended to some well-behaved infinite hypergraphs, and can be used in particular to provide upper bounds for the density of simplex-avoiding sets on the sphere and in Euclidean space [3]. For triangle-avoiding sets on the sphere \(S^{n-1}\), for instance, the bound obtained is like the one in Theorems 7.1 and 7.2, with both the Krawtchouk and Hahn polynomials replaced by Gegenbauer (ultraspherical) polynomials \(P^n_k\) (resp. \(P^{n-1}_k\)), which are the orthogonal polynomials on the interval \([-1, 1]\) for the weight function \((1-x^2)^{(n-3)/2}\). In this setting, the link of a vertex \(x \in S^{n-1}\) is a scaled copy of \(S^{n-2}\).

This bound can be analyzed asymptotically, yielding an upper bound for the density of simplex-avoiding sets that decays exponentially fast in the dimension of the underlying space. The key point in the analysis is to show exponential decay of the parameter \(M_P^n(t) = \min \{\, P^n_k(t): k \ge 0 \,\}\) for \(t \in (0,1)\). This is done in two steps. First, one uses results on the asymptotic behavior of the roots of Gegenbauer polynomials to show that \(\min \{\, P^n_k(t): k \ge 0 \,\}\) is attained at \(k = \Omega (n)\). Then, one shows that \(|P^n_k(t)|\) tends to 0 exponentially fast if \(k = \Omega (n)\) by exploiting a particular integral representation for the Gegenbauer polynomials [3, Lemma 4.2].

Fig. 2
figure 2

The plot shows, for every \(n = 20\), ..., 150 on the horizontal axis, the value of \(\ln (\vartheta (H(n, s)) / 2^n)\) on the vertical axis, where s is the even integer closest to n/2 (in green), n/3 (in red), and n/4 (in blue)

The same can be attempted for the Hamming cube: how does the density of a subset of \(\mathbb {H}^n\) that avoids s-triangles behave as n goes to infinity? For a fixed s, the answer is simple, since \(|\mathbb {H}^n_s|\) is exponentially smaller than \(|\mathbb {H}^n|\). We should therefore consider a regime where s and n both tend to infinity; for instance, we could take \(s = s(n, c)\) to be the even integer closest to n/c for some \(c > 1\). Numerical evidence (see Fig. 7) supports the following conjecture.

Conjecture 7.3

With s(nc) defined as above, \(\vartheta (H(n, s(n, c))) / 2^n\) decays exponentially fast with n for every fixed \(c > 2\), whereas \(\vartheta (H(n, s(n, 2))) / 2^n\) decays linearly fast with n.

We leave open the question of whether this conjecture, for \(c > 2\), can be proven using Theorems 7.1 and 7.2. Following the strategy of Castro-Silva, Oliveira, Slot, and Vallentin [3], it is possible to show that the minima in (28) and (29) are attained at \(k = \Omega (n)\), using results on the roots of Krawtchouk and Hahn polynomials. For \(c = 2\), it appears that the minimum in (28) is always attained at \(k = 2\) when n is a multiple of 4, implying in this case that \(M_n(n/2) = K_2^n(n/2) = -1/(n-1)\). The remaining obstacle to finishing the analysis of the asymptotic behavior of \(M_K^n(s)\) and \(M_Q^n(s)\) is the lack of a suitable integral representation for the Krawtchouk and Hahn polynomials, as was available for the Gegenbauer polynomials.