1 Introduction

Full row-rank integer matrices with minors bounded by a given constant \(\varDelta \) in the absolute value have been extensively studied in integer linear programming as well as matroid theory: The interest for optimization was initiated by the paper of Artmann, Weismantel and Zenklusen [2] who showed that integer linear programs with a bimodular constraint matrix, meaning that all its maximal size minors are bounded by two in absolute value, can be solved in strongly polynomial time. With the goal of generalizing the results of Artmann et al. beyond the bimodular case, Nägele, Santiago and Zenklusen [12] studied feasibility and proximity questions of a subclass of integer programs with bounded subdeterminants. Fiorini et al. [6] obtained a strongly polynomial-time algorithm for integer linear programs whose defining coefficient matrix has the property that all its subdeterminants are bounded by a constant and all of its rows contain at most two nonzero entries. For more information on developments regarding this topic, we refer to the three cited contributions above and the references therein.

For a matrix \(A \in \mathbb {R}^{m \times n}\) and for \(1 \le k \le \min \{m,n\}\), we write

$$\begin{aligned} \varDelta _k(A) := \max \{ |\det (B)| : B \text { is a } k \times k \text { submatrix of } A \} \end{aligned}$$

for the maximal absolute value of a \(k \times k\) minor of A. Given an integer \(\varDelta \in \mathbb {Z}_{>0}\), a matrix \(A \in \mathbb {R}^{m \times n}\) of rank m is said to be \(\varDelta \)-modular and \(\varDelta \)-submodular, if \(\varDelta _m(A) = \varDelta \) and \(\varDelta _m(A) \le \varDelta \), respectively.Footnote 1 Moreover, a matrix \(A \in \mathbb {R}^{m \times n}\) is said to be totally \(\varDelta \)-modular and totally \(\varDelta \)-submodular, if \(\max _{k \in [m]} \varDelta _k(A) = \varDelta \) and \(\max _{k \in [m]} \varDelta _k(A) \le \varDelta \), respectively, where \([m] := \{1,2,\ldots ,m\}\).

Our object of study is the generalized Heller constant, which we define as

$$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) := \max \bigl \{ n \in \mathbb {Z}_{>0} : A \in \mathbb {Z}^{m \times n} \&\text {has pairwise distinct columns} \bigr . \\ \bigl .&\text {and} \ \varDelta _m(A) = \varDelta \bigr \}. \end{aligned}$$

The value \({{\,\textrm{h}\,}}(\varDelta ,m)\) is directly related to the value \(\mathfrak {c}(\varDelta ,m)\) studied in [8, 11] and defined as the maximum number n of columns in a \(\varDelta \)-submodular integer matrix A with m rows with the properties that A has no zero columns and for any two distinct columns \(A_i\) and \(A_j\) with \(1 \le i <j \le n\) one has \(A_i \ne A_j\) and \(A_i \ne - A_j\). It is clear that

$$\begin{aligned} \mathfrak {c}(\varDelta ,m) = \frac{1}{2} \bigl ( \max \{{{\,\textrm{h}\,}}(1,m),\ldots , {{\,\textrm{h}\,}}(\varDelta ,m)\} -1 \bigr ), \end{aligned}$$

showing that \(\mathfrak {c}(\varDelta ,m)\) and \({{\,\textrm{h}\,}}(\varDelta ,m)\) are “equivalent” in many respects. However, our proofs are more naturally phrased in terms of \({{\,\textrm{h}\,}}(\varDelta ,m)\) rather than \(\mathfrak {c}(\varDelta ,m)\), as we prefer to prescribe \(\varDelta _m(A)\) rather than providing an upper bound on \(\varDelta _m(A)\), and we do not want to eliminate the potential symmetries within A coming from taking columns \(A_i\) and \(A_j\) that satisfy \(A_i = -A_j\).

Upper bounds on the number of columns in (totally) \(\varDelta \)-(sub)modular integer matrices with m rows have been gradually improved over time. In the case \(\varDelta = 1\), we are concerned with the notion of (totally) unimodular integer matrices. The maximal number of pairwise distinct columns in a (totally) unimodular integer matrix with m rows has been shown by Heller [9] to be equal to \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\). Lee [10, Sect. 10] initiated the study of the maximal number of columns beyond unimodular matrices. In 1989, he proved a bound of \(\mathcal {O}(r^{2\varDelta })\) for matrices of row-rank r and which have what he calls a \(\varDelta \)-regular row-space. This class of matrices includes the totally \(\varDelta \)-submodular integer matrices of row-rank r. Glanzer, Weismantel and Zenklusen [8] revived the story by extending the investigation to \(\varDelta \)-submodular integer matrices and obtaining a polynomial bound in the parameter m. More precisely, they showed that for each \(\varDelta \ge 2\), \({{\,\textrm{h}\,}}(\varDelta ,m)\) is in \(\mathcal {O}(\varDelta ^{2 + \log _2 \log _2 \varDelta } \cdot m^2)\). This result has been recently improved by Lee, Paat, Stallknecht and Xu [11, Theorem 2 and Propositions 1 and 2] who obtained the exact value

$$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) = m^2 + m + 1 + 2m(\varDelta - 1) \qquad \text {if} \qquad \varDelta \le 2 \ \text { or }\ m \le 2, \end{aligned}$$
(1)

and, for every \(\varDelta ,m \in \mathbb {Z}_{\ge 3}\), proved the estimatesFootnote 2

$$\begin{aligned} m^2 + m + 1 + 2m(\varDelta - 1) \le {{\,\textrm{h}\,}}(\varDelta ,m) \le \left( m^2 + m\right) \varDelta ^2 + 1. \end{aligned}$$
(2)

Bounds on \({{\,\textrm{h}\,}}(\varDelta ,m)\) can also be derived using the machinery of matroid theory. In their recent work, Geelen, Nelson and Walsh [7, Propositions 8.6.1] rely on the fact that the class of matroids representable by integer \(\varDelta \)-submodular matrices is minor-closed and that the line on \(2 \varDelta +2\) points (that is, the uniform matroid of rank two with \(2 \varDelta +2\) elements) is an excluded minor for \(\varDelta \)-submodular representability. This shows that \({{\,\textrm{h}\,}}(\varDelta ,m)\) can be bounded by providing a bound, for given positive integers t and m, on the size of a simple matroid of rank m that is representable over the real numbers with no \((t+2)\)-point line being a minor. Employing this approach, in [7, Theorem 2.2.4] the bound \({{\,\textrm{h}\,}}(\varDelta ,m) \le m^2 + f(\varDelta ) m\) is derived with \(f(\varDelta )\) being at least doubly exponential in \(\varDelta \) (see the comment in [11, p. 3]).

The best known upper bounds on \({{\,\textrm{h}\,}}(\varDelta ,m)\) to date have the form of a quadratic polynomial \(a(\varDelta ) m^2 + b(\varDelta ) m + c\) in m, with the coefficients for \(m^2\) and m possibly depending on \(\varDelta \), and the constant term \(c \in \mathbb {R}\) being independent of \(\varDelta \). The bounds are incomparable, since for \(\varDelta \rightarrow \infty \), for some results \(a(\varDelta )\) is large but \(b(\varDelta )\) is small, while for other bounds it is the other way around.

The lower bound \({{\,\textrm{h}\,}}(\varDelta ,m) \ge m^2 + m + 1 + 2m (\varDelta - 1)\) in (2) is obtained from the \(\varDelta \)-modular integer matrix with m rows and whose columns are the differences of any two vectors in the set

$$\begin{aligned} \left\{ \textbf{0},e_1,e_2,\ldots ,e_m \right\} \cup \left\{ 2e_1,3e_1,\ldots ,\varDelta e_1\right\} , \end{aligned}$$

where \(e_i\) denotes the ith coordinate unit vector. This is a natural generalization of the unimodular matrix (\(\varDelta = 1\)) that attains Heller’s result \({{\,\textrm{h}\,}}(1,m) = m^2+m+1\). With this perspective and their precise result (1), for \(\varDelta \le 2\) or \(m \le 2\), Lee et al. [11] conjecture that the lower bound in (2) is actually the correct value of \({{\,\textrm{h}\,}}(\varDelta ,m)\), for any choice of \(\varDelta ,m \in \mathbb {Z}_{>0}\).

Conjecture 1

(Lee et al. [11]) For every \(\varDelta ,m \in \mathbb {Z}_{>0}\), it holds that

$$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) = m^2 + m + 1 + 2m (\varDelta - 1). \end{aligned}$$

On the qualitative side, Conjecture 1 implies that \({{\,\textrm{h}\,}}(\varDelta ,m) \le a(\varDelta ) m^2 + b(\varDelta ) m+c\) holds with \(a(\varDelta ) \in \mathcal {O}(1)\) and \(b(\varDelta ) \in \mathcal {O}(\varDelta )\). If this is true, we would also have \({{\,\textrm{h}\,}}(\varDelta ,m) \le \mathcal {O}(\varDelta ) m^2\), but even this estimate has not yet been confirmed since the currently available bounds are asymptotically too large for \(\varDelta \rightarrow \infty \). The authors of [11, p. 24] ask if there exists a bound of the form \(\mathcal {O}(m^d) \varDelta \), for some constant \(d \in \mathbb {Z}_{>0}\). As our main result, we answer this question in the affirmative by establishing a bound of \(\mathcal {O}(m^4) \varDelta \).

Theorem 1

Let \(\varDelta ,m \in \mathbb {Z}_{>0}\).

  1. (i)

    If \(m \ge 5\), then

    $$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) \le m^2 + m + 1 + 2\,(\varDelta - 1) \cdot \sum _{i=0}^4 \left( {\begin{array}{c}m\\ i\end{array}}\right) \in \mathcal {O}\left( m^4\right) \cdot \varDelta . \end{aligned}$$
  2. (ii)

    If \(m \ge 4\) and \(\varDelta \) is odd, then

    $$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) \le m^2 + m + 1 + 2\,(\varDelta - 1) \cdot \sum _{i=0}^3 \left( {\begin{array}{c}m\\ i\end{array}}\right) \in \mathcal {O}\left( m^3\right) \cdot \varDelta . \end{aligned}$$

It remains an open question whether our bound can be improved, for all \(\varDelta \), to a bound of \(\mathcal {O}(m^d) \varDelta \) for some exponent \(d < 4\).

Based on computational experiments for small values of m and \(\varDelta \), we found counterexamples to Conjecture 1 for \(\varDelta \in \{4,8,16\}\) and sufficiently large m.

Theorem 2

We have

$$\begin{aligned} {{\,\textrm{h}\,}}(4,m)&\ge m^2 + 9m - 3 \quad \quad \text { for } \quad m \ge 3, \\ {{\,\textrm{h}\,}}(8,m)&\ge m^2 + 19m - 11 \quad \quad \text { for } \quad m \ge 4, \quad \text { and} \\ {{\,\textrm{h}\,}}(16,m)&\ge m^2 + 33m - 17 \quad \quad \text { for } \quad m \ge 10. \end{aligned}$$

These lower bounds exceed the conjecture for \({{\,\textrm{h}\,}}(4,m),{{\,\textrm{h}\,}}(8,m)\) and \({{\,\textrm{h}\,}}(16,m)\) by Lee et al. by the additive terms \(2(m-2)\), \(4(m-3)\), and \(2(m-9)\), respectively. This means that the qualitative side of Conjecture 1 as described above still stands.

Organization of the paper The paper is organized as follows. In Sect. 2, we describe our geometric idea that explains the linearity in \(\varDelta \) of the bounds in Theorem 1, and we introduce two variants of the Heller constant \({{\,\textrm{h}\,}}(1,m)\) which we polynomially bound in Sects. 3 and 4. In Sect. 5, we describe our approach to compute the generalized Heller constant \({{\,\textrm{h}\,}}(\varDelta ,m)\) for small parameters \(\varDelta ,m\). We also discuss the results of our computer experiments, and identify counterexamples to Conjecture 1 whose structure leads us to construct the lower bounds in Theorem 2. Finally, in Sect. 6, we pose some natural open problems that result from our investigations.

2 Counting by residue classes

Our main idea is to count the columns of a \(\varDelta \)-modular integer matrix by residue classes of a certain lattice. This is the geometric explanation for the linearity in \(\varDelta \) of our upper bounds in Theorem 1.

To be able to count in the non-trivial residue classes, we need to extend the Heller constant \({{\,\textrm{h}\,}}(1,m)\) to a shifted setting. Given a translation vector \(t \in \mathbb {R}^m\) and a matrix \(A \in \mathbb {R}^{m \times n}\), the shifted matrix \(t + A := t \textbf{1}^\intercal + A\) has columns \(t + A_i\), where \(A_1,\ldots ,A_n\) are the columns of A, and \(\textbf{1}\) denotes the all-one vector.

Definition 1

(Shifted Heller constants) Let \(m \in \mathbb {Z}_{>0}\) and \(\delta \in \mathbb {Z}_{\ge 2}\).

  1. (i)

    We define the shifted Heller constant \({{\,\mathrm{h_s}\,}}(m)\) as the maximal number n such that there exists a translation vector \(t \in [0,1)^m \setminus \{\textbf{0}\}\) and a matrix \(A \in \{-1,0,1\}^{m \times n}\) with pairwise distinct columns such that \(t+A\) is totally 1-submodular, that is, \(\max _{k \in [m]} \varDelta _k(t+A) \le 1\).

  2. (ii)

    We define the refined shifted Heller constant \({\text {h}_\text {s}}^{\delta }{(m)}\) as the maximal number n such that there exists a vector \(t \in [0,1)^m \cap (\tfrac{1}{\delta }\mathbb {Z})^m \setminus \{\textbf{0}\}\) and a matrix \(A \in \{-1,0,1\}^{m \times n}\) with pairwise distinct columns such that \(t+A\) is totally 1-submodular, that is, \(\max _{k \in [m]} \varDelta _k(t+A) \le 1\).

Note that, in contrast to the generalized Heller constant \({{\,\textrm{h}\,}}(\varDelta ,m)\), we do not necessarily require \(t+A\) to have full rank in the above definition, but we restrict A to have entries in \(\{-1,0,1\}\) only. Compared with \({{\,\mathrm{h_s}\,}}(m)\), in the definition of \({\text {h}_\text {s}}^{\delta }{(m)}\), we only allow the translation vectors t to have rational coordinates all of whose denominators are divisors of \(\delta \). Hence, we clearly have \({\text {h}_\text {s}}^{\delta }{(m)} \le {{\,\mathrm{h_s}\,}}(m)\), for any \(\delta \ge 2\).

The reason for restricting the non-zero translation vectors to the half-open unit cube \([0,1)^m\) becomes apparent in the proof of our main lemma. However, we need to prepare for it with an observation on the representation of integer points modulo a sublattice of \(\mathbb {Z}^m\).

Proposition 1

Let \(\varLambda \subseteq \mathbb {Z}^m\) be a full-dimensional sublattice of \(\mathbb {Z}^m\) with basis \(b_1,\ldots ,b_m \in \varLambda \) and index \(\varDelta = |\det (b_1,\ldots ,b_m)|\). Then, for every \(x \in \mathbb {Z}^m\), the uniquely determined coefficients \(\alpha _1,\ldots ,\alpha _m\) in the representation

$$\begin{aligned} x = \alpha _1 b_1 + \ldots + \alpha _m b_m \end{aligned}$$

satisfy \(\alpha _i \in \tfrac{1}{\varDelta } \mathbb {Z}\), for every \(1 \le i \le m\).

Proof

By definition, the finite group \(\mathbb {Z}^m / \varLambda \) has order \(\varDelta \). Hence, the order of any of its elements \(x + \varLambda \), where \(x \in \mathbb {Z}^m\), divides \(\varDelta \). \(\square \)

Lemma 1

For every \(\varDelta , m \in \mathbb {Z}_{>0}\), we have

$$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) \le {{\,\textrm{h}\,}}(1,m) + (\varDelta -1) \cdot {\text {h}_\text {s}}^{\varDelta }{(m)}. \end{aligned}$$

In particular, \({{\,\textrm{h}\,}}(\varDelta ,m) \le {{\,\textrm{h}\,}}(1,m) + (\varDelta -1) \cdot {{\,\mathrm{h_s}\,}}(m)\).

Proof

Let \(A \in \mathbb {Z}^{m \times n}\) be a matrix with \(\varDelta _m(A) = \varDelta \) and pairwise distinct columns, and let \(X_A \subseteq \mathbb {Z}^m\) be the set of columns of A. Further, let \(b_1,\ldots ,b_m \in X_A\) be such that \(|\det (b_1,\ldots ,b_m)| = \varDelta \) and consider the parallelepiped

$$\begin{aligned} P_A := [-b_1,b_1] + \ldots + [-b_m,b_m] = \left\{ \sum _{i=1}^m \alpha _i b_i : -1 \le \alpha _i \le 1, \forall i \in [m]\right\} . \end{aligned}$$

Observe that \(X_A \subseteq P_A\). Indeed, assume to the contrary that there is an \(x = \sum _{i=1}^m \alpha _i b_i \in X_A\), with, say \(|\alpha _j| > 1\). Then,

$$\begin{aligned} |\det (b_1,\ldots ,b_{j-1},x,b_{j+1},\ldots ,b_m)| = |\alpha _j| \varDelta > \varDelta , \end{aligned}$$

which contradicts that A was chosen to be \(\varDelta \)-modular.

Now, consider the sublattice \(\varLambda := \mathbb {Z}b_1 + \ldots + \mathbb {Z}b_m\) of \(\mathbb {Z}^m\), whose index in \(\mathbb {Z}^m\) equals \(\varDelta \). We seek to bound the number of elements of \(X_A\) that fall into a fixed residue class of \(\mathbb {Z}^m\) modulo \(\varLambda \). To this end, let \(x \in \mathbb {Z}^m\) and consider the residue class \(x + \varLambda \). Every element \(z \in (x + \varLambda ) \cap P_A\) is of the form \(z = \sum _{i=1}^m \alpha _i b_i\), for some uniquely determined \(\alpha _1,\ldots ,\alpha _m \in [-1,1]\) which, in view of Proposition 1, satisfy \(\alpha _i \in \tfrac{1}{\varDelta } \mathbb {Z}\), for every \(1 \le i \le m\). Moreover, z can be written as

$$\begin{aligned} z = \sum _{i=1}^m \lfloor \alpha _i \rfloor b_i + \sum _{i=1}^m \{\alpha _i\} b_i, \end{aligned}$$
(3)

where \(\{\alpha _i\} = \alpha _i - \lfloor \alpha _i \rfloor \in \left\{ 0,\tfrac{1}{\varDelta },\ldots ,\tfrac{\varDelta -1}{\varDelta }\right\} \) is the fractional part of \(\alpha _i\), and where \(\bar{x} := \sum _{i=1}^m \{\alpha _i\} b_i\) is the unique representative of \(x + \varLambda \) in the half-open parallelepiped \([\textbf{0},b_1) + \ldots + [\textbf{0},b_m)\), and in particular, is independent of z. We use the notation \(\lfloor z \rfloor := (\lfloor \alpha _1 \rfloor , \ldots , \lfloor \alpha _m \rfloor ) \in \{-1,0,1\}^m\) and \(\{ z \} := (\{\alpha _1\},\ldots ,\{\alpha _m\}) \in [0,1)^m\) and thus have \(z = B (\lfloor z \rfloor + \{ z \})\), where \(B = (b_1,\ldots ,b_m) \in \mathbb {Z}^{m \times m}\).

Because the vectors \((x + \varLambda ) \cap X_A\) constitute a \(\varDelta \)-submodular system and because \(|\det (b_1,\ldots ,b_m)| = \varDelta \), the set of vectors \(\{ \lfloor z \rfloor + \{ z \} : z \in (x + \varLambda ) \cap X_A \}\) constitute a 1-submodular system. For the residue class \(\varLambda \), this system is given by \(\{ \lfloor z \rfloor : z \in \varLambda \cap X_A \} \subseteq \{-1,0,1\}^m\) and moreover has full rank as it contains \(e_1,\ldots ,e_m\), and we are thus in the setting of the classical Heller constant \({{\,\textrm{h}\,}}(1,m)\).

For the \(\varDelta - 1\) non-trivial residue classes \(x + \varLambda \), \(x \notin \varLambda \), we are in the setting of the refined shifted Heller constant \({\text {h}_\text {s}}^{\varDelta }{(m)}\). Indeed, as the matrix with columns \(\{b_1,\ldots ,b_m\} \cup \left( (x + \varLambda )\cap X_A\right) \subseteq X_A\) is \(\varDelta \)-submodular, the matrix with columns

$$\begin{aligned} \{e_1,\ldots ,e_m\} \cup \left\{ \lfloor z \rfloor + \{ z \} : z \in (x + \varLambda ) \cap X_A \right\} \end{aligned}$$

has all its minors, of any size, bounded by 1 in absolute value. By the definition of \({\text {h}_\text {s}}^{\varDelta }{(m)}\), the second set in this union has at most \({\text {h}_\text {s}}^{\varDelta }{(m)}\) elements. As a consequence, we get \(n = |X_A| \le {{\,\textrm{h}\,}}(1,m) + (\varDelta -1) \cdot {\text {h}_\text {s}}^{\varDelta }{(m)}\), as desired. \(\square \)

Remark 1

  1. (i)

    The proof above shows that we actually want to bound the number of columns n of a matrix \(A \in \{-1,0,1\}^{m \times n}\) such that the system

    $$\begin{aligned} \{e_1,\ldots ,e_m\} \cup \{t+A_1, \ldots , t + A_n\} \end{aligned}$$

    is 1-submodular, for some \(t \in [0,1)^m \setminus \{\textbf{0}\}\). However, \(t+A\) is totally 1-submodular if and only if \(\{e_1,\ldots ,e_m\} \cup (t+A)\) is 1-submodular.

  2. (ii)

    As any matrix \(A \in \{-1,0,1\}^{m \times n}\) with pairwise distinct columns can have at most \(3^m\) columns, one trivially gets the bound \({{\,\mathrm{h_s}\,}}(m) \le 3^m\). Thus, Lemma 1 directly implies the estimate \({{\,\textrm{h}\,}}(\varDelta ,m) \le 3^m \cdot \varDelta \).

  3. (iii)

    The separation of \({{\,\textrm{h}\,}}(1,m)\) and \({{\,\mathrm{h_s}\,}}^\varDelta (m)\) in Lemma 1 is possible because we excluded \(t = \textbf{0}\) in the definition of both \({{\,\mathrm{h_s}\,}}(m)\) and \({\text {h}_\text {s}}^{\delta }{(m)}\). This is motivated by a comparison to Conjecture 1; we know the exact value of \({{\,\textrm{h}\,}}(1,m)\), while we may hope to bound \({{\,\mathrm{h_s}\,}}^\varDelta (m)\) subquadratically in future work.

2.1 Small dimensions and lower bounds in the shifted setting

Recall that the original Heller constant is given by \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\). The following exact results for dimensions two and three show the difference between this original (unshifted) and the shifted setting captured by \({{\,\mathrm{h_s}\,}}(m)\). Note that as in the shifted setting we require \(t \ne \textbf{0}\), the Heller constant \({{\,\textrm{h}\,}}(1,m)\) is not a lower bound on the shifted Heller constant \({{\,\mathrm{h_s}\,}}(m)\). The proof for these special cases might give a hint how to approach the determination of exact values or the asymptotic growth of \({{\,\mathrm{h_s}\,}}(m)\) for general m.

Proposition 2

We have \({{\,\mathrm{h_s}\,}}(2) = 6\) and \({{\,\mathrm{h_s}\,}}(3) = 12\).

Proof

First, we show that \({{\,\mathrm{h_s}\,}}(2) = 6\). Let \(A \in \{-1,0,1\}^{2 \times n}\) have distinct columns, and let \(t \in [0,1)^2 \setminus \{\textbf{0}\}\) be such that \(t+A\) is totally 1-submodular. Since \(t \ne \textbf{0}\), it has a non-zero coordinate, say \(t_1 > 0\). As the \(1\times 1\) minors of \(t+A\), that is, the entries of \(t+A\), are bounded in absolute value by 1, we get that the first row of A can only have entries in \(\{-1,0\}\). This shows already that \(n \le 6\), as there are simply only 6 options for the columns of A respecting this condition.

An example attaining this bound is given by

$$\begin{aligned} A = { \left[ \!\begin{array}{rrr|rrr} -1 &{}\quad -1 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ -1 &{}\quad 0 &{}\quad 1 &{}\quad -1 &{}\quad 0 &{} 1 \end{array}\!\right] } \quad \text { and }\quad t = { \begin{bmatrix} 1/2 \\ 0 \end{bmatrix}}. \end{aligned}$$

One can check that (up to permutations of rows and columns) this is actually the unique example (At) with 6 columns in A.

Now, we turn our attention to demonstrating that \({{\,\mathrm{h_s}\,}}(3) = 12\). The lower bound follows by the existence of the following matrix and translation vector

$$\begin{aligned} A = { \left[ \!\begin{array}{rrr|rrr|rrr|rrr} -1 &{}\quad -1 &{}\quad -1 &{}\quad -1 &{}\quad -1 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ -1 &{}\quad -1 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 &{}\quad -1 &{}\quad -1 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad 0 \\ -1 &{}\quad 0 &{}\quad 1 &{}\quad -1 &{}\quad 0 &{}\quad 1 &{}\quad -1 &{}\quad 0 &{}\quad 1 &{}\quad -1 &{}\quad 0 &{}\quad 1 \end{array}\!\right] } \ ,\ t = { \begin{bmatrix} 1/2 \\ 1/2 \\ 0 \end{bmatrix}}. \end{aligned}$$

Checking that \(t+A\) is indeed totally 1-submodular is a routine task that we leave to the reader.

For the upper bound, let \(A \in \{-1,0,1\}^{3 \times n}\) and \(t \in [0,1)^3 \setminus \{\textbf{0}\}\) be such that \(t+A\) is totally 1-submodular. Let s be the number of non-zero entries of \(t \ne \textbf{0}\). Just as we observed for \({{\,\mathrm{h_s}\,}}(2)\), we get that there are \(s \ge 1\) rows of A only containing elements from \(\{-1,0\}\). Thus, if \(s=3\) there are only \(2^3 = 8\) possible columns and if \(s=2\), there are only \(2^2 \cdot 3 = 12\) possible columns, showing that \(n \le 12\) in both cases.

We are left with the case that \(s=1\), and we may assume that A has no entry equal to 1 in the first row and that \(t_1 > 0\). Assume for contradiction that \(n \ge 13\). There must be \(\ell \ge 7\) columns of A with the same first coordinate. We let \(A'\) be the submatrix of A consisting of these \(\ell \) columns. By the identity \({{\,\textrm{h}\,}}(1,2) = 7\) applied to the last two rows of A, and as \(t_2 = t_3 = 0\), we must have \(\ell = 7\) and up to permutations and multiplication of any of the last two rows by \(-1\), \(A' = {\tiny \left[ \!\begin{array}{*{7}r} a &{} a &{} a &{} a &{} a &{} a &{} a \\ 0 &{} 1 &{} 0 &{} -1 &{} 0 &{} 1 &{} -1 \\ 0 &{} 0 &{} 1 &{} 0 &{} -1 &{} -1 &{} 1 \end{array}\!\right] }\), for some \(a \in \{-1,0\}\). Because the absolute values of the \(2 \times 2\) minors of \(t+A\) are bounded by 1, the remaining \(n-\ell \ge 6\) columns of A are different from \((b,1,1)^\intercal \) and \((b,-1,-1)^\intercal \), where b is such that \(\{a,b\} = \{-1,0\}\). Under these conditions, we find that A contains either \(B = {\tiny \left[ \!\begin{array}{*{3}r} -1 &{} -1 &{} 0 \\ 1 &{} 0 &{} 1 \\ 0 &{} -1 &{} -1 \end{array}\!\right] }\), \(B' = {\tiny \left[ \!\begin{array}{*{3}r} -1 &{} -1 &{} 0 \\ -1 &{} 0 &{} -1 \\ 0 &{} 1 &{} 1 \end{array}\!\right] }\), \(C = {\tiny \left[ \!\begin{array}{*{3}r} 0 &{} 0 &{} -1 \\ 1 &{} 0 &{} 1 \\ 0 &{} -1 &{} -1 \end{array}\!\right] }\) or \(C' = {\tiny \left[ \!\begin{array}{*{3}r} 0 &{} 0 &{} -1 \\ -1 &{} 0 &{} -1 \\ 0 &{} 1 &{} 1 \end{array}\!\right] }\) as a submatrix. However, both the conditions \(|\det (t+B)| \le 1\) and \(|\det (t+B')| \le 1\) give \(t_1 \ge 1\), and both \(|\det (t+C)| \le 1\) and \(|\det (t+C')| \le 1\) give \(t_1 \le 0\). Hence, in either case we get a contradiction to the assumption that \(0< t_1 < 1\). \(\square \)

Combining Lemma 1, the identity \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\), and Proposition 2 yields the bounds \({{\,\textrm{h}\,}}(\varDelta ,2) \le 6 \varDelta + 1\) and \({{\,\textrm{h}\,}}(\varDelta ,3) \le 12 \varDelta + 1\). The latter bound improves upon the bound (2) of Lee et al. However, as \({{\,\textrm{h}\,}}(\varDelta ,2) = 4 \varDelta + 3\) by (1), we see that the approach via the shifted Heller constant \({{\,\mathrm{h_s}\,}}(m)\) cannot give optimal results for all m.

A quadratic lower bound on \({{\,\mathrm{h_s}\,}}(m)\) can be obtained as follows:

Proposition 3

For every \(m \in \mathbb {Z}_{>0}\), we have

$$\begin{aligned} {{\,\mathrm{h_s}\,}}(m) \ge {{\,\textrm{h}\,}}(1,m-1) = m(m-1)+1. \end{aligned}$$

Proof

Let \(A' \in \{-1,0,1\}^{(m-1) \times n}\) be a totally unimodular matrix with \(n = {{\,\textrm{h}\,}}(1,m-1)\) columns, and let \(A \in \{-1,0,1\}^{m \times n}\) be obtained from \(A'\) by simply adding a zero-row as the first row. Then, for the translation vector \(t = (\frac{1}{m},0,\ldots ,0)^\intercal \) the matrix \(t+A\) is totally 1-submodular.

Indeed, we only need to look at its \(k \times k\) minors, for \(k \le m\), that involve the first row, as \(A'\) is totally unimodular by choice. But then, the triangle inequality combined with the Laplace expansion of the given minor with respect to the first row shows that its absolute value is bounded by 1. \(\square \)

The proof demonstrates that the same lower bound holds for the refined shifted Heller constant \({\text {h}_\text {s}}^{ m }{(m)}\). Note that the denominators of the allowed translation vectors for this constant depend on m though.

3 Polynomial upper bounds on \({\text {h}_\text {s}}^{\Delta }{(m)}\) and \({{\,\mathrm{h_s}\,}}(m)\)

An elegant and alternative proof for Heller’s result that \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\) has been suggested by Bixby and Cunningham [5] and carried out in detail in Schrijver’s book [14, Sect. 21.3]. They first reduce the problem by considering only the supports of the columns of a given (totally) unimodular matrix and then applying Sauer’s Lemma from extremal set theory that guarantees the existence of a large cardinality set that is shattered by a large enough family of subsets of [m].

We show that this approach can in fact be adapted for the shifted Heller constant \({{\,\mathrm{h_s}\,}}(m)\). Note that we are not the first to apply Sauer’s Lemma to bound the size of integer matrices beyond the totally unimodular case. Its utility in this context was shown by Anstee [1] who established a bound of \(\mathcal {O}(m^{2 \varDelta (1 + \log \varDelta )})\) for the number of columns of a totally \(\varDelta \)-submodular matrix with pairwise distinct columns and m rows.

Compared to \({{\,\textrm{h}\,}}(1,m)\), the additional freedom in \({{\,\mathrm{h_s}\,}}(m)\) that is introduced by the translation vectors \(t \in [0,1)^m \setminus \{\textbf{0}\}\) makes the argument more involved, but still gives a low degree polynomial bound. Toward this end, we write \({{\,\textrm{supp}\,}}(y) := \left\{ j \in [m] : y_j \ne 0 \right\} \) for the support of a vector \(y \in \mathbb {R}^m\) and

$$\begin{aligned} \mathcal {E}_A := \left\{ {{\,\textrm{supp}\,}}(A_i) : i \in [n] \right\} \subseteq 2^{[m]} \end{aligned}$$

for the family of column supports in a matrix \(A \in \mathbb {R}^{m \times n}\) with columns \(A_1,\ldots ,A_n\). We use the notation \(2^Y\) for the power set of a finite set Y.

Just as in the unshifted Heller setting, each support can be realized by at most two columns of A, if there exists a translation vector \(t \in [0,1)^m\) such that \(t+A\) is totally 1-submodular.

Proposition 4

Let \(A \in \{-1,0,1\}^{m \times n}\) and \(t \in [0,1)^m\) be such that \(\varDelta _k(t + A) \le 1\), for \(k \in \{1,2\}\). Then, each \(E \in \mathcal {E}_A\) is the support of at most two columns of A.

Proof

Observe that in view of the condition \(\varDelta _1(t+A) \le 1\) and the assumption that \(t_i \ge 0\), for every \(i \in [m]\), we must have \(t_r = 0\), as soon as there is an entry equal to 1 in the r-th row of A.

Now, assume to the contrary that there are three columns \(A_i,A_j,A_k\) of A having the same support \(E \in \mathcal {E}_A\). Then, clearly \(|E| \ge 2\) and the restriction of the matrix \((A_i,A_j,A_k) \in \{-1,0,1\}^{m \times 3}\) to the rows indexed by E is a \(\pm 1\)-matrix. Also observe that there must be two rows \(r,s \in E\) so that \((A_i,A_j,A_k)\) contains an entry equal to 1 in both of these rows. Indeed, if there is at most one such row, then the columns \(A_i\), \(A_j\), \(A_k\) cannot be pairwise distinct. Therefore, we necessarily have \(t_r = t_s = 0\). Now, there are two options. Either two of the columns \(A_i,A_j,A_k\) are such that their restriction to the rows rs give linearly independent \(\pm 1\)-vectors. This however would yield a \(2 \times 2\) submatrix of \(t+A\) with minor \(\pm 2\), contradicting that \(\varDelta _2(t+A) \le 1\). In the other case, the restriction of the three columns to the rows rs has the form \(\pm {\tiny \begin{bmatrix}1 &{} 1 &{} 1 \\ 1 &{} 1 &{} 1 \end{bmatrix}}\) or \(\pm {\tiny \begin{bmatrix}1 &{} 1 &{} -1 \\ 1 &{} 1 &{} -1\end{bmatrix}}\), up to permutation of the indices ijk. If \(|E| = 2\), then this cannot happen as A is assumed to have pairwise distinct columns. So, \(|E| \ge 3\), and considering the columns, say \(A_i,A_j\), which agree in the rows rs, there must be another index \(\ell \in E \setminus \{r,s\}\) such that \((A_i)_\ell = 1\) and \((A_j)_\ell = -1\), or vice versa. In either case this means that also \(t_\ell = 0\) and that there is a \(2 \times 2\) submatrix of \(t+A\) in the rows \(r,\ell \) consisting of linearly independent \(\pm 1\)-vectors. Again this contradicts that \(\varDelta _2(t+A) \le 1\), and thus proves the claim. \(\square \)

As mentioned above, this observation on the supports allows us to use Sauer’s Lemma from extremal set theory which we state for the reader’s convenience.Footnote 3 It was independently published by Sauer [13] and Shelah [15] (who also credits M. Perles) in 1972, and again independently by Vapnik and Chervonenkis [18] a few years earlier.

Lemma 2

Let \(m,k \in \mathbb {Z}_{>0}\) be such that \(m > k\). If \(\mathcal {E}\subseteq 2^{[m]}\) is such that \( |\mathcal {E}| > \left( {\begin{array}{c}m\\ 0\end{array}}\right) + \left( {\begin{array}{c}m\\ 1\end{array}}\right) + \ldots + \left( {\begin{array}{c}m\\ k\end{array}}\right) , \) then there is a subset \(Y \subseteq [m]\) with \(k+1\) elements that is shattered by \(\mathcal {E}\), meaning that \(\left\{ E \cap Y : E \in \mathcal {E}\right\} = 2^Y\).

Now, the strategy for bounding the number of columns in a matrix \(A \in \{-1,0,1\}^{m \times n}\) such that \(t+A\) is totally 1-submodular for some \(t \in [0,1)^m\) is to use the inequality \(|\mathcal {E}_A| \ge \frac{1}{2} n\), which holds by Proposition 4, and then to argue by contradiction. Indeed, if \(n > 2 \sum _{i=0}^{k-1} \left( {\begin{array}{c}m\\ i\end{array}}\right) \), then by Sauer’s Lemma there would be a k-element subset \(Y \subseteq [m]\) that is shattered by \(\mathcal {E}_A\). In terms of the matrix A, this means that (possibly after permuting rows or columns) it contains a submatrix of size \(k \times 2^k\) which has exactly one column for each of the \(2^k\) possible supports and where in each column the non-zero entries are chosen arbitrarily from \(\{-1,1\}\). For convenience we call any such matrix a Sauer Matrix of size k. For concreteness, a Sauer Matrix of size 3 is of the form

$$\begin{aligned} \left[ \!\begin{array}{*8{c}} 0 &{}\quad \pm 1 &{}\quad 0 &{}\quad 0 &{}\quad \pm 1 &{}\quad \pm 1 &{}\quad 0 &{}\quad \pm 1 \\ 0 &{}\quad 0 &{}\quad \pm 1 &{}\quad 0 &{}\quad \pm 1 &{}\quad 0 &{}\quad \pm 1 &{}\quad \pm 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad \pm 1 &{}\quad 0 &{}\quad \pm 1 &{}\quad \pm 1 &{}\quad \pm 1 \end{array}\!\right] , \end{aligned}$$

for any choice of signs.

The combinatorial proof of \({{\,\textrm{h}\,}}(1,m) = m^2+m+1\) is based on the fact that

$$\begin{aligned} \text {No Sauer Matrix of size } 3 \text { is totally } 1\text {-submodular.} \end{aligned}$$
(4)

This is discussed in Schrijver [14, Sect. 21.3], Bixby and Cunningham [5], Tutte [17], and also implicitly in the analysis of the first equation on page 1361 of Heller’s paper [9]. In order to extend this kind of argument to the (refined) shifted setting, we need some more notation.

Definition 2

Let S be a Sauer Matrix of size k. We say that a vector \(r \in [0,1)^k\) is feasible for S if \(r+S\) is totally 1-submodular. Further, we say that S is feasible for translations if there exists a vector \(r \in [0,1)^k\) that is feasible for S, and otherwise we say that S is infeasible for translations.

Moreover, the Sauer Matrix S is said to be of type \((s,k-s)\), if there are exactly s rows in S that contain at least one entry equal to 1.

Note that there is (up to permuting rows or columns) only one Sauer Matrix of type (0, k). As feasibility of a Sauer Matrix of type \((s,k-s)\) is invariant under permuting rows, we always make the assumption that each of its first s rows contains an entry equal to 1.

Proposition 5

Let \(m,k,\varDelta \in \mathbb {Z}_{>0}\) be such that \(m > k\).

  1. (i)

    If no Sauer Matrix of size k is feasible for translations, then

    $$\begin{aligned} {{\,\mathrm{h_s}\,}}(m) \le 2 \cdot \sum _{i=0}^{k-1} \left( {\begin{array}{c}m\\ i\end{array}}\right) \in \mathcal {O}\left( m^{k-1}\right) . \end{aligned}$$
  2. (ii)

    If no Sauer Matrix S of size k admits a translation vector \(t \in [0,1)^k \cap (\frac{1}{\varDelta } \mathbb {Z})^k \setminus \{\textbf{0}\}\) such that \(t+S\) is totally 1-submodular, then

    $$\begin{aligned} {\text {h}_\text {s}}^{\varDelta }{(m)} \le 2 \cdot \sum _{i=0}^{k-1} \left( {\begin{array}{c}m\\ i\end{array}}\right) \in \mathcal {O}\left( m^{k-1}\right) . \end{aligned}$$

Proof

(i): Assume for contradiction that there is a matrix \(A \in \{-1,0,1\}^{m \times n}\) and a translation vector \(t \in [0,1)^m\) such that \(t+A\) is totally 1-submodular and \(n > 2 \sum _{i=0}^{k-1} \left( {\begin{array}{c}m\\ i\end{array}}\right) \). By Proposition 4, we have \(|\mathcal {E}_A| \ge \frac{1}{2} n > \sum _{i=0}^{k-1} \left( {\begin{array}{c}m\\ i\end{array}}\right) \) and thus by Sauer’s Lemma (up to permuting rows or columns) the matrix A has a Sauer Matrix S of size k as a submatrix. Writing \(r \in [0,1)^k\) for the restriction of t to the k rows of A in which we find the Sauer Matrix S, we get that by the total 1-submodularity of \(t+A\), the matrix \(r+S\) necessarily must be totally 1-submodular as well. This however contradicts the assumption.

(ii): The argument is analogous to the one given for part (i). \(\square \)

In contrast to the unshifted setting, for the sizes 3 and 4, there are Sauer Matrices S and vectors r, such that \(r+S\) is totally 1-submodular. Consider, for instance, for size 3 the pair

$$\begin{aligned} S = { \left[ \!\begin{array}{*{8}r} 0 &{}\quad -1 &{}\quad 0 &{}\quad 0 &{}\quad -1 &{}\quad -1 &{}\quad 0 &{}\quad -1 \\ 0 &{}\quad 0 &{}\quad -1 &{}\quad 0 &{}\quad -1 &{}\quad 0 &{}\quad -1 &{}\quad -1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad -1 &{}\quad 0 &{}\quad -1 &{}\quad -1 &{}\quad -1 \end{array}\!\right] } \,,\, r = { \begin{bmatrix} 1/2 \\ 1/2 \\ 1/2 \end{bmatrix}} , \end{aligned}$$

and, for size 4 the pair

$$\begin{aligned} S = { \left[ \!\begin{array}{*{16}r} 0 &{} -1 &{} 0 &{} 0 &{} 0 &{} -1 &{} -1 &{} -1 &{} 0 &{} 0 &{} 0 &{} -1 &{} -1 &{} -1 &{} 0 &{} -1 \\ 0 &{} 0 &{} -1 &{} 0 &{} 0 &{} -1 &{} 0 &{} 0 &{} -1 &{} -1 &{} 0 &{} -1 &{} -1 &{} 0 &{} -1 &{} -1 \\ 0 &{} 0 &{} 0 &{} -1 &{} 0 &{} 0 &{} -1 &{} 0 &{} -1 &{} 0 &{} -1 &{} -1 &{} 0 &{} -1 &{} -1 &{} -1 \\ 0 &{} 0 &{} 0 &{} 0 &{} -1 &{} 0 &{} 0 &{} -1 &{} 0 &{} -1 &{} -1 &{} 0 &{} -1 &{} -1 &{} -1 &{} -1 \end{array}\!\right] } \,,\, r = { \begin{bmatrix} 1/2 \\ 1/2 \\ 1/2 \\ 1/2 \end{bmatrix}} . \end{aligned}$$

In both cases, \(2(r+S)\) is a matrix all of whose entries are either 1 or \(-1\). By Hadamard’s inequality, the determinant of any \(\pm 1\)-matrix of size \(k \le 4\) is at most \(2^k\), and thus \(\varDelta _k(r+S) \le 1\) for all \(k \le 4\), in the two examples above.

Our aim is to show that this pattern does not extend to higher dimensions. In particular, we prove that no Sauer Matrix of size 5 is feasible for translations, which leads to Theorem 1 (i), and that feasible vectors for Sauer Matrices of size 4 must have all its entries in \(\{0,\frac{1}{2}\}\) leading to Theorem 1 (ii). The proof requires a more detailed study of Sauer Matrices of special types and sizes 4 and 5.

Proposition 6

Let S be a Sauer Matrix of size 4 and let \(r \in [0,1)^4\) be feasible for S.

  1. (i)

    If S is of type (0, 4), then \(r = (\frac{1}{2},\frac{1}{2},\frac{1}{2},\frac{1}{2})^\intercal \).

  2. (ii)

    If S is of type (1, 3), then \(r = (0,\frac{1}{2},\frac{1}{2},\frac{1}{2})^\intercal \).

  3. (iii)

    If S is of type (2, 2), then \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \).

  4. (iv)

    If S is of type (3, 1) or (4, 0), then S is infeasible for translations.

Proposition 7

There does not exist a Sauer Matrix S of size 5 and a translation vector \(r \in [0,1)^5\) such that \(r+S\) is totally 1-submodular.

The proof of these statements is based on identifying certain full-rank submatrices of the respective Sauer Matrix for which the minor condition provides an obstruction for feasibility. The technical details are given in Sect. 4.

With these preparations, we are now able to prove our main result.

Proof (Theorem 1)

(i): In view of Lemma 1, we have \({{\,\textrm{h}\,}}(\varDelta ,m) \le {{\,\textrm{h}\,}}(1,m) + (\varDelta - 1) \cdot {{\,\mathrm{h_s}\,}}(m)\). The claimed bound now follows by Heller’s identity \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\) and the fact that \({{\,\mathrm{h_s}\,}}(m) \le 2 \sum _{i=0}^4 \left( {\begin{array}{c}m\\ i\end{array}}\right) \), which holds by combining Proposition 5 (i) and Proposition 7.

(ii): Again, in view of Lemma 1, we have \({{\,\textrm{h}\,}}(\varDelta ,m) \le {{\,\textrm{h}\,}}(1,m) + (\varDelta - 1) \cdot {\text {h}_\text {s}}^{\varDelta }{(m)}\). By Proposition 6, for every Sauer Matrix S of size 4, any vector \(t \in [0,1)^4\) such that \(t+S\) is totally 1-submodular has all its coordinates in \(\{0,\frac{1}{2}\}\). If \(\varDelta \) is odd, then using Proposition 1, such translation vectors are excluded from the definition of \({\text {h}_\text {s}}^{\varDelta }{(m)}\). Thus, by Proposition 5 (ii), we get \({\text {h}_\text {s}}^{\varDelta }{(m)} \le 2 \sum _{i=0}^3 \left( {\begin{array}{c}m\\ i\end{array}}\right) \), which together with Heller’s identity \({{\,\textrm{h}\,}}(1,m) = m^2 + m + 1\) proves the claimed bound. \(\square \)

4 Feasibility of Sauer matrices in low dimensions

Here, we complete the discussion from the previous section and give the technical details and the proof of Propositions 6 and 7. Parts of the arguments are based on the fact that the condition \(|\det (r+M)| \le 1\), for any \(M \in \mathbb {R}^{k \times k}\), is equivalent to a pair of linear inequalities in the coordinates of \(r \in \mathbb {R}^k\). This turns the question on whether a given Sauer Matrix is feasible for translations into the question of whether an associated polyhedron is non-empty. Toward this end, let \(S \in \{-1,0,1\}^{k \times 2^k}\) be a Sauer Matrix of size k and consider the set

$$\begin{aligned} \mathcal {P}(S) = \left\{ r \in [0,1]^k : \varDelta _\ell (r+S) \le 1 \text { for each } 1 \le \ell \le k \right\} \end{aligned}$$

of feasible vectors for S in the unit cube \([0,1]^k\).

Proposition 8

For every Sauer Matrix S, the set \(\mathcal {P}(S)\) is a polytope.

Proof

A vector \(r \in [0,1]^k\) is contained in \(\mathcal {P}(S)\) if and only if \(|\det (r_I+S_{I,J})| \le 1\), for every \(I \subseteq [k]\) and \(J \subseteq [2^k]\) with \(|I| = |J| = \ell \) and \(1 \le \ell \le k\), and where \(r_I\) denotes the subvector of r with coordinates indexed by I and \(S_{I,J}\) denotes the submatrix of S with rows and columns indexed by I and J, respectively. Now, in general, for an \(\ell \times \ell \) matrix A with columns \(a_1,\ldots ,a_\ell \) and a vector \(t \in \mathbb {R}^\ell \), we have

$$\begin{aligned} \det (t+A) = \det (t + a_1,\ldots ,t + a_\ell ) = \det \left( \begin{array}{llll} a_1 &{} \ldots &{} a_\ell &{} -t \\ 1 &{} \ldots &{} 1 &{} 1 \end{array} \right) . \end{aligned}$$
(5)

Thus, the multilinearity of the determinant translates the condition \(|\det (r_I+S_{I,J})| \le 1\) into a pair of linear inequalities in the entries of r. \(\square \)

4.1 Sauer matrices of size 3

We start by illustrating the polyhedrality of \(\mathcal {P}(S)\) on the Sauer Matrix \(S_{(0,3)}\) of type (0, 3). Note that the columns of \(-S_{(0,3)}\) are given by the eight 0/1 vectors \(\{0,1\}^3\). Further, for any \(3 \times 3\) submatrix \(A = (a_1,a_2,a_3)\) of \(S_{(0,3)}\), we have

$$\begin{aligned} |\det (t+A)| = |\det (-a_1 - t,-a_2 - t,-a_3 - t)| = 6 {{\,\textrm{vol}\,}}({{\,\textrm{conv}\,}}\{-a_1,-a_2,-a_3,t\}), \end{aligned}$$

so the condition \(|\det (t+A)| \le 1\) is a condition on the volume of the simplex in \([0,1]^3\) with vertices \(-a_1,-a_2,-a_3,t\). It is straightforward to see that the only case in which this imposes a condition on t is when the vertices \(-a_1,-a_2,-a_3 \in \{0,1\}^3\) are chosen such that they are pairwise not connected by an edge of the cube. Also the \(2 \times 2\) minors of \(t+S_{(0,3)}\) do not further restrict the feasibility of t and in summary, we thus get

$$\begin{aligned} \mathcal {P}(S_{(0,3)})&= \Big \{ r \in [0,1]^3 : 1 \le r_1 + r_2 + r_3 \le 2 , \ 0 \le -r_1 + r_2 + r_3 \le 1 , \nonumber \\&\quad \,0 \le r_1 - r_2 + r_3 \le 1 , \ 0 \le r_1 + r_2 - r_3 \le 1 \,\Big \} \nonumber \\&= {{\,\textrm{conv}\,}}\left\{ {\begin{bmatrix} 0 \\ 1/2 \\ 1/2 \end{bmatrix}}, {\begin{bmatrix} 1/2 \\ 0 \\ 1/2 \end{bmatrix}}, {\begin{bmatrix} 1/2 \\ 1/2 \\ 0 \end{bmatrix}}, {\begin{bmatrix} 1 \\ 1/2 \\ 1/2 \end{bmatrix}}, {\begin{bmatrix} 1/2 \\ 1 \\ 1/2 \end{bmatrix}}, {\begin{bmatrix} 1/2 \\ 1/2 \\ 1 \end{bmatrix}}\right\} . \end{aligned}$$
(6)

This is a regular crosspolytope with each of its vertices being the center of a facet of the cube \([0,1]^3\).

Our second goal is to characterize the feasible vectors for Sauer Matrices of size 3 and type (1, 2).

Proposition 9

Let S be a Sauer Matrix of type (1, 2), that is,

$$\begin{aligned} S = \left[ \!\begin{array}{*8{c}} 0 &{}\quad \pm 1 &{}\quad 0 &{}\quad 0 &{}\quad \pm 1 &{}\quad \pm 1 &{}\quad 0 &{}\quad \pm 1 \\ 0 &{}\quad 0 &{}\quad - 1 &{}\quad 0 &{}\quad - 1 &{}\quad 0 &{}\quad - 1 &{}\quad - 1 \\ 0 &{}\quad 0 &{}\quad 0 &{}\quad - 1 &{}\quad 0 &{}\quad - 1 &{}\quad - 1 &{}\quad - 1 \end{array}\!\right] \end{aligned}$$

for any signs and such that at least one entry in the first row of S equals 1. Let \(\sigma \in \{+,-\}^4\) be the sign vector of the non-zero entries in the first row of S, and let \(r \in [0,1)^3\) be feasible for S.

  1. (i)

    If \(\sigma \in \left\{ (+,-,+,-) , (-,+,-,+) \right\} \), then \(r \in {{\,\textrm{conv}\,}}\left\{ (0,\frac{1}{4},\frac{1}{2})^\intercal , (0,\frac{3}{4},\frac{1}{2})^\intercal \right\} \).

  2. (ii)

    If \(\sigma \in \left\{ (+,+,-,-) , (-,-,+,+) \right\} \), then \(r \in {{\,\textrm{conv}\,}}\left\{ (0,\frac{1}{2},\frac{1}{4})^\intercal , (0,\frac{1}{2},\frac{3}{4})^\intercal \right\} \).

  3. (iii)

    If \(\sigma \) is different from any of those in parts (i) and (ii), then \(r = (0,\frac{1}{2},\frac{1}{2})^\intercal \).

Proof

We start with some general observations: By assumption, we have \(r_1 = 0\) in any case, because there is an entry equal to 1 in the first row of S. This means that the possibilities for feasible vectors r are not affected by possibly negating the first row of S. It thus suffices to consider only those \(\sigma \) with \(\sigma _1 = +\), and we make this assumption throughout the following.

Now, for any \(i,j \in \{1,2,3\}\) and any \(k,\ell \in \{1,2,\ldots ,8\}\), we denote by \(S_{ij,k\ell }\) the \(2 \times 2\) submatrix of S with entries in rows either i or j and columns either k or \(\ell \). With this notation and the condition \(\varDelta _2(r+S) \le 1\) on the minors of size 2, we obtain:

(C\(_{12}\)):

If \(\sigma _2 \ne \sigma _1\), then \(r+S\) contains \({\small \begin{bmatrix} 0 \\ r_3 \end{bmatrix}}+S_{13,25} = \left[ \!\begin{array}{*2{c}} \pm 1 &{} \mp 1 \\ r_3 &{} r_3 \end{array}\!\right] \), which implies \(r_3 \le \frac{1}{2}\).

(C\(_{13}\)):

If \(\sigma _3 \ne \sigma _1\), then \(r+S\) contains \({\small \begin{bmatrix} 0 \\ r_2 \end{bmatrix}}+S_{12,26} = \left[ \!\begin{array}{*2{c}} \pm 1 &{} \mp 1 \\ r_2 &{} r_2 \end{array}\!\right] \), which implies \(r_2 \le \frac{1}{2}\).

(C\(_{24}\)):

If \(\sigma _2 \ne \sigma _4\), then \(r+S\) contains \({\small \begin{bmatrix} 0 \\ r_2 \end{bmatrix}}+S_{12,58} = \left[ \!\begin{array}{*2{c}} \pm 1 &{} \mp 1 \\ r_2 - 1 &{} r_2 - 1 \end{array}\!\right] \), which implies \(r_2 \ge \frac{1}{2}\).

(C\(_{34}\)):

If \(\sigma _3 \ne \sigma _4\), then \(r+S\) contains \({\small \begin{bmatrix} 0 \\ r_3 \end{bmatrix}}+S_{13,68} = \left[ \!\begin{array}{*2{c}} \pm 1 &{} \mp 1 \\ r_3 - 1 &{} r_3 - 1 \end{array}\!\right] \), which implies \(r_3 \ge \frac{1}{2}\).

Finally, for any indices \(i,j,k \in \{1,2,\ldots ,8\}\), we denote by \(S_{ijk}\) the \(3 \times 3\) submatrix of S consisting of its columns indexed by ij, and k.

(i) Let \(\sigma = (+,-,+,-)\). As the first two and the last two entries of \(\sigma \) differ, in view of the conditions (C\(_{12}\)) and (C\(_{34}\)), we get \(r_3 = \frac{1}{2}\). Since \(\sigma _1 = +\) and \(\sigma _2 = -\), the matrix \(r+S\) contains \(r+S_{245} = \left[ \!\begin{array}{*3{c}} 1 &{} 0 &{} -1 \\ r_2 &{} r_2 &{} r_2 - 1 \\ r_3 &{} r_3 - 1 &{} r_3 \end{array}\!\right] \), for which the condition \(\varDelta _3(t+S) \le 1\) yields \(2r_2 + r_3 \le 2\), and thus \(r_2 \le \frac{3}{4}\). Similarly, \(r+S\) also contains \(r+S_{257} = \left[ \!\begin{array}{*3{c}} 1 &{} -1 &{} 0 \\ r_2 &{} r_2 - 1 &{} r_2 - 1 \\ r_3 &{} r_3 &{} r_3 - 1 \end{array}\!\right] \), for which the condition \(\varDelta _3(t+S) \le 1\) yields \(2r_2 - r_3 \ge 0\), and thus \(r_2 \ge \frac{1}{4}\).

(ii) Since interchanging the last two rows of S means to interchange \(\sigma _2\) and \(\sigma _3\), and since this change of rows in S translates into exchanging the coordinates \(r_2\) and \(r_3\), the claim follows by (i).

(iii) By negating the first row of S, the case \(\sigma = (+,+,+,+)\) corresponds to the Sauer Matrix \(S_{(0,3)}\) of type (0, 3) with the additional restriction that \(r_1 = 0\). In view of the characterization (6) of its feasible vectors, this leaves as the only possibility \(r = (0,\frac{1}{2},\frac{1}{2})^\intercal \), as claimed. So, we may assume that at least one of the coordinates \(\sigma _2,\sigma _3,\sigma _4\) equals −.

First, assume that \(\sigma _1 = \sigma _4 = +\). If \(\sigma _2 = -\), then the submatrix \(S_{248}\) of S yields \(r_2 \le r_3\) via the minor condition \(\varDelta _3(r+S)\) just as in part (i). Since by (C\(_{12}\)) and (C\(_{24}\)) we have \(r_3 \le \frac{1}{2}\) and \(r_2 \ge \frac{1}{2}\), respectively, we obtain \(r = (0,\frac{1}{2},\frac{1}{2})^\intercal \), as claimed. An analogous argument works for the case that \(\sigma _3 = -\), by using the submatrix \(S_{238}\) of S.

Second, assume that \(\sigma _4 = - \ne \sigma _1\). The only cases left to consider are those for which \(\sigma _2 = \sigma _3\). If \(\sigma _2 = \sigma _3 = +\), then by (C\(_{24}\)) and (C\(_{34}\)), we get \(r_2 \ge \frac{1}{2}\) and \(r_3 \ge \frac{1}{2}\), respectively. Moreover, using the submatrix \(S_{156}\) of S as before yields that \(r_2 + r_3 \le 1\) implying the desired \(r = (0,\frac{1}{2},\frac{1}{2})^\intercal \). Again, the case \(\sigma _2 = \sigma _3 = -\) is completely analogous, using the conditions (C\(_{12}\)) and (C\(_{13}\)), and the submatrix \(S_{567}\) of S. \(\square \)

4.2 Sauer matrices of size 4

Based on the knowledge of feasibility of Sauer Matrices of size 3, we are now in position to prove Proposition 6. Sauer Matrices of the types (0, 4), (3, 1) and (4, 0) are easy to deal with, so let us start with those.

Proof of Proposition 6 (i) and (iv)

(i): Assume that \(r \in [0,1)^4\) is such that \(r+S\) is totally 1-submodular, and consider the following two \(4 \times 4\) submatrices of S:

$$\begin{aligned} M = { \left[ \!\begin{array}{r|rrr} 0 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} -1 &{} -1 \\ 0 &{} -1 &{} 0 &{} -1 \\ 0 &{} -1 &{} -1 &{} 0 \end{array}\!\right] } \quad \text { and }\quad N = { \left[ \!\begin{array}{r|rrr} -1 &{} -1 &{} -1 &{} -1 \\ \hline -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] }. \end{aligned}$$
(7)

By the \(4 \times 4\) minor condition on \(r+S\), we have

$$\begin{aligned} |\det (r+M)| = r_1 \cdot \det { \begin{bmatrix} 0 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 0 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 0 \end{bmatrix}} = 2 r_1 \le 1, \end{aligned}$$

and hence \(r_1 \le \frac{1}{2}\). Likewise, we have

$$\begin{aligned} |\det (r+N)| = (1-r_1) \cdot \det { \begin{bmatrix} 0 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 0 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 0 \end{bmatrix}} = 2 (1-r_1) \le 1, \end{aligned}$$

and hence \(r_1 \ge \frac{1}{2}\), so that actually \(r_1 = \frac{1}{2}\). Analogous arguments for the other coordinates of r, show that \(r = (\frac{1}{2},\frac{1}{2},\frac{1}{2},\frac{1}{2})^\intercal \) as claimed.

(iv) If in the ith row of a Sauer Matrix S there is an entry equal to 1, then \(r_i=0\), because of \(\varDelta _1(r+S) \le 1\). In the types (3, 1) and (4, 0), the first three rows in S contain an entry equal to 1, so that they contain a Sauer Matrix of size 3 that is itself totally 1-submodular. However, we noted in (4) that no such Sauer Matrix exists. \(\square \)

Proof of Proposition 6 (ii)

We assume that the first row of each considered Sauer Matrix S of type (1, 3) contains an entry equal to 1, so that \(r_1 = 0\). We can now possibly multiply the first row of S by \(-1\) and thus assume that \((-1,-1,-1,-1)^\intercal \) is a column of S. After this set-up, we employ a case distinction based on the signs of the entries in the first row of the columns \(a = (\pm 1,-1,0,0)^\intercal \), \(b = (\pm 1,0,-1,0)^\intercal \), and \(c = (\pm 1,0,0,-1)^\intercal \) of S.

Case 1 \(a_1 = b_1 = c_1 = -1\).

Under this assumption, S contains the matrix N from (7) as a submatrix and thus \(r_1 \ge \frac{1}{2}\), contradicting that \(r_1 = 0\).

Case 2 \(a_1 = b_1 = c_1 = 1\).

In this case, S contains the submatrices

$$\begin{aligned} A = { \left[ \!\begin{array}{r|rrr} 0 &{} 1 &{} 1 &{} 1 \\ \hline 0 &{} -1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \quad \text { and }\quad B = { \left[ \!\begin{array}{r|rrr} 0 &{} 1 &{} 1 &{} 1 \\ \hline -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] }. \end{aligned}$$

The conditions \(|\det (r+A)| \le 1\) and \(|\det (r+B)| \le 1\) translate into the contradicting inequalities \(r_2+r_3+r_4 \le 1\) and \(r_2+r_3+r_4 \ge 2\), respectively.

Case 3 Exactly two of the entries \(a_1,b_1,c_1\) equal \(-1\).

Without loss of generality, we may permute the last three rows of S, and assume that \(a_1 = b_1 = -1\). We find that S now contains the submatrices

$$\begin{aligned} C = { \left[ \!\begin{array}{r|rrr} 0 &{} -1 &{} -1 &{} 0 \\ \hline 0 &{} -1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \,,\, D = { \left[ \!\begin{array}{r|rrr} 0 &{} -1 &{} -1 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \,,\, E = { \left[ \!\begin{array}{r|rrr} -1 &{} -1 &{} -1 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} -1 \\ -1 &{} 0 &{} -1 &{} -1 \\ -1 &{} 0 &{} 0 &{} 0 \end{array}\!\right] }. \end{aligned}$$

The conditions \(|\det (r+C)| \le 1\), \(|\det (r+D)| \le 1\) and \(|\det (r+E)| \le 1\) translate into the contradicting inequalities \(r_2 + r_3 \le 1\), \(r_4 \ge \frac{1}{2}\), and \(r_4 + 1 \le r_2 + r_3\), respectively.

Case 4 Exactly two of the entries \(a_1,b_1,c_1\) equal 1.

As in Case 3, we may assume that \(a_1 = b_1 = 1\). Here, the following six matrices can be found as submatrices in S:

$$\begin{aligned}{} & {} \left[ \!\begin{array}{r|rrr} -1 &{} 0 &{} 0 &{} -1 \\ \hline -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \ ,\ \left[ \!\begin{array}{r|rrr} -1 &{} 0 &{} 0 &{} -1 \\ \hline -1 &{} -1 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} -1 \end{array}\!\right] \ ,\ \left[ \!\begin{array}{r|rrr} -1 &{} 0 &{} 0 &{} -1 \\ \hline -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} -1 &{} -1 &{} 0 \\ -1 &{} -1 &{} 0 &{} -1 \end{array}\!\right] ,\\{} & {} \left[ \!\begin{array}{r|rrr} 0 &{} 1 &{} 1 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \ ,\ \left[ \!\begin{array}{r|rrr} 0 &{} 1 &{} 1 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} -1 \\ -1 &{} 0 &{} -1 &{} -1 \\ -1 &{} 0 &{} 0 &{} 0 \end{array}\!\right] \ ,\ \left[ \!\begin{array}{r|rrr} 0 &{} 1 &{} 1 &{} 0 \\ \hline 0 &{} -1 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] . \end{aligned}$$

The minor conditions for these matrices translate into the inequality system

$$\begin{aligned} r_4&\le \tfrac{1}{2} \qquad \qquad r_3 \qquad \qquad \,\, \le r_2 \qquad \qquad r_2 \le r_3 \\ r_4&\ge \tfrac{1}{2} \qquad \qquad r_2 + r_3 \qquad \ge 1 \qquad \qquad r_2 + r_3 \le 1 \end{aligned}$$

in the same order as the matrices were given above. Solving this inequality system shows that necessarily \(r_2 = r_3 = r_4 = \frac{1}{2}\), completing the proof. \(\square \)

Proof of Proposition 6 (iii)

We write a Sauer Matrix S of type (2, 2) in the form

where in each of the first two rows we have at least one entry equal to 1. We find four Sauer Matrices of type (1, 2) as submatrices of S:

$$\begin{aligned} M_{1,1,2} - \text {consisting of rows } 1,3,4 \text { and the columns in Block 1 and Block 2}\\ M_{1,3,4} - \text {consisting of rows } 1,3,4 \text { and the columns in Block 3 and Block 4}\\ M_{2,1,4} - \text {consisting of rows } 2,3,4 \text { and the columns in Block 1 and Block 4}\\ M_{2,2,3} - \text {consisting of rows } 2,3,4 \text { and the columns in Block 2 and Block 3} \end{aligned}$$

Now, let \(r \in [0,1)^4\) be feasible for S. As the first two rows of S both contain an entry equal to 1, we always have \(r_1 = r_2 = 0\). We find that \((0,r_3,r_4)^\intercal \) must be a feasible vector for the Sauer Matrices \(M_{1,1,2}\), \(M_{1,3,4}\), \(M_{2,1,4}\), and \(M_{2,2,3}\). Using Proposition 9, we see that if we want to allow the possibility of r being different from \((0,0,\frac{1}{2},\frac{1}{2})^\intercal \), then the sign patterns in the \(1 \times 4\) blocks \([\pm 1,\pm 1,\pm 1,\pm 1]\) in those four matrices must all belong either to \(\{(+,-,+,-) , (-,+,-,+)\}\) or to \(\{(+,+,-,-) , (-,-,+,+)\}\). Moreover, as \(r_1 = r_2 = 0\), we may multiply the first or the second row of S with \(-1\) without loosing the total 1-submodularity of \(r+S\). This means that we may assume that the sign patterns in the \(1 \times 4\) blocks corresponding to \((\text {Row 1},\text {Block 2})\) and \((\text {Row 2},\text {Block 3})\) are the same. Again as \(r_1 = r_2 = 0\), total 1-submodularity of \(r+S\) is also not affected if we exchange the first two rows of S, or Block 2 with Block 3.

These reductions show that the only Sauer Matrices of type (2, 2) that possibly allow feasible translation vectors \(r = (0,0,r_3,r_4)^\intercal \) different from \((0,0,\frac{1}{2},\frac{1}{2})^\intercal \) have its first two rows given by

$$\begin{aligned} R_1&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} 1 &{} -1 &{} 1 &{} -1 \end{array}\!\right] , \\ R_2&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} -1 &{} 1 &{} -1 &{} 1 \end{array}\!\right] , \\ R_3&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} -1 &{} 1 &{} -1 &{} 1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} -1 &{} 1 &{} -1 &{} -1 &{} 1 &{} -1 &{} 1 \end{array}\!\right] , \end{aligned}$$

or by

$$\begin{aligned} R_4&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} 1 &{} 1 &{} -1 &{} -1 \end{array}\!\right] , \\ R_5&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} -1 &{} -1 &{} 1 &{} 1 \end{array}\!\right] , \\ R_6&= \left[ \!\begin{array}{rrrr|rrrr|rrrr|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} 0 &{} 0 &{} 0 &{} 0 &{} -1 &{} -1 &{} 1 &{} 1 \\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} -1 &{} -1 &{} -1 &{} -1 &{} 1 &{} 1 \end{array}\!\right] . \end{aligned}$$

Observe also that for the first three matrices, the feasible vectors are of the form \(r = (0,0,r_3,\frac{1}{2})^\intercal \), for \(\frac{1}{4} \le r_3 \le \frac{3}{4}\), whereas for the second three matrices they have the form \(r = (0,0,\frac{1}{2},r_4)^\intercal \), for \(\frac{1}{4} \le r_4 \le \frac{3}{4}\). Let us denote the \(4 \times 4\) submatrix of S indexed by the columns \(i,j,k,\ell \in \{1,2,\ldots ,16\}\) by \(S_{i,j,k,\ell }\).

Case 1: First two rows of S are either \(R_1\), \(R_2\), or \(R_3\), and \(r = (0,0,r_3,\frac{1}{2})^\intercal \).

For \(R_1\), applying the condition \(\varDelta _4(r+S) \le 1\) to the submatrices \(S_{1,6,10,16}\) and \(S_{2,5,9,15}\), yields \(r_3 \le \frac{1}{2}\) and \(r_3 \ge \frac{1}{2}\), respectively. Thus \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \).

For \(R_2\), applying the condition \(\varDelta _4(r+S) \le 1\) to the submatrices \(S_{1,5,10,15}\) and \(S_{2,6,9,16}\), yields \(r_3 \le \frac{1}{2}\) and \(r_3 \ge \frac{1}{2}\), respectively. Thus \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \).

For \(R_3\), the submatrix \(S_{1,6,10,14}\) gives determinant \(|\det (r+S_{1,6,10,14})| = \frac{3}{2}\), contradicting the condition \(\varDelta _4(r+S) \le 1\). Thus, there is no feasible vector \(r \in [0,1)^4\) at all in this case.

Case 2: First two rows of S are either \(R_4\), \(R_5\), or \(R_6\), and \(r = (0,0,\frac{1}{2},r_4)^\intercal \).

For \(R_4\), applying the condition \(\varDelta _4(r+S) \le 1\) to the submatrices \(S_{1,6,10,13}\) and \(S_{3,5,9,14}\), yields \(r_3 \le \frac{1}{2}\) and \(r_3 \ge \frac{1}{2}\), respectively. Thus \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \).

For \(R_5\), applying the condition \(\varDelta _4(r+S) \le 1\) to the submatrices \(S_{1,5,10,14}\) and \(S_{3,6,9,13}\), yields \(r_3 \le \frac{1}{2}\) and \(r_3 \ge \frac{1}{2}\), respectively. Thus \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \).

For \(R_6\), the submatrix \(S_{1,7,11,15}\) gives determinant \(|\det (r+S_{1,6,10,14})| = \frac{3}{2}\), contradicting the condition \(\varDelta _4(r+S) \le 1\). Thus, there is no feasible vector \(r \in [0,1)^4\) at all in this case.

Summarizing our results from above, we see that if a Sauer Matrix of type (2, 2) admits a feasible vector \(r \in [0,1)^4\), then \(r = (0,0,\frac{1}{2},\frac{1}{2})^\intercal \), as desired. \(\square \)

4.3 Sauer matrices of size 5

Proposition 10

  1. (i)

    The Sauer Matrix of type (0, 5) is infeasible for translations.

  2. (ii)

    No Sauer Matrix of type (1, 4) is feasible for translations.

  3. (iii)

    No Sauer Matrix of type (2, 3) is feasible for translations.

Proof

(i) The argument is similar to the one for Proposition 6 (i). Assume for contradiction, that there is a vector \(r \in [0,1)^5\) with \(\varDelta _5(r+S) \le 1\). Consider the following two \(5 \times 5\) submatrices of S:

$$\begin{aligned} X = { \left[ \!\begin{array}{r|rrrr} 0 &{} 0 &{} 0 &{} 0 &{} 0 \\ \hline 0 &{} 0 &{} -1 &{} -1 &{} -1 \\ 0 &{} -1 &{} 0 &{} -1 &{} -1 \\ 0 &{} -1 &{} -1 &{} 0 &{} -1 \\ 0 &{} -1 &{} -1 &{} -1 &{} 0 \end{array}\!\right] } \quad \text { and }\quad Y = { \left[ \!\begin{array}{r|rrrr} -1 &{} -1 &{} -1 &{} -1 &{} -1 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] }. \end{aligned}$$

By the \(5 \times 5\) minor condition on \(r+S\), we have

$$\begin{aligned} |\det (r+X)| = r_1 \cdot \det { \begin{bmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 0 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 0 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 1 &{}\quad 0 \end{bmatrix}} = 3 r_1 \le 1, \end{aligned}$$

and hence \(r_1 \le \frac{1}{3}\). Likewise, we have

$$\begin{aligned} |\det (r+Y)| = (1-r_1) \cdot \det { \begin{bmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 0 &{}\quad 1 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 0 &{}\quad 1 \\ 1 &{}\quad 1 &{}\quad 1 &{}\quad 0 \end{bmatrix}} = 3 (1-r_1) \le 1. \end{aligned}$$

Therefore, we get \(r_1 \ge \frac{2}{3}\), a contradiction.

(ii) Let S be a Sauer Matrix of type (1, 4) and without loss of generality, we may assume that the first row of S contains an entry equal to 1. We also assume for contradiction that there is some \(r \in [0,1)^5\) such that \(r+S\) is totally 1-submodular. As the entries of \(r+S\) are contained in \([-1,1]\), we get that \(r_1 = 0\). Moreover, the last four rows of S contain a Sauer Matrix of type (0, 4). By Proposition 6 (i), this means that \(r_2 = r_3 = r_4 = r_5 = \frac{1}{2}\), so that in summary there is only one possibility for the translation vector r.

Now, as \(r_1 = 0\), we may multiply the first row of S with \(-1\) if needed, and can assume that the vector \((-1,-1,-1,-1,-1)^\intercal \) is a column of S. If M denotes any of the four matrices

$$\begin{aligned}&{ \left[ \!\begin{array}{r|rrrr} -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \,,\, \left[ \!\begin{array}{r|rrrr} -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \,,} \\&{ \left[ \!\begin{array}{r|rrrr} -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \,,\, \left[ \!\begin{array}{r|rrrr} -1 &{} 0 &{} 0 &{} 0 &{} -1 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] \,, } \end{aligned}$$

then the absolute value of the determinant of \(r+M\) equals 3/2. Thus, if indeed \(\varDelta _5(r+S) \le 1\), then these matrices cannot be submatrices of S. In particular, this implies that

$$\begin{aligned} M' = { \left[ \!\begin{array}{r|rrrr} 0 &{} 1 &{} 1 &{} 1 &{} 1 \\ \hline -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \end{aligned}$$

must be a submatrix of S. However, the determinant of \(r+M'\) equals \(-2\), in contradiction to \(r+S\) being totally 1-submodular.

(iii) Assume that there is a Sauer Matrix S of type (2, 3) and a vector \(r \in [0,1)^5\) that is feasible for S. Observe that S contains feasible Sauer Matrices of types (1, 3) in its rows indexed by \(\{1,3,4,5\}\) and by \(\{2,3,4,5\}\). By Proposition 6 (ii) this means that necessarily we have \(r = (0,0,\frac{1}{2},\frac{1}{2},\frac{1}{2})^\intercal \), and we can now argue similarly as we did in part (ii).

First of all, as \(r_1 = r_2 = 0\), we may multiply the first or second row of S with \(-1\) if needed, and can assume that the vectors \((-1,0,-1,-1,-1)^\intercal \) and \((0,-1,0,0,0)^\intercal \) are columns of S. We distinguish cases based on the signs of the entries in the first or second row of the columns \(a = (\pm 1,0,-1,0,0)^\intercal \), \(b = (\pm 1,0,0,-1,0)^\intercal \), \(c = (\pm 1,0,0,0,-1)^\intercal \), and \(a' = (0,\pm 1,-1,0,0)^\intercal \), \(b' = (0,\pm 1,0,-1,0)^\intercal \), \(c' = (0,\pm 1,0,0,-1)^\intercal \) of S.

Case 1 \(a_1 = b_1 = c_1 = 1\) or \(a'_2 = b'_2 = c'_2 = -1\).

Here, one of the matrices

$$\begin{aligned} C_1 = { \left[ \!\begin{array}{rr|rrr} 0 &{} 0 &{} 1 &{} 1 &{} 1 \\ 0 &{} -1 &{} 0 &{} 0 &{} 0 \\ \hline -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \quad \text { or } \quad C_2 = { \left[ \!\begin{array}{rr|rrr} 0 &{} -1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} -1 &{} -1 \\ \hline 0 &{} -1 &{} -1 &{} 0 &{} 0 \\ 0 &{} -1 &{} 0 &{} -1 &{} 0 \\ 0 &{} -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \end{aligned}$$

must be a submatrix of S, but the absolute value of the determinant of both \(r+C_1\) and \(r+C_2\) equals 3/2.

Case 2 Two of the entries \(a_1,b_1,c_1\) equal \(-1\) or two of the entries \(a'_2,b'_2,c'_2\) equal 1.

Without loss of generality, we may permute the last three rows of S, and assume that either \(a_1 = b_1 = -1\) or \(a'_2 = b'_2 = 1\). Now, one of the matrices

$$\begin{aligned} C_3 = { \left[ \!\begin{array}{rr|rrr} -1 &{} 0 &{} -1 &{} -1 &{} 0 \\ 0 &{} -1 &{} 0 &{} 0 &{} 0 \\ \hline -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \quad \text { or } \quad C_4 = { \left[ \!\begin{array}{rr|rrr} -1 &{} 0 &{} 0 &{} 0 &{} 0 \\ 0 &{} -1 &{} 1 &{} 1 &{} 0 \\ \hline -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \end{aligned}$$

must be a submatrix of S, but again the absolute value of the determinant of both \(r+C_3\) and \(r+C_4\) equals 3/2.

Case 3 Up to permuting the last three rows of S we have \(\tiny \left[ \!\begin{array}{*{3}r} a_1 &{} b_1 &{} c_1 \\ a'_2 &{} b'_2 &{} c'_2 \end{array}\!\right] = \left[ \!\begin{array}{*{3}r} -1 &{} 1 &{} 1 \\ 1 &{} -1 &{} -1 \end{array}\!\right] \).

With this assumption, one of the matrices

$$\begin{aligned}&{ \left[ \!\begin{array}{rr|rrr} -1 &{} 0 &{} 0 &{} 1 &{} 1 \\ -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ \hline -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \ ,\ { \left[ \!\begin{array}{rr|rrr} -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 1 &{} 0 &{} 0 \\ \hline -1 &{} -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \ ,\ \\&{ \left[ \!\begin{array}{rr|rrr} 1 &{} 0 &{} -1 &{} 1 &{} 1 \\ -1 &{} -1 &{} 0 &{} 0 &{} 0 \\ \hline -1 &{} 0 &{} -1 &{} 0 &{} 0 \\ -1 &{} 0 &{} 0 &{} -1 &{} 0 \\ -1 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \ ,\ { \left[ \!\begin{array}{rr|rrr} 1 &{} -1 &{} 0 &{} 0 &{} 0 \\ 1 &{} 0 &{} 0 &{} -1 &{} -1 \\ \hline -1 &{} -1 &{} -1 &{} 0 &{} 0 \\ -1 &{} -1 &{} 0 &{} -1 &{} 0 \\ -1 &{} -1 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \end{aligned}$$

must be a submatrix of S, because one of the four vectors \((\pm 1,\pm 1,-1,-1,-1)^\intercal \) must be a column of S. As before, if F denotes any of these four matrices, then the absolute value of the determinant of \(r+F\) equals 3/2.

Case 4 Up to permuting the last three rows of S we have \(\tiny \begin{bmatrix} a_1 &{} b_1 &{} c_1 \\ a'_2 &{} b'_2 &{} c'_2 \end{bmatrix} = \left[ \!\begin{array}{rrr} -1 &{} 1 &{} 1 \\ -1 &{} -1 &{} 1 \end{array}\!\right] \).

In this case, one of the matrices

$$\begin{aligned} C_7 = { \left[ \!\begin{array}{rr|rrr} -1 &{} 0 &{} 1 &{} 0 &{} 0 \\ 0 &{} -1 &{} 0 &{} -1 &{} 0 \\ \hline 0 &{} -1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \quad \text { or } \quad C_8 = { \left[ \!\begin{array}{rr|rrr} 1 &{} -1 &{} 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} -1 &{} -1 &{} 0 \\ \hline 0 &{} -1 &{} -1 &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 &{} -1 &{} 0 \\ 0 &{} 0 &{} 0 &{} 0 &{} -1 \end{array}\!\right] } \end{aligned}$$

must be a submatrix of S, because one of the vectors \((\pm 1,0,0,0,0)^\intercal \) must be a column of S. As before, the absolute value of the determinant of both \(r+C_7\) and \(r+C_8\) equals 3/2.

In conclusion, in all cases we found a \(5 \times 5\) minor of \(r+S\) whose absolute value is greater than 1, and thus no feasible Sauer Matrix of type (2, 3) can exist. \(\square \)

With these preparations, we can now exclude the existence of any Sauer Matrix of size 5 that is feasible for translations.

Proof of Proposition 7

Just as in the proof of Proposition 6 (iv), whenever there are at least three rows in a Sauer Matrix that contain an entry equal to 1, then it is infeasible for translations.

Thus, we may assume that S is a Sauer Matrix whose type is either (0, 5), (1, 4), or (2, 3). We have just proven in Proposition 10 however, that all such Sauer Matrices are infeasible for translations. \(\square \)

5 Computational experiments and the proof of Theorem 2

In this section, we describe a computational approach to determine so far unknown values of \({{\,\textrm{h}\,}}(\varDelta ,m)\) for small parameters \(m,\varDelta \in \mathbb {Z}_{>0}\). The results of our computations led us to identify a family of counterexamples to Conjecture 1 that lie behind the lower bounds in Theorem 2. Our approach is based on the sandwich factory classification scheme described and utilized by Averkov, Borger and Soprunov [3, Sect. 7.3].

As we have done implicitly already in previous sections, we now explicitly work with sets of integer points in \(\mathbb {Z}^m\), rather than with \(\varDelta \)-modular integer matrices with m rows and full rank. To this end, for a point set \(S \subseteq \mathbb {Z}^m\), we write

$$\begin{aligned} \varDelta (S) := \max \left\{ |\det (S')| : S' \subseteq S , |S'| = m \right\} \end{aligned}$$

for the maximum absolute value of the determinant of a matrix whose columns constitute an m-element subset of S. If \(P \subseteq \mathbb {R}^m\) is a lattice polytope, meaning that all its vertices belong to \(\mathbb {Z}^m\), then we write \(\varDelta (P) := \varDelta (P \cap \mathbb {Z}^m)\). Because the maximum determinant is attained by an m-element subset of the vertices of P, we also have \(\varDelta (P) = \varDelta (\{x \in \mathbb {Z}^m : x \text { is a vertex of } P\})\). For the same reason, we have \(\varDelta ({{\,\textrm{conv}\,}}\{S\}) = \varDelta (S)\), for every full-dimensional set \(S \subseteq \mathbb {Z}^m\). Since the value of \({{\,\textrm{h}\,}}(\varDelta ,m)\) is attained by a matrix A whose columns come in opposite pairs \(A_i, -A_i\), we restrict our attention to o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\), that is, we require \(P = -P\) to hold. In this language the generalized Heller constant is

$$\begin{aligned} {{\,\textrm{h}\,}}(\varDelta ,m) = \max \bigl \{ |P \cap \mathbb {Z}^m| : P \subseteq \mathbb {R}^m&\text { is an }o\text {-symmetric lattice polytope} \bigr . \nonumber \\ \bigl .&\text { with } \varDelta (P) = \varDelta \bigr \}. \end{aligned}$$
(8)

Therefore, in order to computationally determine the value \({{\,\textrm{h}\,}}(\varDelta ,m)\) for a pair \((\varDelta ,m)\) of parameters, we may want to solve any of the following classification problems.

Problem 1

Given \(m,\varDelta \in \mathbb {Z}_{>0}\), classify up to unimodular equivalence all o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\) with \(\varDelta (P) = \varDelta \).

The second problem only aims to classify the extremal examples:

Problem 2

Given \(m,\varDelta \in \mathbb {Z}_{>0}\), classify up to unimodular equivalence all o-symmetric lattice polytopes \(Q \subseteq \mathbb {R}^m\) with \(\varDelta (Q) = \varDelta \) and with the maximal number of integer points under these constraints.

5.1 Classification by sandwich factory approach

As hinted above, we tackle these problems with the sandwich factory classification scheme of Averkov, Borger and Soprunov [3, Sect. 7.3]. This is a quite general and versatile approach that can be applied to various enumeration problems for lattice polytopes. Regarding Problem 2, the approach can be interpreted as a branch-and-bound procedure for the maximization of \(|Q \cap \mathbb {Z}^m|\) subject to \(\varDelta (Q) \le \varDelta \).

The basic idea is to use a so-called sandwich (AB) with the inner part A and the outer part B, that is, a pair of lattice polytopes satisfying the inclusion \(A \subseteq B\). In the branch-and-bound interpretation, each sandwich corresponds to a node of a branch-and-bound tree. Such a sandwich represents the family of all lattice polytopes P that are unimodularly equivalent to a lattice polytope \(P'\) satisfying \(A \subseteq P' \subseteq B\). If the latter condition holds, one says that P occurs in the sandwich (AB). If a family \(\mathcal {F}\) of polytopes needs to be enumerated up to unimodular equivalence, and it is possible to find a finite list of sandwiches with the property that every polytope \(P \in \mathcal {F}\) occurs in one of the sandwiches of the finite list, then the enumeration is carried out by iteratively refining sandwiches with \(A \varsubsetneq B\), for which the discrepancy between A and B is large and replacing each such sandwich (AB) with finitely many sandwiches that have a smaller discrepancy between the inner and the outer part. This can be understood as branching. For the quantification of the discrepancy between A and B, one can employ different functions.

For our purposes it is natural to use the lattice point gap \(|B \cap \mathbb {Z}^m| - |A \cap \mathbb {Z}^m|\). A natural approach to replace (AB) by sandwiches with a smaller lattice point gap is to pick a vertex v of B that is not contained in A and modify (AB) to two sandwiches: one with the inner part containing v and one with the outer part not containing v. The iterative procedure continues until all sandwiches in the list have lattice point gap equal to zero.

There are two important aspects that allow us to optimize the running time. Two sandwiches (AB) and \((A',B')\) are called equivalent if there is a unimodular transformation that simultaneously brings A to \(A'\) and B to \(B'\). Thus, for the enumeration of polytopes with a property P that is invariant up to unimodular equivalence, it is sufficient to keep sandwiches up to this notion of equivalence. Our enumeration task concerns the property P of a lattice polytope A, describing that \(\varDelta (A) = \varDelta \). As described in [3, Lem. 7.9], equivalence of two given sandwiches can be expressed as unimodular equivalence of suitable higher-dimensional lattice polytopes associated with the two sandwiches. The second aspect that allows to optimize the running time is monotonicity. If P is the conjunction P \( =\) P\(_1 \wedge \) P\(_2\), where P\(_1\) is a downward closed property, while P\(_2\) is an upward closed property, we can prune those sandwiches (AB) that are generated for which A does not satisfy P\(_1\), or B does not satisfy P\(_2\).

A further tool for an efficient implementation of these ideas is the reduction of a polytope B relative to some polytope \(A \subseteq B\). This simply means, that before adding a possible new sandwich (AB) during the iteration, we neglect all integer points \(v \in B\) such that \(\varDelta (A \cup \{v\}) > \varDelta (A)\). More precisely, the reduced sandwich \((A,B')\) of (AB) is defined by \(B' = {{\,\textrm{conv}\,}}\{ v \in B \cap \mathbb {Z}^m : \varDelta (A \cup \{v\}) = \varDelta (A) \}\).

With these details of the implementation in mind, we can now describe the procedure to solve Problem 1 as done in Algorithm 1. Regarding the classification of all o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\) with \(\varDelta (P) = \varDelta \) and \({{\,\textrm{h}\,}}(\varDelta ,m) = |P \cap \mathbb {Z}^m|\) in Problem 2, we need to make the following adjustments to Algorithm 1:

  1. (i)

    We maintain a value \(\text {cmax}\) that we initialize with the valid lower bound \(m^2 + m + 1 + 2m(\varDelta -1)\) on \({{\,\textrm{h}\,}}(\varDelta ,m)\) (see (2)).

  2. (ii)

    We never add a sandwich (AB) during the algorithm with \(|B \cap \mathbb {Z}^m| < \text {cmax}\). In the branch-and-bound interpretation, this adjustment corresponds to pruning the sandwich (AB).

  3. (iii)

    If we add a sandwich (AB) to \(\mathcal {F}\) with \(|A \cap \mathbb {Z}^m| > \text {cmax}\), then we update \(\text {cmax}\) to \(|A \cap \mathbb {Z}^m|\).

figure a

Remark 2

The initialization of the sandwich factory in Step 1 of Algorithm 1 can be done by generating all Hermite normal forms of integer matrices \(M \in \mathbb {Z}^{m \times m}\) with \(\varDelta (M) = \varDelta \) (cf. Schrijver [14, Sect. 4.1]).

5.2 Computational results

We have implemented the previously described algorithms in sagemath [16], based on the existing implementation of the sandwich factory used in [3] by Christopher Borger.Footnote 4 The source code as well as data files containing the results of our computations are available at https://github.com/mschymura/delta-classification.

The computational results regarding the constant \({{\,\textrm{h}\,}}(\varDelta ,m)\) and the number of equivalence classes of o-symmetric lattice polytopes for a given \(\varDelta \) are gathered in Tables 1, 2 and 3.

Table 1 determines the previously unknown exact values of \({{\,\textrm{h}\,}}(\varDelta ,m)\), for \(m=3\) and \(3 \le \varDelta \le 13\), as well as for \((\varDelta ,m) = (3,4)\). It also reveals a counterexample to Conjecture 1 for the case \((\varDelta ,m) = (4,3)\), whose structure we used to construct counterexamples for every \((\varDelta ,m) \in \{ (4,4) , (4,5) , (8,4) , (8,5) \}\) as well. The construction is discussed further below in the next section.

Table 1 The values of \({{\,\textrm{h}\,}}(\varDelta ,m)\) for small numbers \(m,\varDelta \)

Table 2 reports on the classification Problem 1 and lists the number of equivalence classes of o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\) with \(\varDelta (P) = \varDelta \), for the range of parameters \(\varDelta ,m\) specified in the table.

Table 2 The number of equivalence classes of o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\) for given \(\varDelta (P) = \varDelta \)

For the pairs \((\varDelta ,3)\) with \(10 \le \varDelta \le 13\), we used the modification of Algorithm 1 that solves Problem 2. The corresponding number of equivalence classes of extremizers of \({{\,\textrm{h}\,}}(\varDelta ,m)\) are given in Table 3. It is interesting to observe that, starting from dimension \(m=3\), non-uniqueness of an extremizer of \({{\,\textrm{h}\,}}(\varDelta ,m)\) is the norm.

Table 3 The number of equivalence classes of o-symmetric lattice polytopes \(P \subseteq \mathbb {R}^m\) for given \(\varDelta (P) = \varDelta \), which satisfy \({{\,\textrm{h}\,}}(\varDelta ,m) = |P \cap \mathbb {Z}^m|\)

5.3 Construction of counterexamples and Theorem 2

For a finite set \(S \subseteq \mathbb {R}^m\), we denote by

$$\begin{aligned} \mathcal {D}(S) := S - S = \left\{ a - b : a,b \in S \right\} \end{aligned}$$

its difference set, and by

$$\begin{aligned} {{\,\textrm{pyr}\,}}(S) := \left( S \times \{0\} \right) \cup \{ e_{m+1} \} \subseteq \mathbb {R}^{m+1} \end{aligned}$$

the pyramid over S of height one. It turns out that a combination of these two operations behaves very well with respect to the maximal absolute value of the determinant of m-element subsets.

Lemma 3

For every finite non-empty set \(S \subseteq \mathbb {R}^m\), we have

$$\begin{aligned} |\mathcal {D}({{\,\textrm{pyr}\,}}(S))| = |\mathcal {D}(S)| + 2 |S| \quad \text {and} \quad \varDelta (\mathcal {D}({{\,\textrm{pyr}\,}}(S))) = \varDelta (\mathcal {D}(S)). \end{aligned}$$

Proof

The difference set of the pyramid over S is given by

$$\begin{aligned} \mathcal {D}({{\,\textrm{pyr}\,}}(S)) = \left( (S - S) \times \{0\} \right) \cup \left( -S \times \{1\} \right) \cup \left( S \times \{-1\} \right) . \end{aligned}$$

Since S is finite, this immediately yields the cardinality count.

So, let’s prove the statement on the largest \(m \times m\) minors in \(\mathcal {D}({{\,\textrm{pyr}\,}}(S))\). First of all, we have \(\varDelta (\mathcal {D}({{\,\textrm{pyr}\,}}(S))) \ge \varDelta (\mathcal {D}(S))\), because we can take an m-element subset \(S' \subseteq \mathcal {D}(S)\) with \(|\det (S')| = \varDelta (\mathcal {D}(S))\) and lift this to the set \(S'' := (S' \times \{0\}) \cup \{ (-s,1) \} \subseteq \mathcal {D}({{\,\textrm{pyr}\,}}(S))\), for some \(s \in S\). Clearly, \(|\det (S'')| = |\det (S')| = \varDelta (\mathcal {D}(S))\).

Conversely, let \(S' = \{s_0',s_1',\ldots ,s_m'\} \subseteq \mathcal {D}({{\,\textrm{pyr}\,}}(S))\) be a subset of size \(m+1\), and without loss of generality, let \(s_0',s_1',\ldots ,s_\ell ' \in \left( (S-S) \times \{0\} \right) \) be with \(s_i' = (s_{i1}-s_{i2},0)\), for suitable \(s_{i1},s_{i2} \in S\), and \(s_{\ell +1}',\ldots ,s_m' \in \left( -S \times \{1\} \right) \) be with \(s_j' = (-s_j,1)\), for suitable \(s_j \in S\). With this notation, we obtain

$$\begin{aligned} |\det (S')|&= |\det \left( \genfrac(){0.0pt}1{s_{01}-s_{02}}{0},\ldots ,\genfrac(){0.0pt}1{s_{\ell 1}-s_{\ell 2}}{0},\genfrac(){0.0pt}1{-s_{\ell +1}}{1},\ldots ,\genfrac(){0.0pt}1{-s_{m}}{1}\right) |\\&= |\det \left( \genfrac(){0.0pt}1{s_{01}-s_{02}}{0},\ldots ,\genfrac(){0.0pt}1{s_{\ell 1}-s_{\ell 2}}{0},\genfrac(){0.0pt}1{s_m-s_{\ell +1}}{0},\ldots ,\genfrac(){0.0pt}1{s_m-s_{m-1}}{0},\genfrac(){0.0pt}1{-s_{m}}{1}\right) |\\&= |\det \left( s_{01}-s_{02},\ldots ,s_{\ell 0}-s_{\ell 1},s_m-s_{\ell +1},\ldots ,s_m-s_{m-1}\right) |\\&\le \varDelta (\mathcal {D}(S)). \end{aligned}$$

\(\square \)

Based on this lemma we can now construct series of examples that exceed the conjectured value of \({{\,\textrm{h}\,}}(4,m),{{\,\textrm{h}\,}}(8,m)\), and \({{\,\textrm{h}\,}}(16,m)\) in Conjecture 1 by an additive term that is linear in m, and for m large enough. Our construction is based on the unique (up to unimodular equivalence) set attaining the value \({{\,\textrm{h}\,}}(4,3) = 33\), resulting from our enumeration approach described earlier (see Tables 1 and 3). This set can be written as

$$\begin{aligned} \mathcal {D}({{\,\textrm{pyr}\,}}(H_2)) = \left( 2 \cdot H_2 \times \{0\} \right) \cup \left( H_2 \times \{-1,1\} \right) , \end{aligned}$$

where \(H_2 = \mathcal {D}(\{\textbf{0},e_1,e_2\}) = \left\{ \textbf{0}, \pm e_1, \pm e_2, \pm (e_1 - e_2) \right\} \subseteq \mathbb {Z}^2\) is the two-dimensional set attaining the Heller constant \({{\,\textrm{h}\,}}(1,2) = 7\). Iterating this pyramid construction and using the \(\ell \)-dimensional set \(H_\ell = \mathcal {D}(\{\textbf{0},e_1,\ldots ,e_\ell \})\) attaining the Heller constant \({{\,\textrm{h}\,}}(1,\ell ) = \ell ^2 + \ell + 1\), we define

$$\begin{aligned} C_m^\ell := {{\,\textrm{pyr}\,}}^{m-\ell }(H_\ell ) = \underbrace{{{\,\textrm{pyr}\,}}(\ldots {{\,\textrm{pyr}\,}}({{\,\textrm{pyr}\,}}(H_\ell ))\ldots )}_{m-\ell \text { times}} \subseteq \mathbb {Z}^m. \end{aligned}$$

(Proof of Theorem 2)

The point sets yielding the claimed lower bounds are of the form \(\mathcal {D}(C_m^\ell )\), that is, the difference set of \(C_m^\ell \). In view of Lemma 3, we get

$$\begin{aligned} \varDelta \left( \mathcal {D}\left( C_m^\ell \right) \right) = \varDelta \left( \mathcal {D}\left( C_{m-1}^\ell \right) \right) = \ldots = \varDelta \left( \mathcal {D}\left( H_\ell \right) \right) = \varDelta \left( 2 \cdot H_\ell \right) = 2^\ell , \end{aligned}$$

and, using that the pyramid construction adds exactly one point to the set that the pyramid is taken over, we also obtain \(|C_m^\ell | = m - \ell + |H_\ell |\). Hence, using Lemma 3 once more, we get

$$\begin{aligned} \left| \mathcal {D}\left( C_m^\ell \right) \right|&= \left| \mathcal {D}\left( C_{m-1}^\ell \right) \right| + 2\left| C_{m-1}^\ell \right| = \left| \mathcal {D}\left( C_{m-2}^\ell \right) \right| + 2\left| C_{m-2}^\ell \right| + 2\left| C_{m-1}^\ell \right| \nonumber \\&= \left| \mathcal {D}(H_\ell )\right| + 2 \sum _{i=\ell }^{m-1} \left| C_i^\ell \right| = \left| \mathcal {D}(H_\ell )\right| + 2 \sum _{i=\ell }^{m-1} (i - \ell + \left| H_\ell \right| ) \nonumber \\&= |\mathcal {D}(H_\ell )| + 2 \sum _{j=0}^{m-\ell -1} (|H_\ell | + j) \nonumber \\&= |\mathcal {D}(H_\ell )| + 2(m-\ell )|H_\ell | + (m-\ell )(m-\ell -1) \nonumber \\&= m^2 + \left( 2 |H_\ell | - 2 \ell - 1 \right) m + \left( |\mathcal {D}(H_\ell )| - 2\ell |H_\ell | + \ell (\ell +1) \right) . \end{aligned}$$
(9)

The conjectured value in Conjecture 1 is \({{\,\textrm{h}\,}}(2^\ell ,m) = m^2 + (2^{\ell +1} - 1) m + 1\). Using \(|H_\ell | = \ell ^2 + \ell + 1\) and (9), this means that \(\mathcal {D}(C_m^\ell )\) is a counterexample to Conjecture 1, for fixed \(\ell \) and large enough m, if and only if

$$\begin{aligned} 2 |H_\ell | - 2 \ell - 1 = 2\ell ^2 + 1 > 2^{\ell +1} - 1. \end{aligned}$$

This holds exactly for \(\ell \in \{2,3,4\}\), and computing \(|\mathcal {D}(H_2)| = 19\), \(|\mathcal {D}(H_3)| = 55\), and \(|\mathcal {D}(H_4)| = 131\), we get by (9)

$$\begin{aligned} {{\,\textrm{h}\,}}(4,m)&\ge \left| \mathcal {D}\left( C_m^2\right) \right| = m^2 + 9m - 3{} & {} = m^2 + 7m + 1 + 2(m-2), \\ {{\,\textrm{h}\,}}(8,m)&\ge \left| \mathcal {D}\left( C_m^3\right) \right| = m^2 + 19m - 11{} & {} = m^2 + 15m + 1 + 4(m-3), \\ {{\,\textrm{h}\,}}(16,m)&\ge \left| \mathcal {D}\left( C_m^4\right) \right| = m^2 + 33m - 17{} & {} = m^2 + 31m + 1 + 2(m-9), \end{aligned}$$

and the claim follows. \(\square \)

6 Open problems

The determination of the exact value of \({{\,\textrm{h}\,}}(\varDelta ,m)\) remains the major open problem. Note that the bounds from other sources and the bound we prove here are incomparable when both m and \(\varDelta \) vary. In order to understand the limits of our method for upper bounding \({{\,\textrm{h}\,}}(\varDelta ,m)\), it is necessary to determine the exact asymptotic behavior of \({{\,\mathrm{h_s}\,}}(m)\) or \({\text {h}_\text {s}}^{\varDelta }{(m)}\). Proposition 2 suggests that the following may have an affirmative answer:

Question 1

It is true that for every \(m \in \mathbb {Z}_{>0}\), we have \({{\,\mathrm{h_s}\,}}(m) \le {{\,\textrm{h}\,}}(1,m) \)?

This would imply a bound of order \({{\,\textrm{h}\,}}(\varDelta ,m) \in \mathcal {O}(m^2) \cdot \varDelta \), which is with a view at Proposition 3 the best possible we can achieve based on \({{\,\mathrm{h_s}\,}}(m)\).

A relaxed question concerns the refined shifted Heller constant:

Question 2

Do we have \({\text {h}_\text {s}}^{\varDelta }{(m)} \le {{\,\textrm{h}\,}}(1,m)\), for every \(m,\varDelta \in \mathbb {Z}_{>0}\)?

Again an affirmative answer would imply the bound \({{\,\textrm{h}\,}}(\varDelta ,m) \in \mathcal {O}(m^2) \cdot \varDelta \), but Proposition 3 does not rule out the possibility that \({\text {h}_\text {s}}^{\varDelta }{(m)}\) actually grows sublinearly with m, for fixed \(\varDelta \), as the construction therein uses a translation vector with denominator equal to m.

Relaxing the question once again, we may ask

Question 3

Is it true that \({{\,\textrm{h}\,}}(\varDelta ,m) \in \mathcal {O}(m^2) \cdot \varDelta \)?

Our computational experiments described in Sect. 5 suggest that there are more constraints on the maximal size of a \(\varDelta \)-modular integer matrix, when \(\varDelta \) is a prime (compare also with the improved bound in Theorem 1 (ii) for odd \(\varDelta \)).

Question 4

Does Conjecture 1 hold for every \(m \in \mathbb {Z}_{>0}\), and every prime \(\varDelta \in \mathbb {Z}_{>0}\)? In particular, does it hold for \({{\,\textrm{h}\,}}(3,m)\)?

Moreover, from the data in Table 1 one may also suspect that for any given \(m \in \mathbb {Z}_{>0}\) there are only finitely many \(\varDelta \in \mathbb {Z}_{>0}\) that possibly violate Conjecture 1. For instance, it could very well be that the value \({{\,\textrm{h}\,}}(4,3) = 33\) is the only exception from Conjecture 1 in dimension \(m=3\).

Question 5

Given \(m \in \mathbb {Z}_{>0}\), is there always a threshold \(\varDelta (m) \in \mathbb {Z}_{>0}\) such that \({{\,\textrm{h}\,}}(\varDelta ,m) = m^2+m+1+2m(\varDelta -1)\), for every \(\varDelta \ge \varDelta (m)\)?

Finally, while investigating the extreme examples attaining \({{\,\textrm{h}\,}}(\varDelta ,m)\), for small values of m and \(\varDelta \), and which are enumerated in Table 3, we found that for each computed pair \((\varDelta ,m)\) there is at least one extremizer that can be written as the set of integer points in the convex hull of the difference set of some subset of \(\mathbb {Z}^m\). We wonder whether this is a general phenomenon:

Question 6

Is there always an extremizer for \({{\,\textrm{h}\,}}(\varDelta ,m)\) that can be written as the set of integer points in the convex hull of the difference set of a subset of \(\mathbb {Z}^m\)?