1 Background

Forcing additional structure on polynomials often simplifies certain problems in theory and practice. One of the most prominent examples is given by sparse polynomials, which arise in different areas in mathematics. Exploiting sparsity in problems can reduce the complexity of solving hard problems. An important example is, given by sparse polynomial optimization problems, see [20]. In this paper, we consider sparse polynomials having a special structure in terms of their Newton polytopes and supports. More precisely, we look at polynomials \(f\in \mathbb {R}[\mathbf {x}] = \mathbb {R}[x_1,\ldots ,x_n]\), whose Newton polytopes are simplices and the supports are given by all the vertices of the simplices and one additional interior lattice point in the simplices. Such polynomials have exactly \(n + 2\) monomials and can be regarded as supported on a circuit. Note that \(A \subset \mathbb N^n\) is called a circuit, if A is affinely dependent, but any proper subset of A is affinely independent, see [11]. We write these polynomials as

$$\begin{aligned} f= & {} \sum _{j=0}^n b_j \mathbf {x}^{\alpha (j)} + c \mathbf {x}^y \end{aligned}$$
(1)

where the Newton polytope \(\Delta = {{\mathrm{New}}}(f) = {{\mathrm{conv}}}\{\alpha (0), \dots , \alpha (n)\}\subset \mathbb {R}^n\) is a lattice simplex, \(y \in {{\mathrm{int}}}(\Delta )\), \(b_j \in \mathbb {R}_{>0}\) and \(c \in \mathbb {R}^*\). We denote this class of polynomials as \(P_{\Delta }^y\). In this setting, the goal of this paper is to connect and establish new results in two different areas of mathematics. Namely, we link amoeba theory with nonnegative polynomials and sums of squares. The theory of amoebas deals with images of varieties \(\mathcal {\mathcal {V}} (f) \subset (\mathbb {C}^*)^n\) under the Log-absolute-value map

$$\begin{aligned} {{\mathrm{Log}}}|\cdot |: \left(\mathbb {C}^*\right)^n \rightarrow \mathbb {R}^n, \quad (z_1,\ldots ,z_n) \mapsto (\log |z_1|,\ldots , \log |z_n|), \end{aligned}$$
(2)

having their nature in complex algebraic geometry with applications in various mathematical subjects including complex analysis [10, 11], the topology of real algebraic curves [24], dynamical systems [8], dimers/crystal shapes [18], and in particular with strong connections to tropical geometry, see [25, 28]. The cones of nonnegative polynomials and sums of squares arise as central objects in convex algebraic geometry and polynomial optimization, see [3, 21].

For both amoebas and nonnegative polynomials/sums of squares, work has been done for special configurations in the above setting. In [37], the authors give a characterization of the corresponding amoebas of such polynomials and in [9, 31], the authors characterize questions of nonnegativity and sums of squares for very special coefficients and simplices in the above sparse setting. We aim to extend results in all of these papers and establish connections between them for polynomials \(f\in P_{\Delta }^y\).

We call a lattice point \(\alpha \in \mathbb {Z}^n\) even if every entry \(\alpha _j\) is even, i.e., \(\alpha \in (2\mathbb {Z})^n\). We call an integral polytope even if all its vertices are even. Finally, we call a polynomial a sum of monomial squares if all monomials \(b_\alpha \mathbf {x}^{\alpha }\) satisfy \(b_{\alpha } > 0\) and \(\alpha \) even.

For the remainder of this article, we assume that every polytope is even unless it is explicitly stated otherwise. However, we will reemphasize this fact in key statements.

For \(f \in P_{\Delta }^y\), we define the circuit number \(\Theta _f\) as

$$\begin{aligned} \Theta _f= & {} \prod _{j = 0}^n \left( \frac{b_j}{\lambda _j}\right) ^{\lambda _j}, \end{aligned}$$
(3)

where the \(\lambda _j\)’s are uniquely given by the convex combination \(\sum _{j = 0}^n \lambda _j \alpha (j) = y,\lambda _j \ge 0, \sum _{j = 0}^n \lambda _j = 1\). We show that every polynomial \(f \in P_{\Delta }^y\) is, up to an isomorphism on \(\mathbb {R}^n\), completely characterized by the \(\lambda _j\) and its circuit number \(\Theta _f\).

Remember that we always have \(c \in \mathbb {R}^*\) by definition of \(P_{\Delta }^y\). The case \(c = 0\) implies that the polynomial f is a sum of monomial squares and hence always is nonnegative. This should be kept in mind when with slight abuse of notation \(c = 0\) is a possible choice in some statements. We now formulate our main theorems. The first theorem stated here is a composition of Theorem 3.8 and the Corollaries 3.9, 3.11 and 4.3 in the article.

Theorem 1.1

Let \(f \in P_{\Delta }^y\) and \(\Delta \) be an even simplex, i.e., \(\alpha (j) \in (2\mathbb {N})^n\) for all \(0 \le j \le n\). Then, the following statements are equivalent.

  1. (1)

    \(c \in [-\Theta _f,\Theta _f]\) and \(y \notin (2\mathbb N)^n\) or \(c \ge -\Theta _f\) and \(y \in (2\mathbb {N})^n\).

  2. (2)

    f is nonnegative.

Furthermore, f is located on the boundary of the cone of nonnegative polynomials if and only if \(|c| = \Theta _f\) for \(y \notin (2\mathbb {N})^n\) and \(c = -\Theta _f\) for \(y \in (2\mathbb {N})^n\). In these cases, f has at most \(2^n\) real zeros all of which only differ in their signs.

Assume that furthermore \(n \ge 2\) and f is not a sum of monomial squares with \(c > 0\). Then, the following are equivalent.

  1. (1)

    f is nonnegative, i.e., \(c \in [-\Theta _f,\Theta _f]\) and \(y \notin (2\mathbb N)^n\) or \(c \in [-\Theta _f,0]\) and \(y \in (2\mathbb {N})^n\)

  2. (2)

    The amoeba \({\mathcal {A}} (f)\) is solid.

Note in this context that an amoeba \({\mathcal {A}} (f)\) of \(f \in P_\Delta ^y\) is solid if and only if its complement has no bounded components. Note furthermore that since \(\Delta \) is an even simplex, f is a sum of monomial squares (and hence trivially nonnegative) if and only if \(c \ge 0\) and \(y \in (2\mathbb {N})^n\).

Theorem 1.1 yields a very interesting relation between the structure of the amoebas of \(f \in P_{\Delta }^y\) and nonnegative polynomials \(f \in P_{\Delta }^y\), which are both completely characterized by the circuit number. Furthermore, it generalizes amoeba theoretic results from [37].

A crucial observation for \(f \in P_{\Delta }^y\) is that nonnegativity of such f does not imply that f is a sum of squares. It is particularly interesting that the question whether \(f \in P_{\Delta }^y\) is a sum of squares or not depends on the lattice point configuration of the Newton polytope of f alone. We give a precise characterization of the nonnegative \(f \in P_{\Delta }^y\) which are additionally a sum of squares in Sect. 5, Theorem 5.2. Here, we present a rough version of the statement.

Informal Statement 1.2

Let \(f \in P_{\Delta }^y\) and \(\Delta \) be an even simplex. Let f be nonnegative. Then, f is a sum of squares if and only if y is the midpoint of two even distinct lattice points contained in a particular subset of lattice points in \(\Delta \). In particular, this is independent of the choice of the coefficients \(b_j, c\).

Note that Theorems 1.1 and 1.2 generalize the main results in [9] and [31] and yield them as special instances. In Sect. 5, we will explain this relationship in more detail.

Based on these characterizations, we define a new convex cone \(C_{n,2d}\):

Definition 1.3

We define the set of sums of nonnegative circuit polynomials (SONC) as

$$\begin{aligned} C_{n,2d} = \left\{ f \in \mathbb {R}[\mathbf {x}]_{2d} : f = \sum _{i=1}^k \lambda _i g_i, \lambda _i \ge 0, g_i \in P_{\Delta _i}^y\cap P_{n,2d}\right\} \end{aligned}$$

for some even lattice simplices \(\Delta _i \subset \mathbb {R}^n\).

It follows by construction that membership in the \(C_{n,2d}\) cone serves as a nonnegativity certificate, see also Proposition 7.2.

Corollary 1.4

Let \(f \in \mathbb {R}[\mathbf {x}]\). Then, f is nonnegative if there exist \(\mu _i \ge 0\), \(g_i \in C_{n,2d}\) for \(1 \le i\le k\) such that

$$\begin{aligned} f = \sum _{i=1}^k \mu _i g_i. \end{aligned}$$

In Sect. 7, we discuss the SONC cone in further detail. In Proposition 7.2, we show that the SONC cone and the SOS cone are not contained in each other for general n and d. Particularly, we also prove that the existence of a SONC decomposition is equivalent to nonnegativity of f if \({{\mathrm{New}}}(f)\) is a simplex and there exists an orthant where all terms of f except for those corresponding to vertices have a negative sign (Corollary 7.5).

Finally, we prove the following result about convexity, see also Theorem 6.4.

Theorem 1.5

Let \(n \ge 2\) and \(f \in P_{\Delta }^y\) where \(\Delta \) is an even simplex. Then, f is not convex.

Recently, there is much interest in understanding the cone of convex polynomials, Theorem 1.5 serves as an indication that sparsity is a structure that can prevent polynomials from being convex.

Further contributions

  1. 1.

    Gale duality is a standard concept for (convex) polytopes, matroids and sparse polynomial systems, see [2, 11, 14, 35]. We show that a polynomial \(f \in P_{\Delta }^y\) has a global norm minimizer \(e^{\mathbf {s}^*} \in \mathbb {R}^n\), see Sect. 3.2. f at \(e^{\mathbf {s}^*}\) together with the circuit number \(\Theta _f\) equals the Gale dual vector of the support matrix up to a scalar multiple (Corollary ). Furthermore, it is an immediate consequence of our results that the circuit number is strongly related to the A-discriminant of f. Particularly, \(f \in P_{n,2d} \cap P_{\Delta }^y\) is contained in the topological boundary of the nonnegativity cone, i.e., \(f \in \partial (P_{n,2d} \cap P_{\Delta }^y)\), if and only if the A-discriminant vanishes at f (Corollary 3.11). These facts about the A-discriminant were first shown in [26] and [37].

  2. 2.

    We consider the case of multiple interior lattice points in the support of f. We prove for the case that all coefficients of the interior monomials are negative that all such nonnegative polynomials are in \(C_{n,2d}\). Furthermore, we show when such polynomials are sums of squares, again generalizing results in [9].

  3. 3.

    Since the condition of being a sum of squares depends on the combinatorial structure of the simplex \(\Delta \), using techniques from toric geometry, we provide sufficient conditions for simplices \(\Delta \) such that every nonnegative polynomial in \(P_{\Delta }^y\) is a sum of squares, independent from the position of \(y \in {{\mathrm{int}}}(\Delta )\). This will prove that for \(n = 2\) almost every nonnegative polynomial in \(P_{\Delta }^y\) is a sum of squares and this also yields large sections on which nonnegative polynomials and sums of squares coincide.

  4. 4.

    We answer a question of Reznick stated in [31] whether a certain lattice point criterion on a class of sparse support sets (more general than circuits) of nonnegative polynomials is equivalent to these polynomials being sums of squares.

This article is organized as follows. In Sect. 2, we introduce some notations and recall some results that are essential for the upcoming sections and proofs of the main theorems. In Sect. 3, we characterize nonnegativity of polynomials \(f \in P_{\Delta }^y\). This is done via a norm relaxation method, which is outlined in the beginning of the section. Furthermore, Sect. 3 deals with invariants and properties of such polynomials and sets them in relation to Gale duals and A-discriminants. In Sect. 4, we discuss amoebas of polynomials \(f \in P_{\Delta }^y\) and how they are related to nonnegativity, respectively, the circuit number. In Sect. 5, we completely characterize the section of the cone of sums of squares with \(P_{\Delta }^y\). Furthermore, we generalize results regarding nonnegativity and sums of squares to non-sparse polynomials with simplex Newton polytope. In Sect. 6, we completely characterize convex polynomials in \(P_{\Delta }^y\). In Sect. 7, we provide and discuss a new class of nonnegativity certificate given by SONC. In Sect. 8, we prove that for non-simplex Newton polytopes Q the lattice point criterion from the simplex case does not suffice to characterize sums of squares. We show that a necessary and sufficient criterion can be given by additionally taking into account the set of possible triangulations of Q. This solves an open problem stated by Reznick in [31]. Finally, in Sect. 9, we provide an outlook for future research possibilities.

Fig. 1
figure 1

The amoeba of the polynomial \(f = x_1^2x_2 + x_1x_2^2 - 4x_1x_2 + 1 \in P_\Delta ^{(1,1)}\) with \(\Delta = \{(0,0),(2,1),(1,2)\}\)

2 Preliminaries

2.1 Nonnegative polynomials and sums of squares

Let \(\mathbb {R}[\mathbf {x}]_d = \mathbb R[x_1,\ldots ,x_n]_{d}\) be the vector space of polynomials in n variables of degree d. Denote the convex cone of nonnegative polynomials as

$$\begin{aligned} P_{n,2d}= & {} \{p \in \mathbb R[\mathbf {x}]_{2d} : p(\mathbf {x}) \ge 0 \,\,\text { for all }\,\, \mathbf {x}\in \mathbb {R}^n\}, \end{aligned}$$

and the convex cone of sums of squares as

$$\begin{aligned} \Sigma _{n,2d}= & {} \left\{ p \in P_{n,2d}: p = \sum _{i=1}^k q_i^2 \,\,\text { for }\,\, q_i\in \mathbb {R}[\mathbf {x}]_{d}\right\} . \end{aligned}$$

For an introduction of nonnegative polynomials and sums of squares, see [3, 21, 22]. Since we are interested in nonnegative polynomials and sums of squares in the class \(P_{\Delta }^y\), we consider the sections

$$\begin{aligned} P_{n,2d}^y = P_{n,2d} \cap P_{\Delta }^y \quad \text { and }\quad \Sigma _{n,2d}^y = \ \Sigma _{n,2d} \cap P_{\Delta }^y. \end{aligned}$$

2.2 Amoebas

For a given Laurent polynomial \(f \in \mathbb {C}[z_1,\ldots ,z_n]\) on a support set \(A \subset \mathbb {Z}^n\) with variety \({\mathcal {V}} (f) \subset (\mathbb {C}^*)^n\), the amoeba \({\mathcal {A}} (f)\) is defined as the image of \({\mathcal {V}} (f)\) under the log-absolute map \({{\mathrm{Log}}}|\cdot |\) defined in (2). Amoebas were first introduced by Gelfand, Kapranov and Zelevinsky in [11]. For an example, see Fig. 1. For an overview, see [7, 25, 28, 33].

Amoebas are closed sets [10]. Their complements consists of finitely many convex components [11]. Each component of the complement of \({\mathcal {A}} (f)\) corresponds to a unique lattice point in \({{\mathrm{conv}}}(A) \cap \mathbb {Z}^n\) via an order map [10].

Components of the complement which correspond to vertices of \({{\mathrm{conv}}}(A)\) via the order map do always exist. For all other components of the complement of an amoeba \({\mathcal {A}} (f)\), the existence depends non-trivially on the choice of the coefficients of f, see [11, 25, 28]. We denote the component of the complement of \({\mathcal {A}} (f)\) of all points with order \(\alpha \in {{\mathrm{conv}}}(A) \cap \mathbb {Z}^n\) as \(E_{\alpha }(f)\).

The fiber \(\mathbb {F}_{\mathbf {w}}\) of each point \(\mathbf {w} \in \mathbb {R}^n\) with respect to the \({{\mathrm{Log}}}|\cdot |\)-map is given by

$$\begin{aligned} \mathbb {F}_{\mathbf {w}}= & {} \{\mathbf {z} \in (\mathbb {C}^*)^n \ : \ {{\mathrm{Log}}}|\mathbf {z}| = \mathbf {w}\}. \end{aligned}$$

It is easy to see that \(\mathbb {F}_{\mathbf {w}}\) is homeomorphic to a real n-torus \((S^1)^n\). For \(f = \sum _{\alpha \in A} b_\alpha \mathbf {z}^{\alpha }\) and \(\mathbf {v} \in (\mathbb {C}^*)^n\), we define the fiber function

$$\begin{aligned} f^{|\mathbf {v}|}: (S^1)^n \rightarrow \mathbb {C}, \quad \phi \mapsto f(e^{{{\mathrm{Log}}}|\mathbf {v}| + i \phi }) = \sum _{\alpha \in A} b_\alpha \cdot |\mathbf {v}|^\alpha \cdot e^{i \langle \alpha , \phi \rangle }. \end{aligned}$$

This means that \(f^{|\mathbf {v}|}\) is the pullback \(\varphi _{|\mathbf {v}|}^*(f)\) of f under the homeomorphism \(\varphi _{|\mathbf {v}|}: (S^1)^n \rightarrow \mathbb {F}_{{{\mathrm{Log}}}|\mathbf {v}|} \subset (\mathbb {C}^*)^n\). The crucial fact about the fiber function is that for its zero set \({\mathcal {V}} (f^{|\mathbf {v}|})\) it holds that

$$\begin{aligned} {\mathcal {V}} (f^{|\mathbf {v}|})\cong & {} {\mathcal {V}} (f) \cap \mathbb {F}_{{{\mathrm{Log}}}|\mathbf {v}|}, \end{aligned}$$
(4)

and hence we have for the amoeba \({\mathcal {A}} (f)\) that

$$\begin{aligned} {{\mathrm{Log}}}|\mathbf {v}| \in {\mathcal {A}} (f)\Leftrightarrow & {} {\mathcal {V}} (f^{|\mathbf {v}|}) \ne \emptyset . \end{aligned}$$
(5)

For more details on the fiber function, see [7, 25, 34, 37].

2.3 Agiforms

Asking for nonnegativity of polynomials supported on a circuit is closely related objects called an agiform in [31]. Given a even lattice simplex \(\Delta \subset \mathbb R^n\) and an interior lattice point \(y \in {{\mathrm{int}}}(\Delta )\), the corresponding agiform to \(\Delta \) and y is given by

$$\begin{aligned} f(\Delta ,\lambda ,y) = \sum _{i=0}^{n} \lambda _i \mathbf {x}^{\alpha (i)} - \mathbf {x}^y \end{aligned}$$

where \(y = \sum _{i=0}^{n} \lambda _i\alpha (i) \in \mathbb {N}^n\) with \(\sum _{i=0}^{n} \lambda _i = 1\) and \(\lambda _i\ge 0\). The term agiform is implied by the fact that the polynomial \(f(\Delta ,\lambda ,y) = \sum _{i=0}^{n} \lambda _i \mathbf {x}^{\alpha (i)} - \mathbf {x}^y\) is nonnegative by the arithmetic-geometric mean inequality. Note that an agiform has a zero at the all ones vector \(\mathbf {1}\). This implies that agiforms lie on the boundary of the cone of nonnegative polynomials. A natural question is to characterize those agiforms that can be written as sums of squares. In [31], it is shown that this depends non-trivially and exclusively on the combinatorial structure of the simplex \(\Delta \) and the location of y in the interior. We need some definitions and results adapted from [31].

Definition 2.1

Let \(\hat{\Delta }= \{0, \alpha (1), \ldots , \alpha (n)\}\subset (2\mathbb N)^n\) be such that \({{\mathrm{conv}}}(\hat{\Delta })\) is a simplex and let \(L \subseteq {{\mathrm{conv}}}(\hat{\Delta }) \cap \mathbb {Z}^n\).

  1. (1)

    Define \(A(L) = \{\frac{1}{2}(s + t) \in \mathbb {Z}^n : s, t\in L \cap (2\mathbb {Z})^n\}\) and \(\overline{A}(L) = \{\frac{1}{2}(s + t) \in \mathbb {Z}^n : s\ne t, s, t\in L \cap (2\mathbb {Z})^n\}\) as the set of averages of even, respectively, distinct even points in \({{\mathrm{conv}}}(L) \cap \mathbb {Z}^n\).

  2. (2)

    We say that L is \(\hat{\Delta }\)-mediated, if

    $$\begin{aligned} \hat{\Delta }\subseteq L \subseteq \overline{A}(L)\cup \hat{\Delta }, \end{aligned}$$

    i.e., every \(\beta \in L{\setminus }\hat{\Delta }\) is an average of two distinct even points in L.

Theorem 2.2

(Reznick [31]) There exists a \(\hat{\Delta }\)-mediated set \(\Delta ^*\) satisfying \(A(\hat{\Delta })\subseteq \Delta ^*\subseteq (\Delta \cap \mathbb {Z}^n)\), which contains every \(\hat{\Delta }\)-mediated set.

If \(A(\hat{\Delta }) = \Delta ^*\), then we say, motivated by the following example by Reznick, that \(\Delta \) is an M-simplex. Similarly, if \(\Delta ^* = (\Delta \cap \mathbb {Z}^n)\), then we call \(\Delta \) an H-simplex.

Example 2.3

The standard (Hurwitz-)simplex given by \({{\mathrm{conv}}}\{0,2d \cdot e_1,\ldots ,2d \cdot e_n\} \subset \mathbb {R}^n\) for \(d \in \mathbb {N}\) is an H-simplex. The Newton polytope \({{\mathrm{conv}}}\{0,(2,4),(4,2)\} \subset \mathbb {R}^2\) of the Motzkin polynomial \(f = 1 + x^4y^2 + x^2y^4 - 3x^2y^2\) is an M-simplex, see Fig. 2.

Fig. 2
figure 2

On the left: The H-simplex \({{\mathrm{conv}}}\{(0,0),(6,0),(0,6)\} \subset \mathbb {R}^2\). On the right: The M-simplex \({{\mathrm{conv}}}\{0,(2,4),(4,2)\} \subset \mathbb {R}^2\). The red (light) points are the lattice points contained in the corresponding sets \(\Delta ^*\)

The main result in [31] concerning the question under which conditions agiforms are sums of squares is given by the following theorem.

Theorem 2.4

(Reznick [31]) Let \(f(\Delta ,\lambda ,y)\) be an agiform. Then, \(f(\Delta ,\lambda ,y) \in \Sigma _{n,2d}\) if and only if \(y \in \Delta ^*\).

3 Invariants and nonnegativity of polynomials supported on circuits

The main contribution of this section is the characterization of \(P_{n,2d}^y\), i.e., the set of nonnegative polynomials supported on a circuit (Theorem 3.8). Along the way, we provide standard forms and invariants, which reflect the nice structural properties of the class \(P_{\Delta }^y\).

In Sect. 3.1, we outline the norm relaxation method, which is the proof method used for the characterization of nonnegativity. In Sect. 3.2, we introduce standard forms for polynomials in \(P_{\Delta }^y\) and, in particular, prove the existence of a particular norm minimizer for polynomials, where the coefficient c equals the negative circuit number \(\Theta _f\) (Proposition 3.4). In Sect. 3.3, we put all pieces together and characterize nonnegativity of polynomials in \(P_{\Delta }^y\) (Theorem 3.8). In Sect. 3.4, we discuss connections to Gale duals and A-discriminants.

3.1 Nonnegativity via norm relaxation

We start with a short outline of the proof method, which we introduce and apply here to tackle the problem of nonnegativity of polynomials. Let \(f = \sum _{\alpha \in A} b_\alpha \mathbf {x}^{\alpha } \in \mathbb {R}[\mathbf {x}]\) be a polynomial with \(A \subset \mathbb {N}^n\) finite, \(0 \in A\) and \(\alpha \in (2\mathbb {N})^n\) as well as \(b_\alpha > 0\) if \(\alpha \) is contained in the vertex set \({{\mathrm{vert}}}(A)\) of \({{\mathrm{conv}}}(A)\). Instead of trying to answer the question whether \(f(\mathbf {x}) \ge 0\) for all \(\mathbf {x} \in \mathbb {R}^n\), we investigate the relaxed problem

$$\begin{aligned} \text {Is } f(|\mathbf {x}|)= & {} \sum _{\alpha \in {{\mathrm{vert}}}(A)} b_\alpha \cdot |\mathbf {x}^{\alpha }| - \sum _{\alpha \in A {\setminus } {{\mathrm{vert}}}(A)} |b_\alpha | \cdot |\mathbf {x}^{\alpha }| \ge 0 \ \text { for all } \ \mathbf {x} \in \mathbb {R}^n_{\ge 0} \text { ?} \end{aligned}$$
(6)

Since \(b_\alpha \cdot |\mathbf {x}^{\alpha }| = b_\alpha \cdot \mathbf {x}^{\alpha }\) for \(\alpha \in V(A)\) and \(-b_\alpha \cdot |\mathbf {x}^{\alpha }| \le b_\alpha \cdot \mathbf {x}^{\alpha }\) for \(\alpha \in A {\setminus } {{\mathrm{vert}}}(A)\), we have \(f(|x|) \le f(x)\).

Since the strict positive orthant \(\mathbb {R}^n_{> 0}\) is an open dense set in \(\mathbb {R}^n_{\ge 0}\) and the componentwise exponential function \({{\mathrm{Exp}}}: \mathbb {R}^n \rightarrow \mathbb {R}_{> 0}^n, (x_1,\ldots ,x_n) \mapsto (\exp (x_1),\ldots ,\exp (x_n))\) is a bijection, Problem (6) is equivalent to the question

$$\begin{aligned} \text {Is } f(e^{\mathbf {w}})= & {} \sum _{\alpha \in {{\mathrm{vert}}}(A)} b_\alpha \cdot e^{\langle \mathbf {w},\alpha \rangle } - \sum _{\alpha \in A {\setminus } {{\mathrm{vert}}}(A)} |b_\alpha | \cdot e^{\langle \mathbf {w},\alpha \rangle } \ge 0 \text { for all } \ \mathbf {w} \in \mathbb {R}^n \text { ?} \end{aligned}$$
(7)

Hence, an affirmative answer of (7) implies nonnegativity of f. The motivation for the relaxation is that, on the one hand, Question (7) is eventually easier to answer, since we have linear operations on the exponents and, on the other hand, the gap between (7) and nonnegativity hopefully is not too big, in particular for sparse polynomials. We show that for polynomials supported on a circuit (and some more general classes of sparse polynomials) both is true: in fact, for circuit polynomials, the question of nonnegativity and (7) is equivalent and can be characterized exactly, explicitly, and easily in terms of the coefficients of f and the combinatorial structure of A.

An interesting side effect of the described relaxation is that (7) is strongly related to the amoeba of f as we point out (for circuit polynomials) in the following Sect. 4. Thus, it will serve us as a bridge between real algebraic geometry and amoeba theory.

3.2 Standard forms and norm minimizers of polynomials supported on circuits

Let f be a polynomial of the Form (1) defined on a circuit \(A = \{\alpha (0),\ldots ,\alpha (n),y\}\) \(\subset \mathbb {Z}^n\). Observe that there exists a unique convex combination \(\sum _{j = 0}^n \lambda _j \alpha (j) = y\). In the following, we assume without loss of generality that \(\alpha (0) = 0\), which is possible, since we can factor out a monomial \(\mathbf {x}^{\alpha (0)}\) with \(\alpha (0) \in (2\mathbb {N})^n\) if necessary. We define the support matrix \(M^A\) by

$$\begin{aligned} M^A= & {} \left( \begin{array}{ccccc} 1 &{} 1 &{} \cdots &{} 1 &{} 1 \\ 0 &{} \alpha (1)_1 &{} \cdots &{} \alpha (n)_1 &{} y_1 \\ \vdots &{} \vdots &{} \ddots &{} \vdots &{} \vdots \\ 0 &{} \alpha (1)_n &{} \cdots &{} \alpha (n)_n &{} y_n \\ \end{array} \right) \in {{\mathrm{Mat}}}(\mathbb {Z},(n+1) \times (n+2)), \end{aligned}$$

and \(M^A_j\) as the matrix obtained by deleting the j-th column of \(M^A\), where we start to count at 0. Furthermore, we always assume that \(b_0 = \lambda _0\), which is always possible, since multiplication with a positive scalar does not affect if a polynomial is nonnegative. We denote the canonical basis of \(\mathbb {R}^n\) with \(e_1,\ldots ,e_n\).

Proposition 3.1

Let f be of the Form (1) supported on a circuit \(A = \{\alpha (0),\ldots ,\alpha (n),y\}\) \(\subset \mathbb {Z}^n\) and \(y = \sum _{j = 0}^n \lambda _j \alpha (j)\) with \(\sum _{j = 0}^n \lambda _j = 1\), \(0 < \lambda _j < 1\) for all j. Let \(\mu \in \mathbb {N}_{> 0}\) denote the least common multiple of the denominators of the \(\lambda _j\). Then, there exists a unique polynomial g of the Form (1) with \({{\mathrm{supp}}}(g) = A' = \{0,\alpha (1)',\ldots ,\alpha (n)',y'\} \subset \mathbb {Z}^n\) such that the following properties hold.

  1. (1)

    \(M^A = \left( \begin{array}{cc} 1 &{} 0 \\ 0 &{} T \\ \end{array}\right) M^{A'}\) for some \(T \in GL_{n}(\mathbb {Q})\),

  2. (2)

    f and g have the same coefficients,

  3. (3)

    \(\alpha (j)' = \mu \cdot e_j\) for every \(1 \le j \le n\),

  4. (4)

    \(y' = \sum _{j = 1}^n \lambda _j \alpha (j)'\),

  5. (5)

    \(f(e^{\mathbf {w}}) = g(e^{T^t\mathbf {w}})\) for all \(\mathbf {w} \in \mathbb {R}^n\).

For every f of the Form (1), we call the polynomial g, which satisfies all the conditions of the proposition, the standard form of f. Note that \(f(e^{\mathbf {w}})\) is defined in the sense of (7) and the support matrix \(M^{A'}\) of the standard form of f is of the shape

$$\begin{aligned} M^{A'}= & {} \left( \begin{array}{cccccc} 1 &{} 1 &{} \cdots &{} \cdots &{} 1 &{} 1 \\ 0 &{} \mu &{} 0 &{} \cdots &{} 0 &{} \mu \lambda _1\\ \vdots &{} 0 &{} \ddots &{} &{} \vdots &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} \mu &{} \mu \lambda _n\\ \end{array}\right) \in {{\mathrm{Mat}}}(\mathbb {Z},(n+1) \times (n+2)). \end{aligned}$$
(8)

Proof

We assume without loss of generality that \(\alpha (0) = 0\). Let \(\overline{M}^A_{n+1}\) be the submatrix of \(M^A_{n+1}\) obtained by deleting the first row and column; analogously for \(\overline{M}^{A'}_{n+1}\). By definition, we have \(\alpha (j) = \overline{M}^A_{n+1} e_j\) and \(\alpha (j)' = \overline{M}^{A'}_{n+1} e_j\) for \(1 \le j \le n\). We construct the polynomial g. We choose the same coefficients for g as for f. Since \(0,\alpha (1),\ldots ,\alpha (n)\) form a simplex, there exists a unique matrix \(T \in GL_{n}(\mathbb {Q})\) such that

$$\begin{aligned} M^A_{n+1}= & {} \left( \begin{array}{cc} 1 &{} 0 \\ 0 &{} T \\ \end{array}\right) M^{A'}_{n+1} \end{aligned}$$

with \(M^{A'}\) of the Form (8) given by \(\mu T = (\overline{M}^A_{n+1})^{-1}\). Since \(y = \sum _{j = 0}^n \lambda _j \alpha (j)\), it follows that, in affine coordinates, we have \(y'_j e_j = T^{-1}\lambda _j (\overline{M}^A_{n+1} e_j)\), i.e., \(y' = \mu (\lambda _0,\ldots ,\lambda _n)\). Thus, (1) – (4) holds.

We show that \(f(e^{\mathbf {w}}) = g(e^{T^t \mathbf {w}})\) for every \(\mathbf {w} \in \mathbb {R}^n\). We investigate the monomial \(\mathbf {x}^{\alpha (j)}\):

$$\begin{aligned} b_j e^{\langle \alpha (j), \mathbf {w} \rangle } \ = \ b_j e^{\langle \overline{M}^{A}_{n+1} e_j, \mathbf {w} \rangle } \ = \ b_j e^{\langle T \overline{M}^{A'}_{n+1} e_j , \mathbf {w} \rangle } \ = \ b_j e^{\langle \alpha (j)', T^t \mathbf {w} \rangle } \end{aligned}$$

For the inner monomials y and \(y'\), we know that \(y = T y'\) and thus for \(y' = \sum _{j = 0}^n \lambda _j \alpha (j)'\) we have \(y = T (\sum _{j = 0}^n \lambda _j \alpha (j)') = \sum _{j = 0}^n \lambda _j T \alpha (j)' = \sum _{j = 0}^n \lambda _j \alpha (j)\). Therefore, (5) follows from

$$\begin{aligned} c e^{\langle y,\mathbf {w} \rangle } \ = \ c e^{\langle \sum _{j = 0}^n \lambda _j \alpha (j),\mathbf {w} \rangle } \ = \ c e^{\sum _{j = 0}^n \lambda _j \langle \alpha (j),\mathbf {w} \rangle } \ = \ c e^{\sum _{j = 0}^n \lambda _j \langle \alpha (j)',T^t \mathbf {w} \rangle } \ = \ c e^{\langle y',T^t \mathbf {w} \rangle }. \end{aligned}$$

\(\square \)

Proposition 3.1 can easily be generalized to polynomials

$$\begin{aligned} f= & {} b_0 + \sum _{j = 1}^n b_j \mathbf {x}^{\alpha (j)} + \sum _{y(i) \in I} a_i \mathbf {x}^{y(i)} \in \mathbb {R}[\mathbf {x}], \end{aligned}$$
(9)

with \({{\mathrm{New}}}(f) = \Delta = {{\mathrm{conv}}}\{0,\alpha (1),\ldots ,\alpha (n)\}\) being a simplex and \(I \subset ({{\mathrm{int}}}(\Delta ) \cap \mathbb {Z}^n)\). Every y(i) has a unique convex combination \(y(i) = \lambda _0^{(i)} + \sum _{j = 1}^n \lambda _j^{(i)} \alpha (j)\) with \(\lambda _j^{(i)} > 0\) for all ij.

Corollary 3.2

Let f be defined as in (9). Then, Proposition 3.1 holds literally if we apply (4) for every y(i) and define \(\mu \) as the least common multiple of the denominators of all \(\lambda _j^{(i)}\).

Proof

By definition of \(\mu \), the support matrix \(M^{A'}\) is integral again. Since in the proof of Proposition 3.1 neither uniqueness of y is used nor special assumptions about y were made, the statement follows. \(\square \)

Now, we return to the case of circuit polynomials.

Proposition 3.3

Let \(f = \lambda _0 + \sum _{j = 1}^n b_j \mathbf {x}^{\alpha (j)} + c \mathbf {x}^{y} \in P_\Delta ^y\) be such that \(c < 0\) and \(y = \sum _{j = 1}^n \lambda _j \alpha (j)\) with \(\sum _{j = 0}^n \lambda _j = 1\), \(\lambda _j\ge 0\). Then, \(f(e^{\mathbf {w}})\) with \(\mathbf {w} \in \mathbb {R}^n\) has a unique extremal point, which is always a minimum.

This proposition was used in [37] (see Lemma 4.2 and Theorem 5.4). For convenience, we give an own, easier proof here.

Proof

We investigate the standard form g of f. For the partial derivative \(x_j \partial g / \partial x_j\) (we can multiply with \(x_j\), since \(e^{\mathbf {w}} \ge 0\)), we have

$$\begin{aligned} x_j \frac{\partial g}{\partial x_j} = b_j \mu x_j^{\mu - 1} + c \lambda _j \mu x_j^{\lambda _j \mu - 1} \prod _{k = 2}^n x_k^{\lambda _k \mu }. \end{aligned}$$

Hence, the partial derivative vanishes for some \(e^{\mathbf {w}}\) if and only if

$$\begin{aligned} \exp \left( w_j \mu - \sum _{k = 1}^n \lambda _k \mu w_k\right)= & {} - \frac{c \lambda _j}{b_j}. \end{aligned}$$

Since the right-hand side is strictly positive, we can apply \(\log |\cdot |\) on both sides for every partial derivative and obtain the following linear system of equations

$$\begin{aligned} \left( E_n - \left( \begin{array}{ccc} \lambda _1 &{} \cdots &{} \lambda _n \\ \vdots &{} \ddots &{} \vdots \\ \lambda _1 &{} \cdots &{} \lambda _n \\ \end{array}\right) \right) \cdot \left( \begin{array}{c} w_1 \\ \vdots \\ w_n \\ \end{array}\right)= & {} \left( \begin{array}{c} 1 / \mu (\log (\lambda _1) + \log (-c) - \log (b_1)) \\ \vdots \\ 1 / \mu (\log (\lambda _n) + \log (-c) - \log (b_n)) \\ \end{array}\right) . \end{aligned}$$

Since the matrix on the left-hand side has full rank, we have a unique solution.

For arbitrary f, we have \(f(e^{\mathbf {w}}) = g(e^{T^t \mathbf {w}})\) by Proposition 3.1 and, hence, if \(\mathbf {w}^*\) is the unique extremal point for \(g(e^{\mathbf {w}})\), then \((T^t)^{-1} \mathbf {w}^*\) is the unique extremal point for \(f(e^{\mathbf {w}})\).

For every \(\mathbf {w} \in \mathbb {R}^n\) with \(||\mathbf {w}|| \rightarrow \infty \), the polynomial f converges against the terms with exponents which are contained in a particular proper face of \({{\mathrm{New}}}(f)\). Since all these terms are strictly positive, \(f(e^{\mathbf {w}})\) converges against a number in \(\mathbb {R}_{> 0} \cup \{\infty \}\). Thus, the unique extremal point has to be a global minimum. \(\square \)

For \(f \in P_{\Delta }^y\), we define \(\mathbf {s}^*_f \in \mathbb {R}^n\) as the unique vector satisfying

$$\begin{aligned} \prod _{k = 1}^n (e^{s_{k,f}^*})^{\alpha (j)_k} = e^{\langle \mathbf {s}^*_f,\alpha (j) \rangle }= & {} \frac{\lambda _j}{b_j} \ \text { for all } \ 1 \le j \le n. \end{aligned}$$

\(\mathbf {s}^*_f\) indeed is well defined, since application of \(\log |\cdot |\) on both sides yields a linear system of equations with variables \(s_{k,f}^*\) and the rank of this system has to be n, since \({{\mathrm{conv}}}(A)\) is a simplex. If the context is clear, then we simply write \(\mathbf {s}^*\) instead of \(\mathbf {s}^*_f\) and \(e^{\mathbf {s}^*}\) instead of \(e^{\mathbf {s}^*_f}\). We recall that the circuit number associated to a polynomial \(f \in P_{\Delta }^y\) is given by \(\Theta _f = \prod _{j = 0}^n \left(\frac{b_j}{\lambda _j}\right)^{\lambda _j} = \prod _{j = 1}^n \left(\frac{b_j}{\lambda _j}\right)^{\lambda _j}\).

Proposition 3.4

For \(f \in P_{\Delta }^y\) and \(c = -\Theta _f\), the point \(\mathbf {s}^* \in \mathbb {R}^n\) is a root and the unique global minimizer of \(f(e^{\mathbf {w}})\).

Due to this proposition, we call the point \(\mathbf {s}^*\) the norm minimizer of f. We remark that this proposition was already shown for polynomials in \(P_{\Delta }^y\) in standard form in [9] and for arbitrary simplices but in a more complicated way in [37].

Proof

For \(f(e^{\mathbf {s}^*})\), we have

$$\begin{aligned} f\left( e^{\mathbf {s}^*}\right)= & {} \lambda _0 + \sum _{j = 1}^n b_j e^{\langle \mathbf {s}^*, \alpha (j) \rangle } - \Theta _f e^{\langle \mathbf {s}^*, y \rangle } = \sum _{j = 0}^n \lambda _j - \Theta _f \cdot \prod _{j = 1}^n \left( \frac{\lambda _j}{b_j} \right) ^{\lambda _j} \ = \ 1 - 1 \ = 0. \end{aligned}$$

For the minimizer statement, we investigate the partial derivatives \(x_j \partial f / \partial x_j\) (we can multiply with \(x_j\), since \(e^{\mathbf {w}} > 0\)). Since \(y_j = \sum _{k = 1}^n \lambda _j \alpha _j(k)\), we obtain

$$\begin{aligned} x_j \frac{\partial f}{\partial x_j}= & {} \sum _{k = 1}^n b_k \alpha _j(k) \mathbf {x}^{\alpha (k)} - \Theta _f \cdot \left( \sum _{k = 1}^n \lambda _j \alpha _j(k)\right) \mathbf {x}^{y}. \end{aligned}$$

Evaluation of the partial derivative at \(e^{\mathbf {s}^*}\) yields

$$\begin{aligned} x_j \frac{\partial f}{\partial x_j}(e^{\mathbf {s}^*})= & {} \sum _{k = 1}^n b_k \alpha _j(k) \left( \frac{\lambda _k}{b_k}\right) - \Theta _f \left( \sum _{k = 1}^n \lambda _j \alpha _j(k)\right) \cdot \prod _{j = 1}^n \left( \frac{\lambda _j}{b_j} \right) ^{\lambda _j} \\= & {} \sum _{k = 1}^n \lambda _j \alpha _j(k) - \sum _{k = 1}^n \lambda _j \alpha _j(k) \ = \ 0. \end{aligned}$$

Finally, by Proposition 3.3, \(e^{\mathbf {s}^*}\) is the unique global minimizer of \(f(e^{\mathbf {w}})\). \(\square \)

In some contexts, it is more convenient to work with a Laurent polynomial supported on a circuit where the interior point y equals the origin. With the same argumentation as before, we find a suitable standard form.

Corollary 3.5

Let f and all notations be as in Proposition 3.1. Then, there exists a unique Laurent polynomial g of the Form (1) with \({{\mathrm{supp}}}(g) = A'' = \{\alpha (0)'',\ldots ,\alpha (n)'',0\} \subset \mathbb {Z}^n\) such that the following properties hold:

  1. (1)

    \(M_A = \left( \begin{array}{cc} 1 &{} 0 \\ 0 &{} T \\ \end{array}\right) M_{A''}\) for some \(T \in GL_n(\mathbb {Q})\),

  2. (2)

    f and g have the same coefficients,

  3. (3)

    \(\alpha (j)'' = \mu \cdot e_j\) for every \(1 \le j \le n\),

  4. (4)

    \(\sum _{j = 0}^n \lambda _j \alpha (j)'' = 0\),

  5. (5)

    \(f(e^{\mathbf {w}}) = g(e^{T^t\mathbf {w}})\) for all \(\mathbf {w} \in \mathbb {R}^n\).

For every polynomial f of the Form (1), we call the polynomial g, which satisfies all conditions in Corollary 3.5, the zero standard form of f. Note that the support matrix \(M^{A''}\) of the zero standard form of f is of the shape

$$\begin{aligned} M^{A''}= & {} \left( \begin{array}{cccccc} 1 &{} 1 &{} \cdots &{} \cdots &{} 1 &{} 1 \\ - \frac{\lambda _1 \mu }{\lambda _0} &{} \mu &{} 0 &{} \cdots &{} 0 &{} 0\\ \vdots &{} 0 &{} \ddots &{} &{} \vdots &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 &{} \vdots \\ - \frac{\lambda _n \mu }{\lambda _0} &{} 0 &{} \cdots &{} 0 &{} \mu &{} 0\\ \end{array}\right) \in {{\mathrm{Mat}}}(\mathbb {Z},(n+1) \times (n+2)). \end{aligned}$$
(10)

Proof

We divide f by \(\mathbf {x}^{y}\), which is always possible, since \(e^{\mathbf {w}} > 0\). We apply literally the proof of Proposition 3.1 with the exception of using the matrix \(M_0^A\) instead of \(M_{n+1}^A\) and the convex combination \(-\lambda _0 \alpha (0) = \sum _{j = 1}^n \lambda _j \alpha (j)\) instead of \(y = \sum _{j = 0}^n \lambda _j \alpha (j)\). \(\square \)

An advantage of the zero standard form is that the global minimizer does not longer depend on the choice of c.

Corollary 3.6

For \(f \in P_{\Delta }^y\), the point \(e^{\mathbf {s}^*}\) is a global minimizer for \((f / \mathbf {x}^y)(e^{\mathbf {w}})\) independent of the choice of c.

Proof

By Corollary 3.5, we can transform f into zero standard form with \(y = 0\). Then, the proof of Proposition 3.4 can be literally applied again with the exception of \((f/\mathbf {x}^y)(e^{\mathbf {w}}) = 0\) if and only if \(c = -\Theta _f\). \(\square \)

3.3 Nonnegativity of polynomials supported on a circuit

In this section, we characterize nonnegativity of polynomials in \(P_\Delta ^y\). The following lemma allows us to reduce the case of \(y \in \partial \Delta \) to the case \(y \in {{\mathrm{int}}}(\Delta )\).

Lemma 3.7

Let \(f = b_0 + \sum _{j=1}^n b_j\mathbf {x}^{\alpha (j)} + c\cdot \mathbf {x}^y\) be such that the Newton polytope is given by \(\Delta = {{\mathrm{New}}}(f) = {{\mathrm{conv}}}\{0,\alpha (1),\dots ,\alpha (n)\}\) and \(y \in \partial \Delta \). Furthermore, let F be the face of \(\Delta \) containing y. Then, f is nonnegative if and only if the restriction of f to the face F is nonnegative.

Proof

For the necessity of nonnegativity of the restricted polynomial, see [31]. Otherwise, the restriction to the face F contains the monomial \(\mathbf {x}^y\) and this restriction is nonnegative. Since all other terms in f correspond to the (even) vertices of \(\Delta \) and have nonnegative coefficients, the claim follows. \(\square \)

Now, we show the first part of our main Theorem 1.1 by characterizing nonnegative polynomials \(f \in P_{\Delta }^y\) supported on a circuit. Recall that we denote such polynomials of degree 2d in in n variables as \(P_{n,2d}^y\). Note that this theorem covers the known special cases of agiforms [31] and circuit polynomials in standard form [9].

Theorem 3.8

Let \(f = \lambda _0 + \sum _{j=1}^n b_j\mathbf {x}^{\alpha (j)} + c\cdot \mathbf {x}^y \in P_{\Delta }^y\) be of the Form (1) with \(\alpha (j) \in (2\mathbb {N})^n\). Then, the following are equivalent.

  1. (1)

    \(f \in P_{n,2d}^y\), i.e., f is nonnegative.

  2. (2)

    \(|c| \le \Theta _f\) and \(y \notin (2\mathbb N)^n\) or \(c \ge - \Theta _f\) and \(y \in (2\mathbb N)^n\).

Proof

First, observe that \(f \ge 0\) is trivial for \(c \ge 0\) and \(y \in (2\mathbb N)^n\), since in this case f is a sum of monomial squares.

We apply the norm relaxation strategy introduced in Sect. 3.1. Initially, we show that \(f(\mathbf {x}) \ge 0\) if and only if \(f(e^{\mathbf {w}}) \ge 0\) for all \(f \in P_{\Delta }^y\). Let without loss of generality \(y_1,\ldots ,y_k\) be the odd entries of the exponent vector y. Thus, for every \(1 \le j \le k\) replacing \(x_j\) by \(-x_j\) changes the sign of the term \(c\cdot \mathbf {x}^y\). Since all other terms of f are nonnegative for every choice of \(\mathbf {x} \in \mathbb {R}^n\), we have \(f(\mathbf {x}) \ge 0\) if \({{\mathrm{sgn}}}(c) \cdot {{\mathrm{sgn}}}(x_1) \cdots {{\mathrm{sgn}}}(x_k) = 1\). Since furthermore, for \({{\mathrm{sgn}}}(c) \cdot {{\mathrm{sgn}}}(x_1) \cdots {{\mathrm{sgn}}}(x_k) = -1\) we have \(c\cdot \mathbf {x}^y = -|c| \cdot |x_1|^{y_1} \cdots |x_n|^{y_n}\), we can assume \(c \le 0\) and \(\mathbf {x} \ge 0\) without loss of generality. Then, \(\lambda _0 + \sum _{j = 0}^n b_j \mathbf {x}^{\alpha (j)} - |c| |\mathbf {x}|^{y}\) is nonnegative for all \(\mathbf {x} \in \mathbb {R}^n\) if and only if this is the case for all \(\mathbf {x} \in \mathbb {R}_{\ge 0}^n\). And since \(\mathbb {R}_{> 0}^{n}\) is an open, dense set in \(\mathbb {R}_{\ge 0}^{n}\), we can restrict ourselves to the strict positive orthant. With the componentwise bijection between \(\mathbb {R}_{> 0}^n\) and \(\mathbb {R}^n\) given by the \({{\mathrm{Exp}}}\)-map, it follows that \(f(\mathbf {x}) \ge 0\) for all \(\mathbf {x} \in \mathbb {R}^n\) if and only if \(f(e^{\mathbf {w}}) \ge 0\) for all \(\mathbf {w} \in \mathbb {R}^n\). Hence, the theorem is shown if we prove that\(f(e^{\mathbf {w}}) \ge 0\) for all \(\mathbf {w} \in \mathbb {R}^n\) if and only if \(c \in [-\Theta _f,0]\).

We fix some arbitrary \(b_1,\ldots ,b_n \in \mathbb {R}_{> 0}\) and denote by \((f_c)_{c \in \mathbb {R}}\) be the corresponding family of polynomials in \(P_{\Delta }^y\). By Proposition 3.4, \(f_c(e^{\mathbf {w}})\) has a unique global minimum for \(c = -\Theta _f\) attained at \(\mathbf {s}^* \in \mathbb {R}^n\) satisfying \(f_{-\Theta _f}(e^{\mathbf {s}^*}) = 0\). Since \(e^{\mathbf {s}^*}\) is a global (norm) minimum, this implies, in particular, \(f_c(e^{\mathbf {w}}) \ge 0\) for all \(\mathbf {w} \in \mathbb {R}^n\) if \(c = -\Theta _f\).

But this fact also completes the proof for general \(c < 0\): Since \(c \cdot e^{\langle \mathbf {w}, y \rangle }\) is the unique negative term in \(f_c(e^{\mathbf {w}})\) for all \(\mathbf {w} \in \mathbb {R}^n\), a term by term inspection yields that \(f_c(e^{\mathbf {w}}) < f_{-\Theta _f}(e^{\mathbf {w}})\) if and only if \(c < -\Theta _f\). Hence, \(f_c(e^{\mathbf {w}}) < 0\) for some \(\mathbf {w} \in \mathbb {R}^n\) if and only if \(c < -\Theta _f\). \(\square \)

An immediate consequence of the theorem is an upper bound for the number of zeros of polynomials \(f \in \partial P_{n,2d}^y\).

Corollary 3.9

Let \(f \in \partial P_{n,2d}^y\). Then, f has at most \(2^n\) affine real zeros \(\mathbf {v} \in \mathbb {R}^n\), which all satisfy \(|x_j| = e^{s_j^*}\) for all \(1 \le j \le n\).

Proof

Assume \(f \in \partial P_{n,2d}^y\) and \(f(\mathbf {x}) = 0\) for some \(\mathbf {x} \in \mathbb {R}^n\). Then, we know by the proof of Theorem 3.8 that \(|x_j| = e^{s_j^*}\). Thus, \(\mathbf {x} = (\pm e^{s_1^*},\ldots ,\pm e^{s_n^*})\). \(\square \)

The bound in Corollary 3.9 is sharp as demonstrated by the well-known Motzkin polynomial \(f = 1 + x_1^2x_2^4 + x_1^4x_2^2 - 3x_1^2x_2^2 \in P_{2,6}^y\). The zeros are given by \(\mathbf {x} = (\pm 1, \pm 1)\). Furthermore, it is important to note that the maximum number of zeros does not depend on the degree of the polynomials, which is in sharp contrast to previously known results concerning the maximum number of zeros of nonnegative polynomials and sums of squares, [6].

To illustrate the results of this section, we give an example. Let \(f = 1 + x_1^2x_2^4 + x_1^4x_2^2 - 3x_1^2x_2^2\) be the Motzkin polynomial. f is supported on a circuit A with \(y = \sum _{j = 0}^2 \frac{1}{3} \alpha (j)\). We apply Proposition 3.1 and compute the standard form g of \(1/3 \cdot f\). Then, g is the polynomial, which is supported on a circuit \(A' = \{0,\alpha (1)',\alpha (2)'\}\) satisfying \(M^A = {\left( \begin{array}{cc} 1 &{} 0 \\ 0 &{} T \\ \end{array}\right) } M^{A'}\) for some \(T \in GL_{n}(\mathbb {Q})\) with \(\alpha (1)' = (\mu ,0)^t\), \(\alpha (2)' = (0,\mu )^t\) and \(y' = 1/3 \alpha (1)' + 1/3 \alpha (2)'\), where \(\mu = {{\mathrm{lcm}}}\{1/\lambda _0,1/\lambda _1,1/\lambda _2\} = {{\mathrm{lcm}}}\{3,3,3\} = 3\). Additionally, g has the same coefficients as f. It is easy to see that

$$\begin{aligned} T= & {} \left( \begin{array}{c@{\quad }c} 4/3 &{} 2/3 \\ 2/3 &{} 4/3 \\ \end{array}\right) \end{aligned}$$

and thus

$$\begin{aligned} g= & {} 1/3 + 1/3 x_1^3 + 1/3 x_2^3 - x_1 x_2 \end{aligned}$$

and, by Proposition 3.1 we have \(f(e^{\mathbf {w}}) = g(e^{T^t \mathbf {w}})\).

Since the circuit number \(\Theta _f\) only depends on the coefficients of f and the convex combination of y, it is invariant with respect to transformation to the standard form. Thus, we have

$$\begin{aligned} \Theta _f = \Theta _g = \prod _{j = 0}^2 \left( \frac{\lambda _j}{b_j}\right) ^{\lambda _j} \ = \ \left( \frac{1/3}{1/3}\right) ^{1/3} \cdot \left( \frac{1/3}{1/3}\right) ^{1/3} \cdot \left( \frac{1/3}{1/3}\right) ^{1/3} = 1. \end{aligned}$$

Since \(y = (2,2) \in (2\mathbb {N})^2\), by Theorem 3.8, \(f \ge 0\) if and only if the inner coefficient c of f satisfies \(c \ge -\Theta _f = -1\). But the inner coefficient c of the Motzkin polynomial equals its negative circuit number. Hence, the Motzkin polynomial is contained in the boundary of the cone of nonnegative polynomials.

If \(c = -\Theta _f\), then we know by Proposition 3.4 that \(f(e^{\mathbf {w}}) = 0\) at the unique point \(\mathbf {s}^*\) with

$$\begin{aligned} 1/3 \cdot e^{4s_1^* + 2s_2^*} = 1/3 \quad \text { and } \quad 1/3 \cdot e^{2s_1^* + 4s_2^*} = 1/3. \end{aligned}$$

Thus, \(\mathbf {s}^* = (0,0)\). Since, by the proof of Theorem 3.8, \(f(\mathbf {x}) = 0\) only if \(f(|x_1|,|x_2|) = 0\), we can conclude that every affine root \(\mathbf {v} \in \mathbb {R}^n\) of the Motzkin polynomial satisfies \(|v_j| = 1\).

We give a second example where nonnegativity is not a priori known. Let \(f = 1/4 + 2 \cdot x_1^2x_2^4 + x_1^4x_2^4 - 2.5 \cdot x_1^2x_2^3\). Again, it is easy to see that \(\lambda _1 = 1/2\) and \(\lambda _2 = 1/4\). Hence,

$$\begin{aligned} \Theta _f= & {} \left( \frac{b_1}{\lambda _1}\right) ^{\lambda _1} \cdot \left( \frac{b_2}{\lambda _2}\right) ^{\lambda _2} \ = \ (2 \cdot 2)^{1/2} \cdot (1 \cdot 4)^{1/4} \ = \ 2 \cdot \sqrt{2} \ \approx \ 2.828. \end{aligned}$$

And since \(|c| < \Theta _f\), we can conclude that f is a strictly positive polynomial.

3.4 A-discriminants and Gale duals

For a given \((n+1) \times m\) support matrix \(M^A\) with \(A \subset \mathbb {Z}^n\) and \({{\mathrm{conv}}}(A)\) being full dimensional, a Gale dual or Gale transformation is an integral \(m \times (m-n-1)\) matrix \(M^B\) such that its rows span the \(\mathbb {Z}\)-kernel of \(M^A\). In other words, for every integral vector \(\mathbf {v} \in \mathbb {Z}^m\) with \(M^A \mathbf {v} = 0\), it holds that \(\mathbf {v}\) is an integral linear combination of the rows of \(M^B\), see [11, 28].

If A is a circuit, then \(M^B\) is a vector with \(n+2\) entries. It turns out that this vector is closely related to the global minimum \(e^{\mathbf {s}^*} \in \mathbb {R}^n\) and the circuit number \(\Theta _f\).

Corollary 3.10

Let \(f = \sum _{j = 0}^n b_j \mathbf {x}^{\alpha (j)} + c \mathbf {x}^y\) be a polynomial supported on a circuit A of the Form (1). Let \(e^{\mathbf {s}^*} \in \mathbb {R}^n\) denote the global minimizer and \(\Theta _f\) the circuit number. Then, the Gale dual \(M^B\) of the support matrix \(M^A\) is an integral multiple of the vector

$$\begin{aligned} \left( b_0 e^{\langle \mathbf {s}^*,\alpha (0) \rangle },\ldots ,b_n e^{\langle \mathbf {s}^*,\alpha (n) \rangle },-\Theta _f e^{\langle \mathbf {s}^*,y \rangle }\right) \ \in \ \mathbb {R}^{n+2}. \end{aligned}$$

Proof

The Gale dual \(M^B\) needs to satisfy \(M^A (M^B)^t = 0\). Since A is a circuit, \(M^B\) spans a one-dimensional vector space. From \(y = \sum _{j = 0}^n \lambda _j \alpha (j)\), it follows by construction of \(e^{\mathbf {s}^*}\) and \(\Theta _f\) (see proof of Proposition 3.4) that

$$\begin{aligned} \left( b_0 e^{\langle \mathbf {s}^*,\alpha (0) \rangle },\ldots ,b_n e^{\langle \mathbf {s}^*,\alpha (n) \rangle },-\Theta _f e^{\langle \mathbf {s}^*,y \rangle }\right)= & {} (\lambda _0,\ldots ,\lambda _n,-1) \end{aligned}$$

and the statement follows by definition of \(M^A\) and y. \(\square \)

Furthermore, we point out that the circuit number \(\Theta _f\) and the question of nonnegativity is closely related to A-discriminants. Let \(A = \{\alpha (1),\ldots ,\alpha (d)\} \subset \mathbb {Z}^n\) and let \(\mathbb {C}^A\) denote the space of all polynomials \(\sum _{j = 1}^d b_j \mathbf {z}^{\alpha (j)}\) with \(b_j \in \mathbb {C}\). Since every (Laurent-) polynomial in \(\mathbb {C}^A\) is uniquely determined by its coefficients, \(\mathbb {C}^A\) can be identified with a \(\mathbb {C}^d\) space. Let \(\nabla _A\) be the Zariski closure of the subset of all polynomials f in \(\mathbb {C}^A\) for which there exists a point \(\mathbf {z} \in (\mathbb {C}^*)^n\) such that

$$\begin{aligned} f(\mathbf {z}) = 0 \text { and } \frac{\partial f}{\partial z_j}(\mathbf {z}) \ = \ 0 \ \text { for all } 1 \le j \le n. \end{aligned}$$

It is well known that \(\nabla _A\) is an irreducible \(\mathbb {Q}\)-variety. If \(\nabla _A\) is of codimension 1, then the A-discriminant \(\Delta _A\) is the integral, irreducible monic polynomial in \(\mathbb {C}[b_1,\ldots ,b_d]\), which has the variety \(\nabla _A\), see [11].

The following statement is an immediate consequence of Proposition 3.4 and Theorem 3.8. But it was (at least implicitly) already known before and can also be derived from [11, 37], and [26].

Corollary 3.11

The A-discriminant vanishes at a polynomial \(f \in P_{\Delta }^y\) if and only if \(f \in \partial P_{n,2d}^y\) or, equivalently, if and only if \(c \in \{\pm \Theta _f\}\) and \(y \notin (2\mathbb {N})^n\) or \(c = -\Theta _f\) and \(y \in (2\mathbb {N})^n\).

4 Amoebas of real polynomials supported on a circuit

In this section, we investigate amoebas of real polynomials supported on a circuit. We show that for amoebas of polynomials of the Form (1), which are not a sum of monomial squares, a point \(\mathbf {w}\) is contained in a bounded component of the complement only if the norm of the “inner” monomial is greater than the sum of all “outer” monomials at \(\mathbf {w} \in \mathbb {R}^n\) (Theorem 4.2). This implies particularly that an amoeba of this type has a bounded component in the complement if and only if the “inner” coefficient c satisfies \(|c| > |\Theta _f|\), which proves the equivalence of (1) and (2) in Theorem 1.1. Furthermore, this result generalizes some statements in [37].

In this section, we always assume that \(f_c\) is a parametric family of a Laurent polynomial of the Form (1) with real parameter \(c \in \mathbb {R}_{\le 0}\). Furthermore, we always assume that \(f_c\) is given in zero standard form (see Sect. 3), i.e.,

$$\begin{aligned} f_c= & {} \sum _{j = 1}^{n+1} b_j \mathbf {x}^{\alpha (j)} + c, \end{aligned}$$
(11)

with \(b_1,\ldots ,b_{n+1} \in \mathbb {R}_{> 0}\). Let \(\mathbf {w} \in \mathbb {R}^n\) be an arbitrary point in the underlying space of \({\mathcal {A}} (f_c)\). As introduced in Sect. 2.2, we denote the fiber with respect to the \({{\mathrm{Log}}}|\cdot |\)-map as \(\mathbb {F}_{\mathbf {w}}\) and the fiber function of \(f_c\) at the fiber \(\mathbb {F}_{\mathbf {w}}\) as \(f_c^{|\exp (\mathbf {w})|}\). We define the following parameters:

$$\begin{aligned} \Theta _{\mathbf {w}}= & {} \sum _{j = 1}^{n+1} |b_j e^{\langle \mathbf {w}, \alpha (j) \rangle }|, \\ \Psi _{\mathbf {w}}= & {} \max _{1 \le j \le n+1} |b_j e^{\langle \mathbf {w}, \alpha (j) \rangle }|. \end{aligned}$$

The following facts about amoebas supported on a circuit are well known.

Theorem 4.1

(Purbhoo, Rullgård, Theobald, de Wolff) Let \(f = \lambda _0 + \sum _{j = 1}^n b_j \mathbf {z}^{\alpha (j)} + c \mathbf {z}^{y} \in \mathbb {C}[z_1^{\pm 1},\ldots ,z_n^{\pm 1}]\) be a Laurent polynomial with \(b_j \in \mathbb {C}^*\) and \(c \in \mathbb {C}\) such that \({{\mathrm{New}}}(f)\) is a simplex and \(y \in {{\mathrm{int}}}({{\mathrm{New}}}(f))\).

  1. (1)

    The complement of \({\mathcal {A}} (f)\) has exactly \(n+1\) unbounded and at most one bounded component. If the bounded component \(E_y(f)\) exists, then it has order y.

  2. (2)

    \(\mathbf {w} \in E_y(f) \subset \mathbb {R}^n\) only if \(|c| > \Psi _{\mathbf {w}}\).

  3. (3)

    \(\mathbf {w} \in E_y(f) \subset \mathbb {R}^n\) if \(|c| > \Theta _{\mathbf {w}}\).

  4. (4)

    The complement of \({\mathcal {A}} (f)\) has a bounded component if \(|c| > \Theta _f\) and the bound is sharp if there exists a point \(\mathbf {\phi }\) on the unit torus \((S^1)^n \subset (\mathbb {C}^*)^n\) such that the fiber function \(f^{1}\) satisfies \(f^{1}(\mathbf {\phi }) = e^{i \psi } \cdot (\sum _{j = 0}^n |b_{\alpha (j)}| - |c|)\) for some \(\psi \in [0,2\pi )\).

Part (1) and (2) are consequences of a Theorem by Rullgård based on tropical geometry, which was applied to the circuit case by Theobald and the second author, see [37, Lemma 2.1] and also [7, Theorem 4.1]. Part (3) is an immediate consequence of Purbhoo’s lopsidedness condition (also referred as generalized Pellet’s Theorem), see [30]. Part (4) is [37, Theorem 4.4] after investigating f in the standard form introduced in Section 3, which guarantees that the bound given in [37, Theorem 4.4] coincides with the circuit number \(\Theta _f\). Note that this means \(\Theta _f = \min _{\mathbf {w} \in \mathbb {R}^n} \Theta _{\mathbf {w}}\). Similarly, we define \(\Psi _{f} = \min _{\mathbf {w} \in \mathbb {R}^n} \Psi _{\mathbf {w}}\). We remark that \(\Psi _f\) is the minimal choice for |c| such that the tropical hypersurface \({\mathcal {T}} ({{\mathrm{trop}}}(f))\) of the tropical polynomial \({{\mathrm{trop}}}(f) = \bigoplus _{j = 1}^{n+1} \log |b_j| \odot \mathbf {x}^{\alpha (j)} \oplus \log |c|\) has genus one, see [7, 37] for details; for an introduction to tropical geometry see [23].

Summarized, Theorem 4.1 yields that the complement of an amoeba \({\mathcal {A}} (f)\) of a real polynomial \(f \in P_{\Delta }^y\) has a bounded component for all choices of \(c < -\Theta _f\), the complement of \({\mathcal {A}} (f)\) has no bounded component for \(c \in [-\Psi _{f},0]\), and the situation is unclear for \(c \in (-\Theta _f,-\Psi _f)\), see Fig. 3. Hence, our goal in this section is to show the following theorem.

Theorem 4.2

Let \(f_c\) be of the Form (11) such that \(b_1,\ldots ,b_{n+1} \in \mathbb {R}_{> 0}\) and \(\mathbf {w} \in \mathbb {R}^n\). Then, \(\mathbf {w} \in {\mathcal {A}} (f_c)\) for every real \(c \in [-\Theta _{\mathbf {w}},-\Psi _{\mathbf {w}}]\).

Fig. 3
figure 3

Existence of a bounded component in the complement in dependence of the choice of the “inner” coefficient. If c is contained in the left (blue) interval, then the complement of \({\mathcal {A}} (f_c)\) has a bounded component. If c is contained in the right (green) interval, then \({\mathcal {A}} (f_c)\) is solid. But if c is contained in the middle (red) interval, then it is in general unclear, whether the complement of \({\mathcal {A}} (f_c)\) has a bounded component or not

Note that for real polynomials \(f_c \in P_{\Delta }^y\) we have \({\mathcal {A}} (f_c) = {\mathcal {A}} (f_{-c})\) if and only if \(y \notin (2\mathbb {N})^n\), since, if \(y_j\) is odd and some \(\mathbf w \notin {\mathcal {A}} (f_c)\), then \(f_c(\mathbf {z}) \ne 0\) for all \(\mathbf {z}\) contained in the fiber torus \(\mathbb {F}_{\mathbf {w}} = \{\mathbf {z} : {{\mathrm{Log}}}|\mathbf {z}| = \mathbf {w}\}\). On the one hand, this torus is invariant under the variable transformation \(z_j \mapsto -z_j\). On the other hand, this transformation transforms \(f_c\) to \(f_{-c}\). Therefore, Theorem 4.2 implies particularly the following corollary, which is literally the equivalence between Part (1) and (2) in our main Theorem 1.1.

Corollary 4.3

Let \(f_c\) be a polynomial in \(P_{\Delta }^y\) such that f is not a sum of monomial squares. Then, \({\mathcal {A}} (f_c)\) is solid if and only if \(|c| \in [0,\Theta _f]\).

Proof

The corollary follows immediately from Theorem 4.1 (2) and (4), Theorem 4.2 (including its consecutive note) and the fact that \(\Theta _f = \Theta _{\mathbf {s}^*} = \min _{\mathbf {w} \in \mathbb {R}^n} \Theta _{\mathbf {w}}\) by Corollary 3.6. \(\square \)

The proof of Theorem 4.2 will be quite a lot of work. We need to show a couple of technical statements before we can tackle the actual proof. The first lemma which we need was similarly used in [37, Theorem 4.1].

Lemma 4.4

Let \(g: S^1 \rightarrow \mathbb {C}, \phi \mapsto b_1 e^{i r \phi } + b_2 e^{i \cdot (\eta + s \phi )}\) for some \(b_1, b_2 \in \mathbb {C}^*\) with \(|b_1| \ge |b_2|\), \(\eta \in [0,2\pi )\) and \(r,s \in \mathbb {N}^*\). Then, there exist some \(\phi , \phi ' \in [0,2\pi )\) such that \(g(\phi ) \in \mathbb {R}_{\ge 0}\) and \(g(\phi ') \in \mathbb {R}_{\le 0}\).

For convenience, we provide the proof again. It is mainly based on the Rouché theorem. Recall that the winding number of a closed curve \(\gamma \) in the complex plane around a point z is given by \(\frac{1}{2\pi i} \int _{\gamma } \frac{d \zeta }{\zeta - z}\).

Proof

Assume \(|b_1| > |b_2|\). Clearly, the function \(b_1 \cdot e^{i \cdot r \phi }\) has a non-zero winding number around the origin. If g would have a winding number of zero around the origin, then there would exist some \(t \in (0,1)\) such that \(h(\phi ) = b_1 \cdot e^{i \cdot r \phi } + t \cdot b_2 \cdot e^{i \cdot (\eta + s \phi )}\) has a zero \(\phi \) outside the origin. This is a contradiction. Hence, the trajectory of g needs to intersect the real line in the strict positive part as well as in the strict negative part.

Since g is continuous in the norms of its coefficients, the statement can be extended to \(|b_1| = |b_2|\) and intersections of g with the nonnegative part as well as the nonpositive part of the real axis. \(\square \)

Now, we step over to complex functions on the real n-torus \((S^1)^n\).

Lemma 4.5

Let \(g: (S^1)^n \rightarrow \mathbb {C}, \phi \mapsto \sum _{j = 1}^n b_j \cdot e^{i \phi _j} + b_{n+1} \cdot e^{-i \sum _{j = 1}^n \lambda _j \phi _j}\) with \(b_1 \ge \ldots \ge b_{n+1} \in \mathbb {R}_{> 0}\) and \(\lambda _j \in \mathbb {Q}\). There exists a path \(\gamma : [0,1] \rightarrow (S^1)^n\) such that \(g(\gamma ) \in \mathbb {R}\), \(g(\gamma (0)) = \sum _{j = 0}^n b_j\) and \(g(\gamma (1)) \le b_1 + b_n + b_{n+1} - \sum _{j = 2}^{n-1} b_j\).

Proof

We set \(\gamma (0) = \mathbf {0} \in (S^1)^n\). We construct \(\gamma : [0,1] \rightarrow (S^1)^n\) piecewise on intervals \([k_{j-1},k_j] \subset [0,1]\) for every \(j \in \{2,\ldots ,n-1\}\) with \(k_1 = 0\) and \(k_{n-1} = 1\). In every interval \([k_{j-1},k_j]\), we only vary \(\phi _1,\phi _j\) and \(\phi _n\) and leave all other \(\phi _r\) invariant, i.e., in every interval \([k_{j-1},k_j]\) we only change the first, j-th, n-th and \((n+1)\)-st term.

In the interval \([k_{j-1},k_j]\), we continuously increase \(\phi _j\) from 0 to \(\pi \). For every \(\phi _j \in [0,\pi ]\), there exists \(\phi _1 \in [-\pi /2,0]\) such that \({{\mathrm{Im}}}(b_1 e^{i \phi _1} + b_j e^{i \phi _j}) = 0\), since \(|b_1| \ge |b_j|\). For every pair \((\phi _1,\phi _j) \in [-\pi /2,0] \times [0,\pi ]\), we find, by Lemma 4.4, a \(\phi _n\) such that \({{\mathrm{Im}}}(b_n e^{i \phi _n} + b_{n+1} e^{-i \sum _{j = 1}^n \lambda _j \phi _j}) = 0\) by setting \(\eta = -\sum _{j = 1}^{n-1} \lambda _j \phi _j\) in Lemma 4.4. Hence, for every \(l \in [k_j,k_{j+1}]\) we have \(g(\gamma (l)) \in \mathbb {R}\). And since g is a smooth function, we obtain a smooth path segment in \((S^1)^n\) with smooth real image under g.

At the endpoint \(\gamma (k_j)\) of the path segment \([k_{j-1},k_j] \subset [0,1]\), we are therefore in the situation \(g(\gamma (k_j)) \le |b_1| + |b_{j+1}| + \cdots + |b_{n-1}| + {{\mathrm{Re}}}(b_n) + {{\mathrm{Re}}}(b_{n+1}) - \sum _{l = 2}^{j} |b_l|\). We can glue together different path segments, since for each \(\gamma (k_j)\) we have \(\phi _1 = 0\) by construction and the value of \(\phi _n\) does not matter. Thus, we can subsequently repeat the procedure for all j until we reach \(j = n-1\) and obtain a complete path \(\gamma \subset (S^1)^n\) with the desired properties. \(\square \)

For the next step of the proof, we need to recall the definition of a hypotrochoid. A hypotrochoid with parameters \(R,r \in \mathbb {Q}_{> 0}\), \(d \in \mathbb {R}_{> 0}\) satisfying \(R \ge r\) is the plane algebraic curve \(\gamma \) in \(\mathbb {R}^2 \cong \mathbb {C}\) given by

$$\begin{aligned} \gamma : [0,2\pi ) \rightarrow \mathbb {C}, \quad \phi \mapsto (R - r) \cdot e^{i \cdot \phi } + d \cdot e^{i \cdot \left(\frac{r - R}{r}\right) \cdot \phi }. \end{aligned}$$
(12)

Geometrically, a hypotrochoid is given the following way: let a small circle \(C_1\) with radius r roll along the interior of a larger circle \(C_2\) with radius R. Mark a point p at the end of a segment with length d starting at the center of \(C_1\). Then, the hypotrochoid is the trajectory of p.

We say that a curve \(\gamma \) is a hypotrochoid up to a rotation if there exists some re-parametrization \(\rho _k : [0,2\pi ) \rightarrow [0,2\pi ), \phi \mapsto k + \phi \mod 2 \pi \) with \(k \in [0,2\pi )\) such that \(\gamma \circ \rho _k^{-1}\) is a hypotrochoid. If \(k = 0\) or \(k = \pi \), then we say that \(\gamma \) is a real hypotrochoid. Hypotrochoids are closed, continuous curves in the complex plane, which attain values in the closed annulus with outer radius \((R-r) + d\) and inner radius \((R-r) - d\) for \((R - r) \ge d\). Furthermore, if they are real, then they are symmetric along the real line. For an overview about hypocycloids and other plane algebraic curves see [4].

To prove the second key lemma, which is needed for the proof of Theorem 4.2, we make use of the following special case of [38, Theorem 4.1].

Lemma 4.6

Let \(g: S^1 \rightarrow \mathbb {C}, \phi \mapsto e^{i s \phi } + q e^{-i t \phi } + p\) with \(p,q \in \mathbb {C}^*\). Then, g is a hypotrochoid up to a rotation around the point p with parameters \(R = (t + s) / t\), \(r = s / t\) and \(d = |q|\) rotated by \(\arg (q) \cdot s\).

The proof of this lemma is a straightforward computation.

Proof

The non-constant part \(g - p\) of the function g is given by

$$\begin{aligned} (g-p)(\phi )= & {} e^{i s \phi } + |q| \cdot e^{i \cdot (\arg (q) - t \phi )}. \end{aligned}$$

Since, with our choice of parameters, \(R - r = 1\) and \((r - R)/r = -t/s\), it follows by (12) after replacing \(\phi \) by \(\phi ' = s \phi \) that \(g - p\) is a hypotrochoid up to a rotation. \(\square \)

Lemma 4.6 about hypotrochoids allows us to prove the following technical lemma.

Lemma 4.7

Let \(g: (S^1)^2 \rightarrow \mathbb {C}, (\phi _1,\phi _2) \mapsto b_1 e^{i \phi _1} + b_2 e^{i \phi _2} + b_{3} e^{-i \cdot (\lambda _1 \phi _1 + \lambda _2 \phi _2)}\) with \(b_1, b_2 \in \mathbb {R}_{> 0}\), \(b_3 \in \mathbb {R}^*\), \(|b_1| \ge |b_2| \ge |b_3|\) and \(\lambda _1,\lambda _2 \in \mathbb {Q}\). Then, g attains all real values in the interval \( [b_1,b_1 + b_2 + b_3] \subset \mathbb {R}_{> 0}\).

With this Lemma, we have everything what is needed to prove Theorem 4.2. We provide the quite long and technical proof of Lemma 4.7 after the proof of Theorem 4.2.

Proof

(Proof of Theorem 4.2) Let \(f_c = \sum _{j = 1}^{n+1} b_j \mathbf {x}^{\alpha (j)} - c\) with \(b_1 \ge \cdots \ge b_{n+1} \in \mathbb {R}_{> 0}\) and \(\mathbf {w} \in \mathbb {R}^n\) such that \(\{\alpha (1),\ldots ,\alpha (n+1),0\} \subset \mathbb {Z}^n\) forms a circuit with 0 in the interior of \({{\mathrm{conv}}}\{\alpha (1),\ldots ,\alpha (n+1)\}\). By Corollary 3.5, we can assume that \(f_c\) is in zero standard form, i.e., we can assume that \(\alpha (j) = \mu e_j \in \mathbb {N}^n\) for \(1 \le j \le n\), with \(\lambda _j \in \mathbb {Q}^*\), \(\mu \in \mathbb {N}^*\) denoting the least common multiple of the denominators of \(\lambda _0,\ldots ,\lambda _n \in \mathbb {Q}_{|(0,1)}\) and \(e_j\) denoting the standard basis vector. By construction, we have \(\alpha (n+1) = - \mu / \lambda _0 \cdot \sum _{j = 1}^n \lambda _j e_j \in \mathbb {Z}^n\). Furthermore, we can assume without loss of generality \(\mathbf {w} = \mathbf {1}\) after adjusting the coefficients \(b_j\) if necessary.

We investigate the fiber function

$$\begin{aligned} f_c^{|\mathbf {1}|}= & {} \sum _{j = 1}^n b_j e^{i \mu \phi _j} + b_{n+1} e^{-i \cdot \frac{\mu }{\lambda _0} \cdot \sum _{j = 1}^n \lambda _j \phi _j} - c. \end{aligned}$$

We have to show that \({\mathcal {V}} (f_c^{|\mathbf {1}|}) \ne \emptyset \) for all c with \(\Psi _f = \Psi _{\mathbf {1}} = |b_1| \le c \le \Theta _{\mathbf {1}}\). By applying Lemma 4.5, \(f_c^{|\mathbf {1}|}\) attains all real values in the interval \([|b_1| + |b_n| + |b_{n+1}| - \sum _{j = 2}^{n-1} |b_j| - c, \Theta _\mathbf {1} - c]\). Hence, if \(|b_n| + |b_{n+1}| - \sum _{j = 2}^{n-1} |b_j| \le 0\), then we are done. This is always the case, if \(n \ge 4\) or if \(n = 3\) and \(b_2 \ge b_3 + b_4\).

Let now \(n \in \{2,3\}\). If \(n = 3\), the we apply Lemma 4.5, fix \(\phi _2 = \pi \), and restrict \(f_c^{|\mathbf {1}|}\) to

$$\begin{aligned} g(\phi _1,\phi _n)_c= & {} b_1 e^{i \phi _1} + b_n e^{i \phi _n} + b_{n+1} e^{- i \cdot \frac{\mu }{\lambda _0} \cdot (\pi \lambda _2 + \lambda _1 \phi _1 + \lambda _n \phi _n)} - \sum _{1 < j < n} b_j - c, \end{aligned}$$

which is defined on the sub 2-torus of \(\mathbb {F}_{|\mathbf {1}|}\) given by \((\phi _1,\phi _n)\). Since \((\mu \cdot \lambda _2) / \lambda _0\) is an integer, \(b_{n+1} \cdot e^{-i \cdot \mu \cdot \lambda _2 / \lambda _0}\) is real and hence we can apply Lemma 4.7. It yields that \(g(\phi _1,\phi _n)_c\) attains all real values in the interval \([b_1 - \sum _{1 < j < n} b_j - c, b_1 + b_n + b_{n+1} - \sum _{1 < j < n} b_j - c]\). Thus, all real values in the interval \([|b_1| + |b_n| + |b_{n+1}| - \sum _{j = 2}^{n-1} |b_j| - c, \Theta _\mathbf {1} - c]\) are attained by \(f_c^{|\mathbf {1}|}\) and hence we find a root of \(f_c^{|\mathbf {1}|}\) for every choice of \(b_1 \le c \le \Theta _{\mathbf {1}}\). Therefore, \(\mathbf {1} \in {\mathcal {A}} (f_c)\) for all \(c \in [-\Theta _f,-\Psi _f]\). \(\square \)

We close the section with the proof of Lemma 4.7.

Proof

(Proof of Lemma 4.7) For every fixed value of \(\phi _1 \in [0,2\pi )\), the values of g are given by a curve of the form

$$\begin{aligned} h_{\phi _1}: [0,2\pi ) \rightarrow \mathbb {C}, \quad \phi _2 \mapsto b_2 e^{i \phi _2} + b_3 e^{-i (\lambda _1 \phi _1 + \lambda _2 \phi _2)} + b_1 e^{i \phi _1}. \end{aligned}$$

By Lemma 4.6, \(h_{\phi _1}\) is a hypotrochoid up to a rotation around the point \(b_1 e^{i \phi _1}\) attaining absolute values in the annulus \(A_{\phi _1}\) with outer radius \(b_2 + b_3\) and inner radius \(b_2 - b_3\) around the point \(b_1 e^{i \phi _1}\) (\(A_{\phi _1}\) degenerates to a disc for \(b_2 = b_3\)). Since \(|b_2| \ge |b_3|\), it follows from Lemma 4.4 that \(h_{0}\) intersects the real coordinate axis in both at least one point greater or equal than \(b_1\) and at least one point less or equal than \(b_1\). More specific, let \(\phi _2(1),\ldots ,\phi _2(k) \in [0,2\pi )\) denote the arguments such that \(\mu _j = g(0,\phi _2(j)) \in \mathbb {R}\) with \(\mu _1 \ge \cdots \ge \mu _k\). Analogously, we denote by \(\phi _2(1)',\ldots ,\phi _2'(l) \in [0,2\pi )\) the arguments such that \(\nu _j = g(\pi ,\phi _2'(j)) \in \mathbb {R}\) with \(\nu _1 \ge \cdots \ge \nu _l\). Note that \(\mu _1 \ge b_1 + b_2 - b_3\), \(\mu _k \le b_1 - b_2 + b_3\) and \(\nu _1 \le -b_1 + b_2 + b_3\) and therefore

$$\begin{aligned} \nu _1 \le \mu _k \le b_1 \le b_1 + b_2 - b_3 \le \mu _1. \end{aligned}$$

The key observation of the proof is the following: \(h_{\phi _1}\) depends continuously on \(\phi _1\). But this means that

$$\begin{aligned}&H: [0,1] \times [0,2\pi ) \rightarrow \mathbb {C}\\&H(\phi _1/(2\pi ),h_{\phi _1}(\phi _2)) = g(\phi _1,\phi _2) = b_1 e^{i \phi _1} + b_2 e^{i \phi _2} + b_3 e^{i (- \lambda _1 \phi _1 - \lambda _2 \phi _2)} \end{aligned}$$

is a homotopy of hypotrochoid curves along the circle with radius \(b_1\).

Since \([b_1,b_1 + b_2 + b_3] \subset [\mu _k,\mu _1] \subset \mathbb {R}\), the proof is completed, if we can show that all real values in the interval \([\mu _k,\mu _{k-1}] \cup \cdots \cup [\mu _2,\mu _{1}] = [\mu _k,\mu _1]\) are attained by g.

Since \(g(0,\phi _2)\) is a real hypotrochoid, i.e., in particular, connected and symmetric along the real line, for every \(1 \le j \le k-1\) there exists a closed connected subset \(\gamma _j\) of the trajectory of the hypotrochoid \(g(0,\phi _2)\) and its pointwise complex conjugate \(\overline{\gamma _j}\) both connecting \(\mu _j\) and \(\mu _{j+1}\), i.e., \(\rho _j = \gamma _j \cup \overline{\gamma _j}\) forms a topological circle intersecting \(\mathbb {R}\) exactly in \(\mu _j\) and \(\mu _{j+1}\) and thus its projection on \(\mathbb {R}\) covers \([\mu _{j+1},\mu _{j}]\). Hence, \(\bigcup _{j = 1}^{k-1} \rho _j\) projected on the real line covers \([\mu _k,\mu _1]\).

Now, we restrict the homotopy H of hypotrochoids to a particular circle \(\rho _j\) and to moving \(\phi _1\) continuously from 0 to \(\pi \), i.e., the induced homotopy is \(H_j: \rho _j \times [0,1] \rightarrow \mathbb {C}\) of the circle \(\rho _j\) moved around the half-circle \(\{b_1 e^{i \cdot \phi _1} : \phi _1 \in [0,2\pi )\}\). Two cases can occur during the homotopy \(H_j\): Either \(\mathbb {R}\) intersects the circle \(\rho _j\) transversally in two points during the whole homotopy, or there exists a point \(\tau \in (0,1)\) such that the circle and \(\mathbb {R}\) intersect non-transversally at \(H_j(\rho _j,\tau )\).

First assume that there exists a point \(\tau \in (0,1)\) along the homotopy such that \(H_j(\rho _j,\tau )\) intersects the real line non-transversally in a single point \(s \in \mathbb {R}\). Hence, \(H_j\) yields in particular a new homotopy \(\widehat{H}_j: \{\mu _j,\mu _{j+1}\} \times [0,\tau ] \rightarrow \mathbb {R}\) of both the two points \(\mu _j\) and \(\mu _{j+1}\) to s along the real line. Thus, for all points \(x \in [\mu _{j+1},\mu _j]\), there exists a \(\tau ' \in [0,\tau ]\) such that \(x = \widehat{H}_j(\mu _j,\tau ')\) or \(x = \widehat{H}_j(\mu _{j+1},\tau ')\), i.e., all points in \([\mu _{j+1},\mu _j]\) are visited during the homotopy \(\widehat{H}_j\) and hence every real value in \([\mu _{j+1},\mu _j]\) is attained by g (see Fig. 4).

Now assume that \(H_j(\rho _j,\tau )\) intersects the real line in two distinct points for every \(\tau \in [0,1]\). Thus, again, there is an induced homotopy of points \(\widehat{H}_j: \{\mu _j,\mu _{j+1}\} \times [0,1] \rightarrow \mathbb {R}\) along the real line. Since \(H_j\) is a restriction of H, we know that \(\widehat{H}_j(\{\mu _j,\mu _{j+1}\},1)\) are real points of the hypotrochoid \(H(\pi ,h_\pi (\phi _2))\), i.e., \(\widehat{H}_j(\{\mu _j,\mu _{j+1}\},1) \in \{\nu _1,\ldots ,\nu _l\}\). Since \(\nu _l \le \cdots \le \nu _1\) and \(\nu _1 \le \mu _j\) for all \(1 \le j \le k\), again, all points in \([\mu _{j+1},\mu _j]\) are visited during the homotopy \(\widehat{H}_j\) (see Fig. 5). \(\square \)

Fig. 4
figure 4

Homotopy of a hypotrochoid where the intersection of the hypotrochoid with the real line becomes empty during the homotopy

Fig. 5
figure 5

Homotopy of a hypotrochoid always intersecting the real line

5 Sums of squares supported on a circuit

In this section, we completely characterize the section \(\Sigma _{n,2d}^y\). It is particularly interesting that this section depends heavily on the lattice point configuration in \(\Delta \), thereby, yielding a connection to the theory of lattice polytopes and toric geometry. By investigating this connection in more detail, we will prove that the sections \(P_{2,2d}^y\) and \(\Sigma _{2,2d}^y\) almost always coincide and that \(P_{n,2d}^y\) and \(\Sigma _{n,2d}^y\) contain large sections, at which nonnegative polynomials are equal to sums of squares for \(n > 2\), see Corollaries 5.10 and 5.12.

Surprisingly, the sums of squares condition is exactly the same as for the corresponding agiforms. For this, we briefly review the Gram matrix method for sums of squares polynomials. For \(d \in \mathbb {N}\) let \(\mathbb {N}_d^n = \{\alpha \in \mathbb {N}^n : \alpha _1+\dots +\alpha _n \le d\}\) and \(p = \sum _{k=1}^{r}h_k^2\) where \(p(\mathbf {x}) = \sum _{\alpha \in \mathbb {N}_{2d}^n}^{}a(\alpha )\mathbf {x}^{\alpha }\) and \(h_k(\mathbf {x}) = \sum _{\beta \in \mathbb {N}_d^n}^{}b_k(\beta )\mathbf {x}^{\beta }\). Let \(B(\beta ) = (b_1(\beta ),\dots ,b_r(\beta ))\) and \(G(\beta ,\beta ') = B(\beta )\cdot B(\beta ') = \sum _{k = 1}^{r}b_k(\beta )b_k(\beta ')\) with \(\beta ,\beta '\in \mathbb {N}_d^n\). Comparing coefficients, one has

$$\begin{aligned} a(\alpha ) = \sum _{\beta +\beta ' = \alpha }^{}G(\beta ,\beta ') \ = \ \sum _{\beta \in \mathbb {N}^{n}_{d}} G(\beta ,\alpha -\beta ). \end{aligned}$$

In this case, \([B(\beta )\cdot B(\beta ')]_{\beta ,\beta ' \in \mathbb {N}_d^n}\) is a positive semidefinite matrix.

Furthermore, we need the following well-known lemma, see [3].

Lemma 5.1

Let \(f \in \Sigma _{n,2d}\) be a sum of squares and \(T \in GL_n(\mathbb {R})\) be a matrix yielding a variable transformation \(\mathbf {x} \mapsto T\mathbf {x}\). Then, \(f(T\mathbf {x})\) also is a sum of squares.

Now, we can characterize the sums of squares among nonnegative polynomials in \(P_{\Delta }^y\).

Theorem 5.2

Let \(f = \lambda _0 + \sum _{j=1}^n b_j \mathbf {x}^{\alpha (j)} + c\cdot \mathbf {x}^y \in P_{n,2d}^y\). Then,

$$\begin{aligned} f \in \Sigma _{n,2d}^y&\text { if and only if }&y \in \Delta ^* \ \text { or } c > 0 \text { and } \ y \in (2\mathbb {N})^n. \end{aligned}$$

Furthermore, if \(f \in \Sigma _{n,2d}^y\), then f is a sum of binomial squares.

Note again that for \(f \in P_\Delta ^y\) the condition \(c > 0\) and \(y \in (2\mathbb {N})^n\) holds if and only if f is a sum of monomial squares such that the above theorem holds trivially.

Proof

First, assume that \(f \in \Sigma _{n,2d}^y\). We can assume that \(c < 0\) by the following argument: If \(y \in (2\mathbb {N})^n\), then f is obviously a sum of (monomial) squares for \(c > 0\). If \(y \notin (2\mathbb {N})^n\) and \(c > 0\), then, by Lemma 5.1 and a suitable variable transformation as in the proof of Theorem 3.8, we can reduce to the case \(c < 0\). Let \(f = \sum _{}^{}h_k^2\) and define \(M = \{\beta : b_k(\beta ) \ne 0 \,\,\, for~some\,\,\, k\}\) with \(\beta \) and \(b_k(\beta )\) as in the Gram matrix method. Following [31, Theorem 3.3], we claim that the set \(L = 2M \cup \hat{\Delta }\cup \{y\}\) is \(\hat{\Delta }\)-mediated and hence \(y \in \Delta ^*\). Here, \(\hat{\Delta }\) is the set of vertices of \(\Delta \). To show the claim, we write every \(\beta \in L{\setminus } \hat{\Delta }\) as a sum of two distinct points in M, which implies that \(\beta \) is an average of two distinct points in \(2M \subseteq L\). Note that if \(G(\alpha ,\alpha ') < 0\), then \(b_k(\alpha )b_k(\alpha ') < 0\) for some k and hence \(\alpha \ne \alpha '\) and \(\alpha , \alpha ' \in M\). Hence, it suffices to show that for \(\beta \in L{\setminus }\hat{\Delta }\) there exists an \(\alpha \) with \(G(\alpha ,\beta -\alpha ) < 0\). We have \(a(y) = c < 0\), so \(G(\alpha _0, y-\alpha _0) < 0\) for some \(\alpha _0\). If \(\beta \ne y\) then \(\beta \in L{\setminus }(\hat{\Delta }\cup \{y\})\) and \(a(\beta ) = 0 = \sum _{}^{} G(\alpha ,\beta -\alpha )\). But \(\beta \in 2M\), so \(G(\frac{1}{2}\beta ,\frac{1}{2}\beta ) > 0\) and hence there has to exist an \(\alpha \) with \(G(\alpha ,\beta -\alpha ) < 0\) to let the sum vanish.

Let now \(y \in \Delta ^*\). We investigate two cases. First, let \(y \notin (2\mathbb N)^n\). Then, it suffices to prove the statement for \(c = \pm \Theta _f\) by the following argument: Let \(f_1 =\lambda _0 + \sum _{j=1}^n b_j \mathbf {x}^{\alpha (j)} - c\cdot \mathbf {x}^y \in P_{n,2d}^y\) and \(f_2 = \lambda _0 + \sum _{j=1}^n b_j \mathbf {x}^{\alpha (j)} + c\cdot \mathbf {x}^y \in P_{n,2d}^y\). Let \(c^*\) be such that \(-c < c^* < c\) and \(f_3 = \lambda _0 + \sum _{j=1}^n b_j \mathbf {x}^{\alpha _j} + c^*\cdot \mathbf {x}^y \in P_{n,2d}^y\). Then, we have \(f_3 = \lambda _1f_1 + \lambda _2f_2\) with \(\lambda _1 = \frac{c+c^*}{2c}\), \(\lambda _2 = \frac{c-c^*}{2c}\) and \(\lambda _1, \lambda _2 > 0\), \(\lambda _1 + \lambda _2 = 1\). By the same argument involving the variable transformation \(x_j \mapsto -x_j\) for some \(j \in \{1,\ldots ,n\}\) as before (proof of Theorem 3.8, Lemma 5.1), it suffices to investigate the case \(c = -\Theta _f\). Consider the following linear transformation of the variables \(x_1,\dots ,x_n\).

$$\begin{aligned} T : (x_1,\dots ,x_n) \mapsto \left( (e^{s^*})_1 x_1,\dots ,(e^{s^*})_nx_n\right) , \end{aligned}$$

where \((e^{s^*})_j\) denotes the j-th coordinate of the global minimizer \(e^{\mathbf {s}^*}\) of f, see Proposition 3.4 and proof of Theorem 3.8. By Lemma 5.1, \(f \in \Sigma _{n,2d}\) if and only if \(f(T(\mathbf {x})) \in \Sigma _{n,2d}\), where

$$\begin{aligned} f(T(\mathbf {x}))= & {} \lambda _0 + \sum _{j=1}^n \lambda _j\mathbf {x}^{\alpha (j)} - \mathbf {x}^y. \end{aligned}$$
(13)

But \(f(T(\mathbf {x}))\) is the dehomogenization of an agiform and, therefore, by Theorem 2.4, \(f \in \Sigma _{n,2d}^y\) if and only if \(y \in \Delta ^*\).

If \(y \in (2\mathbb N)^n\), then we use the same argument to prove that f is a sum of squares for \(c = -\Theta _f\). For \(c > -\Theta _f\), the polynomial f is obviously a sum of squares, since the inner monomial can be written as \(-\Theta _f \mathbf {x}^y\) plus the term \((c + \Theta _f)\mathbf {x}^y\), which is a square.

In [31, Theorem 4.4], it is shown that the agiforms in (13) are sums of binomial squares. Thus, for \(y \in \Delta ^*\), the nonnegative polynomials \(f \in P_{n,2d}^y\) are also sums of binomial squares, since the binomial structure is preserved under the variable transformation T. \(\square \)

Agiforms can be recovered by setting \(b_j = \lambda _j\) and, hence, Theorems 3.8 and 5.2 generalize results for agiforms in [31]. Furthermore, by setting \(\alpha (j) = 2d \cdot e_j\) for \(1\le j\le n\), we recover the dehomogenized version of what is called an elementary diagonal minus tail form in [9], and, again, Theorems 3.8 and 5.2 generalize one of the main results in [9] to arbitrary simplices.

We remark that in [31] an algorithm is given to compute such a sum of squares representation in the case of agiforms in Theorem 5.2, which can be generalized to arbitrary circuit polynomials. Furthermore, in [31], it is shown that every agiform in \(\Sigma _{n,2d}^y\) can be written as a sum of \(|L{\setminus }\hat{\Delta }|\) binomial squares. Using the variable transformation T in the proof of Theorem 5.2, we conclude that a general circuit polynomial \(f\in \Sigma _{n,2d}^y\) also can be written as a sum of \(|L{\setminus }\hat{\Delta }| = |L| - (n + 1)\) binomial squares.

Theorem 5.2 also comes with two immediate corollaries.

Corollary 5.3

Let \(\Delta \) be an H-simplex and \(f \in P_{\Delta }^y\). Then, \(f \in P_{n,2d}^y\) if and only if \(f \in \Sigma _{n,2d}^y\).

Proof

Since \(\Delta \) is an H-simplex, it holds that \(\Delta ^* = (\Delta \cap \mathbb {Z}^n)\) (see Sect. 2.3) and we always have \(y \in \Delta ^*\). \(\square \)

The second corollary concerns sums of squares relaxations for minimizing polynomial functions. For this, note that the quantity \(f_{sos}^* = \max \{\lambda : f - \lambda \in \Sigma _{n,2d}\}\) is a lower bound for \(f^* = \min \{f(\mathbf {x}) : \mathbf {x} \in \mathbb {R}^n\}\), see for example [21].

Corollary 5.4

Let \(f \in P_{\Delta }^y\). Then, \(f_{sos}^* = f^*\) if and only if \(y \in \Delta ^*\).

Proof

We have \(f_{sos}^* = f^*\) if and only if \(f - f^* \in \Sigma _{n,2d}\). However, subtracting the minimum of the polynomial f does not affect the question whether \(y\in \Delta ^*\) or not. Hence, if \(y\in \Delta ^*\), this will also hold for the nonnegative polynomial \(f - f^*\) and vice versa. \(\square \)

As an extension, we consider in the following the case of multiple support points, which are interior lattice points in the simplex \(\Delta = {{\mathrm{conv}}}\{0,\alpha (1),\ldots ,\alpha (n)\}\). Assume that all interior monomials come with a negative coefficient. Then, we can write the polynomial as a sum of nonnegative circuit polynomials if and only if it is nonnegative. Furthermore, we get an equivalence between nonnegativity and sums of squares if the whole support is contained in \(\Delta ^*\). In the following, let \(\{\lambda _0^{(i)},\dots ,\lambda _{n}^{(i)}\}\) be the (unique) convex combination of \(y(i) \in I \subseteq ({{\mathrm{int}}}(\Delta ) \cap \mathbb {N}^{n})\) and scale such that \(b_0 = \sum _{j=1}^{|I|} \lambda _0^{(j)}\).

Theorem 5.5

Let \(f = \sum _{j=1}^{|I|} \lambda _0^{(j)} + \sum _{j=1}^n b_j \mathbf {x}^{\alpha (j)} - \sum _{y(i)\in I}^{} a_i\mathbf {x}^{y(i)}\) such that \({{\mathrm{New}}}(f) = \Delta = {{\mathrm{conv}}}\{0,\alpha (1),\dots ,\alpha (n)\}\) is a simplex with \(\alpha (j) \in (2\mathbb {N})^n\), all \(a_i, b_j > 0\) and \(I \subseteq ({{\mathrm{int}}}(\Delta )\cap \mathbb {N}^n)\). Then,

$$\begin{aligned} f\in P_{n,2d} \ \text { if and only if } \ f \ = \ \sum _{i = 1}^{|I|} E_{y(i)}, \end{aligned}$$

where all \(E_{y(i)} \in P_{\Delta (i)}^{y(i)}\) are nonnegative with support sets \(\Delta (i) \subseteq \{0,\alpha (1),\ldots ,\alpha (n),y(i)\}\).

If furthermore \(I \subseteq \Delta ^*\), then we have

$$\begin{aligned} f\in P_{n,2d}&\text { if and only if } f\in \Sigma _{n,2d} \nonumber \\&\text { if and only if } f \text { is a sum of binomial squares}. \end{aligned}$$
(14)

Particularly, (14) always holds if \(\Delta \) is an H-simplex.

Again, we get an immediate corollary.

Corollary 5.6

Let f be as above with \(I \subseteq \Delta ^*\). Then, \(f_{sos}^* = f^*\).

To prove Theorem 5.5, we need the following lemma.

Lemma 5.7

Let \(f = b_0 + \sum _{j=1}^n b_j \mathbf {x}^{\alpha (j)} - \sum _{y(i)\in I}^{} a_i\mathbf {x}^{y(i)}\) be nonnegative with simplex Newton polytope \({{\mathrm{New}}}(f) = \Delta = {{\mathrm{conv}}}\{0,\alpha (1),\dots ,\alpha (n)\}\) for some \(\alpha (j) \in (2\mathbb {N})^n\). Furthermore, let \(I \subseteq ({{\mathrm{int}}}(\Delta )\cap \mathbb {N}^n)\) and \(a_i, b_j > 0\). Then, f has a global minimizer \(\mathbf {v}^* \in \mathbb {R}_{>0}^n\).

Proof

Since all \(b_j > 0\) and \(\alpha (j) \in (2\mathbb {N})^n\), clearly f has a global minimizer on \(\mathbb {R}^n\). Assume that all global minimizers are not contained in \(\mathbb {R}_{\ge 0}^n\). We make a term by term inspection for a minimizer \(\mathbf {v}\) in comparison with \(|\mathbf {v}| = (|v_1|,\ldots ,|v_n|)\): For every vertex of \(\Delta \), we have \(b_j \mathbf {v}^{\alpha (j)} = b_j |\mathbf {v}^{\alpha (j)}|\); for every interior point, we have \(-a_i |\mathbf {v}|^{y(i)} \le -a_i \mathbf {v}^{y(i)}\) and hence \(f(\mathbf {v}) \ge f(|\mathbf {v}|)\). This is a contradiction and therefore at least one global minimizer \(\mathbf {v}^*\) is contained in \(\mathbb {R}^n_{\ge 0}\).

Assume that for at least one component \(v_j^*\) of \(\mathbf {v}^*\) it holds that \(v_j^* = 0\). We define \(g = b_0 + \sum _{j = 1}^n b_j \mathbf {x}^{\alpha (j)} - a_i \mathbf {x}^{y(i)}\) for one \(y(i) \in I\). By Proposition 3.3, \(g(e^{\mathbf {w}})\) has a unique global minimizer on \(\mathbb {R}^n\) and hence g has a unique global minimizer on \(\mathbb {R}_{> 0}^n\). But, by construction of f and g, we have \(f(\mathbf {x}) < g(\mathbf {x})\) for all \(\mathbf {x} \in \mathbb {R}_{> 0}^n\) and \(f(\mathbf {x}) = g(\mathbf {x})\) for \(\mathbf {x} \in \mathbb {R}_{\ge 0}^n {\setminus } \mathbb {R}_{> 0}^n\). Thus, \(v_j^* \ne 0\) for all \(1 \le j \le n\). \(\square \)

Proof

(Proof of Theorem 5.5) Let \(f = \sum _{j=1}^{|I|} \lambda _0^{(j)} + \sum _{j=1}^n b_j\mathbf {x}^{\alpha (j)} - \sum _{y(i)\in I}^{} a_i\mathbf {x}^{y(i)}\) be nonnegative and, by Lemma 5.7, let \(\mathbf {v} \in \mathbb {R}^n_{> 0}\) be a global minimizer of f.

First, we investigate the case \(\alpha (j) = \alpha _j e_j\) for some \(\alpha _j \in 2\mathbb {N}^*\) and \(e_j\) denoting the j-th standard vector. For any \(1 \le k \le n\), we have

$$\begin{aligned} \left( x_k \frac{\partial f}{\partial x_k}\right) (\mathbf {v})= & {} b_k \cdot \alpha (k)_k \cdot v_k^{\alpha _k} - \sum _{y(i) \in I}a_i\cdot y(i)_k \cdot \mathbf v ^{y(i)} = 0. \end{aligned}$$
(15)

Let, again, \(\lambda _0^{(i)},\dots ,\lambda _{n}^{(i)}\) be the coefficients of the unique convex combination of \(y(i) \in I\) and \(\lambda ^{(i)} = (\lambda _1^{(i)},\ldots ,\lambda _n^{(i)}) \in \mathbb {R}_{> 0}^n\). For \(y(i) \in I\), we define

$$\begin{aligned} b_{y(i),k}= & {} \frac{a_i\cdot \lambda _k^{(i)}\cdot \mathbf {v}^{y(i)}}{\mathbf {v}^{\alpha (k)}}. \end{aligned}$$
(16)

Since for all i and all k it holds that \(\sum _{j = 1}^n \lambda ^{(i)}_k \alpha (j)_k = y(i)_k\) and that all \(\alpha (j)_k = 0\) unless \(j = k\), we obtain with (15) that

$$\begin{aligned} b_k= & {} \sum _{y(i)\in I} b_{y(i),k}. \end{aligned}$$

By Proposition 3.4 and Theorem 3.8, we conclude that

$$\begin{aligned} E_{y(i)}(\mathbf {x})= & {} \lambda _0^{(i)} + \sum _{k=1}^n b_{y(i),k}x_k^{\alpha _k} - a_i \mathbf {x}^{y(i)} \end{aligned}$$

is a nonnegative circuit polynomial and has its minimum value at \(\mathbf {v}\). We obtain

$$\begin{aligned} f(\mathbf {x})= & {} \sum _{j=1}^{|I|} \lambda _0^{(j)} + \sum _{k=1}^n b_k x_k^{\alpha _k} - \sum _{y(i)\in I}a_i \mathbf {x}^{y(i)} \nonumber \\= & {} \sum _{j=1}^{|I|} \lambda _0^{(j)} + \sum _{k=1}^n \left( \sum _{y(i)\in I} b_{y(i), k}\right) x_k^{\alpha _k} - \sum _{y(i) \in I}a_i \mathbf {x}^{y(i)} \nonumber \\= & {} \sum _{y(i) \in I} E_{y(i)}(\mathbf {x}). \end{aligned}$$
(17)

Now, we consider the case of arbitrary \(\alpha (j) \in (2\mathbb {N})^n\). Let \(\mathbf {v} \in \mathbb {R}_{> 0}^n\) be a global minimizer of f. By Corollary 3.2 (and Proposition 3.1), there exists a unique polynomial g satisfying

$$\begin{aligned} f(e^{\mathbf {w}})= & {} g(e^{T^t \mathbf {w}}) \text { for all } \mathbf {w} \in \mathbb {R}^n \end{aligned}$$
(18)

such that \(T \in GL_n(\mathbb {R})\) and g has a support matrix

$$\begin{aligned} M^{A'}= & {} \left( \begin{array}{cccccccc} 1 &{} 1 &{} \cdots &{} \cdots &{} 1 &{} 1 &{} \cdots &{} 1\\ 0 &{} \mu &{} 0 &{} \cdots &{} 0 &{} \mu \lambda _1^{(1)} &{} \cdots &{} \mu \lambda _1^{(|I|)} \\ \vdots &{} 0 &{} \ddots &{} &{} \vdots &{} \vdots &{} \cdots &{} \vdots \\ \vdots &{} \vdots &{} &{} \ddots &{} 0 &{} \vdots &{} \cdots &{} \vdots \\ 0 &{} 0 &{} \cdots &{} 0 &{} \mu &{} \mu \lambda _n^{(1)} &{} \cdots &{} \mu \lambda _n^{(|I|)} \\ \end{array}\right) \ \in \ {{\mathrm{Mat}}}(\mathbb {Z},(n+1) \times (n+ |I|)), \end{aligned}$$

where \(\mu \) is the least common multiple of the denominators of all \(\lambda _j^{(i)}\) and 2 (since vertices of \({{\mathrm{New}}}(g)\) shall be in \((2\mathbb {N})^n\)).

Since \(\mathbf {v} \in \mathbb {R}_{> 0}^n\), we can define \({{\mathrm{Log}}}|\mathbf {v}'| = T^{t} {{\mathrm{Log}}}|\mathbf {v}|\). By (17) and (18), it follows that \(\mathbf {v}'\) is a global minimizer for g and thus we have

$$\begin{aligned} f(\mathbf {v}) = f(e^{{{\mathrm{Log}}}|\mathbf {v}|}) = g(e^{T^{t}{{\mathrm{Log}}}|\mathbf {v}|}) = \sum _{i = 1}^{|I|} E_{\mu \lambda ^{(i)}}(e^{{{\mathrm{Log}}}|\mathbf {v}'|}), \end{aligned}$$

for some nonnegative circuit polynomials \(E_{\mu \lambda ^{(i)}}\) with global minimizer \(\mathbf {v}' \in \mathbb {R}_{> 0}^n\).

Since \({{\mathrm{supp}}}(E_{\mu \lambda ^{(i)}}) \subseteq {{\mathrm{supp}}}(g)\) and \({{\mathrm{New}}}(E_{\mu \lambda ^{(i)}}) = {{\mathrm{New}}}(g)\), we have, by Proposition 3.4,

$$\begin{aligned} E_{\mu \lambda ^{(i)}}(e^{{{\mathrm{Log}}}|\mathbf {v}'|}) = E_{y(i)}(e^{{{\mathrm{Log}}}|\mathbf {v}|}) \end{aligned}$$

such that each \(E_{y(i)}(e^{{{\mathrm{Log}}}|\mathbf {v}|})\) is a nonnegative circuit polynomial with global minimizer \(\mathbf {v}\) and support set \(\{0,\alpha (1),\ldots ,\alpha (n),y(i)\}\) satisfying \(f = \sum _{i = 1}^{|I|} E_{y(i)}\).

If, additionally, every \(y(i) \in \Delta ^*\) (for example, if \(\Delta \) is an H-simplex), then we know by Theorem 1.2 that all \(E_{y(i)}(\mathbf {x})\) are sums of (binomial) squares and, hence, f is a sum of (binomial) squares. \(\square \)

Note that Theorem 5.5 generalizes [9, Theorem 2.7], where an analog statement is shown for the special case of diagonal minus tail forms f, which are given by \(\alpha (j) = 2d\) for \(1 \le j\le n\).

We remark that the correct decomposition of the \(b_j\) in Theorem 5.5 for the case of a general simplex Newton polytope is also given by (16), since due to

$$\begin{aligned} e^{\langle {{\mathrm{Log}}}|\mathbf {v}|,y(i) - \alpha (j) \rangle } = e^{\langle (T^{t})^{-1} {{\mathrm{Log}}}|\mathbf {v}'|, T^t(\mu (\lambda ^{(i)} - e_j))\rangle } \ = \ e^{\langle {{\mathrm{Log}}}|\mathbf {v}'|, \mu (\lambda ^{(i)} - e_j)\rangle } \end{aligned}$$

these scalars remain invariant under the transformation T from and to the standard form.

Example 5.8

The polynomial \(f = 1 + \frac{1}{2}x^6 + \frac{1}{32}y^4 - \frac{1}{2}xy - \frac{1}{2}x^2y\) is nonnegative and has a zero at \(\mathbf {v} = (1,2)\). Using the constructions in Theorem 5.5, we can decompose f as sum of two polynomials in \(P_{n,2d}^y\) with \(y \in \{(1,1),(2,1)\}\) and vanishing at \(\mathbf {v}\). More precisely,

$$\begin{aligned} f = \left( \frac{7}{12} +\frac{1}{6} x^6 + \frac{1}{64}y^4 - \frac{1}{2}xy\right) + \left( \frac{5}{12} +\frac{1}{3} x^6 + \frac{1}{64}y^4 - \frac{1}{2}x^2y\right) . \end{aligned}$$

Since \(\Delta \) is an H-simplex, we have \(f \in \Sigma _{2,6}\). Using the algorithm in [31] and a suitable variable transformation (see proof of Theorem 5.2), we get the following representation for f as a sum of binomial squares:

$$\begin{aligned} f = \frac{1}{2}(x - x^3)^2 + \frac{1}{2}\left( \frac{1}{2}y - x\right) ^2 +\frac{1}{2}\left( \frac{1}{2}y - x^2\right) ^2 + \frac{1}{2}\left( 1 - x^2\right) ^2 + \frac{1}{2}\left( 1 - \frac{1}{4}y^2\right) ^2. \end{aligned}$$

5.1 A sufficient condition for H-simplices

By Theorem 5.2, all nonnegative polynomials in \(P_{\Delta }^y\) supported on an H-simplex are sums of squares. Here, we provide a sufficient condition for a lattice simplex \(\Delta \) to be an H-simplex, meaning, that all lattice points in \(\Delta \) except the vertices are midpoints of two even distinct lattice points in \(\Delta \). In the following, we call a full dimensional lattice polytope \(P \subset \mathbb {R}^n\) k-normal, if every lattice point in kP is a sum of exactly k lattice points in P, i.e.,

$$\begin{aligned} k \in \mathbb {N}, m \in kP \cap \mathbb {Z}^n \ \Rightarrow \ m = m_1 +\ldots + m_k, \quad m_1,\dots , m_k \in P \cap \mathbb {Z}^n. \end{aligned}$$

For an introduction to toric ideals, see for example [36].

Theorem 5.9

Let \(\hat{\Delta }= \{\alpha (0), \alpha (1), \dots , \alpha (n)\}\subset (2\mathbb N)^n\) and \(\Delta = {{\mathrm{conv}}}(\hat{\Delta })\) be a lattice simplex. Furthermore, let \(B = \frac{1}{2}\Delta \cap \mathbb N^n\) and \(I_B\) be the corresponding toric ideal of B. If

  1. (1)

    \(I_B\) is generated in degree two, i.e., \(I_B = \langle I_{B,2} \rangle \) and

  2. (2)

    the simplex \(\frac{1}{2}\Delta \) is 2-normal,

then \(\Delta \) is an H-simplex.

Proof

Let \(L = (\Delta \cap \mathbb N^n){\setminus } \hat{\Delta }\). Note that for \(u \in L{\setminus } (2\mathbb N)^n\) the statement follows from normality of \(\frac{1}{2}\Delta \), since we have \(u = s + t\) with \(s,t\in B\). Therefore, \(u = \frac{2s + 2t}{2}\). Now, let

$$\begin{aligned} \left\{ \frac{1}{2}\alpha (0),\dots ,\frac{1}{2}\alpha (n)\right\} = \{\alpha (0)',\dots ,\alpha (n)'\} \end{aligned}$$

be the vertices of \(\frac{1}{2}\hat{\Delta }\) and consider \(u \in B{\setminus } \frac{1}{2}\hat{\Delta }\). By clearing denominators in the unique convex combination of u, we get a relation

$$\begin{aligned} N\cdot u = \lambda _0\alpha (0)'+\dots + \lambda _n\alpha (n)',\quad N = \sum _{i=0}^n \lambda _i,\quad \lambda _i \ge 0. \end{aligned}$$

For the corresponding toric ideal \(I_B\), this implies that \(x_u^N - \prod _{i=0}^nx_{\alpha (i)'}^{\lambda _i} \in I_B\). Since \(I_B\) is generated in degree two, we have the following representation:

$$\begin{aligned} x_u^N - \prod _{i=0}^nx_{\alpha (i)'}^{\lambda _i} = \sum _{\begin{array}{c} m,n \in \mathbb N^B\\ |m|=|n|=2 \end{array}}^{} f_{m,n}(x^m - x^n) \end{aligned}$$

for some polynomials \(f_{m,n}\). Matching monomials, it follows that there exists an m such that \(x^m = x_u^2\) (note that \(f_{m,n}\) contains \(x_u^{N-2}\)). Since \(|m| = 2\), we have \(x_u^2 - x_vx_{v'} \in I_B\) with \(v,v' \in B\), yielding the relation \(2u = \frac{2v + 2v'}{2}\), i.e., 2u is a convex combination of two even lattice points 2v and \(2v'\). \(\square \)

Corollary 5.10

Let \(\Delta \subset \mathbb R^2\) be a lattice simplex as in Theorem 5.9 such that \(\frac{1}{2}\Delta \) has at least four boundary lattice points. Then, \(\Delta \) is an H-simplex.

Proof

Since every 2-polytope is normal, we only need to prove that the corresponding toric ideal is generated in degree two. But this is [19, Theorem 2.10]. \(\square \)

Hence, in \(\mathbb R^2\), almost every simplex \(\Delta \) corresponding to \(P_{\Delta }^y\) is an H-simplex, which is a fact that was announced in [31] without proof. This implies that the sections \(P_{2,2d}^y\) and \(\Sigma _{2,2d}^y\) almost always coincide.

Example 5.11

We demonstrate Theorem 5.9 by two interesting examples.

  1. (1)

    The Newton polytope of the Motzkin polynomial

    $$\begin{aligned} m = 1 + x^4y^2 + x^2y^4 - 3x^2y^2 \in P_{2,6}{\setminus }\Sigma _{2,6} \end{aligned}$$

    is an M-simplex \(\Delta = {{\mathrm{conv}}}\{(0,0),(4,2),(2,4)\}\) such that \(\frac{1}{2}\Delta \) has exactly three boundary lattice points. One can check that the corresponding toric ideal \(I_B\) is generated by cubics.

  2. (2)

    Note that the conditions in Theorem 5.9 are not equivalent. The lattice simplex \(\Delta = {{\mathrm{conv}}}\{(0,0),(2,4),(10,6)\}\) is easily checked to be an H-simplex, but \(\partial \frac{1}{2}\Delta \) contains exactly three lattice points.

In higher dimensions, things get more involved both in checking the conditions in Theorem 5.9 and in determining the maximal \(\hat{\Delta }\)-mediated set \(\Delta ^*\). Note that \(\Delta ^*\) can lie strictly between \(A(\hat{\Delta })\) and \(\Delta \cap \mathbb {Z}^n\), which correspond to M-simplices and H-simplices. In [31], an algorithm for the computation of \(\Delta ^*\) is given. One expects the existence of better algorithms, but, to our best knowledge, no more efficient algorithm is known. On the other hand, checking normality of polytopes and quadratic generation of toric ideals is an active area of research. It is an open problem to decide, whether every smooth lattice polytope is normal and the corresponding toric ideal is generated by quadrics, see [15, 36]. However, for an arbitrary lattice polytope P, the multiples kP are normal for \(k\ge \dim P - 1\) and their toric ideals are generated by quadrics for \(k \ge \dim P\) [5]. In light of these results, we can conclude another interesting corollary from Theorem 5.9.

Corollary 5.12

Let \(\Delta \subset \mathbb {R}^n\) be a lattice simplex as in Theorem 5.9 such that \(\frac{1}{2}\Delta = M\Delta '\) for a lattice simplex \(\Delta '\subset \mathbb {R}^n\) and \(M\ge n\). Then, \(\Delta \) is an H-simplex.

Proof

The result follows from the previously quoted results together with Theorem 5.9. \(\square \)

Note that Corollaries 5.10 and 5.12 yield large sections at which nonnegative polynomials and sums of squares coincide.

6 Convex polynomials and forms supported on circuits

In this section, we investigate convex polynomials and forms (i.e., homogeneous polynomials) supported on a circuit. Recently, there is much interest in understanding the convex cone of convex polynomials/forms. Since deciding convexity of polynomials is NP-hard in general [1], but very important in different areas in mathematics, such as convex optimization, the investigation of properties of the cones of convex polynomials and forms is a genuine problem.

Definition 6.1

Let \(f \in \mathbb R[\mathbf {x}]\). Then, f is convex if the Hessian \(H_f\) of f is positive semidefinite for all \(\mathbf {x} \in \mathbb R^n\), or, equivalently, \(\mathbf {v}^t H_f(\mathbf {x}) \mathbf {v} \ge 0\) for all \(\mathbf {x},\mathbf {v} \in \mathbb R^n\).

Unlike the property of nonnegativity and sums of squares, convexity of polynomials is not preserved under homogenization. Therefore, we need to distinguish between convex polynomials and convex forms. The relationship between convexity on the one side and nonnegativity and sums of squares on the other side arises when considering homogeneous polynomials, since every convex form is nonnegative. However, the relation between convex forms and sums of squares is not well understood except for the fact that their corresponding cones are not contained in each other. The problem to find a convex form that is not a sum of squares is still open. For an overview and proofs of the previous facts, see [3, 32]. Here, we investigate convexity of polynomials and forms in the class \(P_{\Delta }^y\). We start with the univariate (nonhomogeneous) case.

Proposition 6.2

Let \(f = 1 + ax^y + bx^{2d} \in P_{\Delta }^y\) and \(b > 0\). Then, f is convex exactly in the following cases.

  1. (1)

    \(y = 1\),

  2. (2)

    \(a \ge 0\) and \(y=2l\) for \(y > 1\) and \(l \in \mathbb {N}\).

Proof

Let \(f = 1 + ax^y + bx^{2d}\). Note that the degree is necessarily even and \(b>0\). f is convex if and only if \(D^2(f) \ge 0\) where \(D^2(f) = ay(y-1)x^{y-2} + 2db(2d-1)x^{2d-2}\). For \(y=1\) the polynomial \(D^2(f)\) is a square and hence f is convex. Now, consider the case \(y > 1\). First, suppose that \(a < 0\). Then, \(D^2(f)\) is always indefinite, since the monomial \(x^{y-2}\) in \(D^2(f)\) corresponds to a vertex of the corresponding Newton polytope of \(D^2(f)\) and has a negative coefficient. Otherwise, if \(a \ge 0\) and \(y=2l\) for \(l \in \mathbb {N}\), then \(D^2(f)\ge 0\) and f is convex. If \(y=2l+1\), then \(x^{y-2}\) has an odd power and hence \(D^2(f)\) is indefinite, implying that f is not convex. \(\square \)

The homogeneous version is much more difficult than the affine version. We just prove the following claims instead of giving a full characterization.

Proposition 6.3

Let \(f = z^{2d} + ax^yz^{2d-y} + bx^{2d} \in P_{\Delta }^y\) be a form and \(b > 0\). Then, the following hold.

  1. (1)

    For \(y = 2l - 1\), \(l\in \mathbb N\), or \(a \le 0\), the form f is not convex.

  2. (2)

    For \(y = 2l\) and \(0 \le a \le \frac{(y-1)(2d-y-1)}{y(2d-y)}\) the form f is convex.

Proof

We have

$$\begin{aligned} \frac{\partial ^2f}{\partial z^2} = 2d(2d - 1)z^{2d-2} + (2d - y)(2d - y -1)ax^yz^{2d-y-2}. \end{aligned}$$

Evaluating this partial derivative at \(z = 1\), to be nonnegative, it is obvious that y must be even and \(a \ge 0\), proving the first claim. For the second claim, we investigate the principal minors of \(H_f\). We have that \(\frac{\partial ^2f}{\partial x^2}\ge 0\) if and only if \(D^2(f)\ge 0\) where \(D^2(f)\) is the dehomogenized polynomial \(\frac{\partial ^2f}{\partial x^2}(x,1)\). This yields \(y = 1\) or \(a \ge 0\) and \(y = 2l\). From \(\frac{\partial ^2f}{\partial z^2}\), we get again that y must be even and \(a \ge 0\). Finally, one can check that all exponents of the dehomogenized determinant \(\det H_f(x,1)\) are even and have positive coefficients for \(0 \le a \le \frac{(y-1)(2d-y-1)}{y(2d-y)}\). Hence, for \(y = 2l\) and \(0 \le a \le \frac{(y-1)(2d-y-1)}{y(2d-y)}\), the form f is convex. \(\square \)

Note that for \(y = 1\) the form \(f = z^{2d} + ax^yz^{2d-y} + bx^{2d} \in P_{\Delta }^y\) is never convex, whereas, by Proposition 6.2, the dehomogenized polynomial is always convex. As a sharp contrast, we prove the surprising result that for \(n\ge 2\) there are no convex polynomials in the class \(P_{\Delta }^y\), implying that there are no convex forms in \(P_{\Delta }^y\) for \(n \ge 3\).

Theorem 6.4

Let \(n\ge 2\) and \(f \in P_{\Delta }^y\). Then, f is not convex.

Proof

Let

$$\begin{aligned} f = 1 + \sum _{j=1}^n A_jx_1^{\alpha (j)_1}\cdot \ldots \cdot x_n^{\alpha (j)_n} + Bx_1^{y_1}\cdot \ldots \cdot x_n^{y_n} \end{aligned}$$

with \(A_j > 0\) for \(1\le j \le n\) and \(B \in \mathbb R^*\). We will prove that the principal minor \([1,2]\times [1,2]\) (deleting all rows and columns except the first and second one) of the Hessian of f is indefinite, implying that the Hessian of f is not positive semidefinite and, hence, the polynomial f is not convex. We have

$$\begin{aligned} \frac{\partial ^2 f}{\partial x_1^2}\frac{\partial ^2 f}{\partial x_2^2}-\left( \frac{\partial ^2 f}{\partial x_1x_2}\right) ^2= & {} \sum _{j=1}^n\sum _{i=1}^n\left( \alpha (j)_1(\alpha (j)_1-1)A_jx_1^{\alpha (j)_1-2}x_2^{\alpha (j)_2}\cdot \cdots \cdot x_n^{\alpha (j)_n} \right. \\&\left. +\, y_1(y_1-1)Bx_1^{y_1-2}x_2^{y_2}\cdot \cdots \cdot x_n^{y_n}\right) \\&\times \, \left( \alpha (i)_2(\alpha (i)_2-1)A_ix_1^{\alpha (i)_1}x_2^{\alpha (i)_2-2}\cdot \cdots \cdot x_n^{\alpha (n)_i}\right. \\&\left. +\, By_2(y_2-1)x_1^{y_1}x_2^{y_2-2}x_3^{y_3}\cdot \ldots \cdot x_n^{y_n}\right) \\&-\left( \sum _{k=1}^n\alpha (k)_1\alpha (k)_2A_kx_1^{\alpha (k)_1-1}x_2^{\alpha (k)_2-1}x_3^{\alpha (k)_3}\cdot \cdots \cdot x_n^{\alpha (k)_n} \right. \\&\left. +\, By_1y_2x_1^{y_1-1}x_2^{y_2-1}x_3^{y_3}\cdot \cdots \cdot x_n^{y_n}\right) ^2.\\ \end{aligned}$$

We claim that there is a point \(\mathbf {x} \in \mathbb R^n\) at which this minor is negative. For this, note that all exponents in \(\left( \frac{\partial ^2 f}{\partial x_1x_2}\right) ^2\) are captured by those in \(\frac{\partial ^2 f}{\partial x_1^2}\frac{\partial ^2 f}{\partial x_2^2}\). Hence, we can restrict to the latter ones. The \(\left( {\begin{array}{c}n+2\\ 2\end{array}}\right) \) different exponents are of the following type:

  1. (1)

    \((2\alpha (j)_1 - 2, 2\alpha (j)_2 - 2,2\alpha (j)_3,\ldots ,2\alpha (j)_n)\) for \(1\le j\le n\),

  2. (2)

    \((\alpha (i)_1 + \alpha (j)_1 - 2, \alpha (i)_2 + \alpha (j)_2 - 2,\alpha (i)_3 + \alpha (j)_3,\ldots ,\alpha (i)_n + \alpha (j)_n)\) for \(1\le i < j\le n\),

  3. (3)

    \((\alpha (j)_1 + y_1 - 2, \alpha (j)_2 + y_2 - 2, \alpha (j)_3 + y_3,\ldots ,\alpha (j)_n + y_n)\) for \(1\le j\le n\),

  4. (4)

    \((2y_1 - 2, 2y_2 - 2,2y_3,\ldots ,2y_n)\).

We claim that the point \((2y_1 - 2, 2y_2 - 2,2y_3,\ldots ,2y_n)\) is always a vertex in the convex hull of the points (1–4), i.e., in the Newton polytope of the investigated minor. The points in (2) are obviously convex combinations from appropriate points in (1) and the points in (3) are convex combinations from points in (1) and (4). Hence, it remains to show that (4) is not a convex combination of the points in (1). Therefore, denote the points in (1) by \(P_j\) and the point in (4) by Q. Let

$$\begin{aligned} Q= & {} \sum _{j=1}^n \mu _j P_j \text { with } \sum _{j=1}^n \mu _j = 1 \text { and } \mu _j\ge 0 \text { for all } \ 1\le j\le n. \end{aligned}$$

But since \(\sum _{j = 1}^n \mu _j (-2) = -2\), this equation is equivalent to

$$\begin{aligned} y= & {} \sum _{j=1}^n \mu _j\alpha (j) \text { with } \sum _{j=1}^n \mu _j = 1 \text { and } \mu _j \ge 0 \text { for all } 1\le j\le n. \end{aligned}$$

But this means that y lies on the boundary of \(\Delta \), the Newton polytope of f. This is a contradiction, since \(f \in P_{\Delta }^y\), i.e., \(y\in {{\mathrm{int}}}(\Delta )\). Hence, (4) is a vertex of the Newton polytope of the investigated minor. Extracting the coefficient of its corresponding monomial in the minor, we get that this coefficient equals \(-B^2y_1y_2(y_1 + y_2 - 1) < 0\). Therefore, the Newton polytope of the minor of the Hessian of f has a vertex coming with a negative coefficient and, hence, it is indefinite, proving the claim. \(\square \)

Note that this already implies that there is also no convex form in \(P_{\Delta }^y\) whenever \(n\ge 3\), since non-convexity is preserved under homogenization. Since it is mostly unclear which structures prevent polynomials from being convex, Theorem 6.4 is an indication that sparsity is among these structures.

7 Sums of nonnegative circuits

Motivated by results in previous sections, we recall Definition 1.3 from the introduction, where we introduced sums of nonnegative circuit polynomials (SONC’s), a new family of nonnegativity certificates.

Definition 7.1

We define the set of sums of nonnegative circuit polynomials (SONC) as

$$\begin{aligned} C_{n,2d} = \left\{ f \in \mathbb {R}[\mathbf {x}]_{2d} : f = \sum _{i=1}^k \lambda _i g_i, \lambda _i \ge 0, g_i \in P_{\Delta _i}^y\cap P_{n,2d}\right\} \end{aligned}$$

for some even lattice simplices \(\Delta _i \subset \mathbb {R}^n\).

Remember that membership in \(P_{n,2d}^y\) can easily be checked and is completely characterized by the circuit numbers \(\Theta _{g_i}\) (Theorem 3.8). Obviously, for \(\alpha ,\beta \in \mathbb {R}_{>0}\) and \(f,g \in C_{n,2d}\), it holds that \(\alpha f + \beta g \in C_{n,2d}\), hence, \(C_{n,2d}\) is a convex cone. Then, we have the following relations.

Proposition 7.2

The following relationships hold between the corresponding cones.

  1. (1)

    \(C_{n,2d} \subset P_{n,2d}\) for all \(d,n \in \mathbb {N}\),

  2. (2)

    \(C_{n,2d} \subset \Sigma _{n,2d}\) if and only if \((n,2d)\in \{(1,2d),(n,2),(2,4)\}\),

  3. (3)

    \(\Sigma _{1,2} \subset C_{1,2}\) and \(\Sigma _{n,2d} \not \subset C_{n,2d}\) for all (n, 2d) with \(2d \ge 6\).

  4. (4)

    \(P_{\Delta }^{y} \cap K_{n,2d} = \{0\}\) for \(n \ge 2\), where \(K_{n,2d}\) denotes the cone of convex polynomials.

Proof

Since all \(\lambda _i g_i \in P_{n,2d}\), the first inclusion is obvious. For the second part note that one direction follows from the first inclusion and Hilbert’s Theorem [16] stating that \((n,2d)\in \{(1,2d),(n,2),(2,4)\}\) if and only if \(P_{n,2d} = \Sigma _{n,2d}\). Conversely, if \((n,2d)\notin \{(1,2d),(n,2),(2,4)\}\) then one can use homogenizations of the Motzkin polynomial and the dehomogenized agiform \(N = 1 + x^2y^2 + y^2z^2 + x^2z^2 - 4xyz \in P_{3,4}{\setminus }\Sigma _{3,4}\) to obtain polynomials in \(C_{n,2d}{\setminus }\Sigma _{n,2d}\).

Considering (3) note that if \((n,2d) = (1,2)\) then \(\Sigma _{1,2} = P_{1,2} = C_{1,2}\). In other cases, we make use of the following observations. By Corollary 3.9, a polynomial \(f \in C_{n,2d}\) has at most \(2^n\) zeros. Additionally, by [6, Proposition 4.1], there exist polynomials in \(\Sigma _{n,2d}\) with \(d^n\) zeros. The only cases, for which the claim does not follow by this argument is the case \((n,2d) = (n,4)\). (4) follows from Theorem 6.4. \(\square \)

Hence, the convex cone \(C_{n,2d}\) serves as a nonnegativity certificate, which, by Proposition 7.2, is independent from sums of squares certificates.

Example 7.3

Let \(f = 3 + 4y^4 + 6x^8 + x^4y^4 - 3xy + 5x^3y + 2x^4y^2\). The Newton polytope \({{\mathrm{New}}}(f) = {{\mathrm{conv}}}\{(0,0)^T, (0,4)^T, (4,4)^T, (8,0)^T)\}\) is not a simplex and \(f \in C_{2,8}\). An explicit representation is given by

$$\begin{aligned} f = (1 + 2x^8 + 2y^4 - 3xy) + (1 + 3x^8 + 2y^4 + 5x^3y) + (1 + x^8 + x^4y^4 + 2x^4y^2). \end{aligned}$$

We give two further remarks about the Proposition 7.2:

  1. (1)

    As stated in the proof (n, 4) is the case, which is not covered in Part (3). We believe that \(\Sigma _{n,4} \not \subset C_{n,4}\) for all n but we do not have an example.

  2. (2)

    Let \(\widehat{C}_{n,2d}\) be the subset of \(C_{n,2d}\) containing all polynomials with a full dimensional Newton polytope. It is not obvious for which cases next to \((n,2d) \in \{(1,2d),(n,2),(2,4)\}\) it holds that \(\widehat{C}_{n,2d} \subseteq \Sigma _{n,2d}\). However, \(\widehat{C}_{n,2d} \not \subseteq \Sigma _{n,2d}\) if we require \(d \ge n+1\) as we show in the following example.

Example 7.4

Let \(f = 1 + \sum _{j = 1}^n \mathbf {x}^{\alpha (j)} - c \cdot \mathbf {x}^{(2,\ldots ,2)}\) with \(\alpha (j) = (2,\ldots ,2) + 2 \cdot e_j\) where \(e_j\) denotes the j-th unit vector and \(n+1 \le c < 0\). By Theorem 3.8, we conclude that f is a nonnegative circuit polynomial in n variables of degree \(2n+2\). Hence, \(f \in C_{n,2d}\) for all n and \(d \ge n+1\). Moreover, \({{\mathrm{New}}}(f)\) is an n-dimensional polytope by construction. But \(f \notin \Sigma _{n,2d}\). Namely, it is easy to see that the simplex \(1/2 \cdot {{\mathrm{New}}}(f) = {{\mathrm{conv}}}\{0,1/2 \cdot \alpha (1),\ldots ,1/2 \cdot \alpha (n)\}\) only contains the lattice point \((1,\ldots ,1)\) in the interior. Therefore, \({{\mathrm{New}}}(f)\) has exactly one even lattice point in the interior, the point \((2,\ldots ,2)\). It follows from a statement by Reznick [31, Theorem 2.5] that \({{\mathrm{New}}}(f)\) is an M-simplex. Hence, \(f \notin \Sigma _{n,2d}\) by Theorem 1.2.

Of course, a priori it is completely unclear for which type of nonnegative polynomials a SONC decomposition exists and how big the gap between \(C_{n,2d}\) and \(P_{n,2d}\) is. Furthermore, it is not obvious how to compute such a decomposition, if it exists. We discuss this question in a follow-up article [17]. In this article, we show in particular that for simplex Newton polytopes (with arbitrary support) such a decomposition exists if and only if a particular geometric optimization problem is feasible, which can be checked very efficiently. This generalizes similar results by Ghasemi and Marshall [12, 13]. Here, we deduce as a fruitful first step the following corollary from Theorem 5.5.

Corollary 7.5

Let \(f = b_0 + \sum _{j = 1}^n b_j \mathbf {x}^{\alpha (j)} + \sum _{i = 1}^k a_i \mathbf {x}^{y(i)}\) be nonnegative with \(b_j \in \mathbb {R}_{> 0}\) and \(a_i \in \mathbb {R}^*\) such that \({{\mathrm{New}}}(f) = \Delta = {{\mathrm{conv}}}\{0,\alpha (1),\ldots ,\alpha (n)\}\) is a simplex and all \(y(i) \in ({{\mathrm{int}}}(\Delta ) \cap \mathbb {N}^n)\). If there exists a vector \(\mathbf {v} \in (\mathbb {R}^*)^n\) such that \(a_i \mathbf {v}^{y(i)} < 0\) for all \(1 \le i \le k\), then f is SONC.

Proof

Every monomial square is a strictly positive term as well as a 0-simplex circuit polynomial. Thus, we can ignore these terms. If a particular vector \(\mathbf {v} \in (\mathbb {R}^*)^n\) with the desired properties exists, then Theorem 5.5 immediately yields a SONC decomposition after a variable transformation \(x_j \mapsto -x_j\) for all j with \(v_j < 0\). \(\square \)

8 Extension to arbitrary polytopes and counterexamples

In Sect. 5, we proved for \(f \in P_{\Delta }^y\) that \(f\in \Sigma _{n,2d}^y\) if and only if \(y \in \Delta ^*\) or f is a sum of monomial squares. One might wonder whether this equivalence also holds for arbitrary polytopes. More precisely, let \(Q \subset \mathbb {R}^n\) be an arbitrary lattice polytope and denote by \(AP_{Q}^y\) the set of all polynomials of the form \(\sum _{\alpha \in {{\mathrm{vert}}}(Q)} b_\alpha \mathbf {x}^{\alpha } + c \mathbf {x}^y\) that are supported on the vertices \({{\mathrm{vert}}}(Q)\) of Q and an additional interior lattice point \(y \in {{\mathrm{int}}}(Q)\). As a generalization of our previous notation, we call \(f \in AP_Q^y\) an agiform if \(\sum _{\alpha \in {{\mathrm{vert}}}(Q)} b_\alpha \alpha = y\) and \(\sum _{\alpha \in {{\mathrm{vert}}}(Q)} b_\alpha = 1\) as well as \(b_\alpha > 0\) and \(c = -1\).

In [31, Section 10], it is asked, whether the lattice point criterion \(y \in Q^*\) is again an equivalent condition for a polynomial in \(AP_Q^y\) to be a sum of squares. And, if not, how sums of squares can be characterized in this case. Here, we provide a solution to this question (Theorem 8.2). Let \(P_{Q}^y\), respectively, \(\Sigma _{Q}^y\) denote the set of nonnegative, respectively, sums of squares polynomials in \(AP_Q^y\). As for a simplex \(\Delta \), for an arbitrary lattice polytope Q, we use the same definition of an M-polytope, respectively, an H-polytope.

The implication \(f\in \Sigma _{Q}^y \Rightarrow y \in Q^*\) does always hold. For agiforms, this is proven already in [31]. The proof in the case of arbitrary coefficients follows exactly the same line as the proof of Theorem 5.2.

Proposition 8.1

There exists \(f\in P_{Q}^y{\setminus } \Sigma _{Q}^y\) and \(y \in Q^*\).

Proof

We provide an explicit example. Let

$$\begin{aligned} Q = {{\mathrm{conv}}}\{v_0,v_1,v_2,v_3\} = {{\mathrm{conv}}}\{(0,0), (4,0), (4,2), (2,4)\} \ \text { with } \ y \ = \ (2,2). \end{aligned}$$

It is easy to check that Q is an H-polytope (indeed, it can actually be proven that Theorem 5.9 is true for arbitrary polytopes not just for simplices). Since Q is not a simplex, there are infinitely many convex combinations of y:

$$\begin{aligned} y = \lambda _0v_1 + \lambda _1v_1 + \lambda _2v_2 + \lambda _3v_3 \ \text { such that } \ \sum _{i=0}^3 \lambda _i = 1 \ \text { and } \ \lambda _i \ge 0. \end{aligned}$$

The set of convex combinations of y is given by

$$\begin{aligned} \left\{ (\lambda _0,\lambda _1,\lambda _2,\lambda _3) = \left( \frac{1}{2} - \frac{1}{2}\lambda _3, -\frac{1}{2} + \frac{3}{2}\lambda _3, 1 - 2\lambda _3, \lambda _3\right) : \frac{1}{3}\le \lambda _3 \le \frac{1}{2}\right\} . \end{aligned}$$

The corresponding agiform \(f(Q,\lambda ,y)\) is then given by

$$\begin{aligned} f(Q,\lambda ,y) = \left( \frac{1}{2} - \frac{1}{2}\lambda _3\right) + \left( -\frac{1}{2} + \frac{3}{2}\lambda _3\right) x^4 + ( 1 - 2\lambda _3)x^4y^2 + \lambda _3x^2y^4 - x^2y^2. \end{aligned}$$

For \(\lambda _3 = \frac{2}{5}\), the nonnegative polynomial

$$\begin{aligned} f = \frac{3}{10} + \frac{1}{10}x^4 + \frac{1}{5}x^4y^2 + \frac{2}{5}x^2y^4 - x^2y^2 \end{aligned}$$

can easily be checked to be not a sum of squares although \(y \in Q^*\) via the corresponding Gram matrix. \(\square \)

Actually, one can prove that the polynomial \(f(Q,\lambda ,y)\) in the above proof is a sum of squares if and only if \(\lambda _3 = \frac{1}{2}\). In [31], the author suspects that the condition \(y \in Q^*\) is not sufficient by looking at similar examples. However, in all of these examples, the constructed polynomials that are nonnegative but not a sum of squares are not supported on the vertices of Q and an additional interior lattice point \(y \in {{\mathrm{int}}}(Q)\). We conclude that in the non-simplex case the problem of deciding the sums of squares property depends on the coefficients of the polynomials, a sharp contrast to the simplex case. However, motivated by a question in [31] for agiforms, we are interested in the following sets: let C(y) denote the set of convex combinations of the interior lattice point \(y \in {{\mathrm{int}}}(Q)\), i.e.,

$$\begin{aligned} C(y) = \left\{ \lambda = (\lambda _0,\dots ,\lambda _s) \ : \ y = \sum _{i=0}^s \lambda _iv_i,\,\,\sum _{i=0}^s \lambda _i = 1,\,\, \lambda _i \ge 0\right\} \end{aligned}$$

where \(v_i\) are the s vertices of Q. Note that C(y) is a polytope. Fixing f and y, we define

$$\begin{aligned} {{\mathrm{SOS}}}(f,y) = \{\lambda \in C(y) \ : \ f(Q, \lambda , y) \ \text { is a sum of squares}\} \end{aligned}$$

where \(Q = {{\mathrm{New}}}(f)\). We have already seen in the proof of Proposition 8.1 that the structure of \({{\mathrm{SOS}}}(f,y)\) is unclear and highly depends on the convex combinations of y. It is formulated as an open question in [31], whether one can say something about \({{\mathrm{SOS}}}(f,y)\) for fixed f and y. For this, let

$$\begin{aligned} Q = Q_1^{(i)}\cup \dots \cup Q_{r(i)}^{(i)} \end{aligned}$$

be a triangulation of Q for \(1 \le i \le t\), where t is the number of triangulations of Q without using new vertices. We are interested in those simplices \(Q^{(i)}_j\) that contain the point \(y \in {{\mathrm{int}}}(Q)\) and their maximal mediated sets \((Q^{(i)}_j)^*\). Recall that for every lattice simplex \(\Delta \) with vertex set \(\hat{\Delta }\), we denote \(\Delta ^*\) as the maximal \(\hat{\Delta }\)-mediated set (see Sect. 2.3).

Theorem 8.2

Let \(Q \subset \mathbb {R}^n\) be a lattice n-polytope, \(y \in {{\mathrm{int}}}(Q) \cap \mathbb {N}^n\) and \(f \in AP^y_Q\) be an agiform. Then, \({{\mathrm{SOS}}}(f,y) = C(y)\), i.e., every agiform is a sum of squares, if and only if \(y \in Q^{(i)}_j\) implies \(y \in (Q_{j}^{(i)})^*\) for every \(1\le i\le t\) and \(1 \le j \le r(i)\).

Proof

Assume \(y \in Q^{(i)}_j \Rightarrow y \in (Q_{j}^{(i)})^*\) for every \(1\le i\le t\) and \(1 \le j \le r(i)\). Let \(\lambda \in C(y)\) with \(f(Q,\lambda ,y)\) being the corresponding agiform. By [31, Theorem 7.1], every agiform can be written as a convex combination of simplicial agiforms. In fact, following the proof in [31, Theorem 7.1], it can be verified that the vertices of the corresponding simplicial agiforms form a subset of the vertices of Q, since the set C(y) of convex combinations of y is a polytope with vertices being a subset of \({{\mathrm{vert}}}(Q)\). Hence, these agiforms come from triangulating the polytope Q into simplices without using new vertices. Since \(y \in Q^{(i)}_j \Rightarrow y \in ({Q_j^{(i)}})^*\) for every ij, by Theorem 2.4, the corresponding simplicial agiforms are always sums of squares and since \(f(Q,\lambda ,y)\) is a sum of them, the claim follows.

For the reverse direction, assume \(y \in Q_{j}^{(i)}\) and \(y \notin (Q_{j}^{(i)})^*\) for some ij. We prove that this implies \({{\mathrm{SOS}}}(f,y) \ne C(y)\). Suppose \({{\mathrm{vert}}}(Q) = \{v_1,\dots ,v_m\}\). Then, C(y) is a polytope of dimension \(d = m - (n+1)\). Let

$$\begin{aligned} f(Q,\lambda ,y) = \sum _{i=1}^m \lambda _i(\mu _1,\dots ,\mu _d)x^{v_i} - x^y \end{aligned}$$

be the corresponding agiforms. Note that the coefficients \(\lambda _i\) depend on d parameters \(\mu _1,\dots ,\mu _d\), since \(\dim C(y) = d\). By assumption, there exist \(a_1,\dots ,a_d \in \mathbb {R}_{> 0}\) such that the corresponding agiform \(f(Q,\lambda ,y)_{|(\mu _1,\dots ,\mu _d) = (a_1,\dots ,a_d)} = g\) is a simplicial agiform with respect to the simplex \(Q_j^{(i)}\). Since \(y \in Q_j^{(i)}\) but \(y \notin (Q_{y,k}^{(i)})^*\), the agiform g is not a sum of squares. By continuity, we can construct a sequence \((\mu _1,\dots ,\mu _d)\) converging against \((a_1,\dots ,a_d)\) with the properties that \(f(Q,\lambda ,y)_{|(\mu _1,\dots ,\mu _d) = (a_1+\varepsilon ,\dots ,a_d+\varepsilon )}\) is an agiform for some \(\varepsilon > 0\) with its support equal to \(\{v_1,\dots ,v_m,y\}\) and not being a sum of squares, since, otherwise, if every sequence member is a sum of squares, this will also hold for the limit agiform g corresponding to \((a_1,\dots ,a_d)\) since the cone of sums of squares is closed. Hence, \({{\mathrm{SOS}}}(f,y) \ne C(y)\). \(\square \)

Example 8.3

Let again

$$\begin{aligned} Q = {{\mathrm{conv}}}\{v_0,v_1,v_2,v_3\} = {{\mathrm{conv}}}\{(0,0), (4,0), (4,2), (2,4)\} \end{aligned}$$

as in the proof of Proposition 8.1. There are six interior lattice points in Q given by

$$\begin{aligned} {{\mathrm{int}}}(Q) \cap \mathbb {N}^n = \{(1,1), (2,1), (3,1), (2,2), (2,3), (3,2)\}. \end{aligned}$$

Since Q has four vertices, C(y) for \(y\in ({{\mathrm{int}}}(Q) \cap \mathbb {N}^n)\) has a free parameter \(\lambda _3\) (see proof of Proposition 8.1). In the following table, for all \(y\in ({{\mathrm{int}}}(Q) \cap \mathbb {N}^n)\), we provide the range of the free parameter \(\lambda _3\) yielding valid convex combinations for y as well as the set \({{\mathrm{SOS}}}(f,y)\).

The sets \({{\mathrm{SOS}}}(f,y)\) are computed with SOSTOOLS, see [29]. Note that Q has two different triangulations in this case (see Fig. 6). The lattice points (2, 1) and (3, 1) are the only lattice points that satisfy \(y \in Q_j^{(i)} \Rightarrow y \in (Q_j^{(i)})^*\) for all \(i \in \{1,2\}\) and \(j \in \{1,\ldots ,r(i)\}\). Hence, exactly for \(y \in \{(2,1), (3,1)\}\), every agiform is a sum of squares.

y

\(\lambda _3\)

\({{\mathrm{SOS}}}(f,y)\)

(1, 1)

\(\frac{1}{6}\le \lambda _3\le \frac{1}{4}\)

\(\lambda _3\in [0.191;\frac{1}{4}]\)

(2, 1)

\(0\le \lambda _3\le \frac{1}{4}\)

\(\lambda _3\in [0;\frac{1}{4}]\)

(3, 1)

\(0\le \lambda _3\le \frac{1}{4}\)

\(\lambda _3\in [0;\frac{1}{4}]\)

(2, 2)

\(\frac{1}{3}\le \lambda _3\le \frac{1}{2}\)

\(\lambda _3\in \{\frac{1}{2}\}\)

(2, 3)

\(\frac{2}{3}\le \lambda _3\le \frac{3}{4}\)

\(\lambda _3\in [0.683;\frac{3}{4}]\)

(3, 2)

\(\frac{1}{6}\le \lambda _3\le \frac{1}{2}\)

\(\lambda _3\in [\frac{1}{4};\frac{1}{2}]\)

Fig. 6
figure 6

The two triangulations of Q

9 Outlook

We want to give an outlook for possible future research. Starting with the section \(\Sigma _{n,2d}^y\), we renew some open questions already stated in [31]. Is there an algorithm to compute \(\Delta ^*\) that is more efficient as the one in [31]? What can be said about the asymptotic behavior of \(\Delta ^*\), in particular, what is the, say, “probability” that a simplex is an H-simplex? This is settled for \(\mathbb R^2\) in Corollary 5.10, but seems to be completely open for \(n > 2\). Considering this problem from the viewpoint of toric geometry (see Theorem 5.9), it would be a breakthrough to characterize simplices that are normal and their corresponding toric ideals being generated by quadrics. In Sect. 7, we introduced the convex cone \(C_{n,2d}\) of sums of nonnegative circuit polynomials, which serve as nonnegativity certificates different than sums of squares. From a practical viewpoint, the major problem is to determine the complexity of checking membership in \(C_{n,2d}\). In particular, when is every nonnegative polynomial a sum of nonnegative circuit polynomials? As already mentioned in Section 7, the case of polynomials with simplex Newton polytopes is solved in [17] via geometric programming generalizing earlier work by Ghasemi in Marshall [12, 13].

From the viewpoint of amoeba theory, one evident conjecture is that Theorem 4.2 can be generalized to arbitrary complex polynomials supported on a circuit. Taking into account the corresponding literature, in particular [27, 37], an answer to this conjecture can be considered as the final piece missing to completely characterize amoebas supported on a circuit.

In our opinion, the most interesting question is whether similar approaches can be generalized to more general (sparse) polynomials and, in accordance, how much deeper the observed connection between the a priori very distinct mathematical topics “amoebas” and “nonnegativity of real polynomials” is? We believe that exploiting methods from amoeba theory might eventually yield fundamental progress in understanding nonnegativity of real polynomials.