1 Introduction and Preliminaries

Consider the problem of minimizing a homogeneous polynomial \(f\in {\mathbb{R}}[x]\) of degree d on the (standard) simplex

$$\Delta _n:=\{x\in {\mathbb{R}}_+^n:\sum _{i=1}^nx_i=1\}.$$

That is the global optimization problem

$$\underline{f}:=\min _{x\in \Delta _n}f(x), \ \ {\text {or}}\ \ \overline{f}:=\max _{x\in \Delta _n}f(x).$$
(1.1)

Here we focus on the problem of computing the minimum \(\underline{f}\) of f over \(\Delta _n\). This problem is well known to be NP-hard, as it contains the maximum stable set problem as a special case (when f is quadratic). Indeed, given a graph \(G=(V,E)\) with adjacency matrix A, Motzkin and Straus [8] show that the maximum stability number \(\alpha (G)\) can be obtained by

$${1\over \alpha (G)}=\min _{x\in \Delta _{|V|}}x^{\text{T}}(I+A)x,$$

where I denotes the identity matrix. Moreover, one can w.l.o.g. assume \(f\) is homogeneous. Indeed, if \(f=\sum _{s=0}^{d}f_s\), where \(f_s\) is homogeneous of degree s, then \(\min _{x\in \Delta _n}f(x)=\min _{x\in \Delta _n}f'(x)\), setting \(f'=\sum _{s=0}^d f_s\left( \sum _{i=1}^n x_i\right) ^{d-s}.\)

For problem (1.1), many approximation algorithms have been studied in the literature. In fact, when f has fixed degree d, there is a polynomial time approximation scheme (PTAS) for this problem, see [1] for the case \(d=2\) and [5, 7] for \(d\geqslant 2\). For more results on its computational complexity, we refer to [3, 6].

We consider the following two bounds for \(\underline{f}\): an upper bound \(f_{\Delta (n,r)}\) obtained by taking the minimum value on a regular grid and a lower bound \(f_{\min }^{(r-d)}\) based on Pólya’s representation theorem. They both have been studied in the literature, see e.g., [1, 5, 7] for \(f_{\Delta (n,r)}\) and [5, 14, 15] for \(f_{\min }^{(r-d)}\). The two ranges \(f_{\Delta (n,r)}-\underline{f}\) and \(\underline{f}- f_{\min }^{(r-d)}\) have been studied separately and upper bounds for each of them have been shown in the above-mentioned works.

In this paper, we study these two ranges at the same time. More precisely, we analyze the larger range \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\) and provide upper bounds for it in terms of the range of function values \(\overline{f}-\underline{f}\). Of course, upper bounds for the range \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\) can be obtained by combining the known upper bounds for each of the two ranges \(f_{\Delta (n,r)}-\underline{f}\) and \(\underline{f}- f_{\min }^{(r-d)}\). Our new upper bound for \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\) refines these known bounds in the quadratic and cubic cases and provides an asymptotic refinement for general degree d.

1.1 Notation

Throughout \({\mathcal{H}}_{n,d}\) denotes the set of all homogeneous polynomials in n variables with degree d. We let \([n]:=\{1,2,\cdots ,n\}\). We denote \({\mathbb{R}}^n_+\) as the set of all nonnegative real vectors, and \({\mathbb{N}}^n\) as the set of all nonnegative integer vectors. For \(\alpha \in {\mathbb{N}}^n\), we define \(|\alpha |:=\sum _{i=1}^n\alpha _i\) and \(\alpha !:=\alpha _1!\alpha _2!\cdots \alpha _n!\). We denote \(I(n,d):=\{\alpha \in {\mathbb{N}}^n: |\alpha |=d\}\). We let e denote the all-ones vector and \(e_i\) denote the ith standard unit vector. We denote \({\mathbb{R}}[x]\) as the set of all multivariate polynomials in n variables (i.e. \(x_1,x_2\cdots ,x_n\)) and denote \({\mathcal{H}}_{n,d}\) as the set of all multivariate homogeneous polynomials in n variables with degree d. For \(\alpha \in {\mathbb{N}}^n\), we denote \(x^{\alpha }:=\prod _{i=1}^nx_i^{\alpha _i}\), while for \(I\subseteq [n]\), we let \(x^{I}:=\prod _{i\in I}x_i\). Moreover, we denote \(x^{\underline{d}}:=x(x-1)(x-2)\cdots (x-d+1)\) for integer \(d\geqslant 0\) and \(x^{\underline{\alpha }}:=\prod _{i=1}^nx_{i}^{\underline{{\alpha }_i}}\) for \(\alpha \in {\mathbb{N}}^n\). Thus, \(x^{\underline{d}}=0\) if x is an integer with \(0\leqslant x\leqslant d-1\).

1.2 Upper Bounds Using Regular Grids

One can construct an upper bound for \(\underline{f}\) by taking the minimum of f on the regular grid

$$\Delta (n,r):=\{x\in \Delta _n:rx\in {\mathbb{N}}^n\},$$

for an integer \(r\geqslant 0\). We define

$$f_{\Delta (n,r)}:=\min _{x\in \Delta (n,r)}f(x).$$

Obviously, \(\underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}\) and \(f_{\Delta (n,r)}\) can be computed by \(|\Delta (n,r)| = {n+r-1 \atopwithdelims ()r}\) evaluations of f. In fact, when considering polynomials f of fixed degree d, the parameters \(f_{\Delta (n,r)}\) (with increasing values of r) provide a PTAS for (1. 1), as was proved by Bomze and de Klerk [1] (for \(d=2\)), and by de Klerk et al. [5] (for \(d\geqslant 2\)). Recently, de Klerk et al. [7] provide an alternative proof for this PTAS and refine the error bound for \(f_{\Delta (n,r)}-\underline{f}\) from [5] for cubic f.

In addition, some researchers study the properties of the regular grid \(\Delta (n,r)\). For instance, given a point \(x \in \Delta _n\), Bomze et al. [2] show a scheme to find the closest point to x on \(\Delta (n,r)\) with respect to some class of norms including \(\ell _p\)-norms for \(p\geqslant 1\).

1.3 Lower Bounds Based on Pólya’s Representation Theorem

Given a polynomial \(f\in {\mathcal{H}}_{n,d}\), Pólya [12] shows that if f is positive over the simplex \(\Delta _n\), then the polynomial \((\sum _{i=1}^ nx_i)^r f\) has nonnegative coefficients for any r large enough (see [13] for an explicit bound for r). Based on this result of Pólya, an asymptotically converging hierarchy of lower bounds for \(\underline{f}\) can be constructed as follows: for any integer \(r\geqslant d\), we define the parameter \(f_{\min }^{(r-d)}\) as

$$f_{\min }^{(r-d)}:=\max \lambda \ \ {\text {s.t.}}\ \ \left( \sum _{i=1}^nx_i\right) ^{r-d}\left( f-\lambda \left( \sum _{i=1}^nx_i\right) ^d\right) \ \ \text {has nonnegative coefficients.}$$
(1.2)

Notice that \(\underline{f}\) can be equivalently formulated as

$$\underline{f}=\max \ \ \lambda \ \ {\text {s.t.}}\ \ f(x)-\lambda \left( \sum _{i=1}^nx_i\right) ^d\geqslant 0\ \ \forall x\in {\mathbb{R}}^n_+. $$

Then, one can easily check the following inequalities:

$$f_{\min }^{(0)}\leqslant f_{\min }^{(1)}\leqslant \cdots \leqslant \underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}.$$

Parrilo [9, 10] first introduces the idea of applying Pólya’s representation theorem to construct hierarchical approximations in copositive optimization. De Klerk et al. [5] consider \(f_{\min }^{(r-d)}\) and show upper bounds for \(\underline{f}-f_{\min }^{(r-d)}\) in terms of \(\overline{f}-\underline{f}\). Furthermore, Yildirim [15] and Sagol and Yildirim [14] analyze error bounds for \(f_{\min }^{(r-2)}\) for quadratic f.

Now we give an explicit formula for the parameter \(f_{\min }^{(r-d)}\), which follows from [13, relation (3)]; note that the quadratic case of this formula has also been observed in [11, 14, 15].

Lemma 1.1

For \(f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}\), one has

$$f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\sum _{\beta \in I(n,d)}f_{\beta }{ \alpha ^{\underline{\beta }} \over r^{\underline{d}}}.$$
(1.3)

Proof

By using the multinomial theorem \((\sum _{i=1}^n x_i)^d=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\alpha }\), we obtain

$$\left( \sum _{i=1}^nx_i\right) ^{r-d}f-\lambda \left( \sum _{i=1}^nx_i\right) ^{r}= \left( \sum _{\gamma \in I(n,r-d)}{(r-d)!\over \gamma !}x^{\gamma }\right) \left( \sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\right) -\lambda \left( \sum _{\alpha \in I(n,r)}{r!\over \alpha !}x^{\alpha }\right)= \sum _{\alpha \in I(n,r)}\left( \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}\right) {r!\over \alpha !}x^{\alpha }-\lambda \left( \sum _{\alpha \in I(n,r)}{r!\over \alpha !}x^{\alpha }\right) = \sum _{\alpha \in I(n,r)}\left( \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}-\lambda \right) {r!\over \alpha !}x^{\alpha }. $$

Hence, by Definition (1.2), we obtain

$$ f_{\min }^{(r-d)}= \max \ \ \lambda \ \ {\text {s.t.}} \ \ \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}-\lambda \geqslant 0\ \ \forall \alpha \in I(n,r)= \min \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}\ \ {\text {s.t.}}\ \ \alpha \in I(n,r). $$

\(\square \)

Similarly as \(f_{\Delta (n,r)}\), by (1.3), the computation of \(f_{\min }^{(r-d)}\) requires \(|I(n,r)| = {n+r-1 \atopwithdelims ()r}\) evaluations of the polynomial \(\sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}\).

1.4 Bernstein Coefficients

For any polynomial \(f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}\), we can write it as

$$f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }=\sum _{\beta \in I(n,d)}\left( f_{\beta }{\beta !\over d!}\right) {d!\over \beta !}x^{\beta }. $$
(1.4)

For any \(\beta \in I(n,d)\), we call \(f_{\beta }{\beta !\over d!}\) the Bernstein coefficients of f (this terminology has also been used in [4, 7]), since they are the coefficients of the polynomial f when f is expressed in the Bernstein basis \(\{{d!\over \beta !}x^\beta : \beta \in I(n,d)\}\) of \({\mathcal{H}}_{n,d}\). Applying the multinomial theorem together with (1.4), one can obtain that when evaluating f at a point \(x\in \Delta _n\), \(f(x)\) is a convex combination of the Bernstein coefficients \(f_{\beta }{\beta !\over d!}\). Therefore, we have

$$\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\leqslant \underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}\leqslant \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}.$$
(1.5)

For the analysis in Sect. 5, we need the following result of [5], which bounds the range of the Bernstein coefficients of f in terms of its range of values \(\overline{f}-\underline{f}\).

Theorem 1.2

[5, Theorem 2.2] For any polynomial \(f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}\), one has

$$\max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\leqslant {2d-1\atopwithdelims ()d}d^d (\overline{f}-\underline{f}).$$

1.5 Contribution of the Paper

In this paper, we consider upper bounds for \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\) in terms of \(\overline{f}-\underline{f}\). More precisely, we provide tighter upper bounds in the quadratic, cubic and square-free (aka multilinear) cases and in the general case \(d\geqslant 2\), our upper bounds are asymptotically tighter when \(r\) is large enough. We will apply the formula (1.3) directly for the quadratic, cubic and square-free cases, while for the general case we will use Theorem 1.2.

There are some relevant results in the literature. De Klerk et al. [5] give upper bounds for \(f_{\Delta (n,r)}-\underline{f}\) (the upper bound for cubic \(f\) has been refined by de Klerk et al. [7]) and for \(\underline{f}-f_{\min }^{(r-d)}\) in terms of \(\overline{f}-\underline{f}\), and by adding them up one can easily derive upper bounds for \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\). Furthermore, for quadratic polynomial f, Yildirim [15] considers the upper bound \(\min _{k\leqslant r} f_{\Delta (n,k)}\) for \(\underline{f}\) (for \(r \geqslant 2\)) and upper bounds the range \(\min _{k\leqslant r} f_{\Delta (n,k)}-f_{\min }^{(r-d)}\) in terms of \(\overline{f}-\underline{f}\). Our results in this paper refine the results in [5, 7, 15] for the quadratic and cubic cases (see Sects. 2 and 3 respectively), while for the general case, our result refines the result of [5] when r is sufficiently large (see Sect. 5).

1.6 Structure

The paper is organized as follows. In Sects. 2 and 3, we consider the quadratic and cubic cases, respectively and refine the relevant results obtained from [5, 7, 15]. Then, we look at the square-free (aka multilinear) case in Sect. 4. Moreover, in Sect. 5, we consider general (fixed-degree) polynomials and compare our new result with the one of [5].

2 The Quadratic Case

For any quadratic polynomial f, we consider the range \(f_{\Delta (n,r)}-f_{\min }^{(r-2)}\) and derive the following upper bound in terms of \(\overline{f}-\underline{f}\).

Theorem 2.1

For any quadratic \(f=x^{\text{T}}Qx\) and \(r\geqslant 2\), one has

$$ f_{\Delta (n,r)}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(Q_{\max }-f_{\Delta (n,r)}) \leqslant {1\over r-1}(\overline{f}-\underline{f}),$$
(2.1)

where \(Q_{\max }:=\max _{i\in [n]} Q_{ii}\).

Proof

By (1.3), we have

$$ f_{\min }^{(r-2)}=\min _{\alpha \in I(n,r)}{1\over r(r-1)}\left[ f(\alpha )-\sum _{i=1}^nQ_{ii}\alpha _i\right] .$$

Hence, \({r-1\over r}f_{\min }^{(r-2)}=\min _{\alpha \in I(n,r)}\left[ f({\alpha \over r})-\sum _{i=1}^nQ_{ii}{\alpha _i\over r}{1\over r}\right] .\) We obtain

$$ {r-1\over r}f_{\min }^{(r-2)}\geqslant \min _{\alpha \in I(n,r)}f({\alpha \over r})-\max _{\alpha \in I(n,r)}{1\over r}\sum _{i=1}^nQ_{ii}{\alpha _i\over r} = f_{\Delta (n,r)}-{1\over r} Q_{\max }.$$
(2.2)

One can easily obtain the first inequality in (2.1) by (2.2). For the second inequality in (2.1), we use the fact that \( Q_{\max } \leqslant \overline{f}\) (since \(Q_{ii}=f(e_i)\leqslant \overline{f}\) for \(i\in [n]\)) as well as the fact that \(f_{\Delta (n,r)}\geqslant \underline{f}\). \(\square \)

Now we point out that our result (2.1) refines the relevant result of [5]. De Klerk et al. [5] show the following theorem.

Theorem 2.2

[5, Theorem 3.2] Suppose \(f\in {\mathcal{H}}_{n,2}\) and \(r\geqslant 2\). Then

$$\underline{f}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(\overline{f}-\underline{f}),$$
(2.3)
$$f_{\Delta (n,r)}-\underline{f}\leqslant {1\over r}(\overline{f}-\underline{f}). $$
(2.4)

By adding up (2.3) and (2.4), one gets

$$f_{\Delta (n,r)}-f_{\min }^{(r-2)}\leqslant \left( {1\over r-1}+{1\over r}\right) (\overline{f}-\underline{f}),$$

which is implied by our result (2.1).

Moreover, in [15], Yildirim considers one hierarchical upper bound of \(\underline{f}\) (when f is quadratic), which is defined by \(\min _{k\leqslant r}f_{\Delta (n,k)}.\) One can easily verify that

$$f_{\min }^{(r-2)}\leqslant \underline{f} \leqslant \min _{k\leqslant r}f_{\Delta (n,k)} \leqslant f_{\Delta (n,r)}.$$

In [15, Theorem 4.1], Yildirim shows \(\min _{k\leqslant r} f_{\Delta (n,k)}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(Q_{\max }-\underline{f})\), which can also be easily implied by our result (2.1).

The following example shows that the upper bound (2.1) can be tight.

Example 2.3

[7, Example 2] Consider the quadratic polynomial \(f=\sum _{i=1}^n x_i^2\). As f is convex, one can check that \(\underline{f}={1\over n}\) (attained at \(x={1\over n}e\)) and \(\overline{f}=1\) (attained at any standard unit vector). To compute \(f_{\Delta (n,r)}\), we write r as \(r=kn+s\), where \(k\geqslant 0\) and \(0\leqslant s<n\). Then one can check that

$$f_{\Delta (n,r)}={1\over n} + {1\over r^2}{s(n-s)\over n}.$$

By (1.3), we have

$$f_{\Delta (n,r)}-f_{\min }^{(r-2)}={1\over r-1}\left( \overline{f}-\underline{f}\right) -{1\over r^2(r-1)}{s(n-s)\over n}.$$

Hence, for this example, the upper bound (2.1) is tight when \(s=0\).

3 The Cubic Case

For any cubic polynomial f, we consider the difference \(f_{\Delta (n,r)}-f_{\min }^{(r-3)}\) and show the following result.

Theorem 3.1

For any cubic polynomial f and \(r\geqslant 3\), one has

$$f_{\Delta (n,r)}-f_{\min }^{(r-3)}\leqslant {4r\over (r-1)(r-2)}(\overline{f}-\underline{f}). $$
(3.1)

Proof

We can write any cubic polynomial f as

$$f=\sum _{i=1}^nf_ix_i^3+\sum _{i<j}(f_{ij}x_ix_j^2+g_{ij}x_i^2x_j)+\sum _{i<j<k}f_{ijk}x_ix_jx_k.$$

Then by (1.3), one can check that

$${(r-1)(r-2)\over r^2}f_{\min }^{(r-3)} = \min _{\alpha \in I(n,r)}\left\{ f\left({\alpha \over r}\right)-{1\over r^3}\left( 3\sum _{i=1}^n f_i\alpha _i^2-2\sum _{i=1}^n f_i\alpha _i+\sum _{i<j}(f_{ij}+g_{ij})\alpha _i\alpha _j \right) \right\} \geqslant f_{\Delta (n,r)}-{1\over r}\max _{\alpha \in I(n,r)}\left\{ 3\sum _{i=1}^n f_i\left( {\alpha _i\over r}\right) ^2+\sum _{i<j}(f_{ij}+g_{ij})\left( {\alpha _i\over r}\right) \left( {\alpha _j\over r}\right) \right\} +{1\over r^2}\min _{\alpha \in I(n,r)}2\sum _{i=1}^n f_i{\alpha _i\over r} \geqslant f_{\Delta (n,r)}-{1\over r}\max _{x\in \Delta _n}\left\{ 3\sum _{i=1}^n f_ix_i^2+\sum _{i<j}(f_{ij}+g_{ij})x_ix_j\right\} +{1\over r^2}\min _{x\in \Delta _n}2\sum _{i=1}^n f_ix_i. $$
(3.2)

Evaluating f at \(e_i\) and \((e_i+e_j)/2\) yields, respectively, the relations:

$$\underline{f}\leqslant f_i\leqslant \overline{f},$$
(3.3)
$$f_i+f_j+f_{ij}+g_{ij}\leqslant 8\overline{f}.$$
(3.4)

Using (3.4) and the fact that \(\sum _{i=1}^nx_i=1\), one can obtain

$$\sum _{i<j}(f_{ij}+g_{ij})x_ix_j\leqslant \sum _{i<j}(8\overline{f}-f_i-f_j)x_ix_j=8\overline{f}\sum _{i<j}x_ix_j-\sum _{i=1}^n f_ix_i(1-x_i). $$
(3.5)

By (3.2), (3.3), (3.5) and the fact that \(\sum _{i=1}^nx_i=1\), one can get

$$(r-1)(r-2)f_{\min }^{(r-3)}\geqslant r^2f_{\Delta (n,r)}-4r\overline{f}+(r+2)\min _{x\in \Delta _n}\sum _{i=1}^nf_ix_i\geqslant r^2f_{\Delta (n,r)}-4r\overline{f}+(r+2)\underline{f}.$$

Hence, one has

$$(r-1)(r-2)\left( f_{\Delta (n,r)}-f_{\min }^{(r-3)}\right) \leqslant 4r\overline{f}-(3r-2)f_{\Delta (n,r)}-(r+2)\underline{f}\leqslant 4r(\overline{f}-\underline{f}). $$

\(\square \)

Now we observe that our result (3.1) refines the relevant upper bound obtained from [5, 7]. De Klerk et al. [5] show the following result.

Theorem 3.2

[5, Theorem 3.3] Suppose \(f\in {\mathcal{H}}_{n,3}\) and \(r\geqslant 3\). Then

$$\underline{f}-f_{\min }^{(r-3)}\leqslant {4r\over (r-1)(r-2)}(\overline{f}-\underline{f}),$$
(3.6)
$$f_{\Delta (n,r)}-\underline{f}\leqslant {4\over r}(\overline{f}-\underline{f}). $$
(3.7)

Recently, De Klerk et al. [7, Corollary 2 ] refine (3.7) to

$$f_{\Delta (n,r)}-\underline{f}\leqslant \left( {4\over r}-{4\over r^2}\right) (\overline{f}-\underline{f}).$$
(3.8)

Similar to the quadratic case (in Sect. 2), our new upper bound (3.1) implies the upper bound obtained by adding up (3.6) and (3.8). However, we do not find any example showing the upper bound (3.1) is tight. Thus, it is still an open question to show the tightness of the upper bound (3.1).

4 The Square-free Case

Consider the square-free (aka multilinear) polynomial \(f=\sum _{I:I\subseteq [n],|I|=d}f_Ix^I\in {\mathcal{H}}_{n,d}\). We have the following result for the difference \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\).

Theorem 4.1

For any square-free polynomial \(f=\sum _{I:I\subseteq [n],|I|=d}f_Ix^I\) and \(r\geqslant d\), one has

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-1\right) \left( \overline{f}-\underline{f}\right) . $$
(4.1)

Proof

From (1.3), one can easily check that

$$ f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\sum _{I:I\subseteq [n],|I|=d}f_I{\alpha ^I\over r^{\underline{d}}}={1\over r^{\underline{d}}}\min _{\alpha \in I(n,r)}f(\alpha ). $$

As a result, one can obtain

$${f_{\min }^{(r-d)}\over f_{\Delta (n,r)}}={r^d\over r^{\underline{d}}}. $$

For \(d=1\), the result (4.1) is clear.

Now we assume \(d\geqslant 2\). Considering \(\overline{f}\geqslant 0\) (as \(f(e_i)=0\) for any \(i\in [n]\)), we obtain

$$f_{\Delta (n,r)}-f_{\min }^{(r-d)}=\left( 1-{r^d\over r^{\underline{d}}}\right) f_{\Delta (n,r)}\leqslant \left( 1-{r^d\over r^{\underline{d}}}\right) \underline{f} \leqslant \left( {r^d\over r^{\underline{d}}}-1\right) \left( \overline{f}-\underline{f}\right) . $$
(4.2)

\(\square \)

The following example shows that our upper bound (4.1) can be tight.

Example 4.2

[7, Example 4] Consider the square-free polynomial \(f=-x_1x_2\). One can check \(\overline{f}=0,\) \(\underline{f}=-{1\over 4}\), and

$$f_{\Delta (2,r)}=\left\{ \begin{array}{ll} -{1\over 4}\;, &{} {\text {if}}\,\, r \,\, {\text {is}} \, {\text{even,}}\\ -{1\over 4}+ {1\over 4r^2}\;, &{} {\text {if}} \,\, r \,\, {\text {is}} \, {\text{odd.}} \end{array} \right. $$

By (1.3), we have

$$f_{\Delta (2,r)}-f_{\min }^{(r-2)}=\left\{ \begin{array}{ll} {1\over r-1}\left( \overline{f}-\underline{f}\right), &{} {\text {if}} \,\, r \,\, {\text {is}} \, {\text{even,}}\\ \left( {1\over r}+{1\over r^2}\right) \left( \overline{f}-\underline{f}\right), &{} {\text {if}} \,\, r\,\, {\text {is}} \, {\text{odd.}} \end{array} \right.$$

For this example, the upper bound (4.1) is tight when \(r\) is even. In fact, from (4.2), one can easily see that the upper bound (4.1) is tight as long as \(f_{\Delta (n,r)}=\underline{f}-\overline{f}\) holds.

5 The General Case

Now, we consider an arbitrary polynomial \(f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}\). We need the following notation to formulate our result. Consider the univariate polynomial \(t^d-t^{\underline{d}}\) (in the variable t), which can be written as

$$ t^d-t^{\underline{d}}=\sum _{k=1}^{d-1}(-1)^{d-k-1}a_{d-k}t^{k},$$
(5.1)

for some positive scalars \(a_1,a_2,\cdots ,a_{d-1}\). Moreover, one can easily check that

$$ \sum _{k=1}^{d-1}a_{d-k}t^{k}=(t+d-1)^{\underline{d}}-t^d. $$
(5.2)

We can show the following error bound for the range \(f_{\Delta (n,r)}-f_{\min }^{(r-d)}\).

Theorem 5.1

For any polynomial \(f\in {\mathcal{H}}_{n,d}\) and \(r\geqslant d\), one has

$$ {f_{\Delta (n,r)}}-{f_{\min }^{(r-d)}}\leqslant {(r+d-1)^{\underline{d}}-r^d \over r^{\underline{d}}}{2d-1\atopwithdelims ()d}d^d (\overline{f}-\underline{f}). $$
(5.3)

Note that when \(f\) is quadratic, cubic or square-free, we have shown better upper bounds in Theorems 2.1, 3.1 and 4.1.

In the proof, we will need the following Vandermonde-Chu identity (see [13] for a proof, or alternatively use induction on \(d\geqslant 1\)):

$$\left(\sum _{i=1}^n x_i\right)^{\underline{d}}=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\underline{\alpha }}\quad \forall x\in {\mathbb{R}}^n, $$
(5.4)

which is an analogue of the multinomial theorem \((\sum _{i=1}^n x_i)^d=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\alpha }.\) Now we prove Theorem 5.1.

Proof

(of Theorem 5.1) From (1.3), we have

$${r^{\underline{d}}\over r^d}f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\left\{ \sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }\over r^d}-\sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }-\alpha ^{\underline{\beta }}\over r^d}\right\}.$$

From this, we obtain the inequality:

$${r^{\underline{d}}\over r^d}f_{\min }^{(r-d)}\geqslant f_{\Delta (n,r)}-\max _{\alpha \in I(n,r)}\sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }-\alpha ^{\underline{\beta }}\over r^d}.$$
(5.5)

We now focus on the summation \(\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})\).

For any \(\beta \in I(n,d)\) and \(x\in {\mathbb{R}}^n\), we can write the polynomial \(x^{\beta }-x^{\underline{\beta }}\) as

$$x^{\beta }-x^{\underline{\beta }}=\sum _{\gamma :|\gamma |\leqslant d-1}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma },$$
(5.6)

for some nonnegative scalars \(c_{\gamma }^{\beta }\) (which is an analogue of (5.1)). We now claim that for any fixed \(k\in [d-1]\), the following identity holds:

$$\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma}^{\beta}x^{\gamma}=(-1)^{d-k-1}a_{d-k}\left(\sum _{i=1}^nx_i\right)^{k}.$$
(5.7)

For this, observe that the polynomials at both sides of (5.7) are homogeneous of degree k. Hence (5.7) will follow if we can show that the equality holds after summing each side over \(k\in [d-1]\). In other words, it suffices to show the identity:

$$ \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\sum _{k=1}^{d-1}(-1)^{d-k-1}a_{d-k}\left(\sum _{i=1}^nx_i\right)^{k}. $$

By the definition of \(a_{d-k}\) in (5.1), the right side of the above equation is equal to \((\sum _{i=1}^nx_i)^{d}-(\sum _{i=1}^nx_i)^{\underline{d}}\). Hence, we only need to show

$$ \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\left(\sum _{i=1}^nx_i\right)^{d}-\left(\sum _{i=1}^nx_i\right)^{\underline{d}}. $$
(5.8)

Summing over (5.6), we obtain

$$ \sum _{\beta \in I(n,d)}{d!\over \beta !}\left( x^{\beta }-x^{\underline{\beta }}\right) =\sum _{\beta \in I(n,d)}\sum _{\gamma :|\gamma |\leqslant d-1}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }. $$

We can now conclude the proof of (5.8) (and thus of (5.7)). Indeed, by using the multinomial theorem and the Vandermonde-Chu identity (5.4), we see that the left-most side in the above relation is equal to \((\sum _{i=1}^nx_i)^d-(\sum _{i=1}^nx_i)^{\underline{d}}.\)

We partition \([d-1]\) as \([d-1]=I_{o}\cup I_{e}\), where \(I_{o}:=\{k:k\in [d-1],{\text {d-k \, is \,odd}\}}\) and \(I_{e}:=\{k:k\in [d-1],{\text{d-k}} \; {\text{is}} \quad {\text{even}}\}\). Then, from (5.6), the summation \(\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})\) becomes

$$\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }}) =\sum _{\beta \in I(n,d)}f_{\beta }\sum _{\gamma :|\gamma |\leqslant d-1}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }\alpha ^{\gamma }= \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}f_{\beta }(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }\alpha ^{\gamma }\leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}c_{\gamma }^{\beta }\alpha ^{\gamma } -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}c_{\gamma }^{\beta }\alpha ^{\gamma }. $$

By (5.7), we obtain

$$ \sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})\leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}\left(\sum _{i=1}^n\alpha _i\right)^{k}- \left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}\left(\sum _{i=1}^n\alpha _i\right)^{k}. $$

Combining with (5.5), we get

$$r^{\underline{d}}f_{\min }^{(r-d)}\geqslant r^df_{\Delta (n,r)}-\left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} +\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}.$$

That is,

$$ r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)}) \leqslant (r^{\underline{d}}-r^d)f_{\Delta (n,r)}+ \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}. $$

Since \(r^{\underline{d}}-r^d=\sum _{k=1}^{d-1}(-1)^{d-k}a_{d-k}r^{k}\), we obtain

$$r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)})\leqslant \sum _{k=1}^{d-1}(-1)^{d-k}a_{d-k}r^{k}f_{\Delta (n,r)}+ \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} - \left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}= \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k}+f_{\Delta (n,r)}\sum _{k\in I_{e}}a_{d-k}r^{k} -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}-f_{\Delta (n,r)}\sum _{k\in I_{o}}a_{d-k}r^{k}. $$

According to (1.5), one has \(\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\le f_{\Delta (n,r)}\leqslant \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\). Therefore, we have

$$ r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)}) \leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k=1}^{d-1}a_{d-k}r^{k}. $$

That is,

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant {\sum _{k=1}^{d-1}a_{d-k}r^{k} \over r^{\underline{d}}}\left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) . $$

Finally, together with Theorem 1.2 and (5.2), we can conclude the result of Theorem 5.1.\(\square \)

Now, we compare the following theorem by De Klerk et al. [5] with our new result (5.3).

Theorem 5.2

[5, Theorem 1.3] Suppose \(f\in {\mathcal{H}}_{n,d}\) and \(r\geqslant d\). Then

$$\underline{f}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-1\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}), $$
(5.9)
$$ f_{\Delta (n,r)}-\underline{f}\leqslant \left( 1-{r^{\underline{d}}\over r^{d}}\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}). $$
(5.10)

By adding up (5.9) and (5.10), we obtain

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}). $$
(5.11)

Lemma 5.3

When r is large enough, the upper bound (5.3) refines the upper bound (5.11).

Proof

It suffices to show that \({r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}\) is larger than \({\sum _{k=1}^{d-1}a_{d-k}r^{k} \over r^{\underline{d}}}\) when \(r\) is sufficiently large. Since \({r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}=(r^d-{(r^{\underline{d}})^2\over r^d})/r^{\underline{d}}\), we only need to compare \(r^d-{(r^{\underline{d}})^2\over r^d}\) and \(\sum _{k=1}^{d-1}a_{d-k}r^{k}\). For the term \(r^d-{(r^{\underline{d}})^2\over r^d}\), one can check that the coefficient of \(r^d\) is \(0\) and the coefficient of \(r^{d-1}\) is \(2a_{1}>0\). On the other hand, in the summation \({\sum _{k=1}^{d-1}a_{d-k}r^{k}}\), the coefficient of \(r^{d-1}\) is \(a_{1}>0\). Therefore, when \(r\) is sufficiently large, \(r^d-{(r^{\underline{d}})^2\over r^d}\) is larger than \(\sum _{k=1}^{d-1}a_{d-k}r^{k}\), by which we conclude the proof. \(\square \)

We illustrate the result in Lemma 5.3 in the case of quartic polynomials.

Example 5.4

Consider a polynomial \(f\in {\mathcal{H}}_{n,4}\) written as

$$f= \sum _{i=1}^nf_ix_i^4+\sum _{i<j}\left( f_{ij}x_i^3x_j+g_{ij}x_i^2x_j^2+h_{ij}x_ix_j^3\right) +\sum _{i<j<k}(f_{ijk}x_i^2x_jx_k+ g_{ijk}x_ix_j^2x_k +h_{ijk}x_ix_jx_k^2) +\sum _{i<j<k<l}f_{ijkl}x_ix_jx_kx_l. $$

In this case, (5.3) reads

$$ f_{\Delta (n,r)}-f_{\min }^{(r-4)}\leqslant {6r^2+11r+6\over (r-1)(r-2)(r-3)}{7\atopwithdelims ()4}4^4 (\overline{f}-\underline{f}), $$
(5.12)

while (5.11) reads

$$ f_{\Delta (n,r)}-f_{\min }^{(r-4)}\leqslant {12r^2-58r+144-{193\over r}+{132\over r^2}-{36\over r^3}\over (r-1)(r-2)(r-3)}{7\atopwithdelims ()4}4^4 (\overline{f}-\underline{f}). $$
(5.13)

One can check that (5.12) refines (5.13) when \(r\geqslant 10\).

Remark 5.5

We now consider the convergence rate of the sequence

$$ \alpha _r:={f_{\Delta (n,r)}-f_{\min }^{(r-d)}\over \overline{f}-\underline{f}}\ \ \ \ r=1,2,\cdots $$

Suppose the degree of f is fixed. By (5.3), we have \(\alpha _r=O({1\over r})\). As in Example 4.2, \(\alpha _r=\Omega ({1\over r})\) holds, we can conclude that the dependence of \(\alpha _r\) on r in (5.3) is tight, in the sense that there does not exist any \(\varepsilon >0\) such that \(\alpha _r=O({1\over r^{1+\varepsilon }})\).

In [7], De Klerk et al. consider the convergence rate of the sequence

$$ \beta _r:={{f_{\Delta (n,r)}-\underline{f}}\over \overline{f}-\underline{f}}\ \ \ \ r=1,2,\cdots $$

They consider several examples, and all of them satisfy \(\beta _r=O({1\over r^2})\). However, it is still an open question to determine the asymptotic convergence rate of \(\beta _r\) in general.