A Refined Error Analysis for Fixed-Degree Polynomial Optimization over the Simplex

Sun, Zhao

doi:10.1007/s40305-014-0057-8

A Refined Error Analysis for Fixed-Degree Polynomial Optimization over the Simplex

Published: 11 September 2014

Volume 2, pages 379–393, (2014)
Cite this article

Download PDF

Journal of the Operations Research Society of China Aims and scope Submit manuscript

A Refined Error Analysis for Fixed-Degree Polynomial Optimization over the Simplex

Download PDF

Zhao Sun¹

890 Accesses
2 Citations
Explore all metrics

Abstract

We consider the problem of minimizing a fixed-degree polynomial over the standard simplex. This problem is well known to be NP-hard, since it contains the maximum stable set problem in combinatorial optimization as a special case. In this paper, we revisit a known upper bound obtained by taking the minimum value on a regular grid, and a known lower bound based on Pólya’s representation theorem. More precisely, we consider the difference between these two bounds and we provide upper bounds for this difference in terms of the range of function values. Our results refine the known upper bounds in the quadratic and cubic cases, and they asymptotically refine the known upper bound in the general case.

An alternative proof of a PTAS for fixed-degree polynomial optimization over the simplex

Article 15 October 2014

On the convergence rate of grid search for polynomial optimization over the simplex

Article 07 March 2016

Error bounds for mixed integer nonlinear optimization problems

Article 03 February 2016

1 Introduction and Preliminaries

Consider the problem of minimizing a homogeneous polynomial $f\in {\mathbb{R}}[x]$ of degree d on the (standard) simplex

$$\Delta _n:=\{x\in {\mathbb{R}}_+^n:\sum _{i=1}^nx_i=1\}.$$

That is the global optimization problem

$$\underline{f}:=\min _{x\in \Delta _n}f(x), \ \ {\text {or}}\ \ \overline{f}:=\max _{x\in \Delta _n}f(x).$$

(1.1)

Here we focus on the problem of computing the minimum $\underline{f}$ of f over $\Delta _n$. This problem is well known to be NP-hard, as it contains the maximum stable set problem as a special case (when f is quadratic). Indeed, given a graph $G=(V,E)$ with adjacency matrix A, Motzkin and Straus [8] show that the maximum stability number $\alpha (G)$ can be obtained by

$${1\over \alpha (G)}=\min _{x\in \Delta _{|V|}}x^{\text{T}}(I+A)x,$$

where I denotes the identity matrix. Moreover, one can w.l.o.g. assume $f$ is homogeneous. Indeed, if $f=\sum _{s=0}^{d}f_s$, where $f_s$ is homogeneous of degree s, then $\min _{x\in \Delta _n}f(x)=\min _{x\in \Delta _n}f'(x)$, setting $f'=\sum _{s=0}^d f_s\left( \sum _{i=1}^n x_i\right) ^{d-s}.$

For problem (1.1), many approximation algorithms have been studied in the literature. In fact, when f has fixed degree d, there is a polynomial time approximation scheme (PTAS) for this problem, see [1] for the case $d=2$ and [5, 7] for $d\geqslant 2$. For more results on its computational complexity, we refer to [3, 6].

We consider the following two bounds for $\underline{f}$: an upper bound $f_{\Delta (n,r)}$ obtained by taking the minimum value on a regular grid and a lower bound $f_{\min }^{(r-d)}$ based on Pólya’s representation theorem. They both have been studied in the literature, see e.g., [1, 5, 7] for $f_{\Delta (n,r)}$ and [5, 14, 15] for $f_{\min }^{(r-d)}$. The two ranges $f_{\Delta (n,r)}-\underline{f}$ and $\underline{f}- f_{\min }^{(r-d)}$ have been studied separately and upper bounds for each of them have been shown in the above-mentioned works.

In this paper, we study these two ranges at the same time. More precisely, we analyze the larger range $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$ and provide upper bounds for it in terms of the range of function values $\overline{f}-\underline{f}$. Of course, upper bounds for the range $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$ can be obtained by combining the known upper bounds for each of the two ranges $f_{\Delta (n,r)}-\underline{f}$ and $\underline{f}- f_{\min }^{(r-d)}$. Our new upper bound for $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$ refines these known bounds in the quadratic and cubic cases and provides an asymptotic refinement for general degree d.

1.1 Notation

Throughout ${\mathcal{H}}_{n,d}$ denotes the set of all homogeneous polynomials in n variables with degree d. We let $[n]:=\{1,2,\cdots ,n\}$. We denote ${\mathbb{R}}^n_+$ as the set of all nonnegative real vectors, and ${\mathbb{N}}^n$ as the set of all nonnegative integer vectors. For $\alpha \in {\mathbb{N}}^n$, we define $|\alpha |:=\sum _{i=1}^n\alpha _i$ and $\alpha !:=\alpha _1!\alpha _2!\cdots \alpha _n!$. We denote $I(n,d):=\{\alpha \in {\mathbb{N}}^n: |\alpha |=d\}$. We let e denote the all-ones vector and $e_i$ denote the ith standard unit vector. We denote ${\mathbb{R}}[x]$ as the set of all multivariate polynomials in n variables (i.e. $x_1,x_2\cdots ,x_n$) and denote ${\mathcal{H}}_{n,d}$ as the set of all multivariate homogeneous polynomials in n variables with degree d. For $\alpha \in {\mathbb{N}}^n$, we denote $x^{\alpha }:=\prod _{i=1}^nx_i^{\alpha _i}$, while for $I\subseteq [n]$, we let $x^{I}:=\prod _{i\in I}x_i$. Moreover, we denote $x^{\underline{d}}:=x(x-1)(x-2)\cdots (x-d+1)$ for integer $d\geqslant 0$ and $x^{\underline{\alpha }}:=\prod _{i=1}^nx_{i}^{\underline{{\alpha }_i}}$ for $\alpha \in {\mathbb{N}}^n$. Thus, $x^{\underline{d}}=0$ if x is an integer with $0\leqslant x\leqslant d-1$.

1.2 Upper Bounds Using Regular Grids

One can construct an upper bound for $\underline{f}$ by taking the minimum of f on the regular grid

$$\Delta (n,r):=\{x\in \Delta _n:rx\in {\mathbb{N}}^n\},$$

for an integer $r\geqslant 0$. We define

$$f_{\Delta (n,r)}:=\min _{x\in \Delta (n,r)}f(x).$$

Obviously, $\underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}$ and $f_{\Delta (n,r)}$ can be computed by $|\Delta (n,r)| = {n+r-1 \atopwithdelims ()r}$ evaluations of f. In fact, when considering polynomials f of fixed degree d, the parameters $f_{\Delta (n,r)}$ (with increasing values of r) provide a PTAS for (1. 1), as was proved by Bomze and de Klerk [1] (for $d=2$), and by de Klerk et al. [5] (for $d\geqslant 2$). Recently, de Klerk et al. [7] provide an alternative proof for this PTAS and refine the error bound for $f_{\Delta (n,r)}-\underline{f}$ from [5] for cubic f.

In addition, some researchers study the properties of the regular grid $\Delta (n,r)$. For instance, given a point $x \in \Delta _n$, Bomze et al. [2] show a scheme to find the closest point to x on $\Delta (n,r)$ with respect to some class of norms including $\ell _p$-norms for $p\geqslant 1$.

1.3 Lower Bounds Based on Pólya’s Representation Theorem

Given a polynomial $f\in {\mathcal{H}}_{n,d}$, Pólya [12] shows that if f is positive over the simplex $\Delta _n$, then the polynomial $(\sum _{i=1}^ nx_i)^r f$ has nonnegative coefficients for any r large enough (see [13] for an explicit bound for r). Based on this result of Pólya, an asymptotically converging hierarchy of lower bounds for $\underline{f}$ can be constructed as follows: for any integer $r\geqslant d$, we define the parameter $f_{\min }^{(r-d)}$ as

$$f_{\min }^{(r-d)}:=\max \lambda \ \ {\text {s.t.}}\ \ \left( \sum _{i=1}^nx_i\right) ^{r-d}\left( f-\lambda \left( \sum _{i=1}^nx_i\right) ^d\right) \ \ \text {has nonnegative coefficients.}$$

(1.2)

Notice that $\underline{f}$ can be equivalently formulated as

$$\underline{f}=\max \ \ \lambda \ \ {\text {s.t.}}\ \ f(x)-\lambda \left( \sum _{i=1}^nx_i\right) ^d\geqslant 0\ \ \forall x\in {\mathbb{R}}^n_+. $$

Then, one can easily check the following inequalities:

$$f_{\min }^{(0)}\leqslant f_{\min }^{(1)}\leqslant \cdots \leqslant \underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}.$$

Parrilo [9, 10] first introduces the idea of applying Pólya’s representation theorem to construct hierarchical approximations in copositive optimization. De Klerk et al. [5] consider $f_{\min }^{(r-d)}$ and show upper bounds for $\underline{f}-f_{\min }^{(r-d)}$ in terms of $\overline{f}-\underline{f}$. Furthermore, Yildirim [15] and Sagol and Yildirim [14] analyze error bounds for $f_{\min }^{(r-2)}$ for quadratic f.

Now we give an explicit formula for the parameter $f_{\min }^{(r-d)}$, which follows from [13, relation (3)]; note that the quadratic case of this formula has also been observed in [11, 14, 15].

Lemma 1.1

For $f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}$, one has

$$f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\sum _{\beta \in I(n,d)}f_{\beta }{ \alpha ^{\underline{\beta }} \over r^{\underline{d}}}.$$

(1.3)

Proof

By using the multinomial theorem $(\sum _{i=1}^n x_i)^d=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\alpha }$, we obtain

$$\left( \sum _{i=1}^nx_i\right) ^{r-d}f-\lambda \left( \sum _{i=1}^nx_i\right) ^{r}= \left( \sum _{\gamma \in I(n,r-d)}{(r-d)!\over \gamma !}x^{\gamma }\right) \left( \sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\right) -\lambda \left( \sum _{\alpha \in I(n,r)}{r!\over \alpha !}x^{\alpha }\right)= \sum _{\alpha \in I(n,r)}\left( \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}\right) {r!\over \alpha !}x^{\alpha }-\lambda \left( \sum _{\alpha \in I(n,r)}{r!\over \alpha !}x^{\alpha }\right) = \sum _{\alpha \in I(n,r)}\left( \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}-\lambda \right) {r!\over \alpha !}x^{\alpha }. $$

Hence, by Definition (1.2), we obtain

$$ f_{\min }^{(r-d)}= \max \ \ \lambda \ \ {\text {s.t.}} \ \ \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}-\lambda \geqslant 0\ \ \forall \alpha \in I(n,r)= \min \sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}\ \ {\text {s.t.}}\ \ \alpha \in I(n,r). $$

$\square $

Similarly as $f_{\Delta (n,r)}$, by (1.3), the computation of $f_{\min }^{(r-d)}$ requires $|I(n,r)| = {n+r-1 \atopwithdelims ()r}$ evaluations of the polynomial $\sum _{\beta \in I(n,d)}f_{\beta }\alpha ^{\underline{\beta }}{1\over r^{\underline{d}}}$.

1.4 Bernstein Coefficients

For any polynomial $f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}$, we can write it as

$$f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }=\sum _{\beta \in I(n,d)}\left( f_{\beta }{\beta !\over d!}\right) {d!\over \beta !}x^{\beta }. $$

(1.4)

For any $\beta \in I(n,d)$, we call $f_{\beta }{\beta !\over d!}$ the Bernstein coefficients of f (this terminology has also been used in [4, 7]), since they are the coefficients of the polynomial f when f is expressed in the Bernstein basis $\{{d!\over \beta !}x^\beta : \beta \in I(n,d)\}$ of ${\mathcal{H}}_{n,d}$. Applying the multinomial theorem together with (1.4), one can obtain that when evaluating f at a point $x\in \Delta _n$, $f(x)$ is a convex combination of the Bernstein coefficients $f_{\beta }{\beta !\over d!}$. Therefore, we have

$$\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\leqslant \underline{f}\leqslant f_{\Delta (n,r)}\leqslant \overline{f}\leqslant \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}.$$

(1.5)

For the analysis in Sect. 5, we need the following result of [5], which bounds the range of the Bernstein coefficients of f in terms of its range of values $\overline{f}-\underline{f}$.

Theorem 1.2

[5, Theorem 2.2] For any polynomial $f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}$, one has

$$\max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\leqslant {2d-1\atopwithdelims ()d}d^d (\overline{f}-\underline{f}).$$

1.5 Contribution of the Paper

In this paper, we consider upper bounds for $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$ in terms of $\overline{f}-\underline{f}$. More precisely, we provide tighter upper bounds in the quadratic, cubic and square-free (aka multilinear) cases and in the general case $d\geqslant 2$, our upper bounds are asymptotically tighter when $r$ is large enough. We will apply the formula (1.3) directly for the quadratic, cubic and square-free cases, while for the general case we will use Theorem 1.2.

There are some relevant results in the literature. De Klerk et al. [5] give upper bounds for $f_{\Delta (n,r)}-\underline{f}$ (the upper bound for cubic $f$ has been refined by de Klerk et al. [7]) and for $\underline{f}-f_{\min }^{(r-d)}$ in terms of $\overline{f}-\underline{f}$, and by adding them up one can easily derive upper bounds for $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$. Furthermore, for quadratic polynomial f, Yildirim [15] considers the upper bound $\min _{k\leqslant r} f_{\Delta (n,k)}$ for $\underline{f}$ (for $r \geqslant 2$) and upper bounds the range $\min _{k\leqslant r} f_{\Delta (n,k)}-f_{\min }^{(r-d)}$ in terms of $\overline{f}-\underline{f}$. Our results in this paper refine the results in [5, 7, 15] for the quadratic and cubic cases (see Sects. 2 and 3 respectively), while for the general case, our result refines the result of [5] when r is sufficiently large (see Sect. 5).

1.6 Structure

The paper is organized as follows. In Sects. 2 and 3, we consider the quadratic and cubic cases, respectively and refine the relevant results obtained from [5, 7, 15]. Then, we look at the square-free (aka multilinear) case in Sect. 4. Moreover, in Sect. 5, we consider general (fixed-degree) polynomials and compare our new result with the one of [5].

2 The Quadratic Case

For any quadratic polynomial f, we consider the range $f_{\Delta (n,r)}-f_{\min }^{(r-2)}$ and derive the following upper bound in terms of $\overline{f}-\underline{f}$.

Theorem 2.1

For any quadratic $f=x^{\text{T}}Qx$ and $r\geqslant 2$, one has

$$ f_{\Delta (n,r)}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(Q_{\max }-f_{\Delta (n,r)}) \leqslant {1\over r-1}(\overline{f}-\underline{f}),$$

(2.1)

where $Q_{\max }:=\max _{i\in [n]} Q_{ii}$.

Proof

By (1.3), we have

$$ f_{\min }^{(r-2)}=\min _{\alpha \in I(n,r)}{1\over r(r-1)}\left[ f(\alpha )-\sum _{i=1}^nQ_{ii}\alpha _i\right] .$$

Hence, ${r-1\over r}f_{\min }^{(r-2)}=\min _{\alpha \in I(n,r)}\left[ f({\alpha \over r})-\sum _{i=1}^nQ_{ii}{\alpha _i\over r}{1\over r}\right] .$ We obtain

$$ {r-1\over r}f_{\min }^{(r-2)}\geqslant \min _{\alpha \in I(n,r)}f({\alpha \over r})-\max _{\alpha \in I(n,r)}{1\over r}\sum _{i=1}^nQ_{ii}{\alpha _i\over r} = f_{\Delta (n,r)}-{1\over r} Q_{\max }.$$

(2.2)

One can easily obtain the first inequality in (2.1) by (2.2). For the second inequality in (2.1), we use the fact that $ Q_{\max } \leqslant \overline{f}$ (since $Q_{ii}=f(e_i)\leqslant \overline{f}$ for $i\in [n]$) as well as the fact that $f_{\Delta (n,r)}\geqslant \underline{f}$. $\square $

Now we point out that our result (2.1) refines the relevant result of [5]. De Klerk et al. [5] show the following theorem.

Theorem 2.2

[5, Theorem 3.2] Suppose $f\in {\mathcal{H}}_{n,2}$ and $r\geqslant 2$. Then

$$\underline{f}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(\overline{f}-\underline{f}),$$

(2.3)

$$f_{\Delta (n,r)}-\underline{f}\leqslant {1\over r}(\overline{f}-\underline{f}). $$

(2.4)

By adding up (2.3) and (2.4), one gets

$$f_{\Delta (n,r)}-f_{\min }^{(r-2)}\leqslant \left( {1\over r-1}+{1\over r}\right) (\overline{f}-\underline{f}),$$

which is implied by our result (2.1).

Moreover, in [15], Yildirim considers one hierarchical upper bound of $\underline{f}$ (when f is quadratic), which is defined by $\min _{k\leqslant r}f_{\Delta (n,k)}.$ One can easily verify that

$$f_{\min }^{(r-2)}\leqslant \underline{f} \leqslant \min _{k\leqslant r}f_{\Delta (n,k)} \leqslant f_{\Delta (n,r)}.$$

In [15, Theorem 4.1], Yildirim shows $\min _{k\leqslant r} f_{\Delta (n,k)}-f_{\min }^{(r-2)}\leqslant {1\over r-1}(Q_{\max }-\underline{f})$, which can also be easily implied by our result (2.1).

The following example shows that the upper bound (2.1) can be tight.

Example 2.3

[7, Example 2] Consider the quadratic polynomial $f=\sum _{i=1}^n x_i^2$. As f is convex, one can check that $\underline{f}={1\over n}$ (attained at $x={1\over n}e$) and $\overline{f}=1$ (attained at any standard unit vector). To compute $f_{\Delta (n,r)}$, we write r as $r=kn+s$, where $k\geqslant 0$ and $0\leqslant s<n$. Then one can check that

$$f_{\Delta (n,r)}={1\over n} + {1\over r^2}{s(n-s)\over n}.$$

By (1.3), we have

$$f_{\Delta (n,r)}-f_{\min }^{(r-2)}={1\over r-1}\left( \overline{f}-\underline{f}\right) -{1\over r^2(r-1)}{s(n-s)\over n}.$$

Hence, for this example, the upper bound (2.1) is tight when $s=0$.

3 The Cubic Case

For any cubic polynomial f, we consider the difference $f_{\Delta (n,r)}-f_{\min }^{(r-3)}$ and show the following result.

Theorem 3.1

For any cubic polynomial f and $r\geqslant 3$, one has

$$f_{\Delta (n,r)}-f_{\min }^{(r-3)}\leqslant {4r\over (r-1)(r-2)}(\overline{f}-\underline{f}). $$

(3.1)

Proof

We can write any cubic polynomial f as

$$f=\sum _{i=1}^nf_ix_i^3+\sum _{i<j}(f_{ij}x_ix_j^2+g_{ij}x_i^2x_j)+\sum _{i<j<k}f_{ijk}x_ix_jx_k.$$

Then by (1.3), one can check that

$${(r-1)(r-2)\over r^2}f_{\min }^{(r-3)} = \min _{\alpha \in I(n,r)}\left\{ f\left({\alpha \over r}\right)-{1\over r^3}\left( 3\sum _{i=1}^n f_i\alpha _i^2-2\sum _{i=1}^n f_i\alpha _i+\sum _{i<j}(f_{ij}+g_{ij})\alpha _i\alpha _j \right) \right\} \geqslant f_{\Delta (n,r)}-{1\over r}\max _{\alpha \in I(n,r)}\left\{ 3\sum _{i=1}^n f_i\left( {\alpha _i\over r}\right) ^2+\sum _{i<j}(f_{ij}+g_{ij})\left( {\alpha _i\over r}\right) \left( {\alpha _j\over r}\right) \right\} +{1\over r^2}\min _{\alpha \in I(n,r)}2\sum _{i=1}^n f_i{\alpha _i\over r} \geqslant f_{\Delta (n,r)}-{1\over r}\max _{x\in \Delta _n}\left\{ 3\sum _{i=1}^n f_ix_i^2+\sum _{i<j}(f_{ij}+g_{ij})x_ix_j\right\} +{1\over r^2}\min _{x\in \Delta _n}2\sum _{i=1}^n f_ix_i. $$

(3.2)

Evaluating f at $e_i$ and $(e_i+e_j)/2$ yields, respectively, the relations:

$$\underline{f}\leqslant f_i\leqslant \overline{f},$$

(3.3)

$$f_i+f_j+f_{ij}+g_{ij}\leqslant 8\overline{f}.$$

(3.4)

Using (3.4) and the fact that $\sum _{i=1}^nx_i=1$, one can obtain

$$\sum _{i<j}(f_{ij}+g_{ij})x_ix_j\leqslant \sum _{i<j}(8\overline{f}-f_i-f_j)x_ix_j=8\overline{f}\sum _{i<j}x_ix_j-\sum _{i=1}^n f_ix_i(1-x_i). $$

(3.5)

By (3.2), (3.3), (3.5) and the fact that $\sum _{i=1}^nx_i=1$, one can get

$$(r-1)(r-2)f_{\min }^{(r-3)}\geqslant r^2f_{\Delta (n,r)}-4r\overline{f}+(r+2)\min _{x\in \Delta _n}\sum _{i=1}^nf_ix_i\geqslant r^2f_{\Delta (n,r)}-4r\overline{f}+(r+2)\underline{f}.$$

Hence, one has

$$(r-1)(r-2)\left( f_{\Delta (n,r)}-f_{\min }^{(r-3)}\right) \leqslant 4r\overline{f}-(3r-2)f_{\Delta (n,r)}-(r+2)\underline{f}\leqslant 4r(\overline{f}-\underline{f}). $$

$\square $

Now we observe that our result (3.1) refines the relevant upper bound obtained from [5, 7]. De Klerk et al. [5] show the following result.

Theorem 3.2

[5, Theorem 3.3] Suppose $f\in {\mathcal{H}}_{n,3}$ and $r\geqslant 3$. Then

$$\underline{f}-f_{\min }^{(r-3)}\leqslant {4r\over (r-1)(r-2)}(\overline{f}-\underline{f}),$$

(3.6)

$$f_{\Delta (n,r)}-\underline{f}\leqslant {4\over r}(\overline{f}-\underline{f}). $$

(3.7)

Recently, De Klerk et al. [7, Corollary 2 ] refine (3.7) to

$$f_{\Delta (n,r)}-\underline{f}\leqslant \left( {4\over r}-{4\over r^2}\right) (\overline{f}-\underline{f}).$$

(3.8)

Similar to the quadratic case (in Sect. 2), our new upper bound (3.1) implies the upper bound obtained by adding up (3.6) and (3.8). However, we do not find any example showing the upper bound (3.1) is tight. Thus, it is still an open question to show the tightness of the upper bound (3.1).

4 The Square-free Case

Consider the square-free (aka multilinear) polynomial $f=\sum _{I:I\subseteq [n],|I|=d}f_Ix^I\in {\mathcal{H}}_{n,d}$. We have the following result for the difference $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$.

Theorem 4.1

For any square-free polynomial $f=\sum _{I:I\subseteq [n],|I|=d}f_Ix^I$ and $r\geqslant d$, one has

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-1\right) \left( \overline{f}-\underline{f}\right) . $$

(4.1)

Proof

From (1.3), one can easily check that

$$ f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\sum _{I:I\subseteq [n],|I|=d}f_I{\alpha ^I\over r^{\underline{d}}}={1\over r^{\underline{d}}}\min _{\alpha \in I(n,r)}f(\alpha ). $$

As a result, one can obtain

$${f_{\min }^{(r-d)}\over f_{\Delta (n,r)}}={r^d\over r^{\underline{d}}}. $$

For $d=1$, the result (4.1) is clear.

Now we assume $d\geqslant 2$. Considering $\overline{f}\geqslant 0$ (as $f(e_i)=0$ for any $i\in [n]$), we obtain

$$f_{\Delta (n,r)}-f_{\min }^{(r-d)}=\left( 1-{r^d\over r^{\underline{d}}}\right) f_{\Delta (n,r)}\leqslant \left( 1-{r^d\over r^{\underline{d}}}\right) \underline{f} \leqslant \left( {r^d\over r^{\underline{d}}}-1\right) \left( \overline{f}-\underline{f}\right) . $$

(4.2)

$\square $

The following example shows that our upper bound (4.1) can be tight.

Example 4.2

[7, Example 4] Consider the square-free polynomial $f=-x_1x_2$. One can check $\overline{f}=0,$ $\underline{f}=-{1\over 4}$, and

$$f_{\Delta (2,r)}=\left\{ \begin{array}{ll} -{1\over 4}\;, &{} {\text {if}}\,\, r \,\, {\text {is}} \, {\text{even,}}\\ -{1\over 4}+ {1\over 4r^2}\;, &{} {\text {if}} \,\, r \,\, {\text {is}} \, {\text{odd.}} \end{array} \right. $$

By (1.3), we have

$$f_{\Delta (2,r)}-f_{\min }^{(r-2)}=\left\{ \begin{array}{ll} {1\over r-1}\left( \overline{f}-\underline{f}\right), &{} {\text {if}} \,\, r \,\, {\text {is}} \, {\text{even,}}\\ \left( {1\over r}+{1\over r^2}\right) \left( \overline{f}-\underline{f}\right), &{} {\text {if}} \,\, r\,\, {\text {is}} \, {\text{odd.}} \end{array} \right.$$

For this example, the upper bound (4.1) is tight when $r$ is even. In fact, from (4.2), one can easily see that the upper bound (4.1) is tight as long as $f_{\Delta (n,r)}=\underline{f}-\overline{f}$ holds.

5 The General Case

Now, we consider an arbitrary polynomial $f=\sum _{\beta \in I(n,d)}f_{\beta }x^{\beta }\in {\mathcal{H}}_{n,d}$. We need the following notation to formulate our result. Consider the univariate polynomial $t^d-t^{\underline{d}}$ (in the variable t), which can be written as

$$ t^d-t^{\underline{d}}=\sum _{k=1}^{d-1}(-1)^{d-k-1}a_{d-k}t^{k},$$

(5.1)

for some positive scalars $a_1,a_2,\cdots ,a_{d-1}$. Moreover, one can easily check that

$$ \sum _{k=1}^{d-1}a_{d-k}t^{k}=(t+d-1)^{\underline{d}}-t^d. $$

(5.2)

We can show the following error bound for the range $f_{\Delta (n,r)}-f_{\min }^{(r-d)}$.

Theorem 5.1

For any polynomial $f\in {\mathcal{H}}_{n,d}$ and $r\geqslant d$, one has

$$ {f_{\Delta (n,r)}}-{f_{\min }^{(r-d)}}\leqslant {(r+d-1)^{\underline{d}}-r^d \over r^{\underline{d}}}{2d-1\atopwithdelims ()d}d^d (\overline{f}-\underline{f}). $$

(5.3)

Note that when $f$ is quadratic, cubic or square-free, we have shown better upper bounds in Theorems 2.1, 3.1 and 4.1.

In the proof, we will need the following Vandermonde-Chu identity (see [13] for a proof, or alternatively use induction on $d\geqslant 1$):

$$\left(\sum _{i=1}^n x_i\right)^{\underline{d}}=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\underline{\alpha }}\quad \forall x\in {\mathbb{R}}^n, $$

(5.4)

which is an analogue of the multinomial theorem $(\sum _{i=1}^n x_i)^d=\sum _{\alpha \in I(n,d)}{d!\over \alpha !}x^{\alpha }.$ Now we prove Theorem 5.1.

Proof

(of Theorem 5.1) From (1.3), we have

$${r^{\underline{d}}\over r^d}f_{\min }^{(r-d)}=\min _{\alpha \in I(n,r)}\left\{ \sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }\over r^d}-\sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }-\alpha ^{\underline{\beta }}\over r^d}\right\}.$$

From this, we obtain the inequality:

$${r^{\underline{d}}\over r^d}f_{\min }^{(r-d)}\geqslant f_{\Delta (n,r)}-\max _{\alpha \in I(n,r)}\sum _{\beta \in I(n,d)}f_{\beta }{\alpha ^{\beta }-\alpha ^{\underline{\beta }}\over r^d}.$$

(5.5)

We now focus on the summation $\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})$.

For any $\beta \in I(n,d)$ and $x\in {\mathbb{R}}^n$, we can write the polynomial $x^{\beta }-x^{\underline{\beta }}$ as

$$x^{\beta }-x^{\underline{\beta }}=\sum _{\gamma :|\gamma |\leqslant d-1}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma },$$

(5.6)

for some nonnegative scalars $c_{\gamma }^{\beta }$ (which is an analogue of (5.1)). We now claim that for any fixed $k\in [d-1]$, the following identity holds:

$$\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma}^{\beta}x^{\gamma}=(-1)^{d-k-1}a_{d-k}\left(\sum _{i=1}^nx_i\right)^{k}.$$

(5.7)

For this, observe that the polynomials at both sides of (5.7) are homogeneous of degree k. Hence (5.7) will follow if we can show that the equality holds after summing each side over $k\in [d-1]$. In other words, it suffices to show the identity:

$$ \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\sum _{k=1}^{d-1}(-1)^{d-k-1}a_{d-k}\left(\sum _{i=1}^nx_i\right)^{k}. $$

By the definition of $a_{d-k}$ in (5.1), the right side of the above equation is equal to $(\sum _{i=1}^nx_i)^{d}-(\sum _{i=1}^nx_i)^{\underline{d}}$. Hence, we only need to show

$$ \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\left(\sum _{i=1}^nx_i\right)^{d}-\left(\sum _{i=1}^nx_i\right)^{\underline{d}}. $$

(5.8)

Summing over (5.6), we obtain

$$ \sum _{\beta \in I(n,d)}{d!\over \beta !}\left( x^{\beta }-x^{\underline{\beta }}\right) =\sum _{\beta \in I(n,d)}\sum _{\gamma :|\gamma |\leqslant d-1}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }=\sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }x^{\gamma }. $$

We can now conclude the proof of (5.8) (and thus of (5.7)). Indeed, by using the multinomial theorem and the Vandermonde-Chu identity (5.4), we see that the left-most side in the above relation is equal to $(\sum _{i=1}^nx_i)^d-(\sum _{i=1}^nx_i)^{\underline{d}}.$

We partition $[d-1]$ as $[d-1]=I_{o}\cup I_{e}$, where $I_{o}:=\{k:k\in [d-1],{\text {d-k \, is \,odd}\}}$ and $I_{e}:=\{k:k\in [d-1],{\text{d-k}} \; {\text{is}} \quad {\text{even}}\}$. Then, from (5.6), the summation $\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})$ becomes

$$\sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }}) =\sum _{\beta \in I(n,d)}f_{\beta }\sum _{\gamma :|\gamma |\leqslant d-1}(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }\alpha ^{\gamma }= \sum _{k=1}^{d-1}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}f_{\beta }(-1)^{d-|\gamma |-1}c_{\gamma }^{\beta }\alpha ^{\gamma }\leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}c_{\gamma }^{\beta }\alpha ^{\gamma } -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}\sum _{\gamma \in I(n,k)}\sum _{\beta \in I(n,d)}{d!\over \beta !}c_{\gamma }^{\beta }\alpha ^{\gamma }. $$

By (5.7), we obtain

$$ \sum _{\beta \in I(n,d)}f_{\beta }(\alpha ^{\beta }-\alpha ^{\underline{\beta }})\leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}\left(\sum _{i=1}^n\alpha _i\right)^{k}- \left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}\left(\sum _{i=1}^n\alpha _i\right)^{k}. $$

Combining with (5.5), we get

$$r^{\underline{d}}f_{\min }^{(r-d)}\geqslant r^df_{\Delta (n,r)}-\left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} +\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}.$$

That is,

$$ r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)}) \leqslant (r^{\underline{d}}-r^d)f_{\Delta (n,r)}+ \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}. $$

Since $r^{\underline{d}}-r^d=\sum _{k=1}^{d-1}(-1)^{d-k}a_{d-k}r^{k}$, we obtain

$$r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)})\leqslant \sum _{k=1}^{d-1}(-1)^{d-k}a_{d-k}r^{k}f_{\Delta (n,r)}+ \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k} - \left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}= \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{o}}a_{d-k}r^{k}+f_{\Delta (n,r)}\sum _{k\in I_{e}}a_{d-k}r^{k} -\left( \min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k\in I_{e}}a_{d-k}r^{k}-f_{\Delta (n,r)}\sum _{k\in I_{o}}a_{d-k}r^{k}. $$

According to (1.5), one has $\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\le f_{\Delta (n,r)}\leqslant \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}$. Therefore, we have

$$ r^{\underline{d}}(f_{\Delta (n,r)}-f_{\min }^{(r-d)}) \leqslant \left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) \sum _{k=1}^{d-1}a_{d-k}r^{k}. $$

That is,

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant {\sum _{k=1}^{d-1}a_{d-k}r^{k} \over r^{\underline{d}}}\left( \max _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}-\min _{\beta \in I(n,d)}f_{\beta }{\beta !\over d!}\right) . $$

Finally, together with Theorem 1.2 and (5.2), we can conclude the result of Theorem 5.1.$\square $

Now, we compare the following theorem by De Klerk et al. [5] with our new result (5.3).

Theorem 5.2

[5, Theorem 1.3] Suppose $f\in {\mathcal{H}}_{n,d}$ and $r\geqslant d$. Then

$$\underline{f}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-1\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}), $$

(5.9)

$$ f_{\Delta (n,r)}-\underline{f}\leqslant \left( 1-{r^{\underline{d}}\over r^{d}}\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}). $$

(5.10)

By adding up (5.9) and (5.10), we obtain

$$ f_{\Delta (n,r)}-f_{\min }^{(r-d)}\leqslant \left( {r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}\right) {2d-1\atopwithdelims ()d}d^d(\overline{f}-\underline{f}). $$

(5.11)

Lemma 5.3

When r is large enough, the upper bound (5.3) refines the upper bound (5.11).

Proof

It suffices to show that ${r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}$ is larger than ${\sum _{k=1}^{d-1}a_{d-k}r^{k} \over r^{\underline{d}}}$ when $r$ is sufficiently large. Since ${r^d\over r^{\underline{d}}}-{r^{\underline{d}}\over r^d}=(r^d-{(r^{\underline{d}})^2\over r^d})/r^{\underline{d}}$, we only need to compare $r^d-{(r^{\underline{d}})^2\over r^d}$ and $\sum _{k=1}^{d-1}a_{d-k}r^{k}$. For the term $r^d-{(r^{\underline{d}})^2\over r^d}$, one can check that the coefficient of $r^d$ is $0$ and the coefficient of $r^{d-1}$ is $2a_{1}>0$. On the other hand, in the summation ${\sum _{k=1}^{d-1}a_{d-k}r^{k}}$, the coefficient of $r^{d-1}$ is $a_{1}>0$. Therefore, when $r$ is sufficiently large, $r^d-{(r^{\underline{d}})^2\over r^d}$ is larger than $\sum _{k=1}^{d-1}a_{d-k}r^{k}$, by which we conclude the proof. $\square $

We illustrate the result in Lemma 5.3 in the case of quartic polynomials.

Example 5.4

Consider a polynomial $f\in {\mathcal{H}}_{n,4}$ written as

$$f= \sum _{i=1}^nf_ix_i^4+\sum _{i<j}\left( f_{ij}x_i^3x_j+g_{ij}x_i^2x_j^2+h_{ij}x_ix_j^3\right) +\sum _{i<j<k}(f_{ijk}x_i^2x_jx_k+ g_{ijk}x_ix_j^2x_k +h_{ijk}x_ix_jx_k^2) +\sum _{i<j<k<l}f_{ijkl}x_ix_jx_kx_l. $$

In this case, (5.3) reads

$$ f_{\Delta (n,r)}-f_{\min }^{(r-4)}\leqslant {6r^2+11r+6\over (r-1)(r-2)(r-3)}{7\atopwithdelims ()4}4^4 (\overline{f}-\underline{f}), $$

(5.12)

while (5.11) reads

$$ f_{\Delta (n,r)}-f_{\min }^{(r-4)}\leqslant {12r^2-58r+144-{193\over r}+{132\over r^2}-{36\over r^3}\over (r-1)(r-2)(r-3)}{7\atopwithdelims ()4}4^4 (\overline{f}-\underline{f}). $$

(5.13)

One can check that (5.12) refines (5.13) when $r\geqslant 10$.

Remark 5.5

We now consider the convergence rate of the sequence

$$ \alpha _r:={f_{\Delta (n,r)}-f_{\min }^{(r-d)}\over \overline{f}-\underline{f}}\ \ \ \ r=1,2,\cdots $$

Suppose the degree of f is fixed. By (5.3), we have $\alpha _r=O({1\over r})$. As in Example 4.2, $\alpha _r=\Omega ({1\over r})$ holds, we can conclude that the dependence of $\alpha _r$ on r in (5.3) is tight, in the sense that there does not exist any $\varepsilon >0$ such that $\alpha _r=O({1\over r^{1+\varepsilon }})$.

In [7], De Klerk et al. consider the convergence rate of the sequence

$$ \beta _r:={{f_{\Delta (n,r)}-\underline{f}}\over \overline{f}-\underline{f}}\ \ \ \ r=1,2,\cdots $$

They consider several examples, and all of them satisfy $\beta _r=O({1\over r^2})$. However, it is still an open question to determine the asymptotic convergence rate of $\beta _r$ in general.

References

Bomze, I.M., de Klerk, E.: Solving standard quadratic optimization problems via semidefinite and copositive programming. J. Glob. Optim. 24(2), 163–185 (2002)
Article MATH Google Scholar
Bomze, I.M., Gollowitzer, S., Yildirim, E.A.: Rounding on the standard simplex: regular grids for global optimization. J. Glob. Optim. 59, 243–258 (2014) doi:10.1007/s10898-013-0126-2
de Klerk, E.: The complexity of optimizing over a simplex, hypercube or sphere: a short survey. CEJOR. 16(2), 111–125 (2008)
Article MathSciNet MATH Google Scholar
de Klerk, E., Laurent, M.: Error bounds for some semidefinite programming approaches to polynomial optimization on the hyeprcube. SIAM J. Optim. 20(6), 3104–3120 (2010)
Article MathSciNet MATH Google Scholar
de Klerk, E., Laurent, M., Parrilo, P.: A PTAS for the minimization of polynomials of fixed degree over the simplex. Theor. Comput. Sci. 361(2–3), 210–225 (2006)
Article MATH Google Scholar
de Klerk, E., den Hertog, D., Elfadul, G.E.E.: On the complexity of optimization over the standard simplex. Eur. J. Oper. Res. 191, 773–785 (2008)
Article MATH Google Scholar
de Klerk, E., Laurent, M., Sun, Z.: An alternative proof of a PTAS for fixed-degree polynomial optimization over the simplex. Preprint, arXiv:1311.0173 (2013)
Motzkin, T.S., Straus, E.G.: Maxima for graphs and a new proof of a theorem of Túran. Can. J. Math. 17, 533–540 (1965)
Parrilo, P.: Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization, Ph.D. Thesis, California Institute of Technology (2000)
Parrilo, P.: Semidefinite programming relaxations for semialgebraic problems. Math. Program. Ser. B. 96, 293–320 (2003)
Peña, J.C., Vera, J.C., Zuluaga, L.F.: Computing the stability number of a graph via linear and semidefinite programming. SIAM J. Optim. 18(1), 87–105 (2007)
Article MathSciNet MATH Google Scholar
Pólya, G.: Collected Papers, vol. 2, pp. 309–313. MIT Press, Cambridge (1974)
Google Scholar
Powers, V., Reznick, B.: A new bound for Pólya’s theorem with applications to polynomials positive on polyhedra. J. Pure Appl. Algebra. 164, 221–229 (2001)
Sagol, G., Yildirim, E.A.: Analysis of copositive optimization based bounds on standard quadratic optimization. Technical Report, Department of Industrial Engineering, Koc University, Sariyer, Istanbul (2013)
Yildirim, E.A.: On the accuracy of uniform polyhedral approximations of the copositive cone. Optim. Methods Softw. 27(1), 155–173 (2012)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgments

The author is grateful to M. Laurent and E. de Klerk for useful discussions and for their help to improve the presentation of this paper. The author also thanks the anonymous reviewers for useful remarks.

Author information

Authors and Affiliations

Tilburg University, PO Box 90153, 5000 LE, Tilburg, the Netherlands
Zhao Sun

Authors

Zhao Sun
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhao Sun.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, Z. A Refined Error Analysis for Fixed-Degree Polynomial Optimization over the Simplex. J. Oper. Res. Soc. China 2, 379–393 (2014). https://doi.org/10.1007/s40305-014-0057-8

Download citation

Received: 09 June 2014
Revised: 17 August 2014
Accepted: 26 August 2014
Published: 11 September 2014
Issue Date: September 2014
DOI: https://doi.org/10.1007/s40305-014-0057-8

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Refined Error Analysis for Fixed-Degree Polynomial Optimization over the Simplex

Abstract

Similar content being viewed by others

An alternative proof of a PTAS for fixed-degree polynomial optimization over the simplex

On the convergence rate of grid search for polynomial optimization over the simplex

Error bounds for mixed integer nonlinear optimization problems

1 Introduction and Preliminaries

1.1 Notation

1.2 Upper Bounds Using Regular Grids

1.3 Lower Bounds Based on Pólya’s Representation Theorem

Lemma 1.1

Proof

1.4 Bernstein Coefficients

Theorem 1.2

1.5 Contribution of the Paper

1.6 Structure

2 The Quadratic Case

Theorem 2.1

Proof

Theorem 2.2

Example 2.3

3 The Cubic Case

Theorem 3.1

Proof

Theorem 3.2

4 The Square-free Case

Theorem 4.1

Proof

Example 4.2

5 The General Case

Theorem 5.1

Proof

Theorem 5.2

Lemma 5.3

Proof

Example 5.4

Remark 5.5

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation