Skip to main content

Sum-of-squares hierarchies for binary polynomial optimization


We consider the sum-of-squares hierarchy of approximations for the problem of minimizing a polynomial f over the boolean hypercube \({{\mathbb {B}}^{n}=\{0,1\}^n}\). This hierarchy provides for each integer \(r \in {\mathbb {N}}\) a lower bound \(\smash {f_{({r})}}\) on the minimum \(f_{\min }\) of f, given by the largest scalar \(\lambda \) for which the polynomial \(f - \lambda \) is a sum-of-squares on \({\mathbb {B}}^{n}\) with degree at most 2r. We analyze the quality of these bounds by estimating the worst-case error \(f_{\min }- \smash {f_{({r})}}\) in terms of the least roots of the Krawtchouk polynomials. As a consequence, for fixed \(t \in [0, 1/2]\), we can show that this worst-case error in the regime \(r \approx t \cdot n\) is of the order \(1/2 - \sqrt{t(1-t)}\) as n tends to \(\infty \). Our proof combines classical Fourier analysis on \({\mathbb {B}}^{n}\) with the polynomial kernel technique and existing results on the extremal roots of Krawtchouk polynomials. This link to roots of orthogonal polynomials relies on a connection between the hierarchy of lower bounds \(\smash {f_{({r})}}\) and another hierarchy of upper bounds \(\smash {f^{({r})}}\), for which we are also able to establish the same error analysis. Our analysis extends to the minimization of a polynomial over the q-ary cube \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\). Furthermore, our results apply to the setting of matrix-valued polynomials.

This is a preview of subscription content, access via your institution.


  1. Technically, the program (31) allows for the density to be a sum of squares, whereas the program (35) requires the density to be an actual square. This is no true restriction, though, since, as a straightforward convexity argument shows, the optimum solution to (31) can in fact always be chosen to be a square [20].

  2. Note that as p is assumed to be real-valued, the coefficients \(\lambda _i\) must be real. Indeed, for each \(a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), we have \(\langle p, \chi _a \rangle _{\mu } = \lambda _{|a|} \Vert \chi _a\Vert ^2 = \lambda _{|a|} \Vert \chi ^{-1}_{a}\Vert ^2 = \langle p, \overline{\chi _a} \rangle _{\mu } = \overline{\langle p, \chi _a \rangle _{\mu }}\).


  1. Alon, N., Naor, A.: Approximating the cut-norm via Grothendieck’s inequality. In: 36th Annual ACM Symposium on Theory of Computing, pp. 72–80 (2004)

  2. Arora, S., Berger, E., Hazan, E., Kindler, G., Safra, M.: On non-approximability for quadratic programs. In: Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science, pp. 206–215 (2005)

  3. Balas, E., Ceria, S., Cornuéjols, G.: A lift-and-project cutting plane algorithm for mixed 0–1 programs. Math. Program. 58, 295–324 (1993)

    Article  MathSciNet  MATH  Google Scholar 

  4. Bansal, N., Blum, A., Chawla, S.: Correlation clustering. Mach. Learn. 46(1–3), 89–113 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Barak, B., Steurer, D.: Sum-of-squares proofs and the quest toward optimal algorithms. In Proceedings of International Congress of Mathematicians (ICM) (2014)

  6. Charikar, M., Wirth, A.: Maximizing quadratic programs: Extending Grothendieck’s inequality. In: Proceedings of the 45th Annual IEEE Symposium on Foundations of Computer Science, pp. 54–60 (2004)

  7. de Klerk, E., Laurent, M.: A survey of semidefinite programming approaches to the generalized problem of moments and their error analysis. In: Araujo C., Benkart G., Praeger C., Tanbay B. (eds), World Women in Mathematics 2018. Association for Women in Mathematics Series, vol. 20, pp. 17–56. Springer, Cham (2019)

  8. de Klerk, E., Laurent, M.: Convergence analysis of a Lasserre hierarchy of upper bounds for polynomial minimization on the sphere. Math. Program. (2020).

    Article  MATH  Google Scholar 

  9. de Klerk, E., Laurent, M.: Worst-case examples for Lasserre’s measure based hierarchy for polynomial optimization on the hypercube. Math. Oper. Res. 45(1), 86–98 (2020)

  10. Doherty, A.C., Wehner, S.: Convergence of SDP hierarchies for polynomial optimization on the hypersphere (2012). arXiv:1210.5048

  11. Fang, K., Fawzi, H.: The sum-of-squares hierarchy on the sphere, and applications in quantum information theory. Math. Program. (2020).

    Article  MATH  Google Scholar 

  12. Fawzi, H., Saunderson, J., Parrilo, P.A.: Sparse sums of squares on finite abelian groups and improved semidefinite lifts. Math. Program. 160(1–2), 149–191 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  13. Goemans, M., Williamson, D.: Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming. J. Assoc. Comput. Mach. 42(6), 1115–1145 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  14. Karlin, A.R., Mathieu, C., Nguyen, C.T.: Integrality gaps of linear and semi-definite programming relaxations for knapsack. In: Günlük, O., Woeginger, G.J. (eds.) Integer Programming and Combinatoral Optimization, pp. 301–314. Springer, Berlin, Heidelberg (2011)

  15. Kurpisz, A., Leppänen, S., Mastrolilli, M.: Tight Sum-of-Squares lower bounds for binary polynomial optimization problems. In: Chatzigiannakis, I., et al. (eds) 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016) vol. 78, pp. 1–14 (2016)

  16. Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11(3), 796–817 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lasserre, J.B.: A max-cut formulation of \(0/1\) programs. Oper. Res. Lett. 44, 158–164 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  18. Lasserre, J.B.: An explicit exact sdp relaxation for nonlinear 0–1 programs. In: Aardal, K., Gerards, B. (eds.) Integer Programming and Combinatorial Optimization, pp. 293–303, Springer, Berlin, Heidelberg (2001)

  19. Lasserre, J.B.: Moments. Positive Polynomials and Their Applications. Imperial College Press, London (2009)

    Book  Google Scholar 

  20. Lasserre, J.B.: A new look at nonnegativity on closed sets and polynomial optimization. SIAM J. Optim. 21(3), 864–885 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  21. Laurent, M.: A comparison of the Sherali–Adams, Lovász-Schrijver and Lasserre relaxations for 0-1 programming. Math. Oper. Res., 28(3):470–496 (2003)

  22. Laurent, M.: Lower bound for the number of iterations in semidefinite hierarchies for the cut polytope. Math. Oper. Res. 28(4), 871–883 (2003)

    Article  MathSciNet  MATH  Google Scholar 

  23. Laurent, M.: Semidefinite representations for finite varieties. Math. Program. 109, 1–26 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  24. Laurent, M.: Sums of squares, moment matrices and optimization over polynomials. In: Putinar, M., Sullivant, S. (eds.) Emerging Applications of Algebraic Geometry, Vol. 149 of IMA Volumes in Mathematics and its Applications, pp. 157–270. Springer (2009)

  25. Lee, J.R., Raghavendra, P., Steurer, D.: Lower bounds on the size of semidefinite programming relaxations. In: STOC ’15: Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, pp. 567–576 (2015)

  26. Levenshtein, V.I.: Universal bounds for codes and designs. In: Handbook of Coding Theory, vol. 9, pp. 499–648. North-Holland, Amsterdam (1998)

  27. Lovász, L., Schrijver, A.: Cones of matrices and set-functions and 0–1 optimization. SIAM J. Optim. 1, 166–190 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  28. Macwilliams, F., Sloane, N.: The Theory of Error Correcting Codes, volume 16 of North-Holland Mathematical Library. Elsevier (1983)

  29. Natanson, I.P.: Constructive Function Theory, Vol. I Uniform Approximation (1964)

  30. Nie, J., Schweighofer, M.: On the complexity of Putinar’s positivstellensatz. J. Complex. 23(1), 135–150 (2007)

  31. O’Donnell, R.: SOS is not obviously automatizable, even approximately. In: 8th Innovations in Theoretical Computer Science Conference vol. 59, pp. 1–10 (2017)

  32. Parrilo, P.A.: Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. Ph.D thesis, California Institure of Technology (2000)

  33. Raghavendra, P., Weitz, B.: On the bit complexity of sum-of-squares proofs. In: 44th International Colloquium on Automata, Languages, and Programming, vol. 80, pp. 1–13 (2017)

  34. Reznick, B.: Uniform denominators in Hilbert’s seventeenth problem. Math. Z. 220(1), 75–97 (1995)

  35. Rothvoss, T.: The Lasserre hierarchy in approximation algorithms. Lecture Notes for the MAPSP 2013 Tutorial (2013)

  36. Sakaue, S., Takeda, A., Kim, S., Ito, N.: Exact semidefinite programming relaxations with truncated moment matrix for binary polynomial optimization problems. SIAM J. Optim. 27(1), 565–582 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  37. Scherer, C.W., Hol, C.W.J.: Matrix sum-of-squares relaxations for robust semi-definite programs. Math. Program. 107, 189–211 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  38. Schweighofer, M.: On the complexity of Schmüdgen’s positivstellensatz. J. Complex. 20(4), 529–543 (2004)

  39. Sherali, H.D., Adams, W.P.: A hierarchy of relaxations between the continuous and convex hull representations for zero-one programming problems. SIAM J. Discret. Math. 3, 411–430 (1990)

    Article  MathSciNet  MATH  Google Scholar 

  40. Slot, L., Laurent, M.: Improved convergence analysis of Lasserre’s measure-based upper bounds for polynomial minimization on compact sets. Math. Program. (2020).

  41. Szegö, G.: Orthogonal Polynomials. vol. 23 in American Mathematical Society colloquium publications. American Mathematical Society (1959)

  42. Slot, L., Laurent, M.: Near-optimal analysis of of Lasserre’s univariate measure-based bounds for multivariate polynomial optimization. Math. Program. (2020).

  43. Terras, A.: Fourier Analysis on Finite Groups and Applications. London Mathematical Society Student Texts, Cambridge University Press, Cambridge (1999)

    Book  MATH  Google Scholar 

  44. Tunçel, L.: Polyhedral and Semidefinite Programming Methods in Combinatorial Optimization. Fields Institute Monograph (2010)

  45. Vallentin, F.: Semidefinite programs and harmonic analysis. arXiv:0809.2017 (2008)

Download references


We wish to thank Sven Polak and Pepijn Roos Hoefgeest for several useful discussions. We also thank the anonymous referees for their helpful comments and suggestions.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Lucas Slot.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work is supported by the European Union’s Framework Programme for Research and Innovation Horizon 2020 under the Marie Skłodowska-Curie Actions Grant Agreement No. 764759 (MINOA)


The q-ary cube

In this section, we indicate how our results for the boolean cube \({\mathbb {B}}^{n}\) may be extended to the q-ary cube \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} = \{0, 1, \dots , q-1\}^n\) when \(q > 2\) is a fixed integer. Here \({\mathbb {Z}}/ q{\mathbb {Z}}\) denotes the cyclic group of order q, so that \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}={\mathbb {B}}^{n}\) when \(q=2\). The lower bound \(\smash {f_{({r})}}\) for the minimum of a polynomial f over \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) is defined analogously to the case \(q=2\); namely we set

$$\begin{aligned} \smash {f_{({r})}} := \sup _{\lambda \in {\mathbb {R}}} \left\{ f(x) - \lambda \text { is a sum-of-squares of degree at most } 2r \text { on } \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} \right\} , \end{aligned}$$

where the condition means that \(f(x) - \lambda \) agrees with a sum of squares \(s \in \varSigma [x]_{2r}\) for all \(x \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) or, alternatively, that \(f - \lambda - s\) belongs to the ideal generated by the polynomials \(x_i(x_i -1)\ldots (x_i - q + 1)\) for \(i \in [n]\). Similarly, the upper bound \(\smash {f^{({r})}}\) is defined as in (7) after equipping \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) with the uniform measure \(\mu \). The parameters \(\smash {f_{({r})}}\) and \(\smash {f^{({r})}}\) may again be computed by solving a semidefinite program of size polynomial in n for fixed \(r, q \in {\mathbb {N}}\), see [23].

As before d(xy) denotes the Hamming distance and |x| denotes the Hamming weight (number of nonzero components). Note that, for \(x,y\in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), d(xy) can again be expressed as a polynomial in xy, with degree \(q-1\) in each of x and y.

We will prove Theorem 11 below, which can be seen as an analog of Corollary 1 for \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\). The general structure of the proof is identical to that of the case \(q=2\). We therefore only give generalizations of arguments as necessary. For reasons that will become clear later, it is most convenient to consider the sum-of-squares bound \(\smash {f_{({r})}}\) on the minimum \(f_{\min }\) of a polynomial f with degree at most \((q-1)d\) over \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), where \(d\le n\) is fixed.

Fourier analysis on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) and Krawtchouk polynomials Consider the space

$$\begin{aligned} {\mathcal {R}}[{x}] := {\mathbb {C}}[x] / (x_i(x_i-1) \ldots (x_i-q+1): i\in [n]) \end{aligned}$$

consisting of the polynomials on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) with complex coefficients. We equip \({\mathcal {R}}[{x}]\) with its natural inner product

$$\begin{aligned} \langle f,g\rangle _{\mu }= \int _{\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}}f(x)\overline{g(x)}d\mu (x)={1\over q^n} \sum _{x \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}} f(x)\overline{g(x)}, \end{aligned}$$

where \(\mu \) is the uniform measure on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\). The space \({\mathcal {R}}[{x}]\) has dimension \(|\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}| = q^n\) over \({\mathbb {C}}\) and it is spanned by the polynomials of degree up to \((q-1)n\). The reason we now need to work with polynomials with complex coefficients is that the characters have complex coefficients when \(q>2\).

Let \(\psi = e^{2\pi i / q}\) be a primitive q-th root of unity. For \(a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), the associated character \(\chi _a \in {\mathcal {R}}[{x}]\) is defined by:

$$\begin{aligned} \chi _a(x) = \psi ^{a \cdot x}\quad (x\in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}). \end{aligned}$$

So (14) is indeed the special case of this definition when \(q=2\). The set of all characters \(\{ \chi _a : a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} \}\) forms an orthogonal basis for \({\mathcal {R}}[{x}]\) w.r.t. the above inner product \(\langle \cdot , \cdot \rangle _{\mu }\). A character \(\chi _a\) can be written as a polynomial of degree \((q-1) \cdot |a|\) on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), i.e., we have \(\chi _a \in {\mathcal {R}}[{x}]_{(q-1)|a|}\) for all \(a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\).

As before, we have the direct sum decomposition into pairwise orthogonal subspaces:

$$\begin{aligned} {\mathcal {R}}[{x}] = H_0 \perp H_1 \perp \dots \perp H_n, \end{aligned}$$

where \(H_i\) is spanned by the set \(\{ \chi _a : |a| = i \}\) and \(H_i\subseteq {\mathbb {R}}[x]_{(q-1)i}\). The components \(H_i\) are invariant and irreducible under the action of \({{\,\mathrm{Aut}\,}}(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n})\), which is generated by the coordinate permutations and the action of \(\text {Sym}(q)\) on individual coordinates. Hence any \(p \in {\mathcal {R}}[{x}]\) of degree at most \((q-1)d\) can be (uniquely) decomposed as:

$$\begin{aligned} p = p_0 + p_1 + \dots + p_d \quad (p_i \in H_i). \end{aligned}$$

As before \({\mathrm{St}}(0) \subseteq {{\,\mathrm{Aut}\,}}(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n})\) denotes the stabilizer of \(0 \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), which is generated by the coordinate permutations and the permutations in \(\text {Sym}(q)\) fixing 0 in \(\{0,1,\ldots ,q-1\}\) at any individual coordinate. We note for later reference that the subspace of \(H_i\) invariant under action of \({\mathrm{St}}(0)\) is of dimension one, and is spanned by the zonal spherical function:

$$\begin{aligned} X_i = \sum _{|a| = i}\chi _a \in H_i. \end{aligned}$$

The Krawtchouk polynomials introduced in Sect. 2 have the following generalization in the q-ary setting:

$$\begin{aligned} {\mathcal {K}}^n_k(t) = {\mathcal {K}}^n_{k, q} (t) := \sum _{i=0}^k (-1)^i (q-1)^{k-i} {t \atopwithdelims ()i} {n-t \atopwithdelims ()k-i}. \end{aligned}$$

Analogously to relation (20), the Krawtchouk polynomials \({\mathcal {K}}^n_k\) (\(0\le k\le n\)) are pairwise orthogonal w.r.t. the discrete measure \(\omega \) on [0 : n] given by:

$$\begin{aligned} \omega (t) = \frac{1}{q^n} \sum _{t=0}^n w(t) \delta _t, \text { with } w(t) := (q-1)^t {n \atopwithdelims ()t}. \end{aligned}$$

To be precise, we have:

$$\begin{aligned} \sum _{t=0}^n {\mathcal {K}}^n_{k}(t) {\mathcal {K}}^n_{k'}(t) (q-1)^t {n \atopwithdelims ()t} = \delta _{k, k'} (q-1)^k {n \atopwithdelims ()k}. \end{aligned}$$

As \({\mathcal {K}}^n_k(0) = (q-1)^k {n \atopwithdelims ()k} = \Vert {\mathcal {K}}^n_k \Vert ^2_{\omega }\), we may normalize \({\mathcal {K}}^n_k\) by setting:

$$\begin{aligned} \widehat{{\mathcal {K}}}^n_k(t) := {\mathcal {K}}^n_k(t) / {\mathcal {K}}^n_k(0) = {\mathcal {K}}^n_k(t) / \Vert {\mathcal {K}}^n_k \Vert ^2_{\omega }, \end{aligned}$$

so that \(\widehat{{\mathcal {K}}}^n_k\) satisfies \(\max _{t=0}^n \widehat{{\mathcal {K}}}^n_k(t) = \widehat{{\mathcal {K}}}^n_k(0) = 1\) (cf. 23).

We have the following connection (cf. 22) between the characters and the Krawtchouk polynomials:

$$\begin{aligned} \sum _{a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} : |a| = k} \chi _a(x) = {\mathcal {K}}^n_k(i) \quad \text { for } x\in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} \text { with } |x| = i. \end{aligned}$$

Note that for all \(a, x, y \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), we have:

$$\begin{aligned} \chi _a^{-1}(x) = \overline{\chi _a(x)} = \chi _a(-x), \quad \chi _a(x) \chi _a(y) = \chi _a(x + y). \end{aligned}$$

Hence, for any \(x,y\in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\), we also have (cf. 21):

$$\begin{aligned} \sum _{a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} : |a| = k} \chi _a(x)\overline{\chi _a(y)}= & {} \sum _{a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} : |a| = k} \chi _a(x-y) = {\mathcal {K}}^n_k(i) \quad \text {when } d(x,y)\\= & {} |x-y| = i. \end{aligned}$$

Invariant kernels In analogy to the binary case \(q=2\), for a degree r univariate polynomial \(u\in {\mathbb {R}}[t]_r\) we define the associated polynomial kernel \(K(x,y) := u^2(d(x,y))\) (\(x,y\in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\)) and the associated kernel operator:

$$\begin{aligned} {\mathbf {K}}: p \mapsto {\mathbf {K}}p (x)= & {} \int _{\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}} \overline{p(y)} K(x,y)d\mu (y) \\= & {} {1\over q^n}\sum _{y \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}} \overline{p(y)}K(x,y) \quad (p \in {\mathcal {R}}[{x}]). \end{aligned}$$

Note that K(xy) is a polynomial on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) with degree \(2r(q-1)\) in each of x and y. Let us decompose the univariate polynomial \(u(t)^2\) in the Krawtchouk basis as

$$\begin{aligned} u(t)^2= \sum _{i=0}^{2r} \lambda _i {\mathcal {K}}^n_i(t). \end{aligned}$$

Then the kernel operator \({\mathbf {K}}\) acts as follows on characters: for \(z \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\),

$$\begin{aligned} {\mathbf {K}}\chi _z = \lambda _{|z|}\chi _z, \end{aligned}$$

which can be seen by retracing the proof of Theorem 6, and we obtain the Funk–Hecke formula (recall (28)): for any polynomial \(p \in {\mathcal {R}}[{x}]_{(q-1)d}\) with Harmonic decomposition \(p=p_0+\cdots +p_d\),

$$\begin{aligned} {\mathbf {K}}p = \lambda _0 p_0 + \dots + \lambda _d p_d. \end{aligned}$$

Performing the analysis It remains to find a univariate polynomial \(u \in {\mathbb {R}}[t]\) of degree r with \(u^2(t) = \sum _{i=0}^{2r}\lambda _i {\mathcal {K}}^n_i(t)\) for which \(\lambda _0 = 1\) and the other scalars \(\lambda _i\) are close to 1. As before (cf. 29), we have:

$$\begin{aligned} \lambda _i = \langle {\mathcal {K}}^n_i, u^2 \rangle _{\omega } / \Vert {\mathcal {K}}^n_i \Vert _{\omega }^2 = \langle \widehat{{\mathcal {K}}}^n_i, u^2 \rangle _{\omega }. \end{aligned}$$

So we would like to minimize \(\sum _{i=1}^{2r} (1-\lambda _i)\). We are therefore interested in the inner Lasserre hierarchy applied to the minimization of the function \(g(t) = d - \sum _{i = 0}^d \widehat{{\mathcal {K}}}^n_i(t)\) on the set [0 : n] (equipped with the measure \(\omega \) from (47)). We show first that this function g again has a nice linear upper estimator.

Lemma 18

We have:

$$\begin{aligned} \begin{aligned} |\widehat{{\mathcal {K}}}^n_k(t) - \widehat{{\mathcal {K}}}^n_k(t+1)|&\le \frac{2k}{n}, \quad \quad&(t=0,1,\dots ,n-1)\\ |\widehat{{\mathcal {K}}}^n_k(t) - 1|&\le {2k\over n} \cdot t&(t=0,1,\dots ,n) \end{aligned} \end{aligned}$$

for all \(k \le n\).


The proof is almost identical to that of Lemma 4. Let \(t\in [0:n-1]\) and \(0 < k \le d\). Consider the elements \(1^t 0^{n-t}, 1^{t+1} 0^{n-t-1} \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) from Lemma 2. Then we have:

$$\begin{aligned} |{\mathcal {K}}^n_k(t) - {\mathcal {K}}^n_k(t+1)|&\overset{(48)}{=} |\sum _{|a| = k} \chi _a(1^t 0^{n-t}) - \chi _a(1^{t+1} 0^{n-t-1})| \\&\le 2\cdot \#\big \{ a \in \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} : |a| = k, a_{t+1} \ne 0 \big \}\\&=2\cdot (q-1)^k \cdot {n - 1 \atopwithdelims ()k-1}, \end{aligned}$$

where for the inequality we note that \(\chi _a(1^{t} 0^{n-t}) = \chi _a(1^{t+1} 0^{n-t-1})\) if \(a_{t+1} = 0\). As \({\mathcal {K}}^n_k(0) = (q-1)^k {n \atopwithdelims ()k}\), this implies that:

$$\begin{aligned} |\widehat{{\mathcal {K}}}^n_k(t) - \widehat{{\mathcal {K}}}^n_k(t+1)| \le 2\cdot {n - 1 \atopwithdelims ()k-1} / {n \atopwithdelims ()k} = \frac{2k}{n}. \end{aligned}$$

This shows the first inequality of (49). The second inequality follows using the triangle inequality, a telescope summation argument and the fact that \(\widehat{{\mathcal {K}}}^n_k(0)=1\). \(\square \)

From Lemma 18 we obtain that the function \(g(t)= d - \sum _{i = 0}^d \widehat{{\mathcal {K}}}^n_i(t) \) admits the following linear upper estimator: \(g(t) \le d(d+1) \cdot (t/n)\) for \(t \in [0:n]\). Now the same arguments as used for the case \(q=2\) enable us to conclude:

$$\begin{aligned} \smash {f^{({(q-1)r})}}-f_{\min }\le C_d \cdot \xi _{r+1,q}^n / n \end{aligned}$$

and, when \(d(d+1) \xi _{r+1,q}^n / n \le 1/2\),

$$\begin{aligned} f_{\min }- \smash {f_{({(q-1)r})}} \le 2 C_d \cdot \xi _{r+1,q}^n / n. \end{aligned}$$

Here \(C_d\) is a constant depending only on d and \(\xi _{r+1,q}^n\) is the least root of the Krawtchouk polynomial \({\mathcal {K}}^n_{r+1, q}\). Note that as the kernel \(K(x,y) = u^2(d(x,y))\) is of degree \(2(q-1)r\) in x (and y), we are only able to analyze the corresponding levels \((q-1)r\) of the hierarchies. We come back below to the question on how to show the existence of the above constant \(C_d\).

But first we finish the analysis. Having shown analogs of Theorem 1 and Theorem 3 in this setting, it remains to state the following more general version of Theorem 2, giving information about the smallest roots of the q-ary Krawtchouk polynomials.

Theorem 10

([26], Section 5) Fix \(t\in [0,{q-1\over q}]\). Then the smallest roots \(\xi ^n_{r,q}\) of the q-ary Krawtchouk polynomials \({\mathcal {K}}^n_{r, q}\) satisfy:

$$\begin{aligned} \lim _{r/n \rightarrow t} \xi _{r,q}^n / n = \varphi _q(t) := \frac{q-1}{q} - \left( \frac{q-2}{q}\cdot t + \frac{2}{q} \sqrt{(q-1)t(1-t)}\right) . \end{aligned}$$

Here the above limit means that, for any sequences \((n_j)_j\) and \((r_j)_j\) of integers such that \(\lim _{j\rightarrow \infty } n_j=\infty \) and \(\lim _{j\rightarrow \infty }r_j/n_j=t\), we have \(\lim _{j\rightarrow \infty } \xi _{r_j,q}^{n_j}/n_j =\varphi _q(t)\).

Note that for \(q=2\) we have \(\varphi _q(t) ={1\over 2} -\sqrt{t(1-t)}\), which is the function \(\varphi (t)\) from (5). To avoid technical details we only quote in Theorem 10 the asymptotic analog of Theorem 2 (and not the exact bound on the root \(\xi ^n_{r,q}\) for any n). Therefore we have shown the following q-analog of Corollary 1.

Theorem 11

Fix \(d\le n\) and for \(n,r\in {\mathbb {N}}\) write

$$\begin{aligned} E_{(r)}(n)&:= \sup _{f \in {\mathbb {R}}[x]_{(q-1)d}} \big \{f_{\min }- \smash {f_{({r})}} : \Vert f\Vert _\infty = 1 \big \}, \\ E^{(r)}(n)&:= \sup _{f \in {\mathbb {R}}[x]_{(q-1)d}} \big \{ \smash {f^{({r})}} - f_{\min }: \Vert f\Vert _\infty = 1 \big \}. \end{aligned}$$

There exists a constant \(C_d>0\) (depending also on q) such that, for any \(t\in [0,{q-1\over q}]\), we have:

$$\begin{aligned} \lim _{r/n\rightarrow t} E^{((q-1)r)}(n) \le C_d\cdot \varphi _q(t) \end{aligned}$$

and, if \(d(d+1)\cdot \varphi _q(t) \le 1/2\), then we also have:

$$\begin{aligned} \lim _{r/n\rightarrow t} E_{((q-1)r)}(n) \le 2\cdot C_d\cdot \varphi _q(t). \end{aligned}$$

Here \(\varphi _q(t)\) is the function defined in (50). Recall that the limit notation \(r/n\rightarrow t\) means that the claimed convergence holds for any sequences \((n_j)_j\) and \((r_j)_j\) of integers such that \(\lim _{j\rightarrow \infty } n_j=\infty \) and \(\lim _{j\rightarrow \infty }r_j/n_j=t\).

For reference, the function \(\varphi _q(t)\) is shown for several values of q in Fig. 1.

Fig. 1
figure 1

The function \(\varphi _q(t)\) for several values of q. Note that the case \(q=2\) corresponds to the function \(\varphi (t)\) of (5)

A generalization of Lemma 5 The arguments above omit a generalization of Lemma 5, which is instrumental to show the existence of the constant \(C_d\) claimed above. In other words, we still need to show that if \(p : \mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n} \rightarrow {\mathbb {R}}\) is a polynomial of degree \((q-1)d\) on \(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n}\) with harmonic decomposition \(p=p_0+\ldots +p_d\), there then exists a constant \(\gamma _d > 0\) (independent of n) such that:

$$\begin{aligned} \Vert p_i\Vert _{\infty } \le \gamma _d \Vert p \Vert _{\infty } \text { for all } 0\le i \le d. \end{aligned}$$

Then, as in the binary case, we may set \(C_d=d(d+1)\gamma _d\). The proof given in Sect. 6 for the case \(q=2\) applies almost directly to the general case, and we only generalize certain steps as required. So consider again the parameters:

$$\begin{aligned} \rho (n, d, k)&:= \sup \{\Vert p_k\Vert _\infty : p = p_0 + p_1 + \dots + p_d\in {\mathbb {R}}[x]_{(q-1)d}, \Vert p\Vert _{\infty } \le 1 \}, \text { and }\\ \rho (n, d)&:= \max _{0 \le k \le d} \rho (n, d, k). \end{aligned}$$

Lemmas 12 and 13, which show that the optimum solution p to \(\rho (n, d, k)\) may be assumed to be invariant under \({\mathrm{St}}(0) \subseteq {{\,\mathrm{Aut}\,}}(\mathbb ({\mathbb {Z}}/ q{\mathbb {Z}})^{n})\), clearly apply to the case \(q > 2\) as well. That is to say, we may assume p is of the form:Footnote 2

$$\begin{aligned} p(x) = \sum _{i=0}^d \lambda _i X_i(x) \quad (\lambda _i \in {\mathbb {R}}) \end{aligned}$$

where \(X_i = \sum _{|a| = i}\chi _a \in H_i\) is the zonal spherical function of degree \((q-1)i\) (cf. 46 and 18). Using (48), we obtain a reformulation of \(\rho (n, d, k)\) as an LP (cf. 41):

$$\begin{aligned} \begin{aligned}&\rho (n, d, k) = \max \quad \lambda _k \\&s.t. \quad -1 \le \sum _{i=0}^d \lambda _i \widehat{{\mathcal {K}}}^n_{i, q}(t) \le 1 \quad (t=0,1, \dots , n). \end{aligned} \end{aligned}$$

For \(k \in {\mathbb {N}}\), let \(\widehat{{\mathcal {K}}}^\infty _k(t) := \lim _{n \rightarrow \infty } \widehat{{\mathcal {K}}}^n_k(nt) = \Big (1 - \frac{q}{q-1}t\Big )^k\) and consider the program (cf. 43):

$$\begin{aligned} \begin{aligned}&\rho (\infty , d, k) := \max \quad \lambda _k \\&s.t. \quad -1 \le \sum _{i=0}^d \lambda _i \widehat{{\mathcal {K}}}^\infty _i(t) \le 1 \quad (t \in [0, 1]). \end{aligned} \end{aligned}$$

As before, we have \(\rho (n, d, k) \le \rho (\infty , d, k)\), noting that (the proofs of) Lemmas 16 and 17 may be applied directly to the case \(q>2\). From there, it suffices to show \(\rho (\infty , d, k) < \infty \), which can be argued in an analogous way to the case \(q=2\).

Matrix-valued polynomials

In this section, we show how the arguments used for the proofs of our main results in Theorems 1 and 3 may be applied in the setting of matrix-valued polynomials, thereby proving Theorems 4 and 5.

Recall that \({\mathcal {S}}^k\) is the space of \(k\times k\) real symmetric matrices and \({\mathcal {S}}^k[x] \subseteq {\mathbb {R}}^{k \times k}[x]={\mathbb {R}}[x]^{k\times k}\) is the space of n-variate polynomials whose coefficients lie in \({\mathcal {S}}^k\). Given a polynomial matrix \(F\in {\mathcal {S}}^k[x]\) we consider the matrix-valued polynomial optimization problem:

$$\begin{aligned} F_{\min } := \min _{x \in {\mathbb {B}}^{n}} \lambda _{\min }(F(x)), \end{aligned}$$

for which we have the outer Lasserre hierarchy:

$$\begin{aligned} F_{(r)} := \sup _{\lambda \in {\mathbb {R}}} \left\{ F(x) - \lambda \cdot I = S(x) \text { on } {\mathbb {B}}^{n} \text { for some } S \in \varSigma ^{k \times k}_r \right\} , \end{aligned}$$

and the inner Lasserre hierarchy:

$$\begin{aligned} F^{(r)} := \inf _{S \in \varSigma ^{k \times k}_r} \left\{ \int _{{\mathbb {B}}^{n}} \mathrm{Tr} \big ( F(x) S(x)\big ) d\mu (x) : \int _{{\mathbb {B}}^{n}} \mathrm{Tr} \big (S(x)\big ) d\mu (x) = 1\right\} . \end{aligned}$$

Here, the set \(\varSigma ^{k \times k}_r\) consists of all sum-of-squares polynomial matrices \(S \in {\mathcal {S}}^k[x]\), of the form:

$$\begin{aligned} S(x) = \sum _{i} U_i(x) U_i(x)^\top \quad (U_i \in {\mathbb {R}}^{k \times m}[x], ~~\mathrm{\deg } ~U_i \le r,~~m\in {\mathbb {N}}). \end{aligned}$$

The outer hierarchy (proof of Theorem4) We generalize the outline of Sect. 1.5 to the matrix-valued setting. Let \(F\in {\mathcal {S}}^k[x]\) be the polynomial matrix of degree d to be optimized, and assume w.l.o.g. that \(0 \le \Vert F\Vert _\infty \le 1\). Here, and throughout this section, \(\Vert F\Vert _\infty := \max _{x \in {\mathbb {B}}^{n}} \Vert F(x) \Vert \) is the largest absolute value of an eigenvalue of F(x) over \({\mathbb {B}}^{n}\). A kernel \(K\) of the form \(K(x, y) = u^2(d(x,y))\) with \(u \in {\mathbb {R}}[t]_r\) (cf. 8) induces a linear operator \({\mathbf {K}}\) on \({\mathcal {S}}^k[x]\) by:

$$\begin{aligned} {\mathbf {K}}P (x) := \int _{{\mathbb {B}}^{n}} P(y) K(x,y) d \mu (y) = \frac{1}{2^n} \sum _{y \in {\mathbb {B}}^{n}} P(y) K(x, y) \quad (P \in {\mathcal {S}}^k[x]). \end{aligned}$$

If \(P(x) \succeq 0\) for all \(x \in {\mathbb {B}}^{n}\), then the polynomial \({\mathbf {K}}P\) is a sum-of-squares polynomial matrix of degree at most 2r on \({\mathbb {B}}^{n}\). Indeed, we then have:

$$\begin{aligned} {\mathbf {K}}P(x) = \frac{1}{2^n} \sum _{y \in {\mathbb {B}}^{n}} U_y(x) U_y(x)^\top , \quad \text { where } U_y(x) = u(d(x,y)) \sqrt{P(y)}. \end{aligned}$$

Given \(\delta > 0\) to be determined later, set \({\tilde{F}} = F + \delta I\). Assuming that \({\mathbf {K}}\) is non-singular, we can write \(F = {\mathbf {K}}({\mathbf {K}}^{-1} {\tilde{F}})\). Therefore, assuming that \({\mathbf {K}}^{-1} {\tilde{F}}\) is positive semidefinite over \({\mathbb {B}}^{n}\), we find that \(F + \delta I\) is a sum-of-squares polynomial matrix of degree 2r on \({\mathbb {B}}^{n}\), and thus that \(F_{\min } - F_{(r)} \le \delta \).

To guarantee positive semidefiniteness of \({\tilde{F}}\), it suffices to ensure that (cf. 9):

$$\begin{aligned} \Vert {\mathbf {K}}^{-1} - I\Vert := \sup _{P \in {\mathcal {S}}^k[x]_d} \frac{\Vert {\mathbf {K}}^{-1} P - P \Vert _\infty }{\Vert P\Vert _\infty } \le \delta . \end{aligned}$$

Indeed, as the smallest eigenvalue of \({\tilde{F}}(x)\) is at least \(\delta \) for each \(x \in {\mathbb {B}}^{n}\), the smallest eigenvalue of \({\mathbf {K}}^{-1}F(x)\) must then be at least zero.

As in the case of scalar-valued polynomials, the eigenvalues of \({\mathbf {K}}\) are given by the coefficients \(\lambda _i\) in the expansion \({u^2(t) = \sum _{i = 0}^{2r} \lambda _i {\mathcal {K}}^n_i(t)}\). Indeed, if \(P \in {\mathcal {S}}^k[x]\) is a polynomial matrix of degree d then we may decompose it into harmonic components entry-wise to obtain \({P(x) = \sum _{i=0}^d P_i(x)}\) and (cf. Theorem 6):

$$\begin{aligned} {\mathbf {K}}P(x) = \sum _{i=0}^d \lambda _i P_i(x). \end{aligned}$$

It remains to express the quantity \(\Vert {\mathbf {K}}^{-1} - I\Vert \) in terms of the eigenvalues \(\lambda _i\) of \({\mathbf {K}}\), after which the proof proceeds as in the case of scalar polynomials. For this, note that:

$$\begin{aligned} \Vert {\mathbf {K}}^{-1} P - P\Vert _\infty= & {} \Vert \sum _{i=1}^d (\lambda _i^{-1} - 1)P_i \Vert _\infty \le \sum _{i=1}^d |\lambda _i^{-1} - 1| \Vert P_i\Vert _\infty \\\le & {} \sum _{i=1}^d |\lambda _i^{-1} - 1| \cdot \gamma _d \Vert P \Vert _\infty . \end{aligned}$$

where \(\gamma _d\) is the constant of Lemma 5 (cf. 34). The last inequality relies on the following generalization of Lemma 5, whose proof here is essentially as given in [11].

Lemma 19

Let \(P(x) = \sum _{i=0}^d P_i(x)\) be a polynomial matrix of degree d, decomposed into harmonic components. If \(\gamma _d\) is the constant of Lemma 5, we then have:

$$\begin{aligned} \Vert P_i\Vert _{\infty } \le \gamma _d \Vert P\Vert _\infty \quad \text { for all } i \le d. \end{aligned}$$


For any matrix \(M \in {\mathcal {S}}^k\), its spectral norm is \(\Vert M \Vert _{} = \max _{y \in {\mathbb {R}}^k} \{ |y^\top M y| : \Vert y\Vert = 1\}\). Therefore, we have:

$$\begin{aligned} \Vert P\Vert _\infty = \max _{x \in {\mathbb {B}}^{n}} \max _{\Vert y\Vert = 1} |y^\top P(x) y| \quad \text { and } \quad \Vert P_i\Vert _\infty = \max _{x \in {\mathbb {B}}^{n}} \max _{\Vert y\Vert = 1} |y^\top P_i(x) y|. \end{aligned}$$

For fixed y, the function \(p^y : x \mapsto y^\top P(x) y\) is a (scalar) polynomial on \({\mathbb {B}}^{n}\) of degree d, whose harmonic components are given by \(p^y_i : x \mapsto y^\top P_i(x) y\). Therefore, we may invoke Lemma 5 to bound:

$$\begin{aligned} \max _{x \in {\mathbb {B}}^{n}} |p_i^y(x)| \le \gamma _d \max _{x \in {\mathbb {B}}^{n}} |p^y(x)| ~ \text {for all } \Vert y\Vert = 1, \end{aligned}$$

and conclude that \(\Vert P_i\Vert _\infty \le \gamma _d \Vert P\Vert _\infty \). \(\square \)

The inner hierarchy (proof of Theorem5) We generalize the arguments of Sect. 5 to the matrix-valued setting. Let F again be the polynomial matrix of degree d to be optimized, and assume w.l.o.g. that \(0 \le \Vert F\Vert _\infty \le 1\) and that the minimum in the optimization problem (53) is attained at 0, i.e., that \(F_{\min } = \lambda _{\min } \big (F(0)\big )\).

As in the scalar case, we work to reduce problem (55) to a (now matrix-valued) instance of the inner hierachy in one variable. Note first that \(F^{(r)} \le F_\mathrm{sym}^{(r)}\) for each \(r \in {\mathbb {N}}\), where \(F_\mathrm{sym}^{(r)}\) is obtained by restricting the optimization in (55) to polynomial matrices S(x) of the form \(S(x) = U(|x|)\). Writing \({\widehat{F}}\) for the univariate polynomial matrix satisfying

$$\begin{aligned} {\widehat{F}}(|x|) = \frac{1}{|{\mathrm{St}}(0)|}\sum _{\sigma \in {\mathrm{St}}(0)} F \circ \sigma (x) \quad (x \in {\mathbb {B}}^{n}), \end{aligned}$$

we find (cf. 38):

$$\begin{aligned} F_\mathrm{sym}^{(r)} = \min _{U \in \varSigma ^{k \times k}_r[t]} \Big \{ \int _{[0:n]} \mathrm{Tr}\big (\widehat{F}(t)U(t)\big ) d\omega (t) : \int _{[0:n]} \mathrm{Tr}\big (U(t)\big ) d \omega (t) = 1 \Big \}. \end{aligned}$$

It remains to analyze the program (56). We first give a linear upper estimator for \({\widehat{F}}\) (cf. Lemma 9).

Lemma 20

For all \(t \in [0:n]\), we have:

$$\begin{aligned} {\widehat{F}}(t) \preceq {\widehat{G}}(t) := d(d+1) \cdot \gamma _d \cdot t/n \cdot I + C, \end{aligned}$$

where \(C={\widehat{F}}(0)\) is a constant matrix with \(\lambda _{\min }(C) = 0\).


We may write \({\widehat{F}}(t) = \sum _{i=0}^d \varLambda _i \widehat{{\mathcal {K}}}^n_i(t)\) for certain \(\varLambda _i \in {\mathcal {S}}^k\). We then have:

$$\begin{aligned} \sum _{i=0}^d \varLambda _i \widehat{{\mathcal {K}}}^n_i(t)&= \sum _{i=0}^d \varLambda _i \big (\widehat{{\mathcal {K}}}^n_i(t) - 1 \big ) + \sum _{i=0}^d \varLambda _i \\&\preceq \max _{i=0}^d \Vert \varLambda _i\Vert _\infty \cdot I \cdot \sum _{i=0}^d |1 - \widehat{{\mathcal {K}}}^n_i(t)| + \sum _{i=0}^d \varLambda _i \\&\preceq d(d+1) \cdot \gamma _d \cdot t/n \cdot I + \sum _{i=0}^d \varLambda _i, \end{aligned}$$

making use of Lemma 19 and (24) for the final inequality. It remains to note that \(\sum _{i=0}^d \varLambda _i = {\widehat{F}}(0)\), and that \(\lambda _{\min } ({\widehat{F}}(0)) = 0\) by assumption. \(\square \)

As \({\widehat{F}}(t) \preceq {\widehat{G}}(t)\) for all \(t \in [0:n]\), we have \(F_\mathrm{sym}^{(r)} \le {\widehat{G}}^{(r)}_{[0:n], \omega }\), where:

$$\begin{aligned} {\widehat{G}}^{(r)}_{[0:n], \omega } = \min _{U \in \varSigma ^{k \times k}_r[t]} \Big \{ \int _{[0:n]} \mathrm{Tr}\big ({\widehat{G}}(t)U(t)\big ) d\omega (t) : \int _{[0:n]} \mathrm{Tr}\big (U(t)\big ) d \omega (t) = 1 \Big \} \end{aligned}$$

is the inner Lasserre hierarchy for G computed on [0 : n] w.r.t. the measure \(\omega \). To conclude the argument, we prove the following generalization of Theorem 7 (see also Remark 1).

Corollary 3

Let \(G(t) = ct \cdot I + C\) be a linear matrix-valued polynomial with \(c>0\) and \(\lambda _{\min }(C) = 0\). Then we have:

$$\begin{aligned} G^{(r)}_{[0:n], \omega } \le c \cdot \xi ^n_{r+1}, \end{aligned}$$

where \(\xi ^n_{r+1}\) is the least root of the degree \(r+1\) Krawtchouk polynomial.


Let u be a unit eigenvector for C corresponding to (one of its) zero eigenvalues. Then for any univariate sum-of-squares polynomial \(s \in \varSigma _r\), the matrix-valued polynomial \(U(t) = s(t) u u^\top \) is a sum-of-squares polynomial matrix of degree 2r. Furthermore, for such a U we have:

$$\begin{aligned} \int _{[0:n]} \mathrm{Tr}\big ( G(t)U(t)\big ) d\omega (t) = \int _{[0:n]} ct \cdot s(t) d\omega (t) \end{aligned}$$


$$\begin{aligned} \int _{[0:n]} \mathrm{Tr}\big (U(t)\big ) d\omega (t) = \int _{[0:n]} s(t) d\omega (t). \end{aligned}$$

Therefore, writing \(g(t) = ct\), and making use of Theorem 7 and Remark 1, we have:

$$\begin{aligned} G^{(r)}_{[0:n], \omega } \le \inf _{s \in \varSigma _r} \left\{ \int _{[0:n]} ct \cdot s(t) d\omega (t) : \int _{[0:n]} s(t) d\omega (t) = 1 \right\} = g^{(r)}_{[0:n], \omega } = c \cdot \xi _{r+1}^n. \end{aligned}$$

This concludes the proof. \(\square \)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Slot, L., Laurent, M. Sum-of-squares hierarchies for binary polynomial optimization. Math. Program. 197, 621–660 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Binary polynomial optimization
  • Lasserre hierarchy
  • Sum-of-squares polynomials
  • Fourier analysis
  • Krawtchouk polynomials
  • Polynomial kernels
  • Semidefinite programming
  • Polynomial matrices