1 Introduction

Given a form f that is a sum of squares of forms, there are usually many ways to write \(f=p_1^2+\cdots +p_r^2\) as a sum of squares. The set \(\mathrm {Gram}(f)\) of all sum of squares (sos) representations of f, modulo orthogonal equivalence, has a natural structure of a spectrahedron, so it is an object of geometric nature. Minimizing a homogeneous degree 2 polynomial in the coefficients of the \(p_i\) means to minimize a linear form over \(\mathrm {Gram}(f)\), hence can be performed using semidefinite programming. With probability one, the optimizer for a random such problem will be a unique extreme point of \(\mathrm {Gram}(f)\). Studying the convex-geometric properties of \(\mathrm {Gram}(f)\), and in particular its extreme points, is therefore a question relevant for optimization problems. For example, minimizing the \(L^2\)-norm of (the tuple of coefficient vectors of) a sum of squares representation \(f=p_1^2+\cdots +p_r^2\) means to minimize the trace form over \(\mathrm {Gram}(f)\). From an algebraic perspective, studying the extreme points of \(\mathrm {Gram}(f)\) is natural since every sos representation of f arises as a convex combination of representations that correspond to extreme points of \(\mathrm {Gram}(f)\).

Let \(f\in {{\mathbb {R}}}[x_1,\dots ,x_n]\) be a form of even degree 2d, let \(X=(x_1^d,x_1^{d-1}x_2,\dots ,x_n^d)^t\) be the sequence of all degree d monomials in some fixed order, let \(N=\genfrac(){0.0pt}{}{n+d-1}{n-1}\) be their number. Then \(\mathrm {Gram}(f)\) is the set of all psd symmetric real \(N\times N\) matrices G for which \(X^tGX=f\). For example, if \(n=2\) and \(f=x_1^6+x_2^6\), \(\mathrm {Gram}(f)\) consists of all psd real matrices of the form

$$\begin{aligned} G=\begin{pmatrix} 1&{}0&{}-a&{}-b\\ 0&{}2a&{}b&{}-c\\ -a&{}b&{}2c&{}0\\ -b&{}-c&{}0&{}1 \end{pmatrix}. \end{aligned}$$
(1)

The first to introduce Gram spectrahedra were Choi et al. [4] in 1995. Among other things, they showed that the elements of \(\mathrm {Gram}(f)\) are in natural bijective correspondence with the orthogonal equivalence classes of sum of squares representations of f (see [4, Prop. 2.10]). In fact, the Gram matrix of f corresponding to an sos representation \(f=\sum _{i=1}^rp_i^2\) is \(\sum _{i=1}^ru_iu_i^t\) where \(u_i\) is the coefficients (column) vector of \(p_i\). A more systematic study of Gram spectrahedra was begun only recently. Gram spectrahedra of ternary quartics were considered by Plaumann et al. in [14]. The paper [5] by Chua et al. is a survey of results and open questions on Gram spectrahedra. Among others, the authors discuss Gram spectrahedra of binary forms, and they relate the Gram spectrahedra of sextic binary forms to Kummer surfaces in \({{\mathbb {P}}}^3\), see also [12].

For fixed positive integers m and n, consider spectrahedra of the form \(S=L\cap \mathrm {Sym}_n({{\mathbb {R}}})\) where L is an m-dimensional affine-linear subspace of \(\mathrm {Sym}_n({{\mathbb {R}}})\). There are upper and (for generically chosen L) lower bounds for the ranks r of extreme points of S, expressed in terms of m and n. The corresponding interval of values for r is known as the Pataki interval. For points of a Gram spectrahedron, the rank is identified with the length of the corresponding sos decomposition. In particular, the sum of squares length of f, or the collection of different sum of squares representations of a given length, are naturally encoded in \(\mathrm {Gram}(f)\). These are invariants that have received a lot of attention in particular cases, starting with Hilbert [6], and more recently [15], for ternary quartics. Lately, results in a similar spirit were obtained for varieties of minimal or almost minimal degree, see [1, 2, 5, 17].

In this paper we focus on Gram spectrahedra in the most basic case possible, namely binary forms. For f a sufficiently general positive binary form of arbitrary degree, we show that \(\mathrm {Gram}(f)\) has extreme points of all ranks in the Pataki range (Theorem 5.3). This gives a positive answer to [5, Quest. 4.2]. In fact we calculate the dimension of the set of extreme points of any given rank r, for f chosen generically (Corollary 5.5).

The proofs for these facts rely on a purely algebraic result of independent interest (Theorem 4.2): For any integers \(d\ge 0\) and \(r\ge 1\) with \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\), there exists a sequence \(p_1,\dots ,p_r\) of binary forms of degree d for which the \(\genfrac(){0.0pt}{}{r+1}{2}\) products \(p_ip_j\), \(1\le i\le j\le r\), are linearly independent. Any sequence with this property will be called quadratically independent.

Speaking generally, the boundary structure of a convex set reflects how complicated the set is. In the case of spectrahedra, one measure for this complicatedness is the ranks of the extreme points. From this perspective, Theorem 5.3 says that Gram spectrahedra, even of binary forms, are as complicated as the most general spectrahedra.

When f is a strictly positive binary form without multiple roots, \(\mathrm {Gram}(f)\) has precisely \(2^{d-1}\) extreme points of rank 2, where \(\deg (f)=2d\). Given two of these points, the line segment connecting them may or may not be a face (edge) of \(\mathrm {Gram}(f)\). For sextic forms we show that it never is an edge, while for \(2d\ge 10\) it always is an edge. Most interesting is the case \(\deg (f)=8\), where the edges between the eight rank two extreme points form a complete bipartite graph \(K_{4,4}\) (Theorem 6.4).

We briefly comment on our methods. If F is the supporting face of a point \(\vartheta \in \mathrm {Gram}(f)\), we constantly use the following characterization of \(\dim (F)\): If \(f=p_1^2+\cdots +p_r^2\) is the sos representation that corresponds to \(\vartheta \) (with \(p_1,\dots ,p_r\) linearly independent forms), \(\dim (F)\) is the number of quadratic relations between \(p_1,\dots ,p_r\). In particular, \(\vartheta \) is an extreme point of \(\mathrm {Gram}(f)\) if and only if \(p_1,\dots ,p_r\) are quadratically independent. Arguments of this kind are one reason why we prefer to use a coordinate-free approach, where symmetric matrices are replaced by symmetric tensors of polynomials. In this setting, the point (Gram matrix) of \(\mathrm {Gram}(f)\) corresponding to a given sos decomposition \(f=\sum _{i=1}^rp_i^2\) is simply the symmetric tensor \(\sum _{i=1}^rp_i\otimes p_i\).

The paper is organized as follows. In Sect. 2 we review well-known facts on the facial structure of spectrahedra, together with the Pataki range for the rank. We then specialize to Gram spectrahedra and state the dimension formula for faces in terms of quadratic relations. Section 4 contains the proof for the existence of long quadratically independent sequences of binary forms. In Sects. 5 and 6 we present our analysis of the ranks of extreme points and of the edges between rank two extreme points.

For related recent work we refer to the paper [9] by Mayer, who proves that Gram spectrahedra of sufficiently general binary forms have polyhedral faces of large dimensions.

We use standard terminology from convex geometry. For \(K\subseteq {{\mathbb {R}}}^n\) a closed convex set, \(\text {aff}(K)\) denotes the affine-linear hull of K and \({{\,\mathrm{relint}\,}}(K)\) is the relative interior of K, i.e., the interior of K relative to \(\text {aff}(K)\). A convex subset \(F\subseteq K\) is a face of K if \(x,y\in K\), \(0<t<1\), and \((1-t)x+ty\in F\) imply \(x,y\in F\). For every \(x\in K\) there is a unique face F of K with \(x\in {{\,\mathrm{relint}\,}}(F)\), called the supporting face of x.

2 Review of Facial Structure of Spectrahedra

All results in this section are known, see for example Ramana and Goldman [16] for the first part and Pataki [13] for the second. We nevertheless give them a coordinate-free review here, i.e., without making reference to a particular basis of the underlying vector space, in order that we can freely use them later on.

2.1

Let V be a vector space over \({{\mathbb {R}}}\) with \(\dim (V)<\infty \). Let \(V^{\scriptscriptstyle \vee }\) be the dual space of V, and let \({\mathsf {S}}_2V\subseteq V\otimes V\) denote the space of symmetric tensors, i.e., tensors that are invariant under the involution \(v\otimes w\mapsto w\otimes v\). Of course, \({\mathsf {S}}_2V\) is canonically identified with \({\mathsf {S}}^2V\), the second symmetric power of V, but it seems preferable in our context to work with \({\mathsf {S}}_2\), rather than with \({\mathsf {S}}^2\). The natural pairing between \(v\in V\) and \(\lambda \in V^{\scriptscriptstyle \vee }\) is denoted \(\langle {v},{\lambda }\rangle =\langle {\lambda },{v}\rangle \). Elements of \({\mathsf {S}}_2V\) can be identified either with symmetric bilinear forms \(V^{\scriptscriptstyle \vee }\times V^{\scriptscriptstyle \vee }\rightarrow {{\mathbb {R}}}\), or with self-adjoint linear maps \(\varphi :V^{\scriptscriptstyle \vee }\rightarrow V\), where the adjoint refers to the natural pairing between V and \(V^{\scriptscriptstyle \vee }\). We shall adapt the second point of view. Let \(\varphi _\vartheta :V^{\scriptscriptstyle \vee }\rightarrow V\) denote the linear map that corresponds to a symmetric tensor \(\vartheta =\sum _{i=1}^rv_i\otimes w_i\in {\mathsf {S}}_2V\). So \(\varphi _\vartheta (\lambda )=\sum _{i=1}^r\lambda (v_i)w_i=\sum _{i=1}^r\lambda (w_i)v_i\) for \(\lambda \in V^{\scriptscriptstyle \vee }\). The range of \(\vartheta \in {\mathsf {S}}_2V\), written \({{\,\mathrm{im}\,}}(\vartheta )\), is the range (image) of the linear map \(\varphi _\vartheta \). Thus, if \(v_1,\dots ,v_r\) and \(w_1,\dots ,w_r\) are linearly independent, \({{\,\mathrm{im}\,}}(\vartheta )={{\,\mathrm{span}\,}}(v_1,\dots ,v_r)={{\,\mathrm{span}\,}}(w_1,\dots ,w_r)\). The rank of \(\vartheta \) is \({{\,\mathrm{rk}\,}}(\vartheta )=\dim ({{\,\mathrm{im}\,}}(\vartheta ))\).

2.2

\(\vartheta \in {\mathsf {S}}_2V\) is positive semidefinite (psd), written \(\vartheta \succeq 0\), if \(\langle {\varphi _\vartheta (\lambda )},{\lambda }\rangle \ge 0\) for every \(\lambda \in V^{\scriptscriptstyle \vee }\). If \(\vartheta =\sum _iv_i\otimes w_i\), this says \(\sum _i\lambda (v_i)\lambda (w_i)\ge 0\) for every \(\lambda \in V^{\scriptscriptstyle \vee }\). The set \({\mathsf {S}}_2^{\scriptscriptstyle +}V=\{\vartheta \in {\mathsf {S}}_2V:\vartheta \succeq 0\}\) is a closed convex cone in \({\mathsf {S}}_2V\). If \(v_1,\dots ,v_n\in V\) are linearly independent and \(\vartheta =\sum _{i=1}^na_{ij}v_i\otimes v_j\), where \(a_{ij}=a_{ji}\in {{\mathbb {R}}}\), then \(\vartheta \succeq 0\) if and only if the real symmetric matrix \((a_{ij})\) is psd, i.e., has nonnegative eigenvalues. So \({\mathsf {S}}_2^{\scriptscriptstyle +}V\) gets identified with the cone of real symmetric psd \(n\times n\) matrices (\(n=\dim (V)\)), after fixing a linear basis of V. We say that \(\vartheta \in {\mathsf {S}}_2V\) is positive definite, written \(\vartheta \succ 0\), if \(\langle {\varphi _\vartheta (\lambda )},{\lambda }\rangle >0\) for every \(0\ne \lambda \in V^{\scriptscriptstyle \vee }\). The fact that every real symmetric matrix can be diagonalized implies that every \(\vartheta \in {\mathsf {S}}_2V\) can be written \(\vartheta =\sum _{i=1}^r\varepsilon _iv_i\otimes v_i\), with \(r\ge 0\), \(\varepsilon _i=\pm 1\) and with \(v_1,\dots ,v_r\in V\) linearly independent. Of course, \(\vartheta \succeq 0\) is equivalent to \(\varepsilon _1=\ldots =\varepsilon _r=1\).

Lemma 2.3

Given and a linear subspace \(U\subseteq V\), we have \({{\,\mathrm{im}\,}}(\vartheta )\subseteq U\) if and only if .

Proof

The “if” direction is clear. Conversely assume \({{\,\mathrm{im}\,}}(\vartheta )\subseteq U\), and write \(\vartheta =\sum _{i=1}^rc_iv_i\otimes v_i\) with \(0\ne c_i\in {{\mathbb {R}}}\) and \(v_1,\dots ,v_r\in V\) linearly independent. If \(\lambda _1,\dots ,\lambda _r\in V^{\scriptscriptstyle \vee }\) are chosen with \(\langle {v_i},{\lambda _j}\rangle =\delta _{ij}\) for all ij, we have \(\varphi _\vartheta (\lambda _j)=\sum _ic_i\lambda _j(v_i)v_i=c_jv_j\), and by assumption this element lies in U for every j. Therefore . \(\square \)

Lemma 2.4

If are psd, then \({{\,\mathrm{im}\,}}(\vartheta +\vartheta ')={{\,\mathrm{im}\,}}(\vartheta )+{{\,\mathrm{im}\,}}(\vartheta ')\).

Proof

This translates into the well-known fact that, for any two symmetric psd matrices AB, one has \({{\,\mathrm{im}\,}}(A+B)={{\,\mathrm{im}\,}}(A)+{{\,\mathrm{im}\,}}(B)\). \(\square \)

Lemma 2.5

Let with \(\vartheta \succeq 0\) and \({{\,\mathrm{im}\,}}(\gamma )\subseteq {{\,\mathrm{im}\,}}(\vartheta )\). Then there is a real number \(\varepsilon >0\) with \(\vartheta -\varepsilon \gamma \succeq 0\).

Proof

This translates into the following well-known fact about real symmetric matrices: If AB are such matrices with \({{\,\mathrm{im}\,}}(B)\subseteq {{\,\mathrm{im}\,}}(A)\), and if \(A\succeq 0\), there is \(\varepsilon >0\) with \(A-\varepsilon B\succeq 0\). \(\square \)

2.6

For the following we fix an affine-linear subspace \(L\subseteq {\mathsf {S}}_2V\) together with the corresponding spectrahedron \(S=L\cap {\mathsf {S}}_2^{\scriptscriptstyle +}V\). See Ramana and Goldman [16] for the results in 2.72.14 below. For any subset \(T\subseteq S\) we consider the linear subspace

$$\begin{aligned} {{\mathscr {U}}}(T):=\sum _{\vartheta \in T}{{\,\mathrm{im}\,}}(\vartheta ) \end{aligned}$$

of V. For any linear subspace \(U\subseteq V\), the set

$$\begin{aligned} {{\mathscr {F}}}(U):=\{\vartheta \in S:{{\,\mathrm{im}\,}}(\vartheta )\subseteq U\}=L\cap {\mathsf {S}}_2^{\scriptscriptstyle +}U \end{aligned}$$

(Lemma 2.3) is a face of S by Lemma 2.4.

Lemma 2.7

For any face \(F\ne \varnothing \) of S there is a linear subspace \(U\subseteq V\) with \(F={{\mathscr {F}}}(U)\). In fact we may take \(U={{\mathscr {U}}}(F)\).

Proof

The inclusion \(F\subseteq {{\mathscr {F}}}({{\mathscr {U}}}(F))\) is trivial. Conversely, there exist finitely many \(\vartheta _1,\dots ,\vartheta _m\in F\) with \({{\mathscr {U}}}(F)=\sum _{i=1}^m{{\,\mathrm{im}\,}}(\vartheta _i)\). Hence there exists a single \(\vartheta \in F\) with \({{\mathscr {U}}}(F)={{\,\mathrm{im}\,}}(\vartheta )\), e.g. \(\vartheta =(1/m)\sum _{i=1}^m\vartheta _i\) (Lemma 2.4). In order to prove \({{\mathscr {F}}}({{\,\mathrm{im}\,}}(\vartheta ))\subseteq F\) let \(\gamma \in {{\mathscr {F}}}({{\,\mathrm{im}\,}}(\vartheta ))\), so \(\gamma \in S\) and \({{\,\mathrm{im}\,}}(\gamma )\subseteq {{\,\mathrm{im}\,}}(\vartheta )\). Choose a real number \(t>0\) so that \(\vartheta ':=\vartheta -t(\gamma -\vartheta )\succeq 0\), using Lemma 2.5. Since \(\vartheta '\in S\) and \(\vartheta \) is a convex combination of \(\vartheta '\) and \(\gamma \), we conclude that \(\gamma \in F\). \(\square \)

Definition 2.8

We say that a linear subspace U of V is facial, or a face subspace (for the given spectrahedron \(S=L\cap {\mathsf {S}}_2^{\scriptscriptstyle +}V\)), if there exists \(\vartheta \in S\) with \(U={{\,\mathrm{im}\,}}(\vartheta )\).

The following lemma is obvious (cf. 2.4):

Lemma 2.9

If \(U,U'\subseteq V\) are face subspaces for S then so is their sum \(U+U'\).

Note that the intersection \(U\cap U'\) need not contain any face subspace.

Proposition 2.10

There is a natural inclusion-preserving bijection between the nonempty faces F of S and the face subspaces \(U\subseteq V\) for S, given by \(F\mapsto {{\mathscr {U}}}(F)\). The inverse is \(U\mapsto {{\mathscr {F}}}(U)\).

Proof

Let \(F\ne \varnothing \) be a face of S. As in the proof of Lemma 2.7, there is \(\vartheta \in F\) with \({{\,\mathrm{im}\,}}(\vartheta )={{\mathscr {U}}}(F)\). Hence the subspace \({{\mathscr {U}}}(F)\) of V is facial, and \(F={{\mathscr {F}}}({{\mathscr {U}}}(F))\) holds by 2.7. On the other hand, if \(U\subseteq V\) is a face subspace then \(U={{\mathscr {U}}}({{\mathscr {F}}}(U))\) holds. Indeed, \(\supseteq \) is tautologically true. Conversely there is \(\vartheta \in S\) with \(U={{\,\mathrm{im}\,}}(\vartheta )\), since U is facial, so we have \(\vartheta \in {{\mathscr {F}}}(U)\) and therefore \(U\subseteq {{\mathscr {U}}}({{\mathscr {F}}}(U))=\sum _{\gamma \in {{\mathscr {F}}}(U)}{{\,\mathrm{im}\,}}(\gamma )\). \(\square \)

In particular we see that:

Corollary 2.11

If \(U\subseteq V\) is a face subspace, the relative interior of \({{\mathscr {F}}}(U)\) is \(\{\vartheta \in S:{{\,\mathrm{im}\,}}(\vartheta )=U\}\). The supporting face of \(\vartheta \in S\) is \({{\mathscr {F}}}({{\,\mathrm{im}\,}}(\vartheta ))\).

Corollary 2.12

Let F be a face of S. Then \({{\,\mathrm{rk}\,}}(\vartheta )=\dim ({{\mathscr {U}}}(F))\) for every \(\vartheta \in {{\,\mathrm{relint}\,}}(F)\). We call this number the rank of F, denoted \({{\,\mathrm{rk}\,}}(F)\). If \(F'\) is a proper subface of F then \({{\,\mathrm{rk}\,}}(F')<{{\,\mathrm{rk}\,}}(F)\).

Here are equivalent characterizations of face subspaces:

Proposition 2.13

For a linear subspace \(U\subseteq V\), the following are equivalent:

  1. (i)

    U is facial, i.e., there is \(\vartheta \in S\) with \({{\,\mathrm{im}\,}}(\vartheta )=U\);

  2. (ii)

    U has a linear basis \(u_1,\dots ,u_r\) for which \(\sum _{i=1}^ru_i\otimes u_i\in S\);

  3. (iii)

    U is linearly spanned by vectors \(u_1,\dots ,u_r\) for which \(\sum _{i=1}^ru_i\otimes u_i\in S\);

  4. (iv)

    for every \(u\in U\) there are \(\varepsilon >0\) and \(u_2,\dots ,u_r\in U\) such that \(\varepsilon u\otimes u+\sum _{i=2}^ru_i\otimes u_i\in S\).

Proof

(i) \(\Rightarrow \) (ii): Let \(\vartheta \in S\) with \({{\,\mathrm{im}\,}}(\vartheta )=U\). By Lemma 2.3 we can write \(\vartheta =\sum _{i=1}^ru_i\otimes u_i\) where \(u_1,\dots ,u_r\in U\) are linearly independent. Since the \(u_i\) span \({{\,\mathrm{im}\,}}(\vartheta )\), they are a linear basis of U. (ii) \(\Rightarrow \) (iii) is trivial. (iii) \(\Rightarrow \) (iv): Let \(\vartheta =\sum _{i=1}^ru_i\otimes u_i\in S\) as in (iii). Then \({{\,\mathrm{im}\,}}(\vartheta )={{\,\mathrm{span}\,}}(u_1,\dots ,u_r)=U\) by Lemma 2.4. Given \(u\in U\) there exists \(\varepsilon >0\) such that \(\gamma :=\vartheta -\varepsilon u\otimes u\in {\mathsf {S}}_2^{\scriptscriptstyle +}U\) (Lemma 2.5). Hence there exist \(u_2,\dots ,u_r\in U\) with \(\gamma =\sum _{i=2}^ru_i\otimes u_i\). (iv) \(\Rightarrow \) (i): Let \(u\in U\), and let \(\gamma =\varepsilon u\otimes u+\sum _{i=2}^ru_i\otimes u_i\) be as in (iv). Then \(u\in {{\,\mathrm{im}\,}}(\gamma )\) (Lemma 2.4), and \({{\,\mathrm{im}\,}}(\gamma )\subseteq U\). This shows that there is a (finite) family of tensors \(\gamma _j\in {{\mathscr {F}}}(U)\) with \(\sum _j{{\,\mathrm{im}\,}}(\gamma _j)=U\). Hence U is facial. \(\square \)

Proposition 2.14

Let , with an affine-linear subspace. If F is a nonempty face of S and \(U={{\mathscr {U}}}(F)\), then . In particular, .

Proof

Here \(\text {aff}(F)\) denotes the affine-linear hull of F. Since \(F=L\cap {\mathsf {S}}_2^{\scriptscriptstyle +}U\), it is clear that \(\text {aff}(F)\subseteq L\cap {\mathsf {S}}_2U\). For the other inclusion let \(\vartheta \in {{\,\mathrm{relint}\,}}(F)\), so \({{\,\mathrm{im}\,}}(\vartheta )=U\) (2.11), and let \(\gamma \in L\cap {\mathsf {S}}_2U\) be arbitrary. Then \(\gamma _t:=(1-t)\vartheta +t\gamma \succeq 0\) for \(|t|<\varepsilon \) and small \(\varepsilon >0\) (2.5), and therefore \(\gamma _t\in S\) for these t. Since \(\vartheta =(\gamma _t+\gamma _{-t})/2\), these \(\gamma _t\) lie in F, and we have proved \(\gamma \in \text {aff}(F)\). \(\square \)

The following result can be found in Pataki [13]. It describes the interval in which the ranks of the extreme points of a spectrahedron can possibly lie:

Proposition 2.15

[Pataki inequalities] Let \(\dim (V)=n\), let be an affine subspace with \(\dim (L)=m\), and let .

  1. (a)

    For any extreme point \(\vartheta \) of S, the rank \({{\,\mathrm{rk}\,}}(\vartheta )=r\) satisfies

    $$\begin{aligned} m+\genfrac(){0.0pt}{}{r+1}{2}\le \genfrac(){0.0pt}{}{n+1}{2}. \end{aligned}$$
  2. (b)

    When L is chosen generically among all affine subspaces of dimension m, every satisfies

    $$\begin{aligned} m\ge \genfrac(){0.0pt}{}{n-{{\,\mathrm{rk}\,}}(\vartheta )+1}{2}. \end{aligned}$$

This formulation is taken from [5, Prop. 3.1]. See also [13, Cor. 3.3.4] and [11, Prop. 5].

Remark 2.16

Let \(S=L\cap {\mathsf {S}}_2^{\scriptscriptstyle +}V\), where \(\dim (V)=n\) and \(L\subseteq {\mathsf {S}}_2V\) is a nonempty affine subspace, \(\dim (L)=m\). The Pataki interval for the rank r of extreme points of S is described by the inequalities

$$\begin{aligned} m\ge \genfrac(){0.0pt}{}{n-r+1}{2}\quad \text { and }\quad m+\genfrac(){0.0pt}{}{r+1}{2}\le \genfrac(){0.0pt}{}{n+1}{2} \end{aligned}$$
(2)

from Proposition 2.15. This amounts to the range of integers r satisfying

$$\begin{aligned} n+\frac{1}{2}-\frac{\sqrt{8m+1}}{2}\le r\le -\frac{1}{2}+\frac{\sqrt{(2n+1)^2-8m}}{2}. \end{aligned}$$

Indeed, the first (resp. second) inequality in (2) says \(A_1\le r\le A_2\) (resp. \(B_1\le r\le B_2\)) where

$$\begin{aligned} A_i=n+\frac{1}{2}+\frac{(-1)^i\sqrt{8m+1}}{2},\qquad B_i=-\frac{1}{2}+\frac{(-1)^i\sqrt{(2n+1)^2-8m}}{2}, \end{aligned}$$

\(i=1,2\). It is elementary to check that \(B_1<0<A_1\le B_2<A_2\) holds. Therefore the Pataki interval is \(\lceil A_1\rceil \le r\le \lfloor B_2\rfloor \).

3 Gram Spectrahedra

See Choi et al. [4] for an introduction to Gram matrices of real polynomials, and Chua et al. [5] for a survey on Gram spectrahedra. In contrast to these texts we emphasize a coordinate-free approach.

3.1

Let A be an \({{\mathbb {R}}}\)-algebra. The multiplication map \(A\otimes A\rightarrow A\), \((a,b)\mapsto ab\) (with \(\otimes =\otimes _{{\mathbb {R}}}\) always) induces the \({{\mathbb {R}}}\)-linear map \(\mu :{\mathsf {S}}_2A\rightarrow A\), where \({\mathsf {S}}_2A\subseteq A\otimes A\) is the space of symmetric tensors as in Sect. 2. Given \(f\in A\), the symmetric tensors \(\vartheta \in {\mathsf {S}}_2A\) with \(\mu (\vartheta )=f\) are called the Gram tensors of f.

3.2

Let \(V\subseteq A\) be a finite-dimensional linear subspace, and let \(f\in A\). We define the Gram spectrahedron of f, relative to V, to be the set of all psd Gram tensors of f in \({\mathsf {S}}_2V\), i.e.,

$$\begin{aligned} \mathrm {Gram}_V(f):={\mathsf {S}}_2^{\scriptscriptstyle +}V\cap \mu ^{-1}(f). \end{aligned}$$

It is well known that \(\mathrm {Gram}_V(f)\) parametrizes the sums of squares representations \(f=\sum _{i=1}^rp_i^2\) with \(p_i\in V\) for all i, up to orthogonal equivalence. This means, the elements of \(\mathrm {Gram}_V(f)\) are the symmetric tensors \(\sum _{i=1}^rp_i\otimes p_i\) with \(r\ge 0\) and \(p_1,\dots ,p_r\in V\) such that \(\sum _{i=1}^rp_i^2=f\). Given two such tensors \(\vartheta =\sum _{i=1}^rp_i\otimes p_i\) and \(\vartheta '=\sum _{j=1}^sq_j\otimes q_j\), we may assume \(r=s\); then \(\vartheta =\vartheta '\) if and only if there is an orthogonal real matrix \((u_{ij})\) such that \(q_j=\sum _{i=1}^ru_{ij}p_i\) for all j. See [4, § 2].

Lemma 3.3

\(\mathrm {Gram}_V(f)\) is a spectrahedron, and is compact provided that the identity \(\sum _{i=1}^rp_i^2=0\) with \(p_1,\dots ,p_r\in V\) implies \(p_1=\ldots =p_r=0\).

Proof

By its definition, \(\mathrm {Gram}_V(f)\) is a spectahedron. If \(\mathrm {Gram}_V(f)\) is unbounded, it has nonzero recession cone, which means that there is \(0\ne \vartheta \in {\mathsf {S}}_2V\) with \(\eta +\vartheta \in \mathrm {Gram}_V(f)\) for every \(\eta \in \mathrm {Gram}_V(f)\). It follows that \(\mu (\vartheta )=0\) and \(\vartheta \succeq 0\), so \(\vartheta =\sum _{i=1}^rp_i\otimes p_i\) with \(0\ne p_i\in V\) where \(\sum _{i=1}^rp_i^2=0\). \(\square \)

3.4

For \(U\subseteq A\) a linear subspace let \(\Sigma U^2=\bigl \{\sum _{i=1}^ru_i^2:r\ge 1,\,u_i\in U\bigr \}\). Usually we will consider Gram spectrahedra only in the case where sums of squares in A are strongly stable [10]. This means that there exists a filtration \(U_1\subseteq U_2\subseteq \ldots \subseteq \bigcup _{i\ge 1}U_i=A\) by finite-dimensional linear subspaces \(U_i\) such that for every \(i\ge 1\) there is \(j\ge 1\) with \(U_i\cap \Sigma A^2\subseteq \Sigma U_j^2\). In this case we simply write \(\mathrm {Gram}(f):=\mathrm {Gram}_{U_j}(f)\) for \(f\in U_i\). Examples are the polynomial rings \(A={{\mathbb {R}}}[x_1,\dots ,x_n]={{\mathbb {R}}}[x]\) with \(U_i={{\mathbb {R}}}[x]_{\le i}\), the space of polynomials of degree \(\le i\).

Example 3.5

Let \(A={{\mathbb {R}}}[x_1,x_2]\), consider the form \(f=x_1^6+x_2^6\) as in the introduction. The Gram spectrahedron of f has dimension 3, and has four extreme points of rank 2, given by the coefficient vectors \((a,b,c)=(0,0,0)\), (2, 0, 2), and \(({1}/{2})( 1, \sqrt{3}, 1)\) in (1). They correspond to the four essentially different ways of writing f as a sum of two squares, namely \(x_1^6+x_2^6\), \((x_1^3-2x_1x_2^2)^2+(2x_1^2x_2-x_2^3)^2\), and

$$\begin{aligned} \biggl (\frac{2x_1^3-x_1x_2^2\mp \sqrt{3}x_2^3}{2}\biggr )^{\!2}+\biggl (\frac{2x_1^2x_2\pm \sqrt{3}x_1x_2^2-x_2^3}{2}\biggr )^{\!2}. \end{aligned}$$

The first one minimizes the sum of the squares of the coefficients of the \(p_i\), the second one maximizes it, over all sos representations \(f=\sum _{i=1}^rp_i^2\) of f.

3.6

We summarize what the formalism of Sect. 2 means. Let \(V\subseteq A\) be a linear subspace, \(\dim (V)<\infty \), and let \(f\in A\). We will say that a linear subspace \(U\subseteq V\) is a face subspace for f if U is a face space for the spectrahedron \(\mathrm {Gram}_V(f)\) in the sense of 2.8. In other words, U is a face subspace for f if there is \(\vartheta \in \mathrm {Gram}_V(f)\) with \(U={{\,\mathrm{im}\,}}(\vartheta )\). According to Proposition 2.13, the nonempty faces F of \(\mathrm {Gram}_V(f)\) are in bijection with the face subspaces U for f, via \(F\mapsto {{\mathscr {U}}}(F)\) and \(U\mapsto {{\mathscr {F}}}(U)\).

The dimension formula 2.14 for faces takes a particularly appealing form for Gram spectrahedra. If \(U\subseteq A\) is a linear subspace, let UU denote the linear subspace of A spanned by the products \(pp'\), \(p,p'\in U\).

Proposition 3.7

For \(U\subseteq V\) a face subspace for f, the face \({{\mathscr {F}}}(U)\) of \(\mathrm {Gram}_V(f)\) has dimension

$$\begin{aligned} \dim ({{\mathscr {F}}}(U))=\frac{1}{2}r(r+1)-s \end{aligned}$$

with \(r=\dim (U)\) and \(s=\dim (UU)\).

Proof

By Proposition 2.14, \(\dim ({{\mathscr {F}}}(U))\) is the dimension of the affine space \(\mu ^{-1}(f)\cap {\mathsf {S}}_2U\). Hence \(\dim ({{\mathscr {F}}}(U))=\dim (W)\) where W is the kernel of the surjective linear map \(\mu :{\mathsf {S}}_2U\rightarrow UU\). Since \(\dim (W)=\dim ({\mathsf {S}}_2U)-\dim (UU)=r(r+1)/2-s\), the proposition follows. \(\square \)

Corollary 3.8

Let \(f=\sum _{i=1}^rp_i^2\) with \(p_1,\dots ,p_r\in V\) linearly independent, let \(\vartheta =\sum _{i=1}^rp_i\otimes p_i\) be the corresponding Gram tensor of f. The dimension of the supporting face of \(\vartheta \) in \(\mathrm {Gram}_V(f)\) equals the number of independent linear relations between the products \(p_ip_j\), \(1\le i\le j\le r\).

We say that a sequence \(p_1,\dots ,p_r\) in A is quadratically independent if the \(\genfrac(){0.0pt}{}{r+1}{2}\) products \(p_ip_{\!j}\), \(1\le i\le j\le r\), are linearly independent. Using this terminology we get:

Corollary 3.9

A psd Gram tensor \(\sum _{i=1}^rp_i\otimes p_i\) of f, with \(p_1,\dots ,p_r\in V\) linearly independent, is an extreme point of \(\mathrm {Gram}_V(f)\) if and only if the sequence \(p_1,\dots ,p_r\) is quadratically independent. \(\square \)

In particular, whether or not \(\vartheta =\sum _{i=1}^rp_i\otimes p_i\) (with the \(p_i\) linearly independent) is an extreme point of \(\mathrm {Gram}_V(f)\), depends only on the linear subspace \(U:={{\,\mathrm{span}\,}}(p_1,\dots ,p_r)\), but not on \(f=\sum _{i=1}^rp_i^2\).

Corollary 3.10

Let \(f\in A\), let \(U\subseteq V\) be the linear subspace generated by all \(p\in V\) with \(f-p^2\in \Sigma V^2\). Then

$$\begin{aligned} \dim (\mathrm {Gram}_V(f))=\frac{r}{2}(r+1)-s, \end{aligned}$$

where \(r=\dim (U)\) and \(s=\dim (UU)\).

4 Quadratically Independent Binary Forms

4.1

Let k be a field, let A be a (commutative) k-algebra. If \(U\subseteq A\) is a k-linear subspace, let UU be the linear subspace of A spanned by the products \(pp'\), \(p,p'\in U\), as in 3.6. Assuming \(\dim (U)=r<\infty \), we say that U is quadratically independent if the natural multiplication map \({\mathsf {S}}_2U\rightarrow A\) is injective, i.e., if \(\dim (UU)=\genfrac(){0.0pt}{}{r+1}{2}\). A sequence \(p_1,\dots ,p_r\) of elements of A is quadratically independent if the \(p_i\) are a linear basis of a quadratically independent subspace U of A.

We will prove the following general result for binary forms:

Theorem 4.2

Let k be an infinite field, and let \(d,r\ge 1\) such that \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\). Then there exists a sequence of r binary forms of degree d over k that is quadratically independent.

4.3

For the rest of this section write \(A=k[x_1,x_2]=\bigoplus _{d\ge 0}A_d\), where \(A_d\) is the space of binary forms of degree d. Note that \(\dim (A_d)=d+1\). Clearly, the existence of a single quadratically independent sequence of length r in \(A_d\) implies that the generic length r sequence in \(A_d\) will be quadratically independent. We can therefore assume that the field k is algebraically closed. (This assumption is only made to simplify notation.)

4.4

Our proof of Theorem 4.2 proceeds by induction on \(r\ge 1\), the start being the case \(r=1\) and \(d=0\) (which is obvious). So let \(r\ge 2\) in the sequel. By induction there is a quadratically independent sequence \(q_1,\dots ,q_{r-1}\) in \(A_e\), where \(e\ge 0\) is minimal with \(\genfrac(){0.0pt}{}{r}{2}\le 2e+1\). Let \(d\ge 1\) be minimal with \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\). Given \(z_1,\dots ,z_m\in {{\mathbb {P}}}^1\) we put

$$\begin{aligned} W_d(z_1,\dots ,z_m):=\{f\in A_d: f(z_1)=\ldots =f(z_m)=0\}. \end{aligned}$$

Let \(\infty \in {{\mathbb {P}}}^1\) be a fixed point, let \(0\ne l\in A_1\) with \(l(\infty )=0\).

Lemma 4.5

Under these assumptions the following hold:

  1. (a)

    For any linear subspace \(U\subseteq W_d(\infty )\) and any \(p\in U\),

    $$\begin{aligned} \dim (pA_d\cap UU)\ge \max {\{\dim (U),\dim (UU)-d+1\}}. \end{aligned}$$
  2. (b)

    There exists a subspace \(U\subseteq W_d(\infty )\) with \(\dim (U)=r-1\) and \(\dim (UU)=\genfrac(){0.0pt}{}{r}{2}\), together with a form \(p\in U\), such that equality holds in (a).

Proof

(a) From \(pU\subseteq pA_d\cap UU\) we get \(\dim (pA_d\cap UU)\ge \dim (U)\). Moreover \(\dim (pA_d+UU)\le 2d\) since \(pA_d+UU\subseteq W_{2d}(\infty )\), therefore \(\dim (pA_d\cap UU)\ge d+1+\dim (UU)-2d=\dim (UU)-d+1\).

(b) By induction we have a quadratically independent sequence \(q_1,\dots ,q_{r-1}\) in \(A_e\). Since \(e<d\), the \(r-1\) forms \(p_i:=l^{d-e}q_i\) (\(1\le i\le r-1\)) are in \(W_d(\infty )\) and are quadratically independent. Let \(V={{\,\mathrm{span}\,}}(p_2,\dots ,p_{r-1})\), we have \(\dim (VV)=\genfrac(){0.0pt}{}{r-1}{2}\). For sufficiently general \(q\in W_d(\infty )\) we claim that \(qA_d\cap VV=\{0\}\) (if \(r\le 5\)), resp. \(qA_d\cap VV\) has codimension \(d-1\) in VV (if \(r\ge 5\)). Indeed, if q has distinct zeros \(z_1,\dots ,z_{d-1},\infty \) in \({{\mathbb {P}}}^1\), we have \(qA_d\cap VV=W_{2d}(z_1,\dots ,z_{d-1})\cap VV\). For general enough choice of q, therefore, this intersection has codimension \(d-1\) in VV, resp. is zero if \(\dim (VV)\le d-1\) (which happens precisely for \(r\le 5\)). We can therefore modify \(p_1\in W_d(\infty )\) in such a way that

$$\begin{aligned} \dim (p_1A_d\cap VV)={\left\{ \begin{array}{ll}0&{}r\le 5,\\ \displaystyle \genfrac(){0.0pt}{}{r-1}{2}-d+1&{}r\ge 5\end{array}\right. } \end{aligned}$$

holds and the sequence \(p_1,\dots ,p_{r-1}\) remains quadratically independent. Writing \(U:={{\,\mathrm{span}\,}}(p_1,\dots ,p_{r-1})=kp_1\oplus V\) we have \(UU=p_1U\oplus VV\) since U is quadratically independent. Therefore

$$\begin{aligned} p_1A_d\cap UU=p_1U\oplus (p_1A_d\cap VV), \end{aligned}$$

and this subspace has dimension \(r-1\) (if \(r\le 5\)) resp. \(r-1+\genfrac(){0.0pt}{}{r-1}{2}-d+1=\genfrac(){0.0pt}{}{r}{2}-d+1\) (if \(r\ge 5\)). \(\square \)

4.6

According to Lemma 4.5, we can now fix a quadratically independent subspace \(U\subseteq W_d(\infty )\) with \(\dim (U)=r-1\) and such that

$$\begin{aligned} \dim (pA_d\cap UU)\ge {\left\{ \begin{array}{ll}r-1&{}r\le 5,\\ \displaystyle \genfrac(){0.0pt}{}{r}{2}-d+1&{}r\ge 5\end{array}\right. } \end{aligned}$$

holds for all \(p\in U\), with equality holding for p sufficiently general. We are going to show that we can extend U to a quadratically independent subspace of \(A_d\) of dimension r. Let \({{\mathbb {P}}}_U\) resp. \({{\mathbb {P}}}_{A_d}\) denote the projective spaces associated to the linear spaces U resp. \(A_d\), and consider the closed subvariety

$$\begin{aligned} X:=\{([p],[q])\in {{\mathbb {P}}}_U\times {{\mathbb {P}}}_{A_d}:pq\in UU\} \end{aligned}$$

of \({{\mathbb {P}}}_U\times {{\mathbb {P}}}_{A_d}\). (Here we write [p] for the element in \({{\mathbb {P}}}_U\) represented by \(0\ne p\in U\), and similarly [q] for \(0\ne q\in A_d\).) Let \(\pi _1:X\rightarrow {{\mathbb {P}}}_U\) and \(\pi _2:X\rightarrow {{\mathbb {P}}}_{A_d}\) denote the projections onto the two components.

Let \(\varepsilon \in \{0,1\}\) be defined by \(2d+1=\genfrac(){0.0pt}{}{r+1}{2}+\varepsilon \). We can calculate the dimension of X:

Lemma 4.7

\(\dim (X)=d-1\) if \(r\le 5\), and \(\dim (X)=d-1-\varepsilon \) if \(r\ge 5\).

Proof

Clearly \(\pi _1\) is surjective since \(([p],[p])\in X\) for \(0\ne p\in U\). For \(0\ne p\in U\), the fibre \(\pi _1^{-1}([p])\) has (projective) dimension \(\dim (pA_d\cap UU)-1\). From 4.6 we therefore see that the generic fibre of \(\pi _1\) has dimension \(r-2\) (if \(r\le 5\)) resp. \(\genfrac(){0.0pt}{}{r}{2}-d\) (if \(r\ge 5\)). It follows that \(\dim (X)=2r-4=d-1\) if \(r\le 5\), resp. \(\dim (X)=r-2+\genfrac(){0.0pt}{}{r}{2}-d=\genfrac(){0.0pt}{}{r+1}{2}-d-2=2d+1-\varepsilon -d-2=d-1-\varepsilon \) if \(r\ge 5\). \(\square \)

4.8

In particular, \(\dim (X)<\dim ({{\mathbb {P}}}_{A_d})\). For generically chosen \(q\in A_d\), therefore, we have \(\pi _2^{-1}([q])=\varnothing \), which means \(qU\cap UU=\{0\}\). In particular there is such \(q\in A_d\) with \(q(\infty )\ne 0\). Since \(qU\oplus UU\subseteq W_{2d}(\infty )\) and \(q^2\notin W_{2d}(\infty )\), we see that the r-dimensional subspace \(U+kq\) of \(A_d\) is quadratically independent. This completes the induction step, and thereby the proof of Theorem 4.2.

5 Pataki Range for Gram Spectrahedra of Binary Forms

5.1

Let \(n\ge 2\) be fixed. For \(d\ge 1\), \({{\mathbb {R}}}[x]_d\) denotes the space of forms of degree d in \({{\mathbb {R}}}[x]={{\mathbb {R}}}[x_1,\dots ,x_n]\). We write \(N_d=\dim ({{\mathbb {R}}}[x]_d)=\genfrac(){0.0pt}{}{n+d-1}{d}\). Let \(\Sigma _{2d}\subseteq {{\mathbb {R}}}[x]_{2d}\) denote the sums of squares cone, i.e., \(\Sigma _{2d}=\Sigma {{\mathbb {R}}}[x]_d^2\). For \(f\in \Sigma _{2d}\) let \(\mathrm {Gram}(f)\) be the (full) Gram spectrahedron of f, i.e., \(\mathrm {Gram}(f):=\mathrm {Gram}_V(f)\) with \(V:={{\mathbb {R}}}[x]_d\). Since \(\mathrm {Gram}(f)=\mu ^{-1}(f)\cap {\mathsf {S}}_2^{\scriptscriptstyle +}V\) and

$$\begin{aligned} \dim (\mu ^{-1}(f))=\dim ({\mathsf {S}}_2V)-\dim (VV)=\genfrac(){0.0pt}{}{N_d+1}{2}-N_{2d}, \end{aligned}$$

the Pataki interval (2.16) for \(\mathrm {Gram}(f)\) is characterized by the inequalities

$$\begin{aligned} N_{2d}+\genfrac(){0.0pt}{}{N_d-r+1}{2}\le \genfrac(){0.0pt}{}{N_d+1}{2}\quad \text { and }\quad \genfrac(){0.0pt}{}{r+1}{2}\le N_{2d}. \end{aligned}$$

For \(f\in {{\,\mathrm{int}\,}}(\Sigma _{2d})\) we have \(\dim (\mathrm {Gram}(f))=\dim (\mu ^{-1}(f))=\genfrac(){0.0pt}{}{N_d+1}{2}-N_{2d}\). In the case \(n=2\) of binary forms this means \(\dim (\mathrm {Gram}(f))=\genfrac(){0.0pt}{}{d+2}{2}-(2d+1)=\genfrac(){0.0pt}{}{d}{2}\) for \(f\in {{\,\mathrm{int}\,}}(\Sigma _{2d})\), and the Pataki range is described by the inequalities \(r\ge 2\) and \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\). (A rank one point exists in \(\mathrm {Gram}(f)\) if and only if f is a perfect square.)

In what follows we always work with binary forms, i.e., \(n=2\) and \({{\mathbb {R}}}[x]={{\mathbb {R}}}[x_1,x_2]\). From Theorem 4.2 we get:

Corollary 5.2

Let \(d\ge 1\) and \(r\ge 0\) such that \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\). The set of quadratically independent r-tuples \((p_1,\dots ,p_r)\) in \(({{\mathbb {R}}}[x_1,x_2]_d)^r\) is open and dense.

Here is our first main result on extreme points of Gram spectrahedra. It gives an affirmative answer to [5, Quest. 4.2].

Theorem 5.3

For any given \(d\ge 1\), there is an open dense set of psd binary forms f of degree 2d for which the Gram spectrahedron \(\mathrm {Gram}(f)\) has extreme points of all ranks in the Pataki interval.

Proof

Let \(k\ge 1\) be the largest integer with \(\genfrac(){0.0pt}{}{k+1}{2}\le 2d+1\), so the Pataki interval for Gram spectrahedra of degree 2d forms is \(\{2,3,\dots ,k\}\). Fix \(r\in \{2,3,\dots ,k\}\), and let \(W_r\subseteq ({{\mathbb {R}}}[x]_d)^r\) be the set of all quadratically independent r-tuples \((p_1,\dots ,p_r)\) of forms. By Corollary 5.2, the set \(W_r\) is open and dense in \(({{\mathbb {R}}}[x]_d)^r\). Let

$$\begin{aligned} S_r:=\{p_1^2+\cdots +p_r^2:(p_1,\dots ,p_r)\in W_r\}. \end{aligned}$$

Since every psd form in \({{\mathbb {R}}}[x]\) is a sum of two squares, the set \(S_r\) is a dense semialgebraic subset of \(\Sigma _{2d}\), which means that \(\dim (\Sigma _{2d}\smallsetminus S_r)<\dim (\Sigma _{2d})\). Whenever \((p_1,\dots ,p_r)\in W_r\), if we put \(f:=\sum _{i=1}^rp_i^2\), the symmetric tensor \(\sum _{i=1}^rp_i\otimes p_i\) is an extreme point of \(\mathrm {Gram}(f)\) of rank r (Corollary 3.9). Therefore every \(f\in S_r\) has a rank r extreme point in its Gram spectrahedron. It now suffices to consider the intersection \(S:=\bigcap _{r=2}^kS_r\). Then S is a dense semialgebraic subset of \(\Sigma _{2d}\) since \(\dim (\Sigma _{2d}\smallsetminus S)<\dim (\Sigma _{2d})\). And for every \(f\in S\), the Gram spectrahedron of f has extreme points of all ranks in the Pataki interval. \(\square \)

Remark 5.4

It seems that only few families of spectrahedra of unbounded dimensions are known which have extreme points of all ranks in the Pataki interval (see the discussion in [5] before Question 4.2). Theorem 5.3 provides such a family. Another notable example are elliptopes, i.e., the sets of correlation matrices of fixed size. The possible ranks of extreme points, and in fact of arbitrary faces, were determined for elliptopes in [7, 8]. In particular, the entire Pataki interval is realized by extreme points.

We can also determine the dimensions of the sets of extreme points of a fixed rank, for suitably general f. To have a short notation, let us write \(\mathrm {Ex}_r(f)\) for the (semialgebraic) set of all extreme points of \(\mathrm {Gram}(f)\) of rank r.

Corollary 5.5

Let \(d\ge 1\). There is an open dense subset U of \(\Sigma _{2d}\) such that, for every \(f\in U\) and every r in the Pataki range, we have

$$\begin{aligned} \dim (\mathrm {Ex}_r(f))=\frac{(r-2)(2d-r+1)}{2}. \end{aligned}$$

Proof

Let r be in the Pataki range. Using notation from the previous proof, consider the sum of squares map \(\sigma :W_r\rightarrow {{\mathbb {R}}}[x]_{2d}\), \((p_1,\dots ,p_r)\mapsto \sum _{i=1}^rp_i^2\). Its image is dense in \(\Sigma _{2d}\). It follows from local triviality of semialgebraic maps (Hardt’s theorem, see e.g. [3, Thm. 9.3.2]) that, for every f in an open dense set \(U_r\subseteq \Sigma _{2d}\), the fibre \(\sigma ^{-1}(f)\) has dimension \(r(d+1)-(2d+1)\). The orthogonal group O(r) has dimension \(\genfrac(){0.0pt}{}{r}{2}\). It acts on the fibre \(\sigma ^{-1}(f)\) with trivial stabilizer subgroups, and the orbits are precisely the extreme points of \(\mathrm {Gram}(f)\) of rank r. So we get

$$\begin{aligned} \dim (\mathrm {Ex}_r(f))=r(d+1)-(2d+1)-\genfrac(){0.0pt}{}{r}{2}=\frac{(r-2)(2d-r+1)}{2} \end{aligned}$$

for every \(f\in U_r\). Take U to be the intersection of the sets \(U_r\) for all r in the Pataki range, to get the desired conclusion. \(\square \)

Remark 5.6

For strictly positive f with \(\deg (f)=2d\ge 12\), the boundary of \(\mathrm {Gram}(f)\) is a union of positive dimensional faces. Indeed, let \(\vartheta \) be an extreme point of \(\mathrm {Gram}(f)\), and let r be its rank. Let \(\vartheta '\) be any extreme point of rank 2, different from \(\vartheta \). Now r satisfies \(\genfrac(){0.0pt}{}{r+1}{2}\le 2d+1\), and one checks readily that this implies \(r+2\le d\) since \(d\ge 6\). In particular, \(r+2<d+1\), which means that the supporting face F of \(\{\vartheta ,\vartheta '\}\) is proper.

The fact that \(\partial \mathrm {Gram}(f)\) is a union of positive dimensional faces is reflected by the fact that, for \(2d\ge 8\) and any r in the Pataki range, the number \((r-2)(2d-r+1)/2\) from Corollary 5.5 is smaller than the dimension of the boundary of \(\mathrm {Gram}(f)\), which is \(\genfrac(){0.0pt}{}{d}{2}-1\), for general \(f\in \Sigma _{2d}\).

6 Edges Between Extreme Points of Rank Two

6.1

We keep considering binary forms, so we work in \({{\mathbb {R}}}[x]={{\mathbb {R}}}[x_1,x_2]\). Let \(f\in \Sigma _{2d}\). Recall ([4, Exam. 2.13], [5, Prop. 4.1]) how Gram tensors \(\vartheta \in \mathrm {Gram}(f)\) of rank \(\le 2\) correspond to product decompositions \(f=g\overline{g}\) with \(g\in {{\mathbb {C}}}[x]\), where \(\overline{g}\) is the form that is coefficient-wise complex conjugate to g. Any \(\vartheta \in \mathrm {Gram}(f)\) with \({{\,\mathrm{rk}\,}}(\vartheta )\le 2\) has the form \(\vartheta =p\otimes p+q\otimes q\) where \(p,q\in {{\mathbb {R}}}[x]\) satisfy \(f=p^2+q^2=(p+iq)(p-iq)\). Conversely, a factorization \(f=g\overline{g}\) with \(g\in {{\mathbb {C}}}[x]\) gives a Gram tensor \(\vartheta =p\otimes p+q\otimes q\) of f, namely \(p=(g+\overline{g})/2\) and \(q=(g-\overline{g})/(2i)\in {{\mathbb {R}}}[x]\). Two factorizations \(f=g\overline{g}=h\overline{h}\) give the same Gram tensor of f if and only if h is a scalar multiple of g or \(\overline{g}\). In particular, if we assume that f has no multiple complex roots, we see that f has (no Gram tensors of rank one and) precisely \(2^{d-1}\) Gram tensors of rank two. All of them are extreme points of \(\mathrm {Gram}(f)\).

6.2

When g has only real zeros, \(\mathrm {Gram}(f)\cong \mathrm {Gram}(fg^2)\) naturally. Hence we discuss \(\mathrm {Gram}(f)\) for strictly positive f only. Then \(\dim (\mathrm {Gram}(f))=\genfrac(){0.0pt}{}{d}{2}\), and points in the relative interior of \(\mathrm {Gram}(f)\) have rank \(d+1\). Let \(d\ge 1\), let \(f\in \Sigma _{2d}\) be strictly positive, and let us first consider the cases of very small degree. If \(d=1\) then \(\mathrm {Gram}(f)\) is a single point of rank two. If \(d=2\) then \(\mathrm {Gram}(f)\) is a nondegenerate interval, the relative interior of which consists of points of rank 3. If f has simple roots, both end points have rank 2. Otherwise f is a square, and one end point has rank 1, the other has rank 2. The case \(d=3\) is covered in the next result (see also [5, Sect. 4.2]):

Proposition 6.3

Let \(f\in \Sigma _6\) be any strictly positive form of degree 6. Then

  1. (a)

    \(\mathrm {Gram}(f)\) has no faces of dimension 1 or 2,

  2. (b)

    \(\mathrm {Gram}(f)\) has four, three, or two extreme points of rank 2,

  3. (c)

    all other extreme points have rank 3.

Proof

The extreme points of rank \(\le 2\) correspond to complex factorizations \(f=p\overline{p}\). Depending on whether f has six, four, or two different roots, there are four, three, or two essentially different such factorizations. The corresponding psd Gram tensors have rank two except when f is a square, i.e., has only two different roots; then one of the Gram tensors has rank one. If \(\mathrm {Gram}(f)\) had a proper face of positive dimension, its rank would have to be 3. To prove (a) it therefore suffices to show that, for any two extreme points \(\vartheta \ne \vartheta '\) of rank \(\le 2\), the segment \([\vartheta ,\vartheta ']\) meets the interior of \(\mathrm {Gram}(f)\). Let \(f=p\overline{p}=q\overline{q}\) be the two factorizations corresponding to \(\vartheta \) and \(\vartheta '\). We can assume \(p=gh\), \(q=g\overline{h}\) with

$$\begin{aligned} g=(x-a_1)(x-a_2),\qquad h=x-a_3, \end{aligned}$$

and \(\{a_1,a_2,a_3\}\cap \{\overline{a}_1,\overline{a}_2,\overline{a}_3\}=\varnothing \). For the supporting face F of \((\vartheta +\vartheta ')/2\) we have

$$\begin{aligned} {{\mathscr {U}}}(F)={{\,\mathrm{span}\,}}(gh,g\overline{h},\overline{g}h,\overline{gh}). \end{aligned}$$

Calculating the determinant gives

$$\begin{aligned} (a_1-\overline{a}_1)(a_1-\overline{a}_2)(a_2-\overline{a}_1)(a_2-\overline{a}_2)(a_3-\overline{a}_3)^2\ne 0. \end{aligned}$$

This means that \((\vartheta +\vartheta ')/2\) has rank 4, and hence lies in the interior of \(\mathrm {Gram}(f)\). \(\square \)

When the positive sextic f is general, the algebraic boundary of \(\mathrm {Gram}(f)\) is a Kummer surface, see [12, Sect. 5] and [5, Sect. 4.2]. In this case, assertion (a) also follows from the fact that a Kummer surface in \({{\mathbb {P}}}^3\) does not contain a line.

Now we are interested in arbitrary degrees. Let \(f\in {{\mathbb {R}}}[x]_{2d}\) be a sufficiently general positive form. We ask: For which pairs \(\vartheta \ne \vartheta '\) in \(\mathrm {Ex}_2(f)\) is the line segment \([\vartheta ,\vartheta ']\) an edge of \(\mathrm {Gram}(f)\), i.e., a one-dimensional face?

Theorem 6.4

Let \(d\ge 4\). For all forms f in an open dense subset of \(\Sigma _{2d}\), the following is true:

  1. (a)

    \(d=4\): For each of the \(\genfrac(){0.0pt}{}{8}{2}=28\) pairs \(\vartheta \ne \vartheta '\) in \(\mathrm {Ex}_2(f)\), the interval \([\vartheta ,\vartheta ']\) is contained in the boundary of \(\mathrm {Gram}(f)\). For precisely 16 of these pairs, \([\vartheta ,\vartheta ']\) is a face of \(\mathrm {Gram}(f)\). These 16 edges form a graph isomorphic to \(K_{4,4}\), the complete bipartite graph on two sets of four points each.

  2. (b)

    \(d\ge 5\): For any two \(\vartheta \ne \vartheta '\) in \(\mathrm {Ex}_2(f)\), the line segment \([\vartheta ,\vartheta ']\) is a face of \(\mathrm {Gram}(f)\).

6.5

Let \(f=p\overline{p}=q\overline{q}\) be complex factorizations of f that correspond to \(\vartheta \) and \(\vartheta '\), respectively. The supporting face F of \([\vartheta ,\vartheta ']\) therefore has \({{\mathscr {U}}}(F)_{{\mathbb {C}}}={{\,\mathrm{span}\,}}(p,\overline{p},q,\overline{q})\subseteq {{\mathbb {C}}}[x]_d\), and \(\dim (F)\) is the number of quadratic relations between \(p,\overline{p},q,\) and \(\overline{q}\). We can split \(p=gh\) into two nontrivial complex factors in such a way that \(\vartheta '\) corresponds to the factorization \(f=q\overline{q}\) with \(q=g\overline{h}\). Thus

$$\begin{aligned} {{\mathscr {U}}}(F)_{{\mathbb {C}}}={{\,\mathrm{span}\,}}(gh,g\overline{h},\overline{g}h,\overline{gh}). \end{aligned}$$

For general f we have \(\dim ({{\mathscr {U}}}(F))=4\). Assuming this, \([\vartheta ,\vartheta ']\) is an edge of \(\mathrm {Gram}(f)\) if and only if there is only one quadratic relation between \(p=gh\), \(\overline{p}=\overline{gh}\), \(q=g\overline{h}\), and \(\overline{q}=\overline{g}h\), i.e., if and only if the nine products

$$\begin{aligned} g^{a_1}\overline{g}^{\,a_2}h^{b_2}\overline{h}^{\,b_2},\qquad a_i,b_i\ge 0,\quad a_1+a_2=b_1+b_2=2, \end{aligned}$$
(⋆)

are linearly independent. (To be sure, there always is one quadratic relation between \(p,\overline{p},q\) and \(\overline{q}\), namely \(p\overline{p}=q\overline{q}\).) The key case for Theorem 6.4 is \(d=4\). It is made more explicit in the next two lemmas.

Lemma 6.6

Let \(g_1,g_2\in {{\mathbb {C}}}[x]\) have degree 3, let \(h_1,h_2\in {{\mathbb {C}}}[x]\) have degree 1. Then the nine octic forms

$$\begin{aligned} g_1^{a_1}g_2^{a_2}h_1^{b_1}h_2^{b_2},\qquad a_i,b_i\ge 0,\quad a_1+a_2=b_1+b_2=2 \end{aligned}$$
(3)

are linearly independent if (and only if) \(\gcd (g_1,g_2)= \gcd (h_1,h_2)=1\).

Lemma 6.7

For arbitrary \(g_1,g_2,h_1,h_2\in {{\mathbb {C}}}[x]\) of degree 2, the nine octic forms (3) are linearly dependent.

Corollary 6.8

Let \(g_1,g_2,h_1,h_2\in {{\mathbb {C}}}[x]\) with \(\deg (g_1)=\deg (g_2)=\delta \ge 1\), \(\deg (h_1)=\deg (h_2)=\varepsilon \ge 1\), and \(\delta +\varepsilon \ge 5\). If \(g_1,g_2,h_1,h_2\) are chosen generically, the nine forms (3) (of degree \(2(\delta +\varepsilon )\)) are linearly independent.

6.9

Before establishing 6.6, 6.7, and 6.8, we show how these imply Theorem 6.4. First let \(d=4\), let \(f\in \Sigma _8\) have simple complex zeros, and let \(f=p\overline{p}=q\overline{q}\) be two nontrivial factorizations corresponding to extreme points \(\vartheta \ne \vartheta '\) in \(\mathrm {Gram}(f)\) (cf. 6.5). Since \({{\,\mathrm{rk}\,}}(\vartheta +\vartheta ')\le {{\,\mathrm{rk}\,}}(\vartheta )+{{\,\mathrm{rk}\,}}(\vartheta ')=4\), it is obvious that \([\vartheta ,\vartheta ']\) is contained in the boundary of \(\mathrm {Gram}(f)\). Write \(p=g_1g_2\) and \(q=g_1\overline{g}_2\) as in 6.5. If \(\deg (g_1)=\deg (g_2)=2\), the nine forms (\(\star \)) (see 6.5) are linearly dependent by Lemma 6.7, and so \([\vartheta ,\vartheta ']\) is not an edge. Otherwise \(\{\deg (g_1),\deg (g_2)\}=\{1,3\}\). By Lemma 6.6, therefore, the nine forms (\(\star \)) are linearly independent, and so \([\vartheta ,\vartheta ']\) is an edge.

This proves the \(d=4\) case of Theorem 6.4. Indeed, the eight points of \(\mathrm {Ex}_2(f)\), corresponding to the eight essentially different factorizations \(f=p\overline{p}\), decompose into two subclasses of four points each, where two different factorizations \(f=p\overline{p}=q\overline{q}\) belong to the same subclass if and only if p and q have precisely two roots in common.

If \(d\ge 5\), if \(f\in \Sigma _{2d}\) is sufficiently general, and if \(f=p\overline{p}=q\overline{q}\) are two factorizations belonging to \(\vartheta \ne \vartheta '\), Corollary 6.8 shows that (\(\star \)) are linearly independent, whence \([\vartheta ,\vartheta ']\) is an edge.

Proof of Lemma 6.6

It is obvious that \(\gcd (g_1,g_2)=\gcd (h_1,h_2)=1\) are necessary for the nine octics to be linearly independent. For the converse assume these conditions, and consider the ideals \(I=\langle g_1,g_2\rangle \) and \(J=\langle h_1,h_2\rangle \) in \(A=k[x]\). We have to prove \((I^2J^2)_8=A_8\). Now \(\gcd (h_1,h_2)=1\) implies \(J_1=A_1\) and hence \((J^2)_2=A_2\). So \((I^2J^2)_8\) contains \((I^2)_6A_2=(I^2)_8\), and it is enough to prove \((I^2)_8=A_8\). The ideal I is a complete intersection since \(\gcd (g_1,g_2)=1\), hence a Gorenstein ideal of socle degree 4. So \(I_5=A_5\), and so \((I^2)_8\) contains \(I_5I_3=A_5I_3=I_8=A_8\). \(\square \)

Proof of Lemma 6.7

Let \(U_1={{\,\mathrm{span}\,}}(g_1,g_2)\), \(U_2={{\,\mathrm{span}\,}}(h_1,h_2)\subseteq {{\mathbb {C}}}[x]_2\), we have to show that \(U_1U_1U_2U_2\) is a proper subspace of \({{\mathbb {C}}}[x]_8\). Since \(U_1\cap U_2\ne \{0\}\) we find \(p,g,h\in {{\mathbb {C}}}[x]_2\) with \(U_1={{\,\mathrm{span}\,}}(p,g)\) and \(U_2={{\,\mathrm{span}\,}}(p,h)\). So we have \(U_1U_2\subseteq p{{\mathbb {C}}}[x]_2+{{\mathbb {C}}}gh\), which implies

$$\begin{aligned} (U_1U_2)(U_1U_2)\subseteq p^2{{\mathbb {C}}}[x]_4+pgh{{\mathbb {C}}}[x]_2+{{\mathbb {C}}}g^2h^2. \end{aligned}$$

The first two subspaces on the right intersect non-trivially, since both contain \(p^2gh\). Hence the right hand side has dimension \(\le 8\), and so it is a proper subspace of \({{\mathbb {C}}}[x]_8\). \(\square \)

Proof of Corollary 6.8

It suffices to prove the assertion for one specific choice of the \(g_i\) and \(h_i\). We can assume \(\delta \ge 3\). Let \(G_1,G_2,H_1,H_2\) satisfy \(\deg (G_i)=3\), \(\deg (H_i)=1\), and \(\gcd (G_1,G_2)=\gcd (H_1,H_2)=1\), and let \(l\ne 0\) be any linear form. Then by Lemma 6.6, the assertion is true for \(g_i:=G_i\ell ^{\delta -3}\), \(h_i:=H_i\ell ^{\varepsilon -1}\), \(i=1,2\). \(\square \)