Extreme points of Gram spectrahedra of binary forms

The Gram spectrahedron $\text{Gram}(f)$ of a form $f$ with real coefficients parametrizes the sum of squares decompositions of $f$, modulo orthogonal equivalence. For $f$ a sufficiently general positive binary form of arbitrary degree, we show that $\text{Gram}(f)$ has extreme points of all ranks in the Pataki range. This is the first example of a family of spectrahedra of arbitrarily large dimensions with this property. We also calculate the dimension of the set of rank $r$ extreme points, for any $r$. Moreover, we determine the pairs of rank two extreme points for which the connecting line segment is an edge of $\text{Gram}(f)$.


Introduction
Given a form f that is a sum of squares of forms, there are usually many inequivalent ways of writing f as a sum of squares. The set Gram(f ) of all sum of squares (sos) representations of f , modulo orthogonal equivalence, has a natural structure of a spectrahedron, so it is an object of geometric nature. Studying the convexgeometric properties of Gram(f ), and in particular its extreme points, is relevant for the problem of optimizing linear functions over all sum of squares representations of f . With probability one, the optimizer for a random such problem will be a unique extreme point of Gram(f ). From an algebraic perspective, studying the extreme points of the Gram spectrahedron is natural since every sos representation of f arises as a convex combination of representations that correspond to extreme points of Gram(f ).
Although the basic idea goes back to Choi, Lam and Reznick [4] in 1995, a systematic study of Gram spectrahedra was taken up only recently. Gram spectrahedra of ternary quartics were considered by Plaumann, Sturmfels and Vinzant in [11]. The paper [5] by Chua, Plaumann, Sinn and Vinzant is a survey of results and open questions on Gram spectrahedra. Among others, the authors discuss Gram spectrahedra of binary forms, and for sextic binary forms they relate the Gram spectrahedra to Kummer surfaces in P 3 , see also [9].
Any point of a spectrahedron has a rank. The Pataki interval describes the range of values that the rank of an extreme point of a general spectrahedron may have. For points of Gram spectrahedra, the rank is identified with the length of the corresponding sum of squares decomposition. In particular, the sum of squares length of f , or the collection of different sum of squares representations of a given length, are naturally encoded in Gram(f ). These are invariants that have received a lot of attention in particular cases, starting with Hilbert [6], and more recently [12], for ternary quartics. Lately, results of a similar spirit were obtained for varieties of minimal or almost minimal degree, see [2,1,14,5]. In this paper we focus on Gram spectrahedra in the most basic case possible, namely binary forms. For f a sufficiently general positive binary form of arbitrary degree, we show that Gram(f ) has extreme points of all ranks in the Pataki range (Theorem 5.3). This gives a positive answer to Question 4.2 from [5]. It also establishes the first known instance of a family of spectrahedra of arbitrary dimensions with this property. In fact we calculate the dimension of the set of extreme points of any given rank r, for f sufficiently general (Corollary 5.4).
The proofs for these facts rely on a purely algebraic result of independent interest (Theorem 4.2): For any integers d ≥ 0 and r ≥ 1 with r+1 2 ≤ 2d + 1, there exists a sequence (p 1 , . . . , p r ) of r binary forms of degree d for which the r+1 2 products p i p j (1 ≤ i ≤ j ≤ r) are linearly independent. Any sequence with this property will be called quadratically independent.
When f is a general positive binary form of degree 2d, Gram(f ) has precisely 2 d−1 extreme points of rank two. Given two of these points, the line segment connecting them may or may not be a face (edge) of Gram(f ). For sextic forms we show that it is never an edge, while for 2d ≥ 10 it always is an edge. Most interesting is the case deg(f ) = 8, where the edges between the eight rank two extreme points form a complete bipartite graph K 4,4 (Theorem 6.4).
We briefly comment on our methods. Throughout we pursue a coordinate-free approach to Gram spectrahedra. Let ϑ ∈ Gram(f ), and let F be the face of Gram(f ) that has ϑ in its relative interior. We constantly use the following characterization of dim(F ): If f = p 2 1 + · · · + p 2 r is the sos representation that corresponds to ϑ (with p 1 , . . . , p r linearly independent forms), dim(F ) is the number of quadratic relations between p 1 , . . . , p r . In particular, ϑ is an extreme point of Gram(f ) if and only if p 1 , . . . , p r are quadratically independent.
The paper is organized as follows. In Section 2 we review the well-known results by Ramana and Goldman on the facial structure of spectrahedra, together with the Pataki range for the rank. We then specialize to Gram spectrahedra and formulate the dimension formula for faces in terms of quadratic relations. Section 4 contains the proof for the existence of long quadratically independent sequences of binary forms. In Sections 5 and 6 we present our analysis of the ranks of extreme points and of the edges between rank two extreme points.
We use standard terminology from convex geometry. For K ⊆ R n a closed convex set, aff(K) denotes the affine-linear hull of K and relint(K) is the relative interior of K, i.e. the interior of K relative to aff(K). A convex subset F ⊆ K is a face of K if x, y ∈ K, 0 < t < 1 and (1 − t)x + ty ∈ F imply x, y ∈ F . For every x ∈ K there is a unique face F of K with x ∈ relint(F ), called the supporting face of x.

Review of facial structure of spectrahedra
All results in this section are known. They are due to Ramana and Goldman [13] for the first part, and to Pataki [10] for the Pataki range. We nevertheless give them a coordinate-free review here, i.e. without making reference to a particular basis of the underlying vector space.
2.1. Let V be a vector space over R with dim(V ) < ∞. Let V ∨ be the dual space of V , and let S 2 V ⊆ V ⊗ V denote the space of symmetric tensors, i.e. tensors that are invariant under the involution v ⊗ w → w ⊗ v. Of course, S 2 V is canonically identified with S 2 V , the second symmetric power of V , but it seems preferable in our context to work with S 2 , rather than with S 2 . The natural pairing between v ∈ V and λ ∈ V ∨ is denoted v, λ = λ, v . Elements of S 2 V can be identified either with symmetric bilinear forms V ∨ × V ∨ → R, or with self-adjoint linear maps ϕ : V ∨ → V , where the adjoint refers to the natural pairing between V and V ∨ . We shall adapt the second point of view. Let ϕ ϑ : V ∨ → V denote the linear map that corresponds to a symmetric tensor The range of ϑ ∈ S 2 V , written im(ϑ), is the range (image) of the linear map ϕ ϑ . Thus, if v 1 , . . . , v r and w 1 , . . . , w r are linearly independent, im(ϑ) = span(v 1 , . . . , v r ) = span(w 1 , . . . , w r ). The rank of ϑ is rk(ϑ) = dim im(ϑ).
0 if and only if the real symmetric matrix (a ij ) is psd, i.e. has nonnegative eigenvalues. So S + 2 V gets identified with the cone of real symmetric psd n × nmatrices (n = dim(V )), after fixing a linear basis of V . We say that ϑ ∈ S 2 V is positive definite, written ϑ ≻ 0, if ϕ ϑ (λ), λ > 0 for every 0 = λ ∈ V ∨ .
The fact that every real symmetric matrix can be diagonalized implies that every ϑ ∈ S 2 V can be written and with v 1 , . . . , v r ∈ V linearly independent. Of course, ϑ 0 is equivalent to ε 1 = · · · = ε r = 1.
Proof. This translates into the following well-known fact about real symmetric matrices: If A, B are such matrices with im(B) ⊆ im(A), and if A 0, there is ε > 0 with A − εB 0.
2.6. For the following we fix an affine-linear subspace L ⊆ S 2 V together with the corresponding spectrahedron S = L ∩ S + 2 V . Results 2.7-2.14 below are all due to Ramana-Goldman [13]. For any subset T ⊆ S we consider the linear subspace Proof. The inclusion F ⊆ F(U(F )) is trivial. Conversely there exist finitely many . In order to prove F(im(ϑ)) ⊆ F let γ ∈ F(im(ϑ)), so γ ∈ S and im(γ) ⊆ im(ϑ). Choose a real number t > 0 so that ϑ ′ := ϑ − t(γ − ϑ) 0, using Lemma 2.5. Since ϑ ′ ∈ S and ϑ is a convex combination of ϑ ′ and γ, we conclude that γ ∈ F . Definition 2.8. We say that a linear subspace U of V is facial, or a face subspace (for the given spectrahedron S = L ∩ S + 2 V ), if there exists ϑ ∈ S with U = im(ϑ). The following lemma is obvious (cf. 2.4): Note that the intersection U ∩ U ′ need not contain any face subspace.
Proposition 2.10. There is a natural inclusion-preserving bijection between the nonempty faces F of S and the face subspaces U ⊆ V for S, given by In particular we see: Here are equivalent characterizations of face subspaces: Proposition 2.13. For a linear subspace U ⊆ V , the following are equivalent: . . , u r ∈ U are linearly independent. Since the u i span im(ϑ), they are a linear basis of U . ( Proof. Here aff(F ) denotes the affine-linear hull of F . Since F = L ∩ S + 2 U , it is clear that aff(F ) ⊆ L ∩S 2 U . For the other inclusion let ϑ ∈ relint(F ), so im(ϑ) = U (2.11), and let γ ∈ L ∩ S 2 U be arbitrary. Then γ t := (1 − t)ϑ + tγ 0 for |t| < ε and small ε > 0 (2.5), and therefore γ t ∈ S for these t. Since ϑ = 1 2 (γ t + γ −t ), these γ t lie in F , and we have proved γ ∈ aff(F ).
The following result is due to Pataki [10]. It describes the interval in which the ranks of the extreme points of a spectrahedron can possibly lie: (b) When L is chosen generically among all affine subspaces of dimension m, This formulation is taken from [5] from Proposition 2.15. This amounts to the range of integers r satisfying Indeed, the first (resp. second) inequality in (1)

Gram spectrahedra
See Choi-Lam-Reznick [4] for an introduction to Gram matrices of real polynomials, and Chua-Plaumann-Sinn-Vinzant [5] for a survey on Gram spectrahedra. In contrast to these texts we emphasize a coordinate-free approach.

3.2.
Let V ⊆ A be a finite-dimensional linear subspace, and let f ∈ A. We define the Gram spectrahedron of f , relative to V , to be the set of all psd Gram tensors for all i, up to orthogonal equivalence. This means, the elements of Gram V (f ) are the symmetric tensors r i=1 p i ⊗ p i with r ≥ 0 and p 1 , . . . , p r ∈ V such that r i=1 p 2 i = f . Given two such tensors ϑ = r i=1 p i ⊗ p i and ϑ ′ = s j=1 q j ⊗ q j , we may assume r = s; then ϑ = ϑ ′ if and only if there is an orthogonal real matrix (u ij ) such that q j = r i=1 u ij p i for all j. See [4] § 2. Lemma 3.3. Gram V (f ) is a spectrahedron, and is compact provided that the identity r i=1 p 2 i = 0 with p 1 , . . . , p r ∈ V implies p 1 = · · · = p r = 0.
Proof. By its definition, Gram V (f ) is a spectahedron. If Gram V (f ) is unbounded, it has nonzero recession cone, which means that there is 0 = ϑ ∈ S 2 V with η + ϑ ∈ Gram V (f ) for every η ∈ Gram V (f ). It follows that µ(ϑ) = 0 and ϑ 0, so Usually we will consider Gram spectrahedra only in the case where sums of squares in A are strongly stable [7]. This means that there exists a filtration ≤i , the space of polynomials of degree ≤ i.

3.5.
We summarize what the formalism of Section 2 means. Let V ⊆ A be a linear subspace, dim(V ) < ∞, and let f ∈ A. We will say that a linear subspace U ⊆ V is a face subspace for f if U is a face space for the spectrahedron Gram V (f ) in the sense of 2.8. In other words, U is a face subspace for f if there is ϑ ∈ Gram V (f ) with U = im(ϑ). According to Proposition 2.13, the nonempty faces F of Gram V (f ) are in bijection with the face subspaces U for f , via F → U(F ) and U → F(U ).
The dimension formula 2.14 for faces takes a particularly appealing form for Gram spectrahedra. If U ⊆ A is a linear subspace, let U U denote the linear subspace of A spanned by the products pp ′ (p, p ′ ∈ U ).  We say that a sequence p 1 , . . . , p r in A is quadratically independent if the r+1 2 products p i p j (1 ≤ i ≤ j ≤ r) are linearly independent. Using this terminology we get:

Quadratically independent binary forms
4.1. Let k be a field, let A be a (commutative) k-algebra. If U ⊆ A is a k-linear subspace, let U U denote the linear subspace of A spanned by the products pp ′ (p, p ′ ∈ U ), as in 3.5. Assuming dim(U ) = r < ∞, we say that U is quadratically independent if the natural multiplication map S 2 U → A is injective, i.e. if dim(U U ) = r+1 2 . A sequence p 1 , . . . , p r of elements of A is quadratically independent if the p i are a linear basis of a quadratically independent subspace U of A.
We will prove the following general result for binary forms: Theorem 4.2. Let k be an infinite field, and let d, r ≥ 1 such that r+1 2 ≤ 2d + 1. Then there exists a sequence of r binary forms of degree d over k that is quadratically independent.

For the rest of this section write
Clearly, the existence of a single quadratically independent sequence of length r in A d implies that the generic length r sequence in A d will be quadratically independent. We can therefore assume that the field k is algebraically closed. (This assumption is only made to simplify notation.)

4.4.
Our proof of Theorem 4.2 proceeds by induction on r ≥ 1, the start being the case r = 1 and d = 0 (which is obvious). So let r ≥ 2 in the sequel. By induction there is a quadratically independent sequence q 1 , . . . , q r−1 in A e , where e ≥ 0 is minimal with r 2 ≤ 2e + 1. Let d ≥ 1 be minimal with r+1 2 ≤ 2d + 1. Given z 1 , . . . , z m ∈ P 1 we put Let ∞ ∈ P 1 be a fixed point, let 0 = l ∈ A 1 with l(∞) = 0.
(b) By induction we have a quadratically independent sequence q 1 , . . . , q r−1 in A e . Since e < d, the r − 1 forms For general enough choice of q, therefore, this intersection has codimension d − 1 in V V , resp. is zero if dim(V V ) ≤ d − 1 (which happens precisely for r ≤ 5). We can therefore modify p 1 ∈ W d (∞) in such a way that holds and the sequence p 1 , . . . , p r−1 remains quadratically independent. Writing U := span(p 1 , . . . , p r−1 ) = kp 1 ⊕V we have U U = p 1 U ⊕V V since U is quadratically independent. Therefore and this subspace has dimension r−1 (if r ≤ 5) resp. (r−1) 4.6. According to Lemma 4.5, we can now fix a quadratically independent subspace U ⊆ W d (∞) with dim(U ) = r − 1 and such that dim(pA d ∩ U U ) ≥ r − 1 r ≤ 5, r 2 − d + 1 r ≥ 5 holds for all p ∈ U , with equality holding for p sufficiently general. We are going to show that we can extend U to a quadratically independent subspace of A d of dimension r. Let P U resp. P A d denote the projective spaces associated to the linear spaces U resp. A d , and consider the closed subvariety for the element in P U represented by 0 = p ∈ U , and similarly [q] for 0 = q ∈ A d .) Let π 1 : X → P U and π 2 : X → P A d denote the projections onto the two components.

4.8.
In particular, dim(X) < dim(P A d ). For generically chosen q ∈ A d , therefore, we have π −1 2 ([q]) = ∅, which means qU ∩ U U = {0}. In particular there is such q ∈ A d with q(∞) = 0. Since qU ⊕ U U ⊆ W 2d (∞) and q 2 / ∈ W 2d (∞), we see that the r-dimensional subspace U + kq of A d is quadratically independent. This completes the induction step, and thereby the proof of Theorem 4.2.

Pataki range for Gram spectrahedra of binary forms
the Pataki interval (2.16) for Gram(f ) is characterized by the inequalities In the case n = 2 of binary forms this means dim Gram(f ) = d+2 2 − (2d + 1) = d 2 for f ∈ int(Σ 2d ), and the Pataki range is described by the inequalities r ≥ 2 and Here is our first main result on extreme points of Gram spectrahedra: This gives an affirmative answer to Question 4.2 from [5]. Note that Gram(f ) has dimension d 2 for general f ∈ Σ 2d , so the dimensions of these spectrahedra are arbitrarily large.
Proof of Theorem 5.3. Let k ≥ 1 be the largest integer with k+1 2 ≤ 2d + 1, so the Pataki interval for Gram spectrahedra of degree 2d forms is {2, 3, . . . , k}. Fix r ∈ {2, 3, . . . , k}, and let W r ⊆ (R[x] d ) r be the set of all quadratically independent r-tuples (p 1 , . . . , p r ) of forms. By Corollary 5.2, the set W r is open and dense in (R[x] d ) r . Let S r := p 2 1 + · · · + p 2 r : (p 1 , . . . , p r ) ∈ W r . Since every psd form in R[x] is a sum of two squares, the set S r is a dense semialgebraic subset of Σ 2d . Whenever (p 1 , . . . , p r ) ∈ W r , if we put f := r i=1 p 2 i , the symmetric tensor r i=1 p i ⊗ p i is an extreme point of Gram(f ) of rank r (Corollary 3.8). Therefore every f ∈ S r has a rank r extreme point in its Gram spectrahedron. It now suffices to consider the intersection S := k r=2 S r . Then S is a dense semialgebraic subset of Σ 2d since dim(Σ 2d S) < dim(Σ 2d ). And for every f ∈ S, the Gram spectrahedron of f has extreme points of all ranks in the Pataki interval.
We can also determine the dimensions of the sets of extreme points of a fixed rank, for suitably general f . To have a short notation, let us write Ex r (f ) for the (semialgebraic) set of all extreme points of Gram(f ) of rank r.
There is an open dense subset U of Σ 2d such that, for every f ∈ U and every r in the Pataki range, we have dim Ex r (f ) = 1 2 (r − 2)(2d − r + 1).
Proof. Let r be in the Pataki range. Using notation from the previous proof, consider the sum of squares map σ : W r → R[x] 2d , (p 1 , . . . , p r ) → r i=1 p 2 i . Its image is dense in Σ 2d . It follows from local triviality of semialgebraic maps (Hardt's theorem, see e.g. [3] Theorem 9.3.2) that, for every f in an open dense set U r ⊆ Σ 2d , the fibre σ −1 (f ) has dimension r(d + 1) − (2d + 1). The orthogonal group O(r) has dimension r 2 . It acts on the fibre σ −1 (f ) with trivial stabilizer subgroups, and the orbits are precisely the extreme points of Gram(f ) of rank r. So we get for every f ∈ U r . Take U to be the intersection of the sets U r for all r in the Pataki range, to get the desired conclusion.
Remark 5.5. At least for general positive f of degree ≥ 12, the boundary of Gram(f ) is a union of positive dimensional faces. This is reflected by the fact that, for 2d ≥ 8 and any r in the Pataki range, the number 1 2 (r − 2)(2d − r + 1) from Corollary 5.4 is smaller than the dimension of the boundary of Gram(f ), which is , where g is the form that is coefficient-wise complex conjugate to g. Any ϑ ∈ Gram(f ) with rk(ϑ) ≤ 2 has the form ϑ = p ⊗ p + q ⊗ q where p, q ∈ R[x] satisfy f = p 2 + q 2 = (p + iq)(p − iq). Conversely, a factorization f = gg with g ∈ C[x] gives a Gram tensor ϑ = p ⊗ p + q ⊗ q of f , namely p = 1 2 (g + g) and q = 1 . Two factorizations f = gg = hh give the same Gram tensor of f if and only if h is a scalar multiple of g or g. In particular, if we assume that f has no multiple complex roots, we see that f has (no Gram tensors of rank one and) precisely 2 d−1 Gram tensors of rank two. All of them are extreme points of Gram(f ).

6.2.
When g has only real zeros, Gram(f ) ∼ = Gram(f g 2 ) naturally. Hence we discuss Gram(f ) for strictly positive f only. Let d ≥ 1, let f ∈ Σ 2d be strictly positive, and let us first consider the cases of very small degree. If d = 1 then Gram(f ) is a single point of rank two. If d = 2 then Gram(f ) is a nondegenerate interval, the relative interior of which consists of points of rank 3. If f has simple roots, both end points have rank 2. Otherwise f is a square, and one end point has rank 1, the other has rank 2.
The case d = 3 is covered in the next result (see also [5]  Proof. The extreme points of rank ≤ 2 correspond to complex factorizations f = pp. Depending on whether f has six, four or two different roots, there are four, three or two essentially different such factorizations. The corresponding psd Gram tensors have rank two except when f is a square, i.e. has only two different roots; then one of the Gram tensors has rank one. If Gram(f ) had a proper face of positive dimension, its rank would have to be 3. To prove (a) it therefore suffices to show that, for any two extreme points ϑ = ϑ ′ of rank ≤ 2, the segment [ϑ, ϑ ′ ] meets the interior of Gram(f ). Let f = pp = qq be the two factorizations corresponding to ϑ and ϑ ′ . We can assume p = gh, q = gh with and {a 1 , a 2 , a 3 } ∩ {a 1 , a 2 , a 3 } = ∅. For the supporting face F of 1 2 (ϑ + ϑ ′ ) we have U(F ) = span(gh, gh, gh, gh) Calculating the determinant gives This means that 1 2 (ϑ+ ϑ ′ ) has rank 4, and hence lies in the interior of Gram(f ).
When the positive sextic f is general, the algebraic boundary of Gram(f ) is a Kummer surface, see [9] Section 5 and [5] Section 4.2. In this case, assertion (a) also follows from the fact that a Kummer surface in P 3 does not contain a line. Now we are interested in arbitrary degrees. Let f ∈ R[x] 2d be a sufficiently general positive form. We ask: For which pairs ϑ = ϑ ′ in Ex 2 (f ) is the line segment [ϑ, ϑ ′ ] an edge of Gram(f ), i.e. a one-dimensional face?
This proves the d = 4 case of Theorem 6.4. Indeed, the eight points of Ex 2 (f ), corresponding to the eight essentially different factorizations f = pp, decompose into two subclasses of four points each, where two different factorizations f = pp = qq belong to the same subclass if and only if p and q have precisely two roots in common.
Proof of Lemma 6.7. Unfortunately, we have no better argument than a brute force computation: For g i , h i with general coefficients, the corresponding 9 × 9 determinant vanishes identically.
Proof of Corollary 6.8. It suffices to prove the assertion for one specific choice of the g i and h i . We can assume δ ≥ 3. Let G 1 , G 2 , H 1 , H 2 satisfy deg(G i ) = 3, deg(H i ) = 1 and gcd(G 1 , G 2 ) = gcd(H 1 , H 2 ) = 1, and let l = 0 be any linear form. Then by Lemma 6.6, the assertion of 6.8 is true for g i := G i ℓ δ−3 , h i := H i ℓ ε−1 , i = 1, 2.