On the geometry of the set of symmetric matrices with repeated eigenvalues

We investigate some geometric properties of the real algebraic variety $\Delta$ of symmetric matrices with repeated eigenvalues. We explicitly compute the volume of its intersection with the sphere and prove a Eckart-Young-Mirsky-type theorem for the distance function from a generic matrix to points in $\Delta$. We exhibit connections of our study to Real Algebraic Geometry (computing the Euclidean Distance Degree of $\Delta$) and Random Matrix Theory.


Introduction
In this paper we investigate the geometry of the set ∆ (below called discriminant) of real symmetric matrices with repeated eigenvalues and of unit Frobenius norm: Here, λ 1 (Q), . . . , λ n (Q) denote the eigenvalues of Q, the dimension of the space of symmetric matrices is N := n(n+1) 2 and S N −1 denotes the unit sphere in Sym(n, R) endowed with the Frobenius norm Q := tr(Q 2 ).
This discriminant is a fundamental object and it appears in several areas of mathematics, from mathematical physics to real algebraic geometry, see for instance [3,5,4,6,35,1,2,36]. We discover some new properties of this object (Theorem 1.1 and Theorem 1.4) and exhibit connections and applications of these properties to Random Matrix Theory (Section 1.4) and Real Algebraic Geometry (Section 1.3).
The set ∆ is an algebraic subset of S N −1 . It is defined by the discriminant polynomial : which is a non-negative homogeneous polynomial of degree deg(disc) = n(n − 1) in the entries of Q. Moreover, it is a sum of squares of real polynomials [21,27] and ∆ is of codimension two. The set ∆ sm of smooth points of ∆ is the set of real points of the smooth part of the Zariski closure of ∆ in Sym(n, C) and consists of matrices with exactly two repeated eigenvalues. In fact, ∆ is stratified according to the multiplicity sequence of the eigenvalues; see (1.3).
1.1. The volume of the set of symmetric matrices with repeated eigenvalues. Our first main result concerns the computation of the volume |∆| of the discriminant, which is defined to be the Riemannian volume of the smooth manifold ∆ sm endowed with the Riemannian metric induced by the inclusion ∆ sm ⊂ S N −1 .  Remark 1. Results of this type (the computation of the volume of some relevant algebraic subsets of the space of matrices) have started appearing in the literature since the 90's [17,16], with a particular emphasis on asymptotic studies and complexity theory, and have been crucial for the theoretical advance of numerical algebraic geometry, especially for what concerns the estimation of the so called condition number of linear problems [13]. The very first result gives the volume of the set Σ ⊂ R n 2 of square matrices with zero determinant and Frobenius norm one; this was computed in [17,16]: For example, this result is used in [17,Theorem 6.1] to compute the average number of zeroes of the determinant of a matrix of linear forms. Subsequently this computation was extended to include the volume of the set of n × m matrices of given corank in [7] and the volume of the set of symmetric matrices with determinant zero in [23], with similar expressions. Recently, in [8] the above formula and [23,Thm. 3] were used to compute the expected condition number of the polynomial eigenvalue problem whose input matrices are taken to be random.
In a related paper [11] we use Theorem 1.1 for counting the average number of singularities of a random spectrahedron. Moreover, the proof of Theorem 1.1 requires the evaluation of the expectation of the square of the characteristic polynomial of a GOE(n) matrix (Theorem 1.6 below), which constitutes a result of independent interest. Theorem 1.1 combined with Poincaré's kinematic formula from [20] allows to compute the average number of symmetric matrices with repeated eigenvalues in a uniformly distributed projective two-plane L ⊂ PSym(n, R) ≃ RP N −1 : where by P∆ ⊂ PSym(n, R) ≃ RP N −1 we denote the projectivization of the discriminant.
The following optimal bound on the number #(L ∩ P∆) of symmetric matrices with repeated eigenvalues in a generic projective two-plane L ≃ RP 2 ⊂ RP N −1 was found in [28,Corollary 15]: Remark 2. Consequence (1.1) combined with (1.2) "violates" a frequent phenomenon in random algebraic geometry, which goes under the name of square root law : for a large class of models of random systems, often related to the so called Edelman-Kostlan-Shub-Smale models [17,31,16,22,32,30], the average number of solutions equals (or is comparable to) the square root of the maximum number; here this is not the case. We also observe that, surprisingly enough, the average cut of the discriminant is an integer number (there is no reason to even expect that it should be a rational number!).
More generally one can ask about the expected number of matrices with a multiple eigenvalue in a "random" compact 2-dimensional family. We prove the following. Theorem 1.2 (Multiplicities in a random family). Let F : Ω → Sym(n, R) be a random Gaussian field F = (f 1 , . . . , f N ) with i.i.d. components and denote by π : Sym(n, R)\{0} → S N −1 the projection map. Assume that: (1) with probability one the map π • F is an embedding and (2) the expected number of solutions of the random system {f 1 = f 2 = 0} is finite. Then: where C (∆) ⊂ Sym(n, R) is the cone over ∆.
2. An Eckart-Young-Mirsky-type theorem. The classical Eckart-Young-Mirsky theorem allows to find a best low rank approximation to a given matrix.
For r ≤ m ≤ n let's denote by Σ r the set of m × n complex matrices of rank r. Then for a given m × n real or complex matrix A a rank r matrixÃ ∈ Σ r which is a global minimizer of the distance function is called a best rank r approximation to A. The Eckart-Young-Mirsky theorem states that if A = U * SV is the singular value decomposition of A, i.e., U is an m × m real or complex unitary matrix, S is an m × n rectangular diagonal matrix with non-negative diagonal entries s 1 ≥ · · · ≥ s m ≥ 0 and V is an n × n real or complex unitary matrix, thenÃ = U * S V is a best rank r approximation to A, whereS denotes the rectangular diagonal matrix withS ii = s i for i = 1, . . . , r andS jj = 0 for j = r + 1, . . . , m. Moreover, a best rank r approximation to a sufficiently generic matrix is actually unique. More generally, one can show that any critical point of the distance function dist A : Σ r → R is of the form U * SI V, where I ⊂ {1, 2, . . . , m} is a subset of size r andS I is the rectangular diagonal matrix withS I ii = s i for i ∈ I andS I jj = 0 for j / ∈ I. In particular, the number of critical points of dist A for a generic matrix A is n r . In [14] the authors call this count the Euclidean Distance Degree of Σ r ; see also Section 1.3 below.
For the distance function from a symmetric matrix to the cone over ∆ we also have an Eckart-Young-Mirsky-type theorem. We prove this theorem in Section 2. Theorem 1.3 (Eckart-Young-Mirsky-type theorem). Let A ∈ Sym(n, R) be a generic real symmetric matrix and let A = C T ΛC be its spectral decomposition with Λ = diag(λ 1 , . . . , λ n ). Any critical point of the distance function Moreover, the function d A : C (∆) → R attains its global minimum at exactly one of the critical points C T Λ i,j C ∈ C (∆ sm ) \ {0} and the value of the minimum of d A equals: Remark 3. Since C (∆) ⊂ Sym(n, R) is the homogeneous cone over ∆ ⊂ S N −1 the above theorem readily implies an analogous result for the spherical distance function from A ∈ S N −1 to ∆. The theorem is a special case of Theorem 1.4 below, that concerns the critical points of the distance points to fixed stratum of C (∆). These strata are in bijection with vectors of natural numbers w = (w 1 , w 2 , . . . , w n ) ∈ N n such that n i=1 i w i = n as follows: let us denote by C (∆) w the smooth semialgebraic submanifold of Sym(n, R) consisting of symmetric matrices that for each i ≥ 1 have exactly w i eigenvalues of multiplicity i. Then, by [29, Lemma 1], the semialgebraic sets C (∆) w with w 1 < n form a stratification of C (∆): In this notation, the complement of C (∆) can be written C (∆) (n,0,...,0) = Sym(n, R) \ C (∆).
Let us denote by Diag(n, R) w := Diag(n, R) ∩ C (∆) w the set of diagonal matrices in C (∆) w and its Euclidean closure by Diag(n, R) w . This closure is an arrangement of .. critical points each of which is the orthogonal projection of Λ on one of the planes in the arrangement Diag(n, R) w and the distance d Λ attains its unique global minimum at one of these critical points. We will show that an analogous result holds for the distance function from a general symmetric matrix A ∈ Sym(n, R) to the smooth semialgebraic manifold C (∆) w . The proof for the following theorem is in Section 2.
Theorem 1.4 (Eckart-Young-Mirsky-type theorem for the strata). Let A ∈ Sym(n, R) be a generic real symmetric matrix and let A = C T ΛC be its spectral decomposition. Then: (1) Any critical point of the distance function d A : C (∆) w → R is of the form C TΛ C, wherẽ Λ ∈ Diag(n, R) w is the orthogonal projection of Λ onto one of the planes in Diag(n, R) w .
Remark 4. Note that the manifold C (∆) w is not compact and thus the function d A : C (∆) w → R might not a priori have a minimum.
1.3. Euclidean Distance Degree. Let X ⊂ R m be a real algebraic variety and let X C ⊂ C m denote its Zariski closure. The number #{x ∈ X sm : u − x ⊥ T x X sm } of critical points of the distance to the smooth locus X sm of X from a generic point u ∈ R m can be estimated by the number EDdeg(X) The quantity EDdeg(X) does not depend on the choice of the generic point u ∈ R m and it's called the Euclidean distance degree of X [14]. Also, solutions x ∈ X C sm to u − x ⊥ T x X C sm are called ED critical points of u with respect to X [15]. In the following theorem we compute the Euclidean distance degree of the variety C (∆) ⊂ Sym(n, R) and show that all ED critical points are actually real (this result is an analogue of [15,Cor. 5.1] for the space of symmetric matrices and the variety C (∆)).  .7), which reduces our study to the evaluation of a special integral over the Gaussian Orthogonal ensemble (GOE) [24,34]. The connection between the volume of ∆ and random symmetric matrices comes from the fact that, in a sense, the geometry in the Euclidean space of symmetric matrices with the Frobenius norm and the random GOE matrix model can be seen as the same object under two different points of view.
The integral in (3.7) is the second moment of the characteristic polynomial of a GOE matrix. In [24] Mehta gives a general formula for all moments of the characteristic polynomial of a GOE matrix. However, we were unable to locate an exact evaluation of the formula for the second moment in the literature. For this reason we added Proposition 4.2, in which we compute the second moment, to this article. We use it in Section 4 to prove the following theorem.
An interesting remark in this direction is that some geometric properties of ∆ can be stated using the language of Random Matrix Theory. For instance, the estimate on the volume of a tube around ∆ allows to estimate the probability that two eigenvalues of a GOE(n) matrix are close: for ǫ > 0 small enough The interest of this estimate is that it provides a non-asymptotic (as opposed to studies in the limit n → ∞, [26,9]) result in random matrix theory. It would be interesting to provide an estimate of the implied constant in (1.4), however this might be difficult using our approach as it probably involves estimating higher curvature integrals of ∆.

Critical points of the distance to the discriminant
In this section we prove Theorems 1.3, 1.4 and 1.5. Since Theorem 1.3 is a special case of Theorem 1.4, we start by proving the latter.
2.1. Proof of Theorem 1.4. Let's denote by C (∆) w ⊂ Sym(n, R) the Euclidean closure of C (∆) w . Note that C (∆) w is a (real) algebraic variety, the smooth locus of C (∆) w is C (∆) w and the boundary C (∆) w \ C (∆) w is a union of some strata C (∆) w ′ of greater codimension.
Let now A ∈ Sym(n, R) be a sufficiently generic symmetric matrix and let A = C T ΛC be its spectral decomposition. From [10,Thm. 3] and [10,Sec. 3] it follows that any real ED critical point of C (∆) w with respect to A = C T ΛC is of the form C TΛ C, where the diagonal matrix Λ ∈ Diag(n, R) w is a ED critical point of Diag(n, R) w with respect to Λ ∈ Diag(n, R). Since, as observed above Diag(n, R) w , is an arrangement of n! 1! w 1 2! w 2 3! w 3 ... planes its ED critical points with respect to a generic Λ ∈ Diag(n, R) are the orthogonal projections of Λ on the components of the plane arrangement. One of these ED critical points is the (unique) closest point on Diag(n, R) w to the generic Λ ∈ Diag(n, R). Both claims follow.

2.2.
Proof of Theorem 1.3. Let w = (n − 2, 1, 0, . . . , 0) and let's for a given symmetric matrix A ∈ Sym(n, R) fix a spectral decomposition A = C T ΛC, Λ = diag(λ 1 , . . . , λ n ). From Theorem 1.4 we know that the critical points of the distance function d A : From this, it is immediate that the distance between Λ and Λ i,j equals This finishes the proof.
2.3. Proof of Theorem 1.5. In the proof of Theorem 1.3 we showed that there are n 2 real ED critical points of the distance function from a general real symmetric matrix A to C (∆). In this subsection we in particular argue that there are no other (complex) ED critical points in this case. is also G-invariant. Using the same argument as in [15,Thm. 2.2] we now show that X C is actually G C -invariant. Indeed, for a fixed point A ∈ X C the map is continuous and hence the set γ −1 A (X C ) ⊂ G C is closed. Since by the above G ⊂ γ −1 A (X C ) and since G C is the Zariski closure of G we must have γ −1 A (X C ) = G C . Let's denote by Diag(n, C) ⊂ Sym(n, C) the space of complex diagonal matrices. For any matrix D ∈ Diag(n, C) with pairwise distinct diagonal entries the tangent space at D to the orbit G C D = {CD : C ∈ G C } consists of complex symmetric matrices with zeros on the diagonal: In particular, is the direct sum which is orthogonal with respect to the bilinear form (A 1 , A 2 ) → tr(A T 1 A 2 ). As any real symmetric matrix can be diagonalized by some orthogonal matrix we have This, together with the inclusion C (∆) ⊂ G C (X C ∩Diag(n, C)), imply that G C (X C ∩ Diag(n, C)) is Zariski dense in X C . We now apply the main theorem from [10] to obtain that the ED degree of X C in Sym(n, C) equals the ED degree of X C ∩ Diag(n, C) in Diag(n, C). Since X C ∩Diag(n, C) = {D ∈ Diag(n, C) : D i = D j , i = j} is the union of n 2 hyperplanes the ED critical points of a genericD ∈ Diag(n, C) are orthogonal projections fromD to each of the hyperplanes (as in the proof of Theorem 1.3). In particular EDdeg(X C ) = EDdeg(X C ∩ Diag(n, C)) = n 2 and ifD ∈ Diag(n, R) is a generic real diagonal matrix ED critical points are all real. Finally, for a general symmetric matrix A = C TD C all ED critical points are obtained from the ones for D ∈ Diag(n, R) via conjugation by C ∈ O(n).
The proof of the statement in Remark 5 is similar. Each plane in the plane arrangement Diag(n, R) w yields one critical point and there are n! 1! w 1 2! w 2 3! w 3 ... many such planes.

The volume of the discriminant
The goal of this section is to prove Theorem 1.1 and Theorem 1.2. As was mentioned in the introduction, we reduce the computation of the volume to an integral over the GOE-ensemble. This is why, before starting the proof, in the next subsection we recall some preliminary concepts and facts from random matrix theory that will be used in the sequel. 6 3.1. The GOE(n) model for random matrices. The material we present here is from [24].
The GOE(n) probability measure of any Lebesgue measurable subset U ⊂ Sym(n, R) is defined as follows: where dA = 1≤i≤j≤n dA ij is the Lebesgue measure on the space of symmetric matrices Sym(n, R) and, as before, A = tr(A 2 ) is the Frobenius norm.
By [24,Sec. 3.1], the joint density of the eigenvalues of a GOE(n) matrix A is given by the dλ i is the Lebesgue measure on R n , V ⊂ R n is a measurable subset, λ 2 = λ 2 1 + · · · + λ 2 n is the Euclidean norm, ∆(λ) := 1≤i<j≤n (λ j − λ i ) is the Vandermonde determinant and Z n is the normalization constant whose value is given by the formula see [24, Eq. (17.6.7)] with γ = a = 1 2 . In particular, for an integrable function f : Sym(n, R) → R that depends only on the eigenvalues of A ∈ Sym(n, R), the following identity holds Recall that by definition the volume of ∆ equals the volume of the smooth part ∆ sm ⊂ S N −1 that consists of symmetric matrices of unit norm with exactly two repeated eigenvalues. Let's denote by (S n−2 ) * the dense open subset of the (n − 2)-sphere consisting of points with pairwise distinct coordinates. We consider the following parametrization of ∆ sm ⊂ S N −1 : where λ 1 , . . . , λ n are defined as In Lemma 3.1 below we show that p is a submersion. Applying to it the smooth coarea formula (see, e.g., [12,Theorem 17.8]) we have Here NJ (C,µ) p denotes the normal Jacobian of p at (C, µ) and we compute its value in the following lemma.
Lemma 3.1. The parametrization p : O(n) × (S n−2 ) * → ∆ sm is a submersion and its normal Jacobian at (C, µ) ∈ O(n) × (S n−2 ) * is given by the formula Proof. Recall that for a smooth submersion f : M → N between two Riemannian manifolds the normal Jacobian of f at x ∈ M is the absolute value of the determinant of the restriction of the differential D x f : T x M → T f (x) N of f at x to the orthogonal complement of its kernel. We now show that the parametrization p : O(n) × (S n−2 ) * → ∆ sm is a submersion and compute its normal Jacobian.
Note that p is equivariant with respect to the right action of O(n) on itself and its action on ∆ sm via conjugation, i.e., for all C,C ∈ O(n) and µ ∈ S n−2 we have p(CC, µ) =C T p(C, µ)C T . Therefore, D (C,µ) p = C T D (½,µ) p C and, consequently, NJ (C,µ) p = NJ (½,µ) p. We compute the latter. The differential of p at (I, µ) is the map The Lie algebra T ½ O(n) consists of skew-symmetric matrices: Let E i,j be the matrix that has zeros everywhere except for the entry (i, j) where it equals 1.

This implies that p is a submersion and
Combining this with the fact that the restriction of D (½,µ) p to T µ (S n−2 ) * is an isometry we obtain 1≤i<j≤n, (i,j) =(n,n−1) which finishes the proof.
We now compute the volume of the fiber p −1 (A), A ∈ ∆ sm that appears in (3.5).
The function being integrated is independent of C ∈ O(n). Thus, using Fubini's theorem we can perform the integration over the orthogonal group. Furthermore, the integrand is a 8 homogeneous function of degree (n−2)(n+1) 2 . Passing from spherical coordinates to spatial coordinates and extending the domain of integration to the measure-zero set of points with repeated coordinates we obtain Let us write u := µn−1 √ 2 for the double eigenvalue and make a change of variables from µ n−1 to u. Considering the eigenvalues µ 1 , . . . , µ n−2 as the eigenvalues of a symmetric (n − 2) × (n − 2) matrix Q, by (3.2) we have Using formulas (3.1) and (3.3) for Z n−2 and |O(n)| respectively we write where in the last step the duplication formula for Gamma function Γ( n 2 )Γ( n−1 2 ) = 2 2−n √ π (n−2)! has been used. Let's recall the formula for the volume of the (N − 3)-dimensional unit sphere: we simplify the constant in (3.6) √ 2 Plugging this into (3.6) we have Combining the last formula with Theorem 1.6 whose proof is given in Section 4 we finally derive the claim of Theorem 1.1: |∆| |S N −3 | = n 2 . Remark 6. The proof can be generalized to subsets of ∆ that are defined by an eigenvalue configuration given by a measurable subset of (S n−2 ) * . Such a configuration only adjust the domain of integration in (3.7). For instance, consider the subset It is an open semialgebraic subset of (S n−2 ) * and ∆ 1 := p(O(n) × (S n−2 ) 1 ) is the smooth part of the matrices whose two smallest eigenvalues coincide. Following the proof until (3.7), we get where 1 {Q≻u½} is the indicator function of Q − u½ being positive definite. 9 3.3. Multiplicities in a random family. In this subsection we prove Theorem 1.2.
The proof consists of an application of the Integral Geometry Formula from [20, p. 17]. Denote F = π • F : Ω → S N −1 . Then, by assumption, with probability one we have: Observe also that, since the list (f 1 , . . . , f N ) consists of i.i.d. random Gaussian fields, then for every g ∈ O(N ) the random maps F and g • F have the same distribution and We have used the Integral Geometry Formula [20, p. 17] in the last step. Let L = {x 1 = x 2 = 0} be the codimension-two subspace of Sym(n, R) given by the vanishing of the first two coordinates (in fact: any two coordinates). The conclusion follows by applying integral geometry again: This finishes the proof.

The second moment of the characteristic polynomial of a goe matrix
In this section we give a proof of Theorem 1.6. Let us first recall some ingredients and prove some auxiliary results. Proof. The formula (3.1) for Z 2m reads Using the formula Γ(z)Γ(z + 1 2 ) = √ π2 1−2z Γ(2z) [33, 43:5:7] with z = i + 1/2 we obtain This proves the claim.
Recall now that the (physicist's) Hermite polynomials H i (x), i = 0, 1, 2, . . . form a family of orthogonal polynomials on the real line with respect to the measure e −x 2 dx. They are defined by A Hermite polynomial is either odd (if the degree is odd) or even (if the degree is even) function: and its derivative satisfies [33, (24:5:1)], [18, (8.952.1)] for these properties.
The following proposition is crucial for the proof of Theorem 1.6. (1) If k = 2m is even, then .
(2) If k = 2m + 1 is odd, then Proof. In Section 22 of [24] one finds two different formulas for the even k = 2m and odd k = 2m + 1 cases. We evalute both seperately.
Plugging this into (4.4) we conlude that Everything is now ready for the proof of Theorem 1.6 Proof of Theorem 1.6. Due to the nature of Proposition 4.2 we also have to make a distinction for this proof.
Plugging back in m = k 2 finishes the proof of the case k = 2m. In the case k = 2m + 1 we use the formula from Proposition 4.2 (2) to see that u E det(Q − u½) 2 e −u 2 du = √ π(2m + 1)!
It is not difficult to verify that the last term is 2 −2m−2 √ π (2m + 3)!. Substituting 2m + 1 = k shows the assertion in this case.