The Real Polynomial Eigenvalue Problem is Well Conditioned on the Average

We study the average condition number for polynomial eigenvalues of collections of matrices drawn from some random matrix ensembles. In particular, we prove that polynomial eigenvalue problems defined by matrices with random Gaussian entries are very well conditioned on the average.


Introduction
Following the ideas in [3,7,19], we note that many different numerical problems can be described within the following simple general framework. We consider a space of inputs and a space of outputs denoted by I and O, respectively, and some equation of One can force some of the matrices to be symmetric and/or positive definite (which leads to a semialgebraic solution variety), a particularly important case in applications, or consider other structured problems, see [10,13,18,21]. In the cases d = 1 and d = 2 polynomial eigenvalues are often referred to as generalized eigenvalues and quadratic eigenvalues, respectively. If n = 1, we recover the homogeneous version of 1.
In this paper, we prove a general theorem computing exactly the expected value of the condition number in a wide collection of problems, including problem 4 above.
We start by recalling the general geometric definition of the condition number, which is usually thought of as "a measure of the sensitivity of the solution o under an infinitesimal perturbation of the input i." A Finsler structure on a differentiable manifold M is a smooth field of norms · p : T p M → R, p ∈ M on M (see [3, p. 223] for more details). In particular, a Riemannian structure ·, · on M defines a Finsler structure on it by ṗ p = ṗ,ṗ p , p ∈ M,ṗ ∈ T p M.
is the set of n × n real matrices endowed with the Finsler structure associated to relative errors in operator norm: In the PEVP, the input space I is endowed with the following Riemannian structure: where A = (A 0 , . . . , A d ), (α, β) ∈ R 2 is a (representative of a) polynomial eigenvalue of A, r and are the corresponding right and left eigenvectors and A given tuple A can have up to nd real isolated polynomial eigenvalues. We define the condition number of A simply as the sum of the condition numbers over all these PEVs: In Corollary 3, we provide an analogous formula in the case when A 0 , . . . , A d are independent GOE(n)-distributed matrices.

Remark 2
Recently in [1] Armentano and the first author of the current article investigated the expectation of the squared condition number for polynomial eigenvalues of complex Gaussian matrices. Theorem 1 establishes the "asymptotic square root law" for the considered problem, i.e., when n → +∞ (and up to the factor π/2) our answer in (2) equals the square root of the answer in [1]. See [2,5,8,15,20] for different contexts where a square root law has been established. For an example of a problem where a square root law fails to hold, see [6].
In Sect. 1, we state our main results, of which Theorem 1 is an easy consequence. Their proofs are given in Sect. 3 and in Sect. 4; some technical results are left for Appendix.

Main Results
In this section, we state our most general result, from which Theorem 1 will follow. First, let us fix a general framework which analyzes the input-output problems described above in a semialgebraic context. For the rest of this paper, the input and the output sets will be, respectively, the punctured real vector space I = R m \{0} and the unit circle S 1 ⊂ R 2 endowed with the standard Riemannian structures. The solution variety will be a semialgebraic set S ⊂ R m × S 1 ⊂ R m × R 2 (we change letter from V to S to remark the fact that it is semialgebraic). We denote by S top the union of topdimensional (smooth) strata of S (see Sect. 2 for details). Then, the smooth manifold S top ⊂ R m × S 1 is endowed with the induced Riemannian product structure. The two projections defined on S are denoted by p 1 : S → R m , p 2 : S → S 1 .
Definition 2 (Condition number in the semialgebraic setting) Let S ⊆ R m × S 1 be any m-dimensional semialgebraic set. Near a regular point (a, x) ∈ S top the first projection p 1 : S top → R m is locally invertible, i.e., there exists a neighborhood U ⊂ R m of a ∈ U and a unique smooth map p −1 1 : U → S top such that p −1 1 (a) = (a, x) and In this case, the local relative condition number μ(a, x) is defined as For points (a, x) ∈ S low = S\S top in the strata of lower dimension of S as well as for critical points (a, x) ∈ S top of p 1 : The relative condition number μ(a) of a ∈ R m is defined to be the sum of all local relative condition numbers μ(a, x): Note also that Definition 2 agrees with Definition 1 if S is algebraic and we endow the input space I = R m \{0} with the Riemannian structure associated to relative errors, that is, ȧ,ḃ a = (ḃ tȧ )/ a 2 , a ∈ R m \{0}.
To simplify terminology, throughout the rest of the paper, we omit the word "relative" when referring to (local) relative condition number.
We deal with a large class of semialgebraic subsets of R m × S 1 that we define next.

Definition 3
We say that the semialgebraic set S ⊂ R m × S 1 is non-degenerate if the following conditions are satisfied: In Proposition 1, we show that this condition is equivalent to the following one: 2 . there exists a semialgebraic subset B ⊂ R m of dimension at most m − 2 such that for any a / ∈ B the fiber p −1 1 (a) is finite.
The first condition in Definition 3 implies that S is m-dimensional (see Lemma 1). To perform our probabilistic study, we take the input variables a = (a 1 , . . . , a m ) ∈ R m to be independent standard Gaussians: a ∼ N (0, 1). In the following theorem, we establish a general formula for the expectation of the condition number μ(a) of a randomly chosen a ∈ R m : where μ is given in Definition 2. If, moreover, S is scale-invariant with respect to the first m variables, i.e., (a, x) ∈ S if and only if (ta, x) ∈ S for any t > 0, then The following form of Theorem 2 for sets in R m × RP 1 better fits our purposes.
Note that Corollary 1 is just a "projective" version of the second part of Theorem 2.
As pointed out in the introduction, we are specifically interested in the polynomial eigenvalue problem.
The space M(n, R) of n × n real matrices is endowed with the Frobenius inner product and the associated norm: Then, a k-dimensional vector subspace V ⊂ M(n, R) is endowed with the standard normal probability distribution N V : where dv is the Lebesgue measure on (V , (·, ·)) and U ⊂ V is a measurable subset.
where μ(A, x) is as in Definition 2 with R m = (V , (·, ·)) d+1 so that m = (d + 1)k and As proved in [10], in the case V = M(n, R) this definition for μ(A, x) is equivalent to (1). In the following theorem, we investigate the expected condition number for polynomial eigenvalues of independent Poincaré's formula [14, (3-5)] allows to derive the following universal upper bound.
In the case V = M(n, R) of all square matrices, we provide an explicit formula for the expected condition number, that is the claim of our Theorem 1 above.
We give an explicit answer also in the case V = Sym(n, R) of symmetric matrices. In this case, the probability space (Sym(n, R), N Sym(n,R) ) is usually referred to as Gaussian Orthogonal Ensemble (GOE).

Corollary 3
If A 0 , . . . , A d ∈ Sym(n, R) are independent GOE(n)-matrices and n is even, then If n is odd, the explicit formula is more complicated and is given in the proof of the corollary. However, the above asymptotic formula is valid for both even and odd n.

Preliminaries
Below, we state few classical results in semialgebraic geometry that we will use; the proofs can be found in [4,9]. Given a semialgebraic set S ⊂ R N of dimension k ≤ N , we fix a semialgebraic stratification of S, i.e., a partition of S into finitely many semialgebraic subsets (called strata) such that each stratum is a smooth submanifold of R N and the boundary of any stratum of dimension i ≤ N is a union of some strata of dimension less than i. We denote by S top the union of all k-dimensional strata of S and by S low = S\S top the union of the strata of dimension less than k. The sets S top , S low ⊂ R N are semialgebraic, and S top is a smooth k-dimensional submanifold of R N .
One of the central results about semialgebraic mappings is Hardt's theorem.
The following two corollaries of Hardt's theorem are frequently used to estimate dimension of semialgebraic sets.
Corollary 4 Let f : S → R M be as above. The set A d ={x ∈ R M : dim( f −1 (x)) = d} is semialgebraic and has dimension not greater than dim(S) − d.
Proof With the notations of Hardt's theorem, let us write We thus have A d = ∪ i∈I C i for some finite index set I , and we conclude that A d is a semialgebraic set. Moreover, for i ∈ I we have that and the second claim of the corollary follows.
Corollary 5 Let f : S → R M be as above and let Z ⊆ R M . Then, for some z ∈ Z we Proof Again using the notation of Hardt's theorem for the restriction g :

Proof of Main Results
In this section, we prove our main results, Theorems 2 and 3. Let us first fix some notations that are used in the rest of the paper: For a non-degenerate subset S ⊂ R m × S 1 by Σ 1 , Σ 2 ⊂ S top , we denote the semialgebraic sets of critical points of p 1 : S top → R m and p 2 : S top → S 1 , respectively, the corresponding semialgebraic sets of critical values are denoted by σ 1 = p 1 (Σ 1 ) ⊂ R m and σ 2 = p 2 (Σ 2 ) ⊂ S 1 .

Proof of Theorem 2
In this subsection, S denotes a non-degenerate semialgebraic subset of R m × S 1 . For the proof of Theorem 2, we need few technical lemmas which we state and prove below. We now prove that S\M = p −1 is discrete, which together with the non-degeneracy of S and Corollary 5 implies dim( p −1

Lemma 3 There exists an open semialgebraic subset R
Proof Since S is non-degenerate, every fiber p −1 Note that the set S 1 \ p 2 (S top ) is semialgebraic and zero-dimensional, thus finite. Indeed, if it was one-dimensional Theorem 4 together with dim( p −1 The semialgebraic set σ 2 = p 2 (Σ 2 ) ⊂ S 1 of critical values of p 2 : S top → S 1 has measure zero by Sard's theorem (see [12, p. 39]). Hence, σ 2 ⊂ S 1 consists of a finite number of points.
Applying Corollary 4 to p 2 : S low → S 1 , we have that C := {x ∈ S 1 : is finite by the above arguments. Since R consists of regular points of p 2 : S top → S 1 , the map p 2 :

Lemma 4 For any measurable function f
Here, N J stands for the normal Jacobian of a smooth map, that is, the absolute value of the determinant of the differential restricted to the orthogonal complement to its kernel.
Proof Let M ⊂ S top be as in Lemma 2. The smooth coarea formula [14, (A-2)] applied to the measurable function f : M → [0, +∞) and to the submersion p 1 : where we used that M = p −1 1 ( p 1 (M)) (Lemma 2) to be able to sum over the whole fiber p −1 1 (a) = {(a, x) ∈ S}, a ∈ p 1 (M). By Lemma 2, we have dim(S\M) ≤ m −1, and hence, dim( p 1 (S)\ p 1 (M)) = dim( p 1 (S\M)) ≤ dim(S\M) ≤ m − 1. Thus, we extend the integrations in (8) over S and p 1 (S), respectively, without changing the result. Moreover, the integration over p 1 (S) can be further extended to the whole space R m since for a point a ∈ R m \ p 1 (S) the summation x∈S 1 :(a,x)∈S f (a, x) is performed over the empty set p −1 1 (a) and the sum is conventionally set to 0. All together, the above arguments imply Let R ⊂ S top be as in Lemma 3. Applying the smooth coarea formula [14, (A-2)] to the measurable function N J p 1 N J p 2 f : R → [0, +∞) and to the submersion . Thus, the integrations in (10) can be extended over S, S 1 and p −1 2 (x), respectively, leading to Combining (11) with (9), we finish the proof.
Now comes the proof of Theorem 2.

Proof of Theorem 2
The following identity is the key point of the proof: where M ⊂ S top and R ⊂ S top are as in Lemmas 2 and 3, respectively, and μ(a, x), the local condition number of (a, x) ∈ S, is defined in Definition 2. The proof of the identity comes after we derive the statement of Theorem 2. Applying Lemma 4 to the measurable function f (a, x) = μ(a, x)e − a 2 /2 / √ 2π m , (a, x) ∈ S, and using (12), we obtain: a e − a 2 2 da dx = ( * ), which gives the claimed formula (3). If S is scale-invariant with respect to a ∈ R m by Lemma 5, we have Now, we turn to the proof of (12). (ȧ 1 , 0), . . . , (ȧ m−1 , 0) be an orthonormal basis of T (a,x) R with (ȧ j , 0) ∈ ker D (a,x) p 2 , j = 1, . . . , m − 1. Note thaṫ a 0 ∈ R m ,ẋ 0 ∈ T x S 1 are nonzero since p 1 : M → p 1 (M), p 2 : R → p 2 (R) are submersions andȧ 0 ∈ R m is orthogonal toȧ j ∈ R m , j = 1, . . . , m − 1. We compute the normal Jacobians N J (a,x) p 1 and N J (a,x) p 2 using the following orthonormal bases: It is straightforward to see that N J (a,x) p 1 = ȧ 0 and N J (a,x) p 2 = ẋ 0 and hence This together with (13) implies the claimed identity (12).

Proof of Theorem 3
For a k-dimensional vector subspace V ⊂ M(n, R) and for a basis f = ( f 0 (α, β), . . . , f d (α, β)) of the space P d,2 of binary forms of degree d ≥ 1, let us define the algebraic variety Theorem 3 follows from the following more general result that we state for any choice of basis (not necessarily the monomial basis) since for some problems it may be useful to consider other bases such as the one coming from harmonic forms [16].

Theorem 5 If Σ V ⊂ V is of codimension one and f is any basis of P d,2 , then S(V , f ) is non-degenerate and
It is easy to verify that the linear change of coordinates A j = d i=0 g i jÃi , j = 0, . . . , d is an isometry of the product space (V , (·, ·)) d+1 (where the inner product (·, ·) on V is defined in (4)) and Therefore, for x = (α, β) ∈ S 1 there is a global isometry I x : (V , (·, ·)) d+1 → (V , (·, ·)) d+1 that sends the fiber p −1 In particular, under the assumption dim(Σ V ) = k − 1 we have that p −1 2 (x) is of codimension one in V d+1 , and hence, condition (1) in Definition 3 is satisfied.

Proof of Theorem 3 Taking
in Theorem 5, we obtain the claim of Theorem 3.

Applications of Main Results
In this section, we derive Theorem 1 and Corollaries 2, 3.
In case of any particular space V ⊂ M(n, R) satisfying dim(Σ V ) = k − 1 = dim(V ) − 1 by Theorem 3 explicit computation of the expected condition number for polynomial eigenvalues amounts to computing the volume of the hypersurface Σ V ∩ S k−1 . In cases V = M(n, R) and V = Sym(n, R) formulas for the volume of Σ V ∩ S k−1 were found in [11] and [17], respectively. It is easy to see that in both of these cases, the variety Σ V of singular matrices is of codimension one, and hence, the hypothesis of Theorem 3 is satisfied.

Proof of Theorem 1
The formula from [11] reads where the asymptotic is obtained using formula (1) from [22].
The following elementary lemma is frequently used throughout Sect. 3. X ⊂ (R m , · ) is a scale-invariant semialgebraic variety of dimension p ≤ m and q > 0, then