1 Introduction

Given \(n \times n\) matrices \(C_0, C_1, \dots , C_k \in \mathbb {C}^{n \times n}\), consider the square matrix polynomial of degree k

$$\begin{aligned} P(x) = \sum _{j=0}^k C_j x^j; \end{aligned}$$
(1)

a finite eigenvalue of P(x) is then defined [9, 12] as a number \(\lambda \in \mathbb {C}\) such that

$$\begin{aligned} \mathrm {rank}_\mathbb {C}P(\lambda ) < \mathrm {rank}_{\mathbb {C}(x)} P(x). \end{aligned}$$

The polynomial eigenvalue problem (PEP) is to find all such eigenvalues [1, 8,9,10,11,12, 15, 21], possibly (and depending on the application) together with other objects—such as eigenspaces, infinite eigenvalues, minimal indices and minimal bases—whose precise definition is not relevant for this paper. Under the generic assumption that \(\det P(x) \not \equiv 0\), the finite eigenvalues of P(x) are the roots of its determinant. Polynomial eigenvalue problems are common in several areas of applied and computational mathematics; their applications include acoustics, control theory, fluid mechanics and structural engineering [10, 11, 21].

Clearly, two very classical mathematical problems arise as special cases of polynomial eigenvalue problems: finding the roots of a scalar polynomial corresponds to \(n=1\), while finding the eigenvalues of a matrix corresponds to \(k=1\) and \(C_1=I_n\). When randomness enters the game, these two extremes are well understood. It is known that when the polynomial coefficients are i.i.d. normally distributed random variables and in the limit \(k \rightarrow \infty \), then the roots of scalar polynomials are uniformly distributed on the unit circle. Similarly, classical results in random matrix theory state that when the entries of an \(n \times n\) matrix are i.i.d. normally distributed random variables with mean 0 and variance \(n^{-1}\), and in the limit \(n \rightarrow \infty \), then the eigenvalues are uniformly distributed on the unit disc. Moreover, the phenomenon of universality is well known: there exist works that, under relatively mild assumptions, extend these results to several other distributions of coefficients or entries.

To our knowledge, nothing was so far explicitly known about the eigenvalues of random matrix polynomials, except for the two extremal cases above described. In this paper, we fill this gap by computing the empirical eigenvalue distribution of monic (\(C_k=I)\) square matrix polynomials of size n and degree k, with all but the leading coefficient being i.i.d. complex Gaussian random matrices, in two different limits: when \(n \rightarrow \infty \) with k constant and when \(k \rightarrow \infty \) with n constant. Moreover, our results can equivalently be interpreted as results on the empirical eigenvalue distribution of certain structured random matrices: indeed, given a monic matrix polynomials P(x), a linearization of P(x) is a matrix whose eigenvalues (as well as their geometric and algebraic multiplicities) coincide with those of P(x). In the numerical linear algebra literature, numerous constructions of linearizations are known, see, e.g., [8, 10, 15] and the references therein. In particular, the prototype of all linearizations is the so-called companion matrix, which plays a central role in this paper.

In previous research on matrix polynomials, probability theory was used in the context of analyzing the condition number of PEPs. Namely, in [2] Armentano and Beltrán computed the average eigenvalue condition number for Gaussian random complex matrix polynomials, and in [4], Beltrán and Kozhasov extended the analysis to the case of real Gaussian matrix polynomials. In [12], Lotz and Noferini went beyond the classical idea of condition by imposing a uniform probability distribution on the sphere for perturbations of a fixed singular matrix polynomial. However, we are not aware of any previous work where the exact distribution of the eigenvalues of a random matrix polynomial is obtained. In addition to being interesting per se, our results can potentially be valuable to numerical analysts in the context of testing numerical methods for the solution of the PEP. Indeed, although randomly generated problems are expected not to be very challenging from the numerical point of view (by the results in [2, 4]), it is common practice to use them as benchmark for minimal performance requirements; in published research papers on this subject, tests on random input are in fact often included among the numerical experiments. The analytic knowledge of the limit eigenvalue distributions that we obtain in this article can help to predict the behavior of randomly generated problems: when scrutinizing a novel algorithm, if the numerically computed eigenvalues should significantly deviate from the expectations, then this fact can raise legitimate suspicions on the accuracy of the computations.

The structure of the paper is as follows. In Sect. 2, we review some necessary background material on linear algebra, matrix polynomial theory, probability theory and random matrix theory. Moreover, we define the empirical spectral distribution of a random matrix polynomial with invertible leading coefficient. In Sect. 3, we obtain our first main result: the almost sure limit, for \(n \rightarrow \infty \), of the empirical spectral distribution of a random \(n \times n\) monic complex Gaussian matrix polynomial of degree k. In Sect. 4, our second main result is discussed: the almost sure limit, for \(k \rightarrow \infty \), of the empirical spectral distribution of a random \(n \times n\) monic complex Gaussian matrix polynomial of degree k. In Sect. 5, we draw some conclusions and propose new lines of research. To keep the main part of the paper as easily readable as possible, the proof of some technical lemmata, needed in Sects. 3 and 4, is postponed to “Appendix A”; however, we believe that some of those results could have independent interest. In particular, we slightly improve known results on the tail bounds for pseudoinverses of random matrices with nonzero mean, and we study the extremal singular values of certain structured random matrices.

2 Mathematical Background

2.1 Linear Algebra

Given an \(m \times n\) complex matrix X, we denote it singular values by \(\sigma _1(X) \ge \dots \ge \sigma _{\min }(X) \ge 0\), having introduced the shorthand \(\sigma _{\min }(X) :=\sigma _{\min (m,n)}(X) \). The spectral norm of X is denoted by \(\Vert X \Vert :=\sigma _1(X)\), while the Frobenius norm of X is

$$\begin{aligned} \Vert X \Vert _F = \left( \sum _{i=1}^m \sum _{j=1}^n |X_{ij}|^2 \right) ^{1/2} = \sqrt{\mathrm {tr}(X^* X)} = \sqrt{\sigma _1(X)^2 + \dots + \sigma _{\min }(X)^2} . \end{aligned}$$

Recall that any X admits a singular value decomposition \(U\Sigma V\) with \(U\in \mathbb {C}^{m\times m}\), \(V\in \mathbb {C}^{n\times n}\) unitary matrices and \(\Sigma \in \mathbb {R}^{m\times n}\) diagonal real matrix whose diagonal elements are the singular values \(\sigma _i(X)\ge 0\). The Moore–Penrose pseudoinverse of X is the matrix \(X^\dagger =V^*\Sigma ^\dagger U^*\), where \(\Sigma ^\dagger \in \mathbb {R}^{n\times m}\) is a diagonal real matrix whose diagonal entries are \(\Sigma _{i,i}^\dagger =\Sigma _{i,i}^{-1} = \sigma _i(X)^{-1}\) if \(\sigma _i(X)>0\) and zero otherwise.Note that, if X has full rank, then \(\Vert X^{\dagger }\Vert = 1/\sigma _{\min }(X)\). We also use the induced 1 and \(\infty \) matrix norms, defined, respectively, as

$$\begin{aligned} \Vert X\Vert _1 = \max _{1\le j\le n} \sum _{i=1}^m |X_{ij}|, \qquad \Vert X\Vert _\infty = \max _{1\le i\le m} \sum _{j=1}^n |X_{ij}|. \end{aligned}$$

Since \(\mathbb {C}^{m \times n}\) is finite-dimensional, various norms mentioned above are of course equivalent to each other, and the following relations will be useful to us:

$$\begin{aligned} \Vert X\Vert \le \Vert X\Vert _F,\qquad \Vert X\Vert \le \sqrt{ \Vert X\Vert _\infty \Vert X\Vert _1 }. \end{aligned}$$
(2)

An interlacing result for the singular values arises when we consider low rank perturbation of matrices.

Theorem 2.1

(Interlacing Singular Values for Low-Rank Perturbations [20]) Let A and E be \(n\times n\) matrices, where E has rank at most k. If \(B = A+E\) and the singular values of A and B are, respectively,

$$\begin{aligned} \alpha _1\ge \alpha _2 \ge \dots \ge \alpha _{n}, \qquad \beta _1\ge \beta _2 \ge \dots \ge \beta _{n}, \end{aligned}$$

then

$$\begin{aligned}&\alpha _i\ge \beta _{i+k},&i=1,2,\dots ,n-k,\\&\beta _i\ge \alpha _{i+k},&i=1,2,\dots ,n-k. \end{aligned}$$

If the norm of the perturbation, as opposed to its rank, is to be used to estimate the singular values, then we can appeal to the following result attributed to Mirsky, which is a corollary of the minimax principle for singular values.

Theorem 2.2

(Perturbation Theorem [14]) Given two \({ n\times n}\) matrices AB, with singular values, respectively,

$$\begin{aligned} \alpha _1\ge \alpha _2 \ge \dots \ge \alpha _n, \qquad \beta _1\ge \beta _2 \ge \dots \ge \beta _n, \end{aligned}$$

then

$$\begin{aligned} |\alpha _i-\beta _i|\le \Vert A-B\Vert , \qquad i=1,2,\dots ,n. \end{aligned}$$

2.2 Matrix Polynomial Theory

Let P(x) be the matrix polynomial defined in (1). We give here a brief overview of those aspects in the spectral theory of square complex matrix polynomials that are relevant to this paper. More detailed discussions can be found, e.g., in [1, 9, 10, 12] and the references therein. As mentioned in introduction, an element \(\lambda \in \mathbb {C}\) is said to be a finite eigenvalue of P(x) if

$$\begin{aligned} \mathrm {rank}_{\mathbb {C}}(P(\lambda )) < \mathrm {rank}_{\mathbb {C}(x)}(P(x)) =: r, \end{aligned}$$

where \(\mathbb {C}(x)\) is the field of fractions of \(\mathbb {C}[x]\), that is, the field of rational functions with coefficients in \(\mathbb {C}\).

If the leading coefficient \(C_k\) of the matrix polynomial P(x) in (1) is invertible, then P(x) has kn finite eigenvalues. Under this assumption, one can define the companion matrix of P(x) as (see, e.g., [1])

$$\begin{aligned} M=\begin{bmatrix} -C_k^{-1}C_{k-1} &{} -C_k^{-1}C_{k-2} &{} \dots &{} -C_k^{-1}C_{1} &{} -C_k^{-1}C_{0}\\ I_n &{} 0 &{} 0 &{} \dots &{} 0\\ 0&{} I_n &{} 0 &{} \dots &{} 0\\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} \vdots \\ 0 &{} \dots &{} 0 &{} I_n &{} 0 \end{bmatrix} \in \mathbb {C}^{kn \times kn}, \end{aligned}$$
(3)

where \(I_n\) and 0 are, respectively, the \(n\times n\) identity and zero matrices. It is well known that the eigenvalues of M, defined in the classical sense, coincide with the finite eigenvalues of P(x). As a consequence, under the assumption that \(C_k\) is invertible, studying the finite eigenvalues of P(x) is equivalent to studying the eigenvalues of the structured matrix M. Observe that, if P(x) is monic, then \(C_k = I\) so that the assumption is automatically satisfied.

We can identify, say via an arbitrary but fixed rearrangement of the real and imaginary parts of the entries of each coefficient, the (real) vector space of \(n \times n\) complex matrix polynomials of degree up to k with \(\mathbb {R}^{2(k+1)n^2}\). In this setting, let \({\mathcal {S}} \subset \mathbb {R}^{2(k+1)n^2}\) correspond to the subset of matrix polynomials that are regular and have kn distinct finite eigenvalues. We conclude this subsection by observing that \({\mathcal {S}}\) is a nonempty Zariski open set, and hence, its complement has Lebesgue measure zero: in this sense, being regular with kn distinct finite eigenvalues is a generic property of matrix polynomials.

2.3 Random Matrix Theory

Often, within our probabilistic arguments it will be crucial to consider matrices that have some deterministic entries and some other entries corresponding to (complex) random variables, which in turn can be seen as pairs of real random variables. We implicitly identify those matrices with a vector in \(\mathbb {R}^N\), N being the number of real random variables involved, and equipping \(\mathbb {R}^N\) with an appropriate probability measure. In this context, we will often invoke, without explicit justification, the well-known fact that events that happen in (subsets of) proper Zariski closed sets of \(\mathbb {R}^N\) have probability zero: for example, we may claim that a certain square random matrix is almost surely invertible. Recalling that any proper Zariski closed set has Lebesgue measure zero, it follows immediately that the claimed property is true for any absolutely continuous probability measure (as are all the ones we discuss in this paper). The verification that, in all the instances where we make such a claim, the corresponding algebraic set is indeed contained in a proper algebraic set is a straightforward exercise in linear algebra, and we therefore omit the details.

2.3.1 Empirical Spectral Distributions

Given a deterministic matrix \(A\in M_{m}(\mathbb {C})\) with eigenvalues \(\lambda _1(A),\dots ,\lambda _m(A)\) we say that its empirical spectral distribution (ESD) is the atomic measure

$$\begin{aligned} \mu _A = \frac{1}{m} \sum _{i=1}^{m} \delta _{\lambda _i(A)}, \end{aligned}$$

where the eigenvalues are considered with their respective algebraic multiplicities. A random matrix \(A_n\) can be seen as a random variable with values in the appropriate space of matrices that will usually be \(M_{nk}(\mathbb C)\), where n and k are fixed parameters. We can extend the concept of ESDs to random matrices as follows.

Definition 2.3

Given a random matrix A, its empirical spectral distribution (ESD) is a random variable with values in the space of probabilities on \(\mathbb C\), defined as

$$\begin{aligned} \mu _A(\omega ) := \mu _{A(\omega )} = \frac{1}{m} \sum _{i=1}^{m} \delta _{\lambda _i(A(\omega ))}. \end{aligned}$$

The space of probabilities on \(\mathbb C\) is a measurable subset of \({\mathcal {M}}^b(\mathbb C)\), the space of signed measure of \(\mathbb C\) with bounded total variation, that is a Hausdorff space when equipped with the vague (or weak \(-^*\)) convergence of measures.

We will study the spectral distribution for some families of random matrices \(\{A_{n}\}_n\) and find that in our cases the sequence \(\{\mu _{A_n}\}_n\) always converges almost surely (a.s.) to a constant random variable that can be identified with a probability measure \(\mu \in \mathbb P(\mathbb C)\). In this case, we simply write

$$\begin{aligned} \mu _{A_n}\xrightarrow {a.s.} \mu . \end{aligned}$$

The measure \(\mu \) will thus be our candidate for the asymptotic spectral distribution of the family \(\{A_{n}\}_n\).

Finally, having let us consider a random matrix polynomial \(P(x;\omega )\) of shape \(n \times n\) and degree k, under the assumption that, for all \(\omega \in \Omega \), \(P(x;\omega )\) has invertible leading coefficient. This implies, in particular, that \(P(x;\omega )\) has kn finite eigenvalues that we denote by \(\lambda _1(P(x;\omega )),\dots ,\lambda _{kn}(P(x;\omega ))\).

Definition 2.4

Let \(P(x;\omega )\) be a random matrix polynomial of size n and degree k, such that its leading coefficient is invertible for all \(\omega \in \Omega \). Its empirical spectral distribution (ESD) is a random variable with values in the space of probabilities on \(\mathbb {C}\), defined as

$$\begin{aligned} \mu _P(\omega ) := \mu _{P(x;\omega )} =\frac{1}{kn} \sum _{i=1}^{kn} \delta _{\lambda _i(P(x;\omega ))}. \end{aligned}$$

It is immediate by Definitions 2.3 and 2.4 that the ESD of a random matrix polynomial coincides with the ESD of its (random) companion matrix (3). Indeed, in this paper we will strongly rely on its equivalence.

2.3.2 The Replacement Principle and the Circle Law

Central to our arguments to derive the empirical spectral distributions is the so-called replacement principle: a tool in random matrix theory developed by Tao, Vu and Krishnapur. We recall it below.

Theorem 2.5

(Replacement Principle [18]) Let \(A_m, B_m\) be two \(m \times m\) random matrices. Assume that

  1. 1.

    The quantity \( \frac{1}{m^2} \left( \Vert A_m \Vert _F^2 + \Vert B_m\Vert ^2_F \right) \) is bounded a.s.;

  2. 2.

    For a.e. \(z \in \mathbb {C}\),

    $$\begin{aligned} \frac{1}{m} \log \left| \frac{\det ( m^{-1/2}A_m - z I ) }{\det ( m^{-1/2}B_m - z I )} \right| \xrightarrow {a.s} 0. \end{aligned}$$

    Then, \(\mu _{\frac{1}{\sqrt{m}}A_m} - \mu _{\frac{1}{\sqrt{m}}B_m} \xrightarrow {a.s.} 0\).

Remark 2.6

The random variable \(\mu _{\frac{1}{\sqrt{m}}A_m} - \mu _{\frac{1}{\sqrt{m}}B_m}\) takes values in the space of signed measures on \(\mathbb C\) with total variation bounded by 2.

Thanks to the replacement principle, we will be able to generalize a well-known result on random Gaussian matrices to the case of monic Gaussian matrix polynomials.

Theorem 2.7

(Strong Circle Law [13]) Let \(A_m\) be the \({m\times m}\) random matrix whose entries are iid Gaussian random variables with mean 0 and variance 1. Then, the ESDs of \(\frac{1}{\sqrt{m}} A_m\) converge almost surely to the uniform distribution on the unit disc.

3 Empirical Spectral Distribution for \(n \times n\) Monic Complex Gaussian Matrix Polynomials of Degree k, in the Limit \(n \rightarrow \infty \)

Let X be a complex random variable, normally distributed with mean 0 and variance 1. We consider the \(n \times n\) monic matrix polynomial of degree \(k \ge 2\)

$$\begin{aligned} P_n(x) = I_n x^k + \sum _{j=0}^{k-1} C_j x^j, \end{aligned}$$
(4)

where, for \(j=0,\dots ,k-1\) every coefficient \(C_j\) is an \(n \times n\) random matrix whose entries are i.i.d. copies of X. Note that each \(C_j\) depends on j and on n, but we omit the dependence on n in the notation. It is intended moreover that all \(C_j\) are independent of each other for varying j and n.

The finite eigenvalues of \(P_n(x)\) coincide with the eigenvalues of its companion matrix: in particular, substituting \(C_k=I_n\) in (3), we obtain

$$\begin{aligned} M:=\begin{bmatrix} -C_{k-1} &{} \dots &{} -C_1 &{} -C_0\\ I_n &{} &{} &{} \\ &{} \ddots &{} &{} \\ &{} &{} I_n &{} \end{bmatrix} =: Z + E_1 C^T \end{aligned}$$
(5)

where \(E_1^T = \begin{bmatrix} I_n&0&\dots&0 \end{bmatrix}\) and \(C^T = -\begin{bmatrix} C_{k-1}&\dots&C_1&C_0 \end{bmatrix}\). Note that the spectrum of the matrix \(E_1 C^T\) consists of the eigenvalues of the random matrix \(-C_{k-1}\), with the addition of the eigenvalue 0, which appears with algebraic multiplicity \(n(k-1)\). As \(C_{k-1}\) is a Gaussian random matrix, the almost sure limit ESD of \(n^{-1/2} C_{k-1}\) follows the circular law (Theorem 2.7), i.e., is distributed with the uniform measure on the unit disc. Hence, the ESD of \(n^{-1/2} E_1 C^T\) converges almost surely, in the limit \(n \rightarrow \infty \), to \(\frac{k-1}{k} \mathbf{1}_0 + \frac{1}{k} \mathbf{1}_D\), where \(\mathbf{1}_0\), \(\mathbf{1}_D\) denote the uniform probability measures on, respectively, the set \(\{ 0 \}\) and the unit disc. Since \(n^{-1/2}M\) is a perturbation of \(n^{-1/2} E_1 C^T\), one can expect that the almost sure limit ESD of \(n^{-1/2}M\) and thus the almost sure limit ESD of \(P_n(n^{1/2}x)\) coincide with the limit ESD for \(n^{-1/2} E_1 C^T\).

Fig. 1
figure 1

Scatter plots of the eigenvalues of \(P_n(x)\) for growing n, multiplied by \(\frac{1}{\sqrt{n}}\)

This conjecture is also empirically confirmed by the experiments. For example, in Fig. 1, we plotted the complex eigenvalues, multiplied by \(n^{-1/2}\), of N realization of the polynomial \(P_n(x)\) for different values of the triple (knN) under the constraint \(knN=c\) for some positive integer c (so that the number of the eigenvalues plotted is the same in every image). We display several subfigures organized as a matrix: the degree of the polynomial is constant on each row (namely \(k=6\) for the first row and \(k=4\) for the second row), while the columns are characterized by different values of n, increasing from left to right. To facilitate the visual comparison with the above claim, we also superimpose the unit circle on each image.

We prove the claim as Theorem 3.1. Its proof relies on several technical lemmata on the behavior of the singular values of certain matrices: in order to improve the readability of the paper, these are collected in “Appendixes A.2 and A.3.” Since, for \(k=1\), we recover the well-known limit distribution of the eigenvalues of a Gaussian random matrix, within the proof we tacitly assume that \(k \ge 2\).

Theorem 3.1

Let \(P_n(x)\) be a monic \(n \times n\) complex random matrix polynomial of degree k as in (4), where the entries of each coefficient \(C_j\) are i.i.d. complex random variables normally distributed with mean 0 and variance 1. Then, for \(n \rightarrow \infty \), the empirical spectral distribution of \(P_n(n^{1/2}x)\) converges almost surely to

$$\begin{aligned} \frac{k-1}{k} \mathbf{1}_0 + \frac{1}{k} \mathbf{1}_D, \end{aligned}$$

where \(\mathbf{1}_0\), \(\mathbf{1}_D\) denote the uniform probability measures on, respectively, the set \(\{ 0 \}\) and the unit disc.

Proof

The strategy of the proof is to apply the Replacement Principle (Theorem 2.5) in the special case where \(m=kn\), \(A_m = M\) and \(B_m = E_1 C^T\), where \(M,E_1,C\) are the matrices defined in (5) and immediately below. Indeed, by the observations above, this immediately implies the statement. Thus, we need to verify that the two assumptions of Theorem 2.5 hold.

  1. 1.

    Consider the random variable

    $$\begin{aligned}R_n = \frac{1}{k^2 n^2} \sum _{i=1}^{2kn^2} |X_i|^2, \end{aligned}$$

    where \(X_i\) are i.i.d. normally distributed complex random variables with mean 0 and variance 1. The \(X_i\) depend also on n, and it is intended that all \(X_i\) are i.i.d. for varying i and n. Since \(\frac{1}{m^2} \left( \Vert A_m \Vert _F^2 + \Vert B_m \Vert _F^2 \right) \) has the same distribution as \(R_n + \frac{k-1}{k^2n}\), it suffices to prove that \(R_n\) is bounded almost surely. This is tantamount to \(\mathbb {P}( \limsup _n R_n < \infty ) =1\). On the other hand, by the strong law of large numbers,

    it follows that \(\mathbb {P}\left( \limsup _n R_n < \infty \right) \ge \mathbb {P}\left( \limsup _n R_n = \frac{2}{k}\right) =1 .\)

  2. 2.

    Fix a nonzero complex number \(w \ne 0\). We need to verify that, for almost every w,

    $$\begin{aligned} \frac{1}{kn} \left( \log \left| \det \left( \frac{1}{\sqrt{kn}} E_1 C^T - wI \right) \right| - \log \left| \det \left( \frac{1}{\sqrt{kn}} M - wI \right) \right| \right) \xrightarrow {a.s.} 0. \end{aligned}$$

    Defining \(z:=w \sqrt{k}\), we readily see that this is equivalent to showing

    $$\begin{aligned} \frac{1}{n} \sum _{i=1}^{kn} \left[ \log \sigma _i \left( \frac{1}{\sqrt{n}} E_1 C^T - z I \right) - \log \sigma _i \left( \frac{1}{\sqrt{n}} M - z I \right) \right] \xrightarrow {a.s.} 0 \end{aligned}$$
    (6)

    for every \(z\ne 0\). Now let \(0<\delta <1/2\) and set \(f(n):=\lfloor kn-n^{1-\delta } \rfloor \). Observe that, for any n large enough, \(kn> f(n) > kn-n\). Rather than verifying (6) directly, we will prove a somewhat stronger statement. Indeed, we claim that the following three facts all hold:

    $$\begin{aligned}&\frac{1}{n} \sum _{i=f(n)+1}^{kn} \log \sigma _i \left( \frac{1}{\sqrt{n}} M - z I \right) \xrightarrow {a.s.} 0. \end{aligned}$$
    (7)
    $$\begin{aligned}&\frac{1}{n} \sum _{i=f(n)+1}^{kn} \log \sigma _i \left( \frac{1}{\sqrt{n}} E_1 C^T - z I \right) \xrightarrow {a.s.} 0. \end{aligned}$$
    (8)
    $$\begin{aligned}&\frac{1}{n} \sum _{i=1}^{f(n)} \left[ \log \sigma _i \left( \frac{1}{\sqrt{n}} E_1 C^T - z I \right) - \log \sigma _i \left( \frac{1}{\sqrt{n}} M - z I \right) \right] \xrightarrow {a.s.} 0. \end{aligned}$$
    (9)

    It is clear that (7), (8) and (9), together, imply (6). It now remains to prove each statement separately.

    • Proof of (7). By Lemma A.6 and Lemma A.8, almost surely, for all n sufficiently large, the following is true:

      $$\begin{aligned} \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right)&\ge \sum _{i=f(n)+1}^{kn}\log ( n^{-a-2})\\&\ge (n^{1-\delta }+1)(-a-2) \log (n),\\ \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right)&\le \sum _{i=f(n)+1}^{kn}\log (d) \le (n^{1-\delta }+1)\log (d), \end{aligned}$$

      where a and d are the positive constants appearing in Lemma A.6 and Lemma A.8, and d can be chosen greater than 1. Hence, dividing by n,

      $$\begin{aligned} (n^{-\delta } + n^{-1}) (-a-2) \log (n)&\le \frac{1}{n} \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) \nonumber \\&\le (n^{-\delta } + n^{-1}) \log (d). \end{aligned}$$
      (10)

      Thus, (7) follows by the sandwich rule.

    • Proof of (8). By Lemma A.7 and Lemma A.8, there are positive constants \(\widetilde{a}\) and \(d>1\) such that almost surely, for all n sufficiently large,

      $$\begin{aligned} \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} E_1C^T -zI \right)&\ge \sum _{i=f(n)+1}^{kn}\log ( n^{-\widetilde{a}-2})\ge (n^{1-\delta }+1)\\&(-\widetilde{a}-2) \log (n),\\ \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} E_1C^T -zI \right)&\le \sum _{i=f(n)+1}^{kn}\log (d) \le (n^{1-\delta }+1)\log (d). \end{aligned}$$

      The latter inequalities imply

      $$\begin{aligned} (n^{-\delta } + n^{-1}) (-\widetilde{a}-2) \log (n)\le & {} \frac{1}{n} \sum _{i=f(n)+1}^{kn}\log \sigma _i\left( \frac{1}{\sqrt{n}} E_1C^T -zI \right) \nonumber \\\le & {} (n^{-\delta } + n^{-1}) \log (d). \end{aligned}$$
      (11)

      yielding in turn (8) via the sandwich rule.

    • Proof of (9). We start by the algebraic manipulation

      $$\begin{aligned}&\frac{1}{n} \sum _{i=1}^{f(n)} \left[ \log \sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) - \log \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) \right] \\&\quad = -\frac{1}{n} \sum _{i=1}^{f(n)} \left[ \log \frac{\sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) }{\sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right] . \end{aligned}$$

      Thanks to Mirsky’s Theorem (Theorem2.2), we know that, for every i,

      $$\begin{aligned} \left| \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) - \sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) \right| \le \frac{1}{\sqrt{n}}\Vert M - E_1 C^T\Vert = \frac{1}{\sqrt{n}} \end{aligned}$$

      so, for \(i=1,\dots ,f(n)\) there exist \(d_i\) satisfying \(|d_i| \le \frac{1}{\sqrt{n}}\) and such that

      $$\begin{aligned} \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) = \sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) +d_i. \end{aligned}$$

      Thus,

      $$\begin{aligned}&\left| \frac{1}{n} \sum _{i=1}^{f(n)} \left[ \log \sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) - \log \sigma _i\left( \frac{1}{\sqrt{n}} M -zI \right) \right] \right| \le \frac{1}{n} \sum _{i=1}^{f(n)} \nonumber \\&\quad \left| \log \left( 1 + \frac{d_i}{\sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right) \right| \end{aligned}$$
      (12)

      Observe now that, using Lemma A.9, we have that, for some positive constants \(t,\varepsilon \), almost surely, for all n sufficiently large and for every \(i\le f(n)\),

      $$\begin{aligned} |x| := \left| \frac{d_i}{\sigma _i(n^{-1/2}E_1 C^T-zI)}\right| \le \left| \frac{n^{-1/2}}{\sigma _{f(n)}(n^{-1/2}E_1 C^T-zI)}\right| \le t^{-1}n^{-\varepsilon } . \end{aligned}$$

      For sufficiently large n (i.e., \(n > t^{-1/\varepsilon }\)), the right-hand side of the latter inequality is bounded above by 1. Noting that \(|x| < 1 \Leftrightarrow |\log (1+x)| \le - \log (1-|x|)\), we obtain the following upper bound for the right-hand side of (12):

      $$\begin{aligned}&0\le - \frac{1}{n} \sum _{i=1}^{f(n)} \log \left( 1 - \frac{|d_i|}{\sigma _i\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right) \le - \frac{f(n)}{n} \\&\quad \log \left( 1 - \frac{n^{-1/2}}{\sigma _{f(n)}\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right) \end{aligned}$$

      which in turn is bounded above by

      $$\begin{aligned}- k \log \left( 1 - \frac{n^{-1/2}}{\sigma _{f(n)}\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right) . \end{aligned}$$

      Invoking again Lemma A.9, we have that almost surely

      $$\begin{aligned}&0\le \frac{n^{-1/2}}{\sigma _{f(n)}\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \le t^{-1}n^{-\varepsilon }\rightarrow 0 \\&\implies - k \log \left( 1 - \frac{n^{-1/2}}{\sigma _{f(n)}\left( \frac{1}{\sqrt{n}} E_1 C^T -zI \right) } \right) \xrightarrow {a.s.} 0, \end{aligned}$$

      and this concludes the proof.

\(\square \)

Remark 3.2

The relations (7) and (8) still hold if the entries of \(C_i\) are i.i.d. copies of any centered random variable with unit variance, using slight variations in the reported results.

4 Empirical Spectral Distribution for \(n \times n\) Monic Complex Gaussian Matrix Polynomials of Degree k in the Limit \(k \rightarrow \infty \)

Consider againFootnote 1 the monic matrix polynomial

$$\begin{aligned} P_k(x) = I_n x^k + \sum _{j=0}^{k-1} C_j x^j, \end{aligned}$$
(13)

so that for all \(j=0,\dots ,k-1\) every coefficient \(C_j\) is a \(n\times n\) random matrix where all the entries are i.i.d. Gaussian complex random variables with mean zero and variance 1. Note that each \(C_j\) depends on j and on k, but we omit the dependence on k in the notation. It is intended, moreover, that all \(C_j\) are independent of each other for varying j and k.

The finite eigenvalues of \(P_k(x)\) coincide with those of its companion matrix M as in (5). However, this time we decompose M as the sum of a deterministic circulant matrix and a random matrix with rank at most n

$$\begin{aligned}&M= B + A,\qquad B = \begin{bmatrix} &{} &{} &{} I_n\\ I_n &{} &{} &{} \\ &{} \ddots &{} &{} \\ &{} &{} I_n &{} \end{bmatrix},\qquad A = \begin{bmatrix} -C_{k-1} &{} \dots &{} -C_1 &{} -(C_0+I_n)\\ &{} &{} &{} \\ &{} &{} &{} \\ &{} &{} &{} \end{bmatrix} \nonumber \\&\quad = E_1 \widehat{C}^T, \end{aligned}$$
(14)

where \(\widehat{C}^T = C^T - e_k^T\otimes I_n\), \(E_1^T = \begin{bmatrix} I_n&0&\dots&0 \end{bmatrix}\) and \(C^T = -\begin{bmatrix} C_{k-1}&\dots&C_1&C_0 \end{bmatrix}\). In particular, B is a circulant matrix [7], with spectrum

$$\begin{aligned} \Lambda (B) = { \exp ( 2\pi ij/k ) | j=0,1,\dots ,k-1 } \end{aligned}$$

where each eigenvalue has multiplicity n. It is thus easy to see that the almost sure limit ESD of B is the uniform (singular) probability measure on the unit circle \(\mathbf{1}_U\). The problem can be seen through the lens of the theory of perturbations for Toeplitz matrices and sequences (see, for example, [3, 22]), but since the perturbation \(E_1{\hat{C}}^T\) is a rank n correction to the Teoplitz matrix B, this case does not fall in the classical settings, where it is required that the random perturbation is not singular with high probability. Since the rank of the perturbation is small when compared with the growing size kn of the matrices, we may anyway expect that the ESD of M also converge almost surely to the same distribution \(\mathbf{1}_U\).

This claim is also empirically supported by the experiments. For example, in Fig. 2 we plot the complex eigenvalues of N realization of the polynomial \(P_k(x)\) for different values of the triple (knN). The interpretation of Fig. 2 is the same as Fig. 1 after swapping the roles of the matrix sizes and the degrees: we fix n on each row (to 6 and 4, respectively) and we increase the degree of the polynomial on the columns. The unit circle is drawn on top of each scatter plot to make easier the comparison with the claim above; N is always chosen so that the number of eigenvalues plotted, equal to knN, is the same in every image.

Fig. 2
figure 2

Scatter plots of the eigenvalues of \(P_k(x)\) for growing k

In Theorem 3.1, we thus show that the ESD of M also converges almost surely to \(\mathbf{1}_U\). Note that the statement includes, as a special case when \(n=1\), the well-known limit distribution of random scalar polynomials, for which we thus provide a novel proof. For the sake of a clearer exposition, below we focus on the major lines of thought that lead to the proof, postponing to “Appendix A.4” a more detailed analysis of some technicalities that appear as intermediate steps.

Theorem 4.1

Let \(P_k(x)\) be a monic \(n \times n\) complex random matrix polynomial of degree k as in (4), where the entries of each coefficient \(C_j\) are i.i.d. complex random variables normally distributed with mean 0 and variance 1. Then, for \(k \rightarrow \infty \), the empirical spectral distribution of \(P_k(x)\) converges almost surely to \(\mathbf{1}_U\), the uniform probability measure on the unit circumference.

Proof

The strategy of the proof follows very closely that of Theorem 3.1: we verify that the two assumptions of Theorem 2.5 hold in the special case where \(m=kn\), \(A_m = \sqrt{kn} \ M\) and \(B_m =\sqrt{kn} \ B \), where MB are the matrices defined in (14) and immediately below.

  1. 1.

    The first item is treated analogously to Theorem 3.1, and we omit the details.

  2. 2.

    Fix a nonzero complex number z such that \(|z|\not \in {0,1}\). We show that, for every such z,

    $$\begin{aligned} \frac{1}{kn} \left( \log \left| \det \left( M - zI \right) \right| - \log \left| \det \left( B - zI \right) \right| \right) \xrightarrow {a.s.} 0, \end{aligned}$$

    that is equivalent to

    $$\begin{aligned} \frac{1}{k} \sum _{i=1}^{kn} \log ( \sigma _i\left( M -zI \right) ) - \log (\sigma _i\left( B -zI \right) ) \xrightarrow {a.s.} 0. \end{aligned}$$
    (15)

    We claim that the following facts are true:

    $$\begin{aligned}&\frac{1}{k} \sum _{i=1}^{n}\log ( \sigma _i\left( M -zI \right) ) \xrightarrow {a.s.} 0, \qquad \frac{1}{k} \sum _{i=kn -n+1}^{kn}\log ( \sigma _i\left( M -zI \right) ) \xrightarrow {a.s.} 0. \end{aligned}$$
    (16)
    $$\begin{aligned}&\frac{1}{k} \sum _{i=1}^{n}\log ( \sigma _i\left( B -zI \right) ) \xrightarrow {a.s.} 0, \qquad \frac{1}{k} \sum _{i=kn -n+1}^{kn}\log ( \sigma _i\left( B -zI \right) ) \xrightarrow {a.s.} 0. \end{aligned}$$
    (17)
    $$\begin{aligned}&\frac{1}{k} \sum _{i=n+1}^{kn -n} \log ( \sigma _i\left( M -zI \right) ) - \log (\sigma _i\left( B -zI \right) ) \xrightarrow {a.s.} 0. \end{aligned}$$
    (18)

    It is clear that (16), (17) and (18), together, imply (15). It now remains to prove each statement separately.

    • Proof of (16). By Lemma A.10, almost surely, for all k sufficiently large, the following are true almost surely for some positive constant r:

      $$\begin{aligned} \sigma _{1}(M-zI) \le r\sqrt{k} + 1 + |z|, \qquad \sigma _n(M-zI)\ge |1-|z||. \end{aligned}$$

      These facts are enough to conclude that

      $$\begin{aligned} \frac{1}{k}\sum _{i=1}^{n}|\log ( \sigma _i\left( M -zI \right) )| \le \frac{n}{k} \max \left\{ |\log (r\sqrt{k} + 1 + |z|)|, |\log ( |1-|z|| )| \right\} \xrightarrow {a.s.} 0 \end{aligned}$$

      and thus the first a.s. limit in (16) holds. Moreover, by Lemma A.11, almost surely, for all k sufficiently large,

      $$\begin{aligned} \sigma _{kn} \left( M - z I \right) \ge tk^{-2} \end{aligned}$$

      for some positive constant t. Hence, it suffices to estimate

      $$\begin{aligned}&\frac{1}{k} \sum _{i=kn -n+1}^{kn}|\log ( \sigma _i\left( M -zI \right) ) | \le \frac{n}{k}\max \{ |\log (\sigma _1(M -zI))|,\\&|\log (\sigma _{kn}(M -zI))| \} \le \frac{n}{k}\max \{ |\log ( r\sqrt{k} + 1 + |z|)|, |\log (tk^{-2})| \}\xrightarrow {a.s.} 0. \end{aligned}$$

      Thus, the second part of (16) also holds.

    • Proof of (17). Observe that \(B-zI\) is a circulant matrix, and hence, in particular it is normal. Its spectrum is

      $$\begin{aligned} \Lambda (B-zI) = { \lambda - z | \lambda \in \Lambda (B)}. \end{aligned}$$

      Since all eigenvalues of B have unitary norm, we can bound the singular values of \(B-zI\) as

      $$\begin{aligned} \sigma _i(B-zI) = |\lambda _i - z|, \qquad | 1 - |z|| \le |\lambda _i - z| \le 1 + |z|. \end{aligned}$$
      (19)

      Importantly, these bounds do not depend on k. As a consequence,

      $$\begin{aligned}&\frac{n\log (| 1 - |z|| )}{k} \le \frac{1}{k} \sum _{i=kn -n+1}^{kn}\log ( \sigma _i\left( B -zI \right) ) \le \frac{n\log (1 + |z|)}{k}, \\&\frac{n\log (| 1 - |z|| )}{k} \le \frac{1}{k} \sum _{i=1}^{n}\log ( \sigma _i\left( M -zI \right) ) \le \frac{n\log (1 + |z|)}{k}, \end{aligned}$$

      and (17) follows by the sandwich rule.

    • Proof of (18). We start by noting that the statement is implied by

      $$\begin{aligned} \frac{1}{k} \sum _{i=n+1}^{kn -n} \left| \log \left( \frac{\sigma _i\left( M -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| \xrightarrow {a.s.} 0. \end{aligned}$$

      Assume now that \(k>2\). Observe that \(M-zI\) is a perturbation of rank at most n of \(B-zI\). As a consequence, by Theorem 2.1, we find that

      $$\begin{aligned} \sigma _{i+n}(B-zI)\le \sigma _i(M-zI) \le \sigma _{i-n}(B-zI) \end{aligned}$$
      (20)

      for every \(n<i\le nk-n\). Thus,

      $$\begin{aligned} \frac{1}{k} \sum _{i=n+1}^{kn -n} \left| \log \left( \frac{\sigma _i\left( M -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| \le&\frac{1}{k} \sum _{i=n+1}^{kn -n} \max \left\{ \left| \log \left( \frac{\sigma _{i-n}\left( B -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| , \right. \\&\left. \left| \log \left( \frac{\sigma _{i+n}\left( B -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| \right\} \end{aligned}$$

      The singular values of \(B-zI\) are the moduli of \(\lambda _i-z\) where \(\lambda _i\) are the eigenvalues of B. For the rest of this argument, and for the sake of notational simplicity, let us now drop the dependence on the argument matrix and simply refer to the rth singular value of \(B-zI\) as \(\sigma _r\). Since all the eigenvalues of B have multiplicity n, then \( \sigma _{i-n}= |\lambda _j -z|\), \( \sigma _{i} = |\lambda _i -z|\) and \( \sigma _{i+n} = |\lambda _s -z|\), where necessarily ijs are pairwise distinct; specifically, j and s are determined by z coherently with the decreasing ordering of the singular values. We conclude that \(\sigma _{i} - \sigma _{i+n}\) and \(\sigma _{i-n} - \sigma _{i}\) are both bounded above by

      $$\begin{aligned}&\min _{j \ne s \ne i \ne j} \max \{ ||\lambda _i -z| - |\lambda _j -z||, ||\lambda _i -z| - |\lambda _s -z|| \} \le \min _{j \ne s \ne i \ne j}\\&\quad \max \{ |\lambda _i -\lambda _j|, |\lambda _i -\lambda _s | . \} \end{aligned}$$

      In particular, as \(k>2\), we can choose

      $$\begin{aligned} \lambda _j = \lambda _i \exp (2\pi \text {i}/k),\qquad \lambda _s = \lambda _i \exp (-2\pi \text {i}/k), \end{aligned}$$

      and hence,

      $$\begin{aligned} \min _{j \ne s \ne i \ne j} \max \{ |\lambda _i -\lambda _j|, |\lambda _i -\lambda _s | \} \le |1-\exp (2\pi \text {i}/k)| = 2\sin (\pi /k). \end{aligned}$$

      Therefore, for all values of k large enough so that \(0<2\sin (\pi /k)<|1-|z||\), we use (19) to obtain

      $$\begin{aligned}&\left| \log \left( \frac{\sigma _{i+n}}{\sigma _i} \right) \right| = - \log \left( 1 - \frac{\sigma _i -\sigma _{i+n}}{\sigma _i}\right) \le - \log \left( 1 - 2\frac{\sin (\pi /k)}{|1-|z||}\right) , \\&\left| \log \left( \frac{\sigma _{i-n}}{\sigma _i} \right) \right| = \log \left( 1+ \frac{\sigma _{i-n} -\sigma _{i}}{\sigma _i} \right) \le \log \left( 1 + 2\frac{\sin (\pi /k)}{|1-|z||}\right) . \end{aligned}$$

      and, since \(0<x<1 \implies -\log (1-x) > \log (1+x)\), we conclude that

      $$\begin{aligned}&\frac{1}{k} \sum _{i=n+1}^{kn -n} \max \left\{ \left| \log \left( \frac{\sigma _{i-n}\left( B -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| , \left| \log \left( \frac{\sigma _{i+n}\left( B -zI \right) }{\sigma _i\left( B -zI \right) } \right) \right| \right\} \\&\quad \le -\frac{kn-2n}{k} \log \left( 1 - 2\frac{\sin (\pi /k)}{|1-|z||}\right) \end{aligned}$$

      that goes to zero as \(k\rightarrow \infty \), implying (18).

\(\square \)

5 Conclusions

We have rigorously obtained the limit of empirical spectral distribution for monic complex i.i.d. Gaussian matrix polynomials. To our knowledge, and in spite of the relatively common use of random matrix polynomials in the context of numerical experiments to test algorithms for the polynomial eigenvalue problem, the study in the present paper is the first attempt to study analytically the distribution of eigenvalues of a class of random matrix polynomials.

We hope that this work may open the path to further future research on eigenvalues of random matrix polynomials. In particular, we believe that it would be of interest to extend our results by considering, for instance, different ways to send \(k,n \rightarrow \infty \), non-monic polynomials, coefficients restricted to be real (and/or otherwise structured) and more general distributions of the entries.

In a forthcoming document, the authors will show further progress about proving the convergence of non-monic non-Gaussian polynomial empirical spectral distribution in both cases \(n,k\rightarrow \infty \).