The asymptotic distribution of the condition number for random circulant matrices

In this manuscript, we study the limiting distribution for the joint law of the largest and the smallest singular values for random circulant matrices with generating sequence given by independent and identically distributed random elements satisfying the so-called Lyapunov condition. Under an appropriated normalization, the joint law of the extremal singular values converges in distribution, as the matrix dimension tends to infinity, to an independent product of Rayleigh and Gumbel laws. The latter implies that a normalized condition number converges in distribution to a Fr\'echet law as the dimension of the matrix increases.


Singular values and condition number. The condition number was independently introduced by von Neumann and Goldstine in
, and by Turing in [76] for studying the accuracy in the solution of a linear system in the presence of finite-precision arithmetic. Roughly speaking, the condition number measures how much the output value of a linear system can change by a small perturbation in the input argument, see [70,80] for further details.
Let A ∈ C n×m be a matrix of dimension n × m. We denote the singular values of A in nondecreasing order by 0 ≤ σ (A) measures the distance between the matrix A and the set of singular matrices. We refer to Chapter 1 in [25] for further details.
The extremal singular values are ubiquitous in the applications and have produced a vast literature in geometric functional analysis, mathematical physics, numerical linear algebra, time series, statistics, etc., see for instance [21,33,35,62,64,74] and Chapter 5 in [65]. The study of the largest and the smallest singular values has been very important in the study of sample correlation matrices, we refer to [44,45,46,47] for further details. Moreover, Poisson statistics for the largest eigenvalues in random matrices ensembles (Wigner ensembles for instance) are studied in [71,72].
Calculate or even estimate the condition number of a generic matrix is a difficult task, see [66]. In computational complexity theory, it is of interest to analyze the random condition number, that is, when the matrix A given in (1.1) is a random matrix. In [36], it is computed the limiting law of the condition number of a random rectangular matrix with independent and identically distributed (i.i.d. for short) standard Gaussian entries. Moreover, the exact law of the condition number of a 2 × n matrix is derived. In [6], for real square random matrices with i.i.d. standard Gaussian entries, no asymptotic lower and upper bounds for the tail probability of the condition number are established. Later, in [38], the results are generalized for non-square matrices and analytic expressions for the tail distribution of the condition number are obtained. Lower and upper bounds for the condition number (in p-norms) and the so-called average "loss of precision" are studied in [73] for real and complex square random matrices with i.i.d. standard Gaussian entries. In [78], it is studied the case of random lower triangular matrices L n of dimension n with i.i.d. standard Gaussian random entries and shown that (κ(L n )) 1/n converges almost surely to 2 as n tends to infinity. In [60], using a Coulomb fluid technique, it is derived asymptotics for the cumulative distribution function of the condition number for rectangular matrices with i.i.d. standard Gaussian entries. More recently, distributional properties of the condition number for random matrices with i.i.d. Gaussian entries are established in [2,28,68] and large deviation asymptotics for condition numbers of sub-Gaussian distributions are given in [69]. We recommend [25,30,34,35] for a complete and current descriptions of condition number for random matrices.
1.2. Random circulant matrices. The (random) circulant matrices and (random) circulant type matrices are an important object in different areas of pure and applied mathematics, for instance compressed sensing, cryptography, discrete Fourier transform, extreme value analysis, information processing, machine learning, numerical analysis, spectral analysis, time series analysis, etc. For more details we refer to [1,11,13,31,42,53,61] and the monograph on random circulant matrices [21]. Some topics that have been studied are spectral norms, extremal distributions, the so-called limiting spectral distribution for random circulant matrices and random circulant-type matrices and process convergence of fluctuations, see [14,15,16,17,18,19,20,22].
1.3. Main results. The problem of computing the limiting distribution of the condition number for square matrices with i.i.d. (real or complex) standard Gaussian random entries is completely analyzed in Chapter 7 of [35]. In this manuscript we focus on the computation of the limiting distribution of κ(C n ) for ξ 0 , . . . , ξ n−1 being i.i.d. real random variables satisfying the so-called Lyapunov condition, see Hypothesis 1.1 below. In fact, the limiting distribution is a Fréchet distribution that belongs to the class of the so-called extreme value distributions [40]. Non-asymptotic estimates for the condition number for random circulant and Toeplitz matrices with i.i.d. standard Gaussian random entries are given in [57,58]. The approach and results of [57,58] are different in nature from our results given in Theorem 1.2.
We assume the following integrability condition. It appears for instance in the so-called Lyapunov Central Limit Theorem, see Section 7.3.1 in [4]. Along this manuscript, the set of nonnegative integers is denoted by N 0 . Hypothesis 1.1 (Lyapunov condition). We assume that (ξ j ) j∈N 0 is a sequence of i.i.d. nondegenerate real random variables on some probability space (Ω, F , P) with zero mean and unit variance. If there exists δ > 0 such that E |ξ 0 | 2+δ < ∞, where E denotes the expectation with respect to P, we say that (ξ j ) j∈N 0 satisfies the Lyapunov integrability condition.
We note that a sequence of i.i.d. non-degenerate sub-Gaussian random variables satisfies the Lyapunov condition. Before state the main result and its consequences, we introduce some notation. For shorthand and in a conscious abuse of notation, we use indistinctly the following notations for the exponential function: exp(a) or e a for a ∈ R. We denote by ln(·) the Napierian logarithm function and we use the same notation | · | for the complex modulus and the absolute value.
The main result of this manuscript is the following. Assume that Hypothesis 1.1 is valid. Then it follows that and G(y) = exp −e −y , y ∈ R are the Rayleigh distribution and Gumbel distribution, respectively, and the normalizing constants are given by (1.8) a n = n ln(n/2) and b n = 1 2 n ln(n/2) , n ≥ 3.
The proof of Theorem 1.2 is based in the Davis-Mikosch method used to prove that a normalized maximum of the periodogram converges in distribution to the Gumbel law, see Theorem 2.1 in [32]. This method relies on Einmahl's multivariate extension of the so-called Komlós-Major-Tusnády approximation, see [39]. However, the laws of the largest singular value σ max − a n b n whose components are not independent. Applying the Continuous Mapping Theorem we deduce the limiting law of the condition number κ(C n ), see Corollary 1.3 item (iv). Along the lines of [48], one can obtain convergence of the point processes of the singular values in the setting of Theorem 1.2. We remark that (1.3) resembles the discrete Fourier transform and it is related with the so-called periodogram, which have been used in many areas of apply science. The maximum of the periodogram has been already studied for instance in [32,51,55,77] and under a suitable rescaling, the Gumbel distribution appears as a limiting law. The asymptotic behavior of Fourier transforms of stationary ergodic sequences with finite second moments is analyzed and shown that asymptotically the real part and the imaginary part of the Fourier transform decouple in a product of independent Gaussian distributions, see [59]. The so-called quenched central limit theorem for the discrete Fourier transform of a stationary ergodic process is obtained in [9] and the central limit theorem for discrete Fourier transforms of function times series are given in [27]. In addition, in [26] the maximum of the periodogram is studied in times series with values in a Hilbert space. However, up to our knowledge, Theorem 1.2 is not immediately implication of these results. As a consequence of Theorem 1.2 we obtain the limiting distribution of the (normalized) largest singular value, the smallest singular value and the (normalized) condition number as Corollary 1.3 states. For a n × n symmetric random Toeplitz matrix satisfying Hypothesis 1.1, it is shown in [67] that the largest eigenvalue scaled by 2n ln(n) converges in L 2+δ as n → ∞ to the constant Sin 2 2→4 = 0.8288 . . .. We point out that Theorem 1 in [24] yields that the Gumbel distribution is the limiting distribution for the (renormalized) largest singular value for symmetric circulant random matrices with generating i.i.d. sequence (half of its entries) satisfying Hypothesis 1.1. Also, for suitable normalization the limiting law of the largest singular value of Hermitian Gaussian circulant matrices has Gumbel distribution, see Corollary 5 in [56].
Recall that the square root of a Exponential distribution with parameter λ has Rayleigh distribution with parameter (2λ) −1/2 . The exponential law appears as the limiting distribution of the minimum modulus of trigonometric polynomials, see Theorem 1 in [81] for the Gaussian case and Theorem 1.2 of [29] for the sub-Gaussian case.
The Fréchet distribution appears as a limiting distribution of the largest eigenvalue (rescaled) for random real symmetric matrices with independent and heavy tailed entries, see Corollary 1 in [5] and [10] for the non-i.i.d. case. The Fréchet distribution in Corollary 1.3 item (iv) has cumulative distribution F (t) = exp(−t −2 )1 {t>0} with shape parameter 2, scale parameter 1 and location parameter 0. Moreover, it does not possess finite variance. For descriptions of extreme value distributions and limiting theorems we refer to [40]. converges in distribution as n → ∞ to a Gumbel distribution, i.e., (iii) The minimum σ (n) min converges in distribution as n → ∞ to a Rayleigh distribution, i.e., (iv) The condition number κ(C n ) converges in distribution as n → ∞ to a Fréchet distribution, i.e., In the sequel, we briefly compare our results with the literature about the limiting law of the condition number, the smallest singular value and the largest singular value. Remark 1.4 (Frechet's distributions as limiting distributions for condition numbers). The Fréchet distribution with shape parameter 2, scale parameter 2 and location parameter 0 is the limiting distribution as the dimension growths of κ(A n )/n, where A n is a square matrix of dimension n with i.i.d. complex Gaussian entries, see Theorem 6.2 in [36]. When A n has real i.i.d. Gaussian entries, the limiting law of κ(A n )/n converges in distribution as n tends to infinity to a random variable with an explicit density, see Theorem 6.1 in [36]. We stress that such density does not belong to the Fréchet family of distributions. We also point out that the distribution of the so-called Demmel condition for (real and complex) Wishart matrices are given explicitly in [37]. Remark 1.5 (A word about the smallest singular value σ 1 (A n )). The behavior of the smallest singular value appears naturaly in numerical inversion of large matrices. For instance, when the random matrix A n has complex i.i.d. Gaussian entries, for all n the random variable of nσ 1 (A n ) has the Chi-square distribution with two degrees of freedom, see Corollary 3.3 in [36]. For a random matrix A n with real i.i.d. Gaussian entries, it is shown in Corollary 3.1 in [36] that nσ 1 (A n ) converges in distribution as the dimension increases to a random variable with an explicit density. For further discussion about the smallest singular values, we refer to [8,12,43,44,49,52,74,75]. Remark 1.6 (A word about the largest singular value σ n (A n )). A lot is known about the behavior of the largest singular value. As an illustration, for a random matrix A n with real i.i.d. Gaussian entries, it is shown in Lemma 4.1 in [36] that (1/n)σ n (A n ) converges in probability to 4 as n growths while for the random matrix A n with complex i.i.d. Gaussian entries, (1/n)σ n (A n ) converges in probability to 8. We stress that the Gumbel distribution is the limiting law of the spectral radius of Ginibre ensembles, see Theorem 1.1 in [62]. Recently, it is shown in Theorem 4 in [3] that the Gumbel distribution is the limiting law of the largest eigenvalue of a Gaussian Laplacian matrix. For further discussion, we recommend to [5,7,24,26,32,44,56,71,72].
We continue with the proof of Corollary 1.3.
Proof of Corollary 1.3. Items (i)-(iii) follow directly from Theorem 1.2 and the Continuous Mapping Theorem (see Theorem 13.25 in [50]). In the sequel, we prove Item (iv). By (1.5) we have max − a n b n + a n where G and R are random variables with Gumbel and Rayleigh distributions, respectively. The Slutsky Theorem with the help of (1.10) and (1.11) implies R possesses Fréchet distribution F . This finishes the proof of Item (iv).
Recently, the condition number for powers of matrices has been studied in [49]. As a consequence of Theorem 1.2 we have the following corollary which, in particular, gives the limiting distribution of the condition number for the powers of C n . Corollary 1.7 (Asymptotic distribution of p-th power of the maximum, the minimum and the condition number). Let the notation and hypothesis of Theorem 1.2 be valid and take p ∈ N. The following holds.
(i) Asymptotic distribution of the p-th power of the normalized maximum. For any y ∈ R it follows that where A n (p) = a p n and B n (p) = pb p n (2 p − 1) for all n ≥ 3. (ii) Asymptotic distribution of the p-th power of the minimum. For any x ≥ 0 it follows that Asymptotic distribution of the p-th power of the condition number. For any z > 0 it follows that Let y ∈ R and observe that B n y + A n → ∞ as n → ∞ due to b n /a n → 0 as n → ∞. Then for n large enough we have (1.12) We claim that lim n→∞ (B n y + A n ) 1/p − a n b n = y for any y ∈ R.
Indeed, notice that (B n y + A n ) 1/p − a n b n = a n b n   p By (1.8), a straightforward computation yields B n = B n (p) = pb p n (2 p − 1) for all n ≥ 3. This finishes the proof of item (i).
The following proposition states the standard Gaussian case. It is an immediately consequence of Theorem 1.2. However, since the random variables are i.i.d. with standard Gaussian distribution, the computation of the law of (1.9) and its limiting distribution can be carried out explicitly. We decide to include an alternative and instructive proof in order to illustrate how we identify the right-hand side of (1.6).
where the functions R and G are given in (1.7), and the normalizing constants a n , b n are defined in (1.8).
Proof. Let q n := ⌊n/2⌋. For 1 ≤ k ≤ q n we note that λ n−k , where λ denotes the complex conjugate of a complex number λ. Then we have The complex modulus of λ (n) k is given by λ (n) k = c 2 k,n + s 2 k,n , where c k,n := n−1 j=0 ξ j cos 2π kj n and s k,n := n−1 j=0 ξ j sin 2π kj n for 0 ≤ k ≤ q n . Note that s 0,n = 0. By straightforward computations we obtain for any n ∈ N (i) E (c k,n ) = E (s k,n ) = 0 for 0 ≤ k ≤ q n , (ii) E c 2 0,n = n, E c 2 k,n = E s 2 k,n = n 2 for 1 ≤ k ≤ q n , (iii) E (c k,n · s l,n ) = E (c k,n · c l,n ) = E (s k,n · s l,n ) = 0 for 0 ≤ l < k ≤ q n . Then (i), (ii) and (iii) implies that 1 √ n (c 0,n , c 1,n , s 1,n , . . . , c qn,n , s qn,n ) is a Gaussian vector such that its entries are not correlated, i.e., it has independent Gaussian entries. Thus, the random vector qn | 2 has χ 1 distribution for n being an even number (due to s qn,n = 0), 1 n |λ (n) qn | 2 has E 1 distribution for n being an odd number. For n being an odd number, (1.14), (1.15) and Lemma A.1 in Appendix A imply (1. 16) We observe that lim n→∞ x 2 n = 0 and lim n→∞ (bny+an) 2 n = ∞ for any x, y ∈ R. Recall that for any nonnegative numbers u and v such that u < v it follows that P Then, a straightforward calculation yields Hence, for any x, y it follows that lim n→∞ P x 2 n < E 1 ≤ (b n y + a n ) 2 n qn = exp − x 2 2 exp −e −y .
The preceding limit with the help of (1.16) implies limit (1.13). Similar reasoning yields the proof when n is an even number. This finishes the proof of Proposition 1.8.
The rest of the manuscript is organized as follows. Section 2 is divided in five subsections. In Subsection 2.1 we prove that the asymptotic behavior of |λ (n) 0 | can be removed from our computations. In Subsection 2.2 we establish that the original sequence of random variables can be assumed to be bounded. In Subsection 2.3 we provide a procedure in which we smooth (by a Gaussian perturbation) the bounded sequence obtained in Subsection 2.2. In Subsection 2.4 we prove the main result for the bounded and smooth sequence given in Subsection 2.3. Finally, in Section 2.5 we summarize all the results proved in previous subsections and show Theorem 1.2. Finally, there is Appendix A which collects technical results used in the main text.

Komlós-Major-Tusnády approximation
In this section, we show that Theorem 1.2 can be deduced by a Gaussian approximation in which computations can be carried out. Roughly speaking, we show that P σ max denote the smallest and the largest singular values, respectively, of a random circulant matrix with generating sequence given by i.i.d. bounded and smooth random variables which can be well-approximated by a standard Gaussian distribution.
Along this section, for any set A ⊂ Ω, we denote its complement with respect to Ω by A c . We also point out the following immediately relation: x < λ (n) k ≤ b n y + a n . can be removed from our computations. In other words, we only need to consider our computations over the array λ (n) k k∈{1,...,n−1} as the following lemma states.

Lemma 2.1 (Asymptotic behavior of |λ
(n) 0 | is negligible). Assume that Hypothesis 1.1 is valid. Then for any x, y ∈ R it follows that Proof. For any n ∈ N we set A := n−1 k=0 x < λ (n) k ≤ b n y + a n and B := n−1 k=1 x < λ (n) k ≤ b n y + a n .
Since A ⊂ B, we have where N(0, E[|ξ 0 | 2 ) denotes the Gaussian distribution with zero mean and variance E[|ξ 0 | 2 ]. We note that for any x, y ∈ R the following limits holds (2.5) lim n→∞ x √ n = 0 and lim n→∞ b n y + a n √ n = ∞.
Next lemma allows us to replace the original array of random variables ξ where ( λ (n) k ) k∈{1,...,n−1} is defined in (2.7) of Lemma 2.2. Then for any x, y ∈ R it follows that Proof. For each n ∈ N let x < λ (n) k ≤ b n y + a n and B n := n−1 k=1 x < λ (n) k ≤ b n y + a n .
The preceding limit with the help of Lemma 2.1 and (2.1) and implies (2.14).

Smoothness.
In this subsection we introduce a Gaussian perturbation in order to smooth the random variables ( ξ (n) j ) j∈{1,...,n−1} defined in (2.6) of Lemma 2.2. Let (N j ) j∈{1,...,n−1} be an array of random variables with standard Gaussian distribution. Let (s n ) n∈N be a deterministic sequence of positive numbers satisfying s n → 0 as n → ∞ in a way that (2.15) lim n→∞ s n b n √ n = lim n→∞ s n a n √ n = 0. This is precise in (2.23) below. We anticipate that s n ≈ n −θ for a suitable positive exponent θ. We define the array (γ k ) k∈{1,...,n−1} as follows ..,n−1} is defined in (2.7) of Lemma 2.2. Lemma 2.4. Assume that Hypothesis 1.1 is valid. Then for any x, y ∈ R and a suitable sequence (s n ) n∈N that tends to zero as n → ∞, it follows that where σ By (2.18) we have σ (n,N ) where β n , σ − a n b n + s n β n a n −→ 0, (2.21) in distribution, as n → ∞. The limit (2.21) can be also deduced from (1.1), p. 522 in [32] or Chapter 10 in [23]. Now, we have the necessary elements to conclude the proof of Lemma 2.4. By Lemma 2.5 we have (2.22) lim n→∞ P σ (n,N ) min ≤ β n x, σ (n,N ) max ≤ β n (b n y + a n ) = R(x)G(y) for any x ≥ 0, y ∈ R.
By (2.20) and (2.22) with the help of the Slutsky Theorem we deduce The preceding limit with the help of (2.19) implies As a consequence we conclude (2.17).

2.4.
Bounded and smooth case. In the subsection we prove the following lemma. max being as in (2.16). Then for any x, y ∈ R and a suitable sequence (s n ) n∈N that tends to zero as n → ∞, it follows that lim n→∞ P σ (n,N ) min ≤ β n x, σ (n,N ) max ≤ β n (b n y + a n ) = R(x)G(y) for any x ≥ 0, y ∈ R, where R and G are defined in Theorem 1.2.
We follow the ideas from [32]. To prove Lemma 2.5 we introduce some notation and the so-called Einmahl-Komlós-Major-Tusnády approximation. For each d ∈ N and indexes i j ∈ {1, . . . , n − 1}, j = 1, . . . , d we define the Fourier frequencies w i j = 2πi j n , j = 1, . . . , d and then the vector v d (ℓ) = (cos(w i 1 ℓ), sin(w i 1 ℓ), . . . , cos(w i d ℓ), sin(w i d ℓ)) T for any ℓ ∈ N 0 , where T denotes the transpose operator. The next lemma is the main tool in the proof of Lemma 2.5. It allows us to reduce the problem to a perturbed Gaussian case. for sufficiently small η > 0. Since where q n := ⌊n/2⌋ with ⌊·⌋ being the floor function. Then we define In what follows we compute the limit of (2.25) as n → ∞. In fact we prove that For convenience, let for all k ∈ {1, . . . , q n } and observe that By Lemma A.2 in Appendix A for any fixed ℓ ∈ N we obtain We claim that for every fixed d ∈ N the following limit holds true where the symbol qn d denotes the binomial coefficient. Indeed, by Lemma 2.6 we have where I 2d denotes the 2d×2d identity matrix, ϕ (1+s 2 n )I 2d is the density of a 2d-dimensional Gaussian vector with zero mean and covariance matrix (1 + s 2 n )I 2d , o n (1) → 0 as n → ∞, and By (1.8) and since x, y ∈ R are fixed, there exists n 0 = n 0 (x, y) such that 2x 2 /n < 2(b n y + a n ) 2 /n for all n ≥ n 0 .
Hence, (2.31) with the help of Lemma A.5 in Appendix A yields By (1.8), the Stirling formula (see formula (1) in [63]) and the fact that s n → 0 as n → ∞ we deduce As a consequence of (2.32) and ( Sending ℓ → ∞ in the preceding inequality yields The preceding limit with the help of (2.27) implies (2.26 where R and G are defined in (1.7).
where σ This concludes the proof of Theorem 1.2.
Appendix A. Tools The following section contains useful tools that help us to make this article more fluid. The following elementary lemma is crucial in the proof of Proposition 1.8.
for all s, t ∈ R.
Since the proof of Lemma A.1 is straightforward, we omit it.
The proof of Lemma A.2 is given in Section 1.1 "Inclusion-exclusion Formula" of [54].
Lemma A.3 (Continuity). Let (Ω, F , P) be a probability space. Let (X n ) n∈N be a sequence of random variables defined on Ω and taking values in R. Assume that X n converges in distribution to a random variable X, as n → ∞. Let x be a continuity point of the distribution function F X of the random variable X and let (a n (x)) n∈N be a deterministic sequence of real numbers such that a n (x) → x as n → ∞. Then (A.1) lim n→∞ P(X n ≤ a n (x)) = F X (x).
In addition, if F X is a continuous function then (i) for any deterministic sequence (a n ) n∈N such that a n → 0 as n → ∞ it follows that lim n→∞ P(|X n | ≤ a n ) = 0.
(ii) for any deterministic sequence (a n ) n∈N satisfying a n → ∞ as n → ∞ it follows that lim n→∞ P(|X n | > a n ) = 0.
Proof. We start with the proof of (A.1). Let x be a continuity point of F X and take ǫ > 0. Then there exists n ǫ := n ǫ (x) such that x − ǫ < a n (x) < x + ǫ for all n ≥ n ǫ . By monotonicity we have lim inf n→∞ P (X n < x − ǫ) ≤ lim inf n→∞ P (X n ≤ a n (x)) ≤ lim sup n→∞ P (X n ≤ a n (x)) ≤ lim sup n→∞ P (X n ≤ x + ǫ) .
Since x is a continuity point of F X , sending ǫ → 0 we deduce (A.1). We continue with the proof of item (i) and item (ii). By the Continuous Mapping Theorem we have |X n | → |X| in distribution, as n → ∞. On the one hand, (A.1) yields lim n→∞ P(|X n | ≤ a n ) = P(|X| ≤ 0) = P(|X| = 0) = 0, which finishes the proof of item (i).
On the other hand, let m > 0 be arbitrary. Then there exists n m ∈ N such that a n > m for all n ≥ n m . Hence, lim sup n→∞ P(|X n | > a n ) = lim sup n→∞ P(|X n | > m) = P(|X| > m), which implies item (ii) sending m ↑ ∞. Proof. Let y > 0 and note that P √ 2/X ≤ y = P X ≥ √ 2/y = exp −y −2 = F (y).
The preceding equality concludes the statement.