Abstract
In order to investigate property of the eigenvector matrix of sample covariance matrix \(\mathbf {S}_n\), in this paper, we establish the central limit theorem of linear spectral statistics associated with a new form of empirical spectral distribution \(H^{\mathbf {S}_n}\), based on eigenvectors and eigenvalues of sample covariance matrix \(\mathbf {S}_n\). Using Bernstein polynomial approximations, we prove the central limit theorem for linear spectral statistics of \(H^{\mathbf {S}_n}\), indexed by a set of functions with continuous third order derivatives over an interval including the support of Marcenko–Pastur law. This result provides further evidences to support the conjecture that the eigenmatrix of sample covariance matrix is asymptotically Haar distributed.
Similar content being viewed by others
References
Bai ZD, Silverstein JW (1998) No eigenvalues outside the support of the limiting spectral distribution of large-dimensional sample covariance matrices. Ann Probab 26:316–345
Bai ZD, Silverstein JW (2004) CLT for linear spectral statistics of large-dimensional sample covariance matrices. Ann Probab 32:553–605
Bai ZD, Silverstein JW (2010) Spectral analysis of large dimensional random matrices, 2nd edn. Springer, New York
Bai ZD, Yin YQ (1993) Limit of the smallest eigenvalue of large dimensional covariance matrix. Ann Probab 21(3):1275–1294
Bai ZD, Silverstein JW, Yin YQ (1988) A note on the largest eigenvalue of a large dimensional sample covariance matrix. J Multiv Anal 26:166–168
Bai ZD, Miao BQ, Pan GM (2007) On asymptotics of eigenvectors of large sample covariance matrix. Ann Probab 35:1532–1572
Bai ZD, Wang XY, Zhou W (2010) Functional CLT for sample covariance matrices. Bernoulli 16(4):1086–1113
Billingsley P (1995) Probability and measure, 3rd edn. Wiley, New York
Bordenave C, Guionnet A (2013) Localization and delocalization of eigenvectors for heavy-tailed random matrices. Probab Theory Relat Fields. 157:885–953. arXiv 1201:1862
Erdös L, Schlein B, Yau H-T (2009a) Semicircle law on short scales and delocalization of eigenvectors for Wigner random matrices. Ann Probab 37(3):815–852
Erdös L, Schlein B, Yau H-T (2009b) Local semicircle law and complete delocalization for Wigner random matrices. Commun Math Phys 287:641–655
Jiang TF (2006) How many entries of a typical orthogonal matrix can be approximated by independent normals. Ann Probab 34(4):1497–1529
Jing BY, Pan GM, Shao QM, Zhou W (2010) Nonparametric estimate of spectral density functions of sample covariance matrices: a first step. SNN Stat 38(6):3724–3750
Johnstone IM (2001) On the distribution of the largest eigenvalue in principal components analysis. Ann Stat 29(2):295–327
Jonsson D (1982) Some limit theorems for the eigenvalues of a sample covariance matrix. J Multiv Anal 12:1–38
Knowles A, Yin J (2013) Eigenvector distribution of Wigner matrices. Probab. Theory Related Fields 155(3-4):543–582 arXiv:1102.0057v4
Marčenko V, Pastur L (1967) Distribution of eigenvalues for some sets of random matrices. Math USSR-Sb 1:457–486
Schenker J (2009) Eigenvector localization for random band matrices with power law band width. Commun Math Phys 290(3):1065–1097
Silverstein JW (1981) Describing the behavior of eigenvectors of random matrices using sequences of measures on orthogonal groups. SIAM J Math Anal 12:274–281
Silverstein JW (1984) Some limit theorems on the eigenvectors of large dimensional sample covariance matrices. J Multiv Anal 15:295–324
Silverstein JW (1989) On the eigenvectors of large dimensional sample covariance matrices. J Multiv Anal 30:1–16
Silverstein JW (1990) Weak convergence of random functions defined by the eigenvectors of sample covariance matrices. Ann Probab 18:1174–1194
Silverstein JW, Bai ZD (1995) On the empirical distribution of eigenvalues of a class of large dimensional random matrices. J Multiv Anal 54:175–192
Tao T, Vu V (2012) Random matrices: universal properties of eigenvectors. Random Matrices Theory Appl 1(1):1150001. arXiv: 1103.2801v2
Tracy CA, Widom H (1994) Level-spacing distributions and the airy kernel. Commun Math Phys 159:151–174
Xia NN, Qin YL, Bai ZD (2013) Convergence rate of empirical spectral and eigenvector distribution of large dimensional sample covariance matrix. Ann Stat 41(5):2572–2607
Yin YQ (1986) Limiting spectral distribution for a class of random matrices. J Multiv Anal 20:50–68
Acknowledgments
The research of Ningning Xia was supported by the Fundamental Research Funds for the Central Universities 11SSXT131. Zhidong Bai was partially supported by NSF China 11171057, PCSIRT, the Fundamental Research Funds for the Central Universities and NUS Grant R-155-000-141-112.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Lemma 1
(Burkholder) (Lemma 2.2 of Bai and Silverstein 1998) Let \(X_k\), \(k=1,2,\ldots ,\) be a complex martingale difference sequence with respect to the increasing \(\sigma \)-fields \(\mathcal {F}_k\). Then, for \(p>1\),
where \(K_p\) is a constant which depends upon \(p\) only.
Lemma 2
(Lemma 2.6 of Silverstein and Bai 1995) Let \(z\in \mathbb {C}^+\) with \(v=\mathfrak {I}z\), \(\mathbf {A}\) and \(\mathbf {B}\) Hermitian and \(\mathbf {r}\in \mathbb {C}^n\). Then
Lemma 3
((1.15) in Bai and Silverstein 2004) Let \(\mathbf {X}=(X_1,\ldots ,X_n)\), where \(X_i\)’s are i.i.d. complex random variables with mean zero and variance 1. Let \(\mathbf {A}=(a_{ij})_{n\times n}\) and \(\mathbf {B}=(b_{ij})_{n\times n}\) be complex matrices. Then the following identity holds:
Lemma 4
(Lemma 2.7 of Bai and Silverstein 1998) For \(\mathbf {X}=(X_1,\ldots ,X_n)^{\prime }\) with i.i.d. standardized real or complex entries such that \(\mathrm E X_i=0\) and \(\mathrm E |X_i|^2=1\), and for \(\mathbf {C}\) an \(n\times n\) complex matrix, we have, for any \(p\ge 2\),
where \(K_p\) is a constant which depends upon \(p\) only.
Lemma 5
(Theorem 5.9 of Bai and Silverstein 2010) Suppose that the entries of the matrix \(\mathbf {X}=(X_{jk},j\le n,k\le N)\) are independent (not necessarily identically distributed) and satisfy
-
(1)
\(\hbox {E}X_{jk}=0\),
-
(2)
\(|X_{jk}|\le \sqrt{N}\delta _N,\)
-
(3)
\(\max _{j,k}|\hbox {E}|X_{jk}|^2-\sigma ^2|\rightarrow 0\) as \(N\rightarrow \infty \), and
-
(4)
\(\hbox {E}|X_{jk}|^l\le b(\sqrt{N}\delta _N)^{l-3}\) for all \(l\ge 3\),
where \(\delta _N\rightarrow 0\) and \(b>0\). Let \(\mathbf {S}_n=\mathbf {X}\mathbf {X}^*/N\). Then, for any \(x>\epsilon >0\) and integers \(j,k \ge 2\), we have
for some constant \(K>0\).
Lemma 6
(Lemma 8.20 in Bai and Silverstein 2010) If \(|z|<A\), \(v\ge CN^{-1/2}\) and \(l \ge 1\), then
where A is a positive constant, \(v_c=1-\sqrt{c}+\sqrt{v}\) and \(\Delta \equiv \Vert \hbox {E}F^{S_n}-F_{c} \Vert \).
Lemma 7
(Lemma 9.1 in Bai and Silverstein 2010) Suppose that \(X_i\), \(i=1,\ldots , n\), are independent, with \(\hbox {E}X_i=0\), \(\hbox {E}|X_i|^2=1\), \(\sup \hbox {E}|X_i|^4=\nu < \infty \) and \(|X_i|\le \eta \sqrt{n}\) with \(\eta >0\). Assume that \(\mathbf {A}\) is a complex matrix. Then for any given \(p\) such that \(2\le p \le b \log (n\nu ^{-1}\eta ^4 )\) and \(b>1\), we have
where \(\mathbf {\alpha }=(X_1,\ldots , X_n)^T\).
Lemma 8
Under the conditions of Theorem 1, for all \(|z|<A\) and \(\mathfrak {I}(z)\ge \eta _N N^{-1/2}\), there exists a constant \(K\), such that
where \(A\) is a positive constant.
Proof
From (3.3.1) in Bai and Silverstein (2010),
where the square root of a complex number is defined as
It is easy to verify that
where \(m_\mathrm{semi }\)denotes the Stieltjes transform of semicircular law. By the fact that \(|m_\mathrm{semi }|\le 1\), we have
According to the relationship between \(b_0(z)\) and \(b_1(z)\),
we have
For \(\mathfrak {I}(z)\ge \eta _N N^{-1/2}\), from the fact that \(|\mathrm Etr (\mathbf {A}_1^{-1}(z)-\mathbf {A}^{-1}(z))|\le \frac{1}{\mathfrak {I}(z)}\) and \(m_n(z)=n^{-1}\mathrm tr \mathbf {A}^{-1}(z)\), we can choose \(n\) large enough, such that
Thus,
Consequently, we obtain
\(\square \)
Lemma 9
If \(|b_1(z)|\le K\) and \(v>O(N^{-1/2})\), then for any fixed \(t>0\),
Proof
Note that if \(|b_1(z)\tilde{\xi }_1(z)|\le 1/2\), then
Thus,
By the \(C_r\)-inequality, Lemmas 6 and 7, for some \(\eta =\eta _NN^{-1/4}\) and \(p\ge \log N\), we have
For any fixed \(t>0\), when \(N\) is large enough so that \(\log \eta _N^{-1}>t+1\), it can be shown that
Similarly, we can show that \(P\left( |\tilde{\beta }_1(z)|>2K\right) =o(N^{-t})\) and \(\mathrm E |\xi _1(z)|^p=o(N^{-t})\), for \(p\ge \log N\). Therefore, \(\beta _1(z)\) and \(\tilde{\beta }_1(z)\) are both bounded by a large constant almost surely by using Borel–Cantelli lemma. \(\square \)
Lemma 10
There exists a constant \(K\), such that \(\mathrm E \left| \mathbf {x}_n^*\mathbf {A}^{-1}(z)\mathbf {A}^{-1}(\bar{z})\mathbf {x}_n\right| \le K/v\).
Proof
In the paper, we assume that \(c=\lim n/N\) is away from 1, according to (8.4.9) in Bai and Silverstein (2010) that \(m(z)\) is bounded by a constant, thus we have \(\mathrm E |m_n^H(z)|\le K\) for some constant \(K\). Therefore,
\(\square \)
Lemma 11
\(|\nabla _2|\le O_p(N^{-\epsilon _0})\), when \(m=[N^{1/4+\epsilon _o}]\).
Proof
Using integral by parts, we get
Based on Theorem 1.6 in Xia et al. (2013), we know \(\Vert H^{S_n}-F_{c_n}\Vert =O_p(N^{-1/4})\), thus
where \(a_0=La_l+t\), \(b_0=Lb_r+t\), and \(L=\frac{1-2\epsilon }{b_r-a_l}\), \(t=\frac{(a_l+b_r)\epsilon -a_l}{b_r-a_l}\). For
Then using Taylor expansion, we obtain
where \(\xi _{k,y}\) is a number between \(k/m\) and \(y\).
Substituting the above equality into \(\tilde{f}^{\prime }_m(y)\) and again noticing \(\sum \nolimits _{k=0}^m {m\atopwithdelims ()k}y^k(1-y)^{m-k}\left( \frac{k}{m}-y\right) =0\), we obtain
Therefore,
\(\square \)
Lemma 12
Under the condition of Theorem 1, we have
uniformly in \(\gamma _{mh}\cup \gamma _{mh}^{\prime }\), where the maximum is taken over all \(1\le i\le n\) and \(1\le j\le N\).
Proof
First, let \(\mathbf {e}_i(1\le i\le n)\) be the \(n\)-vector whose \(i\)th element is one, the rest being zero and \(\mathbf {e}_i^{\prime }\), the transpose of \(\mathbf {e}_i\). Then by using Lemma 4 and Cauchy–Schwartz inequality, we obtain
Similarly, we can show that
Second, by martingale inequality, for any \(\epsilon >0\), we have
Together with
Finally, we obtain
The proof of Lemma 12 is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Xia, N., Bai, Z. Functional CLT of eigenvectors for large sample covariance matrices. Stat Papers 56, 23–60 (2015). https://doi.org/10.1007/s00362-013-0565-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00362-013-0565-3
Keywords
- Bernstein polynomial
- Central limit theorem
- Convergence rate
- Empirical spectral distribution
- Haar distribution
- Stieltjes transform