Abstract
In this paper, we establish the limit of empirical spectral distributions of quaternion sample covariance matrices. Motivated by Bai and Silverstein (Spectral analysis of large dimensional random matrices, Springer, New York, 2010) and Marčenko and Pastur (Matematicheskii Sb, 114:507–536, 1967), we can extend the results of the real or complex sample covariance matrix to the quaternion case. Suppose \(\mathbf X_n = ({x_{jk}^{(n)}})_{p\times n}\) is a quaternion random matrix. For each \(n\), the entries \(\{x_{ij}^{(n)}\}\) are independent random quaternion variables with a common mean \(\mu \) and variance \(\sigma ^2>0\). It is shown that the empirical spectral distribution of the quaternion sample covariance matrix \(\mathbf S_n=n^{-1}\mathbf X_n\mathbf X_n^*\) converges to the Marčenko–Pastur law as \(p\rightarrow \infty \), \(n\rightarrow \infty \) and \(p/n\rightarrow y\in (0,+\infty )\).
Similar content being viewed by others
1 Introduction
In 1843, Hamilton described the hyper-complex number of rank 4, to which he gave the name of quaternion (see Kuipers 1999). Research on the quaternion matrices can be traced back to Wolf (1936). After a long blank period, people gradually discovered that quaternions and quaternion matrices play important roles in quantum physics, robot technology and artificial satellite attitude control, among other applications, see Adler (1995) and Finkelstein et al. (1962). Consequently, studies on quaternions have attracted considerable attention in recent years, see So et al. (1994), Zhang (1995), Kanzieper (2002), Akemann (2005), and Akemann and Phillips (2013), among others. In the following, we introduce the quaternion notation. A quaternion can be represented as a \(2 \times 2\) complex matrix
where \(i\) denotes the imaginary unit and the quaternion units can be represented as
The conjugate of \(x\) is defined as
and its norm as
More details can be found in Kuipers (1999), Zhang (1997), and Mehta (2004). Using the matrix representation (1) of quaternions, an \(n\times n\) quaternion matrix \(\mathbf X\) can be rewritten as a \(2n\times 2n\) complex matrix \(\psi (\mathbf X)\), and so we can deal with quaternion matrices as complex matrices for convenience. Denote \(\mathbf S=\frac{1}{n}\mathbf X\mathbf X^*\) and \(\psi (\mathbf S)=\frac{1}{n}\psi (\mathbf X)\psi (\mathbf X)^*\). It is known (see Zhang 1997) that the multiplicities of all the eigenvalues (obviously they are all real) of \(\psi (\mathbf S)\) are even. Taking one from each of the \(n\) pairs of eigenvalues of \(\psi (\mathbf S)\), the \(n\) values are defined to be the eigenvalues of \(\mathbf S\).
In addition, wide application of computer science has increased a thousand fold in terms of computing speed and storage capability in the recent decades. Due to the failure of the applications of many classical conclusions, we need a new theory to analyze very large data sets with high dimensions. Luckily, the theory of random matrices (RMT) might be a possible route for dealing with these problems. The sample covariance matrix is one of the most important random matrices in RMT, which can be traced back to Wishart (1928). Marčenko and Pastur (1967) proved that the empirical spectral distribution (ESD) of a large dimensional complex sample covariance matrix tends to the Marčenko–Pastur (M–P) law. Since then, several successive studies on large dimensional complex or real sample covariance matrices have been completed. Here, the readers are referred to three books Anderson et al. (2010), Bai and Silverstein (2010), and Mehta (2004) for more details.
Under the normality assumption, there are three classic random matrix models: Gaussian orthogonal ensemble (GOE), for which all entries of the matrix are real normal random variables, Gaussian unitary ensemble (GUE), for which all entries of the matrix are complex normal random variables, and Gaussian symplectic ensemble (GSE), for which all entries of the matrix are normal quaternion random variables. Benefiting from the density function of the ensemble and the joint density of the eigenvalues, the results have gotten their own style. If we remove the normality assumption, the corresponding first two models have already had satisfactory results. For quaternion matrices, there are only a few references (see Yin and Bai 2014; Yin et al. 2013, 2014).
In this paper, we prove that the ESD of the quaternion sample covariance matrix also converges to the M–P law. However, due to the multiplication of quaternions is not commutative, when the entries of \(\mathbf X_n\) are quaternion random variables, few works on the spectral properties are found in the literature unless the random variables are normality distributed, because in this case the joint density of the eigenvalues is available. Thanks to the tool provided by Yin et al. (2013), it makes the quaternion case possible. For the proof of this result, we first introduce two definitions about ESD and Stieltjes transform. Let \({\mathbf A}\) be a \(p \times p\) Hermitian matrix and denote its eigenvalues by \({s_j}, j = 1,2, \ldots , p\). The ESD of \({\mathbf A}\) is defined by
where \({I(D)}\) is the indicator function of an event \({D}\) and the Stieltjes transform of \({F^{\mathbf A}}(x)\) is given by
where \(z=u+\upsilon i\in \mathbb {C}^+\). Let \(g(x)\) and \({\mathbf m}_g(x)\) denote the density function and the Stieltjes transform of the M–P law, which are
and
respectively, where \(a = {\sigma ^2}{(1 - \sqrt{y} )^2}\), \(b = {\sigma ^2}{(1 + \sqrt{y} )^2}\). Here, the constant \(y\) is the limit of dimension \(p\) to sample size \(n\) ratio and \({\sigma ^2}\) is the scale parameter. If \(y > 1\), \(G(x)\), the distribution function of \(g(x)\), has a point mass \(1 - 1/y\) at the origin.
Now, our main theorem can be described as follows.
Theorem 1
Let \(\mathbf X_n = \left( {x_{jk}^{(n)}}\right) \), \(j = 1, \ldots ,p, k= 1,\ldots ,n\). Suppose for each \(n\), \(\left\{ x_{jk}^{(n)}\right\} \) are independent quaternion random variables with a common mean \(\mu \) and variance \(\sigma ^2\). Assume that \(y_n=p/n \rightarrow y \in (0,\infty )\) and for any constant \(\eta > 0\),
Then, with probability one, the ESD of the sample covariance matrix \(\mathbf S_n=\frac{1}{n}{\mathbf X_n}{\mathbf X_n^*}\) converges to the M–P law in distribution which has density function (2) and a point mass \(1-1/y\) at the origin when \(y>1\). Here, superscript \(^*\) stands for the complex conjugate transpose.
Remark 2
Without loss of generality, in the proof of Theorem 1, we assume that \( \sigma ^2=1\). Furthermore, one can see that removing the common mean of the entries of \(\mathbf X_n\) does not alter the LSD of sample covariance matrices. In fact, let
By Lemma 17, we have, for all large \(p\),
where \(\Vert f\Vert _{KS}=\sup _x|f(x)|\). Consequently, we assume that \(\mu =0\).
The paper is organized as follows. In Sect. 2, the structure of the inverse of some matrices about quaternions is established which is the key tool of proving Theorem 1. Section 3 demonstrates the proof of the main theorem by two steps and in Sect. 4, we outline some auxiliary lemmas that can be used in last section.
2 Preliminaries
We shall use Lemma 2.5 of Yin et al. (2013) to prove our main result in next section. To keep this work self-contained, the lemma is now stated as follows.
Definition 3
A matrix is of Type-I, if it has the following structure:
Here, all the entries are complex.
Definition 4
A matrix is of Type-II, if it has the following structure:
Here, \(i=\sqrt{-1}\) denotes the usual imaginary unit and all the other entries are complex numbers.
Definition 5
A matrix is of Type-III, if it has the following structure:
Here, all the entries are complex.
Lemma 6
For all \(n\ge 1\), if a complex matrix \(\Omega _n\) is invertible and of Type-II, then \(\Omega _n^{-1}\) is a Type-I matrix.
The following corollary is immediate.
Corollary 7
For all \(n\ge 1\), if a complex matrix \(\Omega _n\) is invertible and of Type-III, then \(\Omega _n^{-1}\) is a Type-I matrix.
3 Proof of Theorem 1
In this section, we present the proof in two steps. The first one is to truncate, centralize and rescale the random variables \(\{x_{ij}^{(n)}\}\), then we may assume the additional conditions which are given in Remark 12. The other is the proof of the Theorem 1 under the additional conditions. Throughout the remainder of this paper, a local constant C may take different value at different place.
3.1 Truncation, centralization and rescaling
3.1.1 Truncation
Note that, condition (4) is equivalent to: for any \(\eta > 0\),
Applying Lemma 15, one can select a sequence \({\eta _n} \downarrow 0\) such that (5) remains true when \(\eta \) is replaced by \(\eta _n\).
Lemma 8
Suppose that the assumptions of Theorem refth:1 hold. Truncate the variables \(x_{jk}^{(n)}\) at \({\eta _n}\sqrt{n}\), and denote the resulting variables by \(\widehat{x}_{jk}^{(n)}\), i.e., \(\widehat{x}_{jk}^{(n)}=x_{jk}^{(n)}I(\Vert x_{jk}^{(n)}\Vert \le {\eta _n}\sqrt{n})\). Also denote
Then, with probability 1,
Proof
Using Lemma 17, one has
Taking condition (5) into consideration, we get
and
Then, by Bernstein’s inequality (see Lemma 18), for all small \(\varepsilon > 0 \) and large \(n\), we obtain
which is summable. Combining (6), the above inequality with the Borel–Cantelli lemma, it follows that
This completes the proof of the lemma. \(\square \)
3.1.2 Centralization
Lemma 9
Suppose that the assumptions of Lemma 8 hold. Denote
Then, we obtain
where \(L(\cdot ,\cdot )\) denotes the Lévy distance.
Proof
Using Lemma 16 and condition (5), we have
To complete the proof of this lemma, we need to show that the first parentheses of the right-hand side of (7) is almost surely bounded. Applying Lemma 22, one has
This indicates by the Borel–Cantelli lemma
Moreover, we can similarly obtain
Now, turning to (7), for all large \(n\),
The proof of the lemma is complete. \(\square \)
3.1.3 Rescaling
Define
where \(\zeta _{jk}\) is a bounded quaternion random variable with \({ E}\zeta _{jk}=0\), \(\mathrm{Var}\zeta _{jk}=1\) and independent with all other variables.
Lemma 10
Write
Under the conditions assumed in Lemma 9, we have
Proof
(a): Our first goal is to show that
Let \({\mathcal E}_n\) be the set of pairs \((j,k)\) : \(\widetilde{\sigma }_{jk}^2 < \frac{1}{2}\) and \(N_n=\sum \nolimits _{(j,k)\in {\mathcal E}_n} I\left( \widetilde{\sigma }_{jk}^2 <1/2\right) \) be the number of such pairs. Due to \(\frac{1}{np}\sum \nolimits _{jk} \widetilde{\sigma }_{jk}^2 \rightarrow 1 \), we conclude that \(N_n=o(np)\). Owing to Lemma 16 and (8), we get
where \(K=N_n\) and \(u_h=\Vert \xi _{jk}-\widetilde{x}_{jk}^{(n)}\Vert ^2\). Using the fact that for all \(l\ge 1\), \(l!\ge (l/3)^l\), we have
By selecting \(m=[\log p]\) that implies \(\frac{2\eta _n^2m}{p}\rightarrow 0\), and noticing \(\frac{6K}{np} \rightarrow 0\), one obtains for any fixed \(t, \varepsilon >0\),
From the inequality above with \(t=2\) and (9), it follows that
(b): Our next goal is to show that
Applying Lemma 16, we have
Using the fact
and Lemma 22, one gets
which is summable. Together with the Borel–Cantelli lemma, it follows that
(c): Finally, from (a) and (b), we can easily get the lemma. \(\square \)
Combining the results of Lemmas 8, 9, and 10, we have the following remarks.
Remark 11
For brevity, we shall drop the superscript (n) from the variables. Also the truncated and renormalized variables are still denoted by \(x_{jk}\).
Remark 12
Under the conditions assumed in Theorem 1, we can further assume that
-
(1)
\(\Vert x_{jk}\Vert \le \eta _n\sqrt{n}\),
-
(2)
\({ E}(x_{jk})=0 \) and \( \mathrm{Var}(x_{jk})=1\).
3.2 Completion of the proof
Denote
where \(z=u+\upsilon i\in \mathbb {C}^+\).
3.2.1 Random part
First, we should show that
Let \(\pi _j\) denote the \(j\)th column of \(\mathbf X_n\), \(\mathbf S_n^k = {\mathbf S}_n - \frac{1}{n}{\varvec{\pi }} _k\varvec{\pi }_k^*\) and \({{ E}_k}( \cdot )\) denote the conditional expectation given \(\{ {{\varvec{\pi }_{k + 1}},{\varvec{\pi }_{k + 2}}, \ldots ,{\varvec{\pi }_{2n}}} \}\). Then,
where
-
1.
When \(k = 2t - 1 (t =1,2, \ldots ,n)\), due to \((k+1)\)th column is a function of the \(k\)th column, we obtain
$$\begin{aligned} {\gamma _k} =&\,{{\mathrm{E}_{k - 1}}\mathrm{tr}{{\left( {\mathbf S_n} - z{\mathbf I_{2p}}\right) }^{ - 1}} - } {{ E}_k}\mathrm{tr}{\left( {\mathbf S_n} - z{\mathbf I_{2p}}\right) ^{ - 1}} = 0. \end{aligned}$$ -
2.
When \(k = 2t (t =0,1, \ldots ,n)\), together with the formula
$$\begin{aligned} {({\mathbf A} + {\varvec{\alpha }} {{\varvec{\beta }}^*})^{ - 1}} = {{\mathbf A}^{ - 1}} - \frac{{{{\mathbf A}^{ - 1}}{\varvec{\alpha } } {{\varvec{\beta }} ^ * }{{\mathbf A}^{ - 1}}}}{{1 + {{\varvec{\beta }}^ * }{{\mathbf A}^{ - 1}}{\varvec{\alpha } }}}, \end{aligned}$$one finds
$$\begin{aligned} {\gamma _k} =&\,\left( {{ E}_{k - 1}} - {{ E}_k}\right) \left[ \mathrm{tr}{({\mathbf S_{n}} - z{\mathbf I_{2p}})^{ - 1}} - \mathrm{tr}{\left( \mathbf S_n^k - z{\mathbf I_{2p}}\right) ^{ - 1}}\right] \\ =\,&({{ E}_{k - 1}} - {{ E}_k})\frac{\frac{1}{n}{\varvec{\pi }_{k}^ * {{\left( \mathbf S_n^k - z{\mathbf I_{2p}}\right) }^{ - 2}}{\varvec{\pi }_{k}}}}{{1 +\frac{1}{n} \varvec{\pi }_{k}^ * {{\left( \mathbf S_n^k - z{\mathbf I_{2p}}\right) }^{ - 1}}{\varvec{\pi }_{k}}}}. \end{aligned}$$Since
$$\begin{aligned}&\left| \frac{\frac{1}{n}{\varvec{\pi }_{k}^ * {{\left( \mathbf S_n^k - z{\mathbf I_{2p}}\right) }^{ - 2}}{\varvec{\pi }_{k}}}}{{1 +\frac{1}{n} \varvec{\pi }_{k}^ * {{\left( \mathbf S_n^k - z{\mathbf I_{2p}}\right) }^{ - 1}}{\varvec{\pi }_{k}}}} \right| \\&\quad \le \frac{\frac{1}{n}{\varvec{\pi }_{k}^ * {{\left( {{\left( \mathbf S_n^k - u{\mathbf I_{2p}}\right) }^2} + {\upsilon ^2}{\mathbf I_{2p}}\right) }^{ - 1}}{\varvec{\pi }_{k}}}}{{\mathfrak {I}\left( 1 +\frac{1}{n} \varvec{\pi }_{k}^ * {{(\mathbf S_n^k - z{\mathbf I}_{2p})}^{ - 1}}{\varvec{\pi }}_{k}\right) }}\\&\quad = \frac{1}{\upsilon }, \end{aligned}$$we can easily get
$$\begin{aligned} \left| \gamma _k\right| \le \frac{2}{\upsilon }. \end{aligned}$$
Using Lemma 21, it follows that
Combining the Borel–Cantelli lemma with the Chebyshev inequality, we conclude
3.2.2 Mean convergence
When \(\sigma ^2=1\), (3) turns into
Next, we show that
where \({\mathbf X}_{nk}\) is the matrix resulting from deleting the \(k\)th quaternion row of \(\mathbf X_n\), and \({\varvec{\phi }}_k^{\prime }\) is the vector obtained from the \(k\)th quaternion row of \({\mathbf X}_n\). Here, superscript \(^{\prime }\) only stands for the transpose and \({\varvec{\phi }}_k^{\prime }\) is a \(1\times n\) quaternion matrix. Write
and
where \(y_n=p/n\). This implies that
Solving \({ E}{{ m}_n}(z)\) from the equation above, we get
As proved in the Eq. (3.17) of Bai (1993), we can assert that
Comparing (12) with (16), it suffices to show that
For this purpose, we need the following two lemmas.
Lemma 13
Under the conditions of Remark 12, for any \(z=u+vi\) with \(v>0\) and for any \(k=1,\ldots ,p\), we have
Proof
By calculation, we have
where the last inequality has used Lemma 19 twice. Then, the proof is complete. \(\square \)
Lemma 14
Under the conditions of Remark 12, for any \(z=u+vi\) with \(v>0\) and any \(k=1,\ldots ,p\), we have
Proof
Write the form of \(({\mathbf S_{n}} - z{\mathbf I_{2p}})\)
By Corollary 7 and (13), we have \(\frac{1}{n}\varvec{\phi }_k^{\prime }\bar{\varvec{\phi }}_k - z{\mathbf I_2} - \frac{1}{{{n^2}}}\varvec{\phi }_k^{\prime }{\mathbf X}_{nk}^*{\mathbf R_k}{\mathbf X_{n k}}\bar{\varvec{\phi }}_k\) is a scalar matrix. Let \(\mathbf R_k={\left( \frac{1}{n}{\mathbf X_{nk}}{\mathbf X_{nk}^*} - z{\mathbf I_{2p - 2}}\right) }^{-1}\). Denote by \(\varvec{\alpha }_k\) the first column of \(\varvec{\phi }_k\) and by \(\varvec{\beta }_k\) the second column of \(\varvec{\phi }_k\), then combining (14) we have
where
Let \(\widetilde{ E}(\cdot )\) denote the conditional expectation given \(\{ x_{jl},j=1,\ldots ,p,l=1,\ldots ,n;l\ne k\}\), then we get
According to the inequality above, we proceed to complete the estimation of \({ E}|\mathrm{tr}\varvec{\varepsilon }_k^2|\) by the following three steps.
-
(a)
For the first term of the right-hand side of (19), denote \(\mathbf T=(t_{jl})=\mathbf I_{2n}-\frac{1}{n}{\mathbf X}_{nk}^*{\mathbf R_k}{\mathbf X_{nk}}\) where \(t_{jl}=\left( \begin{array}{c@{\quad }c}e_{jl}&{}f_{jl}\\ h_{jl}&{}g_{jl}\end{array}\right) \). Then, rewrite
$$\begin{aligned} \mathrm{tr}\varvec{\varepsilon }_k-\widetilde{ E}\mathrm{tr}\varvec{\varepsilon }_k&=\mathrm{tr}\left( \frac{1}{n}\varvec{\phi }_k^{\prime }\bar{\varvec{\phi }}_k - \frac{1}{{{n^2}}}\varvec{\phi }_k^{\prime }{\mathbf X}_{nk}^*{\mathbf R_k}{\mathbf X_{nk}}\bar{\varvec{\phi }}_k\right) -\mathrm{tr}\left( \mathbf I_2 - \frac{1}{{{n^2}}}{\mathbf X}_{nk}^*{\mathbf R_k}{\mathbf X_{nk}}\right) \\&=\frac{1}{n}\mathrm{tr}\left( \varvec{\phi }_k^{\prime }\mathbf T\bar{\varvec{\phi }}_k-\mathbf T\right) \\&=\frac{1}{n}\left( \sum _{j=1}^{n}\mathrm{tr}(\Vert x_{kj}\Vert ^2-1)t_{jj}+\sum _{j\ne l}^{}\mathrm{tr}(x_{kl}^*x_{kj}t_{jl})\right) \!. \end{aligned}$$By elementary calculation, we obtain
$$\begin{aligned}&\widetilde{ E}|\mathrm{tr}\varvec{\varepsilon }_k-\widetilde{ E}\mathrm{tr}\varvec{\varepsilon }_k|^2\nonumber \\&\quad =\frac{1}{n^2}\bigg (\sum _{j=1}^{n}\widetilde{ E}|\mathrm{tr}(\Vert x_{kj}\Vert ^2-1)t_{jj}|^2+\sum _{j\ne l}^{}\widetilde{ E}\left[ \mathrm{tr}\left( x_{kl}^*x_{kj}t_{jl}\right) \mathrm{tr}(x_{kj}^*x_{kl}t_{jl}^*)\right. \nonumber \\&\left. \qquad +\mathrm{tr}\left( x_{kl}^*x_{kj}t_{jl}\right) \mathrm{tr}\left( x_{kl}^*x_{kj}t_{lj}^*\right) \right] \bigg )\nonumber \\&\quad \le \frac{1}{n^2}\left( \sum _{j=1}^{n}\widetilde{ E}(\Vert x_{kj}\Vert ^2-1)^2|e_{jj}+g_{jj}|^2+2\sum _{j\ne l}\widetilde{ E}|\mathrm{tr}(x_{kl}^*x_{kj}t_{jl})|^2\right) \nonumber \\&\quad \le \frac{C}{n^2}\left( \eta _n^2n\sum _{j=1}^{n}(|e_{jj}|^2 +|g_{jj}|^2)+\sum _{j\ne l}^{}(|e_{jl}|^2+|f_{jl}|^2+|g_{jl}|^2+|h_{jl}|^2)\right) \nonumber \\&\quad \le \frac{C\eta _n^2}{n}\sum _{j=1}^{n}(|e_{jj}|^2 +|g_{jj}|^2)+\frac{C}{n^2}\sum _{j, l}^{}(|e_{jl}|^2+|f_{jl}|^2+|g_{jl}|^2+|h_{jl}|^2)\nonumber \\&\quad \le \frac{C\eta _n^2}{n}\mathrm{tr} \mathbf T \mathbf T^*+\frac{C}{n^2}\mathrm{tr} \mathbf T \mathbf T^*. \end{aligned}$$(20)For \(\frac{1}{\sqrt{n}}{\mathbf X}_{nk}\), there exists a \((2p-2)\times q\) orthonormal matrix \(\mathbf U\) and a \(2n\times q\) orthonormal matrix \(\mathbf V\) such that
$$\begin{aligned} \frac{1}{\sqrt{n}}{\mathbf X}_{nk}=\mathbf U \mathrm{diag}(s_1,\ldots ,s_q)\mathbf V^* \end{aligned}$$where \(s_1,\ldots ,s_q\) are the singular values of \(\frac{1}{\sqrt{n}}{\mathbf X}_{nk}\) and \(q=\min \{(2p-2),2n\}\). Then, we get
$$\begin{aligned} \mathbf I_{2n}-\mathbf T=&\left( \frac{1}{\sqrt{n}}{\mathbf X}_{nk}^*\right) {\mathbf R_k}\left( \frac{1}{\sqrt{n}}{\mathbf X}_{nk}\right) \\ =&\,\mathbf V \mathrm{diag}\left( \frac{s_1^2}{s_1^2-z},\cdots ,\frac{s_q^2}{s_q^2-z}\right) \mathbf V^* \end{aligned}$$which implies that
$$\begin{aligned} \mathbf T =\mathbf V \mathrm{diag} \left( \frac{-z}{s_1^2-z},\cdots ,\frac{-z}{s_q^2-z}\right) \mathbf V^*. \end{aligned}$$Consequently, it follows that
$$\begin{aligned} \mathrm{tr} \mathbf T \mathbf T^*=\sum _{j=1}^{q}\frac{|z|^2}{|s_j^2-z|^2}\le \frac{2n|z|^2}{\upsilon ^2}. \end{aligned}$$(21)$$\begin{aligned} { E}|\mathrm{tr}\varvec{\varepsilon }_k-\widetilde{ E}\mathrm{tr}\varvec{\varepsilon }_k|^2\rightarrow 0. \end{aligned}$$(22) -
(b)
Next, the second term of right-hand side of (19) is estimated. Note that
$$\begin{aligned} \widetilde{ E}\mathrm{tr}\varvec{\varepsilon }_k-{ E}\mathrm{tr}\varvec{\varepsilon }_k=\frac{z}{n}\left( E\mathrm{tr}{\mathbf R_k}-\mathrm{tr}{\mathbf R_k}\right) . \end{aligned}$$Using the martingale decomposition method, we have
$$\begin{aligned} { E}|\widetilde{ E}\mathrm{tr}\varvec{\varepsilon }_k-{ E}\mathrm{tr}\varvec{\varepsilon }_k|^2=&\,\frac{|z|^2}{n^2}{ E}|E\mathrm{tr}{\mathbf R_k}-\mathrm{tr}{\mathbf R_k}|^2\le \frac{4|z|^2}{n\upsilon ^2}\rightarrow 0. \end{aligned}$$(23) -
(c)
Finally, combining (17), (19), (22), and (23), we conclude that
$$\begin{aligned} { E}|\mathrm{tr}\varvec{\varepsilon }_k^2|\rightarrow 0. \end{aligned}$$
These indicate that we complete the proof of the lemma. \(\square \)
Now, we are in a position to show that
By (15) and (18), we can write
Note that
which implies that
Together with Lemma 14, (24), and
one finds that
Combining (25) with (26), we get
So far, we have completed the proof of the mean convergence
3.2.3 Completion of the proof of Theorem 1
By Sects. 3.2.1 and 3.2.2, for any fixed \(z\in \mathbb C^+\), we have
To complete the proof Theorem 1, we need the last part of Chapter 2 of Bai and Silverstein (2010). For the readers convenience, we repeat here. That is, for each \(z\in \mathbb C^+\), there exists a null set \(N_z\) (i.e., \(\text{ P }\left( N_z\right) =0\)) such that
Now, let \(\mathbb C_0^+\) be a dense subset of \(\mathbb C^+\) (e.g., all \(z\) of rational real and imaginary parts) and let \(N=\bigcup _{z\in \mathbb C_0^+} N_{z}\). Then,
Let \(\mathbb C_m^+=\{z\in \mathbb C^+:\ \mathfrak {I}z >1/m,\ |z|\le m\}\). When \(z\in \mathbb C_m^+\), we have \(|{ m}_n\left( z\right) |\le m\). Applying Lemma 23, we have
Since the convergence above holds for every \(m\), we conclude that
Applying Lemma 24, we conclude that
References
Adler, S. L. (1995). Quaternionic quantum mechanics and quantum fields (Vol. 1). Oxford: Oxford University Press.
Akemann, G. (2005). The complex Laguerre symplectic ensemble of non-Hermitian matrices. Nuclear Physics B, 730(3), 253–299.
Akemann, G., Phillips, M. (2013). The interpolating airy kernels for the beta = 1 and beta = 4 elliptic Ginibre ensembles. arXiv:1308.3418.
Anderson, G. W., Guionnet, A., Zeitouni, O. (2010). An introduction to random matrices (Vol. 118). Cambridge: Cambridge University Press.
Bai, Z. D. (1993). Convergence rate of expected spectral distributions of large random matrices. Part II. Sample covariance matrices. The Annals of Probability, 21(2), 649–672.
Bai, Z. D., Silverstein, J. W. (2010). Spectral analysis of large dimensional random matrices (2nd ed.). New York: Springer.
Finkelstein, D., Jauch, J. M., Schiminovich, S., Speiser, D. (1962). Foundations of quaternion quantum mechanics. Journal of Mathematical Physics, 3(2), 207.
Kanzieper, E. (2002). Eigenvalue correlations in non-Hermitean symplectic random matrices. Journal of Physics A: Mathematical and General, 35(31), 6631.
Kuipers, J. B. (1999). Quaternions and rotation sequences. Princeton: Princeton University Press.
Marčenko, V. A., Pastur, L. A. (1967). Distribution of eigenvalues for some sets of random matrices. Matematicheskii Sbornik, 114(4), 507–536.
Mehta, M. L. (2004). Random matrices (Vol. 142). Access online via Elsevier.
So, W., Thompson, R. C., Zhang, F. (1994). The numerical range of normal matrices with quaternion entries. Linear and Multilinear Algebra, 37(1–3), 175–195.
Wishart, J. (1928). The generalised product moment distribution in samples from a normal multivariate population. Biometrika, 20(1/2), 32–52.
Wolf, L. A. (1936). Similarity of matrices in which the elements are real quaternions. Bulletin of the American Mathematical Society, 42(10), 737–743.
Yin, Y., Bai, Z. D. (2014). Convergence rates of the spectral distributions of large random quaternion self-dual Hermitian matrices. Journal of Statistical Physics, 157(6), 1207–1224.
Yin, Y., Bai, Z. D., Hu, J. (2013). On the semicircular law of large dimensional random quaternion matrices. Journal of Theoretical Probability (to appear). arXiv:1309.6937.
Yin, Y., Bai, Z. D., Hu, J. (2014). On the limit of extreme eigenvalues of large dimensional random quaternion matrices. Physics Letters A, 378(1617), 1049–1058.
Zhang, F. (1995). On numerical range of normal matrices of quaternions. Journal of Mathmatical and Physical Science, 29(6), 235–251.
Zhang, F. (1997). Quaternions and matrices of quaternions. Linear Algebra and Its Applications, 251, 21–57.
Author information
Authors and Affiliations
Corresponding author
Additional information
Z. D. Bai was partially supported by CNSF 11171057, the Fundamental Research Funds for the Central Universities, and PCSIRT; J. Hu was partially supported by a grant CNSF 11301063.
Appendix
Appendix
In this section, some results are listed which are used in the proof of the main theorem.
Lemma 15
Suppose for any \(\eta >0\) \(\sum _{n=1}^{\infty }f\left( \eta ,n\right) <\infty \), then we can select a slowly decreasing sequence of constants \(\eta _n\rightarrow 0\) such that
where \(f\) is a nonnegative function.
Similarly, if \(f(\eta ,n)\rightarrow 0\) for any fixed \(\eta >0\), then there exists a decreasing sequence \(\eta _n\rightarrow 0\) such that \(f(\eta _n,n)\rightarrow 0\).
Proof
Letting \(\eta =\frac{1}{m}\), one has \(\sum _{n=1}^{\infty }f\left( \frac{1}{m},n\right) <\infty .\) Moreover, there exists a increasing sequence \(N_m\) such that \(\sum _{n=N_m}^{\infty }f\left( \frac{1}{m},n\right) \le \frac{1}{2^m}\). Define a sequence \(\eta _n=\frac{1}{m}\) when \(N_m\le n<N_{m+1}\). We get
This completes the proof of this lemma.\(\square \)
Lemma 16
(Corollary A.42 of Bai and Silverstein 2010) Let \({\mathbf A}\) and \({\mathbf B}\) be two \(p \times n\) matrices and denote the ESD of \({\mathbf S} = {\mathbf A}{{\mathbf A}^ * }\) and \( \widetilde{\mathbf S} = {\mathbf B}{{\mathbf B}^ * }\) by \({F^{\mathbf S}}\) and \({F^{\widetilde{\mathbf S}}}\), respectively. Then,
where \(L(\cdot ,\cdot )\) denotes the Lévy distance, that is,
Lemma 17
(Theorem A.44 of Bai and Silverstein 2010) Let \({\mathbf A}\) and \({\mathbf B}\) be \(p \times n\) complex matrices. Then,
Lemma 18
(Bernstein’s inequality) If \(\varvec{\tau }_1,\ldots ,\varvec{\tau }_n\) are independent random variables with means zero and uniformly bounded by \(b\), then, for any \(\varepsilon > 0\),
where \( B_n^2={ E}(\varvec{\tau }_1+\cdots +\varvec{\tau }_n)^2.\)
Lemma 19
(see \(\left( A.1.12\right) \) of Bai and Silverstein 2010) Let \(z = u + iv, v > 0, \) and \({\mathbf A}\) be an \(n \times n\) Hermitian matrix. Denote by \({{\mathbf A}_k}\) the kth major sub-matrix of \({\mathbf A}\) of order \((n-1)\), to be the matrix resulting from deleting the \(k\)th row and column from \({\mathbf A}\). Then,
Lemma 20
Suppose that the matrix \(\varvec{\Sigma }\) has the partition as given by \(\begin{pmatrix}\varvec{\Sigma }_{11}&{}\quad \varvec{\Sigma }_{12}\\ \varvec{\Sigma }_{21}&{}\quad \varvec{\Sigma }_{22}\end{pmatrix}\). If \(\varvec{\Sigma }\) and \(\varvec{\Sigma }_{11}\) are nonsingular, then the inverse of \(\varvec{\Sigma }\) has the form
where \(\varvec{\Sigma }_{22.1}=\varvec{\Sigma }_{22}-\varvec{\Sigma }_{21}\varvec{\Sigma }_{11}^{-1}\varvec{\Sigma }_{12}\).
Lemma 21
(Burkholder’s inequality) Let \(\{ {{\mathbf X}_k}\} \) be a complex martingale difference sequence with respect to the increasing \(\sigma \)-field. Then, for \(p > 1,\)
Lemma 22
(Rosenthal’s inequality) Let \( {\mathbf X}_i \) be independent with zero means, then we have, for some constant \(C_k\),
Lemma 23
(Lemma 2.14 of Bai and Silverstein 2010) Let \(f_1,f_2,\ldots \) be analytic in \(D\), a connected open set of \(\mathbb C\), satisfying \(\left| f_n\left( z\right) \right| \le M\) for every \(n\) and \(z\) in \(D\), and \(f_n\left( z\right) \) converges as \(n \rightarrow \infty \) for each \(z\) in a subset of \(D\) having a limit point in \(D\). Then, there exists a function \(f\) analytic in \(D\) for which \(f_n\left( z\right) \rightarrow f\left( z\right) \) and \(f_n^{\prime } \rightarrow f^{\prime }\left( z\right) \) for all \(z\in D\). Moreover, on any set bounded by a contour interior to \(D\), the convergence is uniform and \(\{f_n^{\prime }\left( z\right) \}\) is uniformly bounded.
Lemma 24
(Theorem B.9 of Bai and Silverstein 2010) Assume that \(\left\{ G_n\right\} \) is a sequence of functions of bounded variation and \(G_n\left( -\infty \right) =0\) for all \(n\). Then,
where \(D\equiv \left\{ z\in \mathbb {C}:\mathfrak {I}z>0\right\} \) if and only if there is a function of bounded variation \(G\) with \(G\left( -\infty \right) =0\) and Stieltjes transform \({\mathbf m}\left( z\right) \) and such that \(G_n\rightarrow G\) vaguely.
About this article
Cite this article
Li, H., Bai, Z.D. & Hu, J. Convergence of empirical spectral distributions of large dimensional quaternion sample covariance matrices. Ann Inst Stat Math 68, 765–785 (2016). https://doi.org/10.1007/s10463-015-0514-0
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10463-015-0514-0