Skip to main content
Log in

On the distribution of sample scale-free scatter matrices

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

This paper addresses certain distributional aspects of a scale-free scatter matrix denoted by R that is stemming from a matrix-variate gamma distribution having a positive definite scale parameter matrix B. Under the assumption that B is a diagonal matrix, a structural representation of the determinant of R is derived; the exact density functions of products and ratios of determinants of matrices possessing such a structure are obtained; a closed form expression is given for the density function of R. Moreover, a novel procedure is utilized to establish that certain functions of the determinant of the sample scatter matrix are asymptotically distributed as chi-square or normal random variables. Then, representations of the density function of R that respectively involve multiple integrals, multiple series and Gauss’ hypergeometric function are provided for the general case of a positive definite scale parameter matrix, and an illustrative numerical example is presented. Cutting-edge mathematical techniques have been employed to derive the results. Naturally, they also apply to the conventional sample correlation matrix which is encountered in various multivariate inference contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Bao Z, Pan G, Zhou W (2012). Tracy-Widom law for the extreme eigenvalues of sample correlation matrices. Electron J Probab 88:1–32. ISSN: 1083-6489 https://doi.org/10.1214/EJP.v17-1962

  • Dette H, Dörnemann N (2020) Likelihood ratio tests for many groups in high dimensions. J Multivar Anal 178:104605

    Article  MathSciNet  Google Scholar 

  • Dörnemann N (2023) Likelihood ratio tests under model misspecification in high dimensions. J Multivar Anal 193:105122

    Article  MathSciNet  Google Scholar 

  • Ermolaev VT, Rodyushkin KV (1999) The distribution function of the maximum eigenvalue of a sample correlation matrix of internal noise of antenna-array elements. Radiophys Quantum Electron 2(5):439–444 (UDC 621.396.67.01)

    Article  Google Scholar 

  • Fang C, Krishnaiah PR (1982) Asymptotic distributions of functions of the eigenvalues of some random matrices for nonnormal populations. J Multivar Anal 12:39–63

    Article  MathSciNet  Google Scholar 

  • Farrell R (1985) Multivariate calculation. Springer, New York. https://doi.org/10.1007/978-1-4613-8528-8

    Book  Google Scholar 

  • Grote J, Kabluchko Z, Thäle C (2019) Limit theorems for random simplices in high dimensions. ALEA Latin Am J Probab Stat 16(1):141–177

    Article  MathSciNet  Google Scholar 

  • Gupta AK, Nagar DK (2000) Matrix variate distributions. Hall/CRC, Boca Raton

    Google Scholar 

  • Gupta AK, Nagar DK (2004) Distribution of the determinant of the sample correlation matrix from a mixture normal model. Random Oper Stoch Equ 12(2):193–199

    Article  MathSciNet  Google Scholar 

  • Heiny J, Johnston S, Prochno J (2022) Thin-shell theory for rotationally invariant random simplices. Electron J Probab 27:1–141

    Article  MathSciNet  Google Scholar 

  • Heiny J, Mikosch T (2018) Almost sure convergence of the largest and smallest eigenvalues of high-dimensional sample correlation matrices. Stoch Process Appl 128:2779–2815

    Article  MathSciNet  Google Scholar 

  • Heiny J, Yao J (2020) Limiting distributions for eigenvalues of sample correlation matrices from heavy-tailed populations. arXiv:2003.03857v1 [math.PR] 8 Mar 2020

  • Jiang T (2019) Determinant of sample correlation matrix with application. Ann Appl Probab 29(3):1356–1397

    Article  MathSciNet  Google Scholar 

  • Kollo T, Neudecker H (1993) Asymptotics of eigenvalues and unit-length eigenvectors of sample variance and correlation matrices. J Multivar Anal 47:283–334

    Article  MathSciNet  Google Scholar 

  • Kollo T, Ruul K (2003) Approximations to the distribution of the sample correlation matrix. J Multivar Anal 85:318–334. https://doi.org/10.1016/S0047-259X(02)00037-4

    Article  MathSciNet  Google Scholar 

  • Konishi S (1979) Asymptotic expansions of statistics based on the sample correlation matrix in principal component analysis. Hiroshima Math J 9:647–700

    Article  MathSciNet  Google Scholar 

  • Mathai AM (1993) A handbook of generalized special functions for statistical and physical sciences. Oxford University Press, Oxford

    Google Scholar 

  • Mathai AM, Haubold H (2008) Special functions for applied scientists. Springer, New York. https://doi.org/10.1007/978-0-387-75894-7

    Book  Google Scholar 

  • Mathai AM, Saxena RK, Haubold HJ (2010) The H-function: theory and applications. Springer, New York

    Book  Google Scholar 

  • Pham-Gia T, Choulakian V (2014) Distribution of the sample correlation matrix and applications. Open J Stat 4:330–344. https://doi.org/10.4236/ojs.2014.45033

    Article  Google Scholar 

  • Parolya N, Heiny J, Kurowicka D (2021). Logarithmic law of large random correlation matrix. Preprint. arXiv:2103.13900

  • Schott J (1997) Matrix analysis for statisticians. Wiley, New York

    Google Scholar 

  • Taniguchi M, Krishnaiah PR (1987) Asymptotic distributions of functions of the eigenvalues of sample covariance matrix and canonical correlation matrix in multivariate time series. J Multivar Anal 22:156–176

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to express our sincere thanks to two reviewers for their insightful comments and valuable suggestions. The financial support of the Natural Sciences and Engineering Research Council of Canada is gratefully acknowledged by the second author.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Serge B. Provost.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 172 KB)

Appendices

Appendix

Proof of Theorem 3.4

We will need the asymptotic expansion of the gamma function, namely,

$$\begin{aligned} \Gamma (z+\delta )\!=\!\sqrt{2\pi }z^{z+\delta -\frac{1}{2}}\textrm{e}^{-z-\sum _{k=1}^{\infty }\frac{(-1)^kB_{k+1}(\delta )}{k(k+1)z^{k}}} \mathrm{as\ } |z|\rightarrow \infty \text{ and } ~\delta \text{ is } \text{ bounded, } \end{aligned}$$
(A.1)

where \(B_k(\delta )\), \(k=1,2,\ldots , \) are the Bernoulli polynomials. Lists of the first few Bernoulli polynomials and the first few Bernoulli numbers, along with the above expansion and further details, are given in Mathai (1993). In this case, only \(B_2(\delta )\) and \(B_3(\delta )\) will be needed for deriving the asymptotic normality. For illustrative purposes, we will list \(B_k(\delta )\) for \(k=2\) and 3:

$$\begin{aligned} B_2(\delta )=\delta ^2-\delta +\frac{1}{6};~~B_3(\delta )=\delta ^3-\frac{3}{2}\delta ^2+\frac{1}{2}\delta . \end{aligned}$$
(A.2)

It was established that the h-th moment of |R| for the case where the scale parameter matrix B is diagonal is

$$\begin{aligned} E[|R|^h]=\frac{\Gamma (\alpha -\frac{1}{2}+h)\cdots \Gamma (\alpha -\frac{p-1}{2}+h)}{\Gamma (\alpha -\frac{1}{2})\cdots \Gamma (\alpha -\frac{p-1}{2})}\left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +h)}\right] ^{p-1}. \end{aligned}$$
(A.3)

Consider the two-term approximation of a gamma function which is obtained by expanding (A.1) up to \(k=1\), namely,

$$\begin{aligned} \Gamma (z+\delta )\approx \sqrt{2\pi }z^{z+\delta -\frac{1}{2}}\textrm{e}^{-z+\frac{1}{2z}(\delta ^2-\delta +\frac{1}{6})}\ \, \mathrm{as\ } |z|\rightarrow \infty \text{ and } ~\delta \text{ is } \text{ bounded }. \end{aligned}$$
(A.4)

We now expand all the gamma functions appearing in (A.3) by making use of (A.4) assuming that \(|\alpha |\rightarrow \infty \), by writing \(h=it,i=\sqrt{(-1)},\ t \) being a parameter, so that \(E[|R|^{it}]\) becomes the characteristic function of \(\ln |R|\) or \(E[\textrm{e}^{it\ln |R|}]\). Then,

$$\begin{aligned} \left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +it)}\right] ^{p-1}\approx \textrm{e}^{-\frac{(p-1)}{2\alpha }(-t^2-it)} \end{aligned}$$
(A.5)

and

$$\begin{aligned} \prod _{j=1}^{p-1}\frac{\Gamma (\alpha -\frac{j}{2}+it)}{\Gamma (\alpha -\frac{j}{2})}\approx \textrm{e}^{\frac{1}{2\alpha }[-\frac{(p-1)p}{2}it-(p-1)t^2]}. \end{aligned}$$
(A.6)

On combining (A.5) and (A.6), one has the following two-term approximation:

$$\begin{aligned} E[|R|^h]\approx \textrm{e}^{\frac{1}{2\alpha }\frac{p(p-1)}{2}(it)}. \end{aligned}$$
(A.7)

Now, consider a three-term approximation of all the gamma functions when \(\alpha \) is large. The third factor in the asymptotic series of the gamma function is available from (A.1), letting \(k=2\), and (A.2). We will consider the approximation of the third factor for \(k=2\). By taking \(\delta =0\) for the numerator gamma functions and \(\delta =it\) for the denominator gamma functions, the third factor can be approximated as follows:

$$\begin{aligned} \left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +it)}\right] ^{p-1}&\rightarrow \prod _{j=1}^{p-1}\textrm{e}^{-\frac{1}{6\alpha ^2}(0)}/\textrm{e}^{-\frac{1}{6\alpha ^2}[(it)^3-\frac{3}{2}(it)^2+\frac{1}{2}(it)]}\nonumber \\&=\textrm{e}^{\frac{(p-1)}{6\alpha ^2}[(it)^3-\frac{3}{2}(it)^2+\frac{1}{2}(it)]}. \end{aligned}$$
(A.8)

By taking \(\delta =-\frac{j}{2}+it\) for the numerator gamma functions and \(\delta =-\frac{j}{2}\) for the denominator gamma functions, the third factor in \(\prod _{j=1}^{p-1}\frac{\Gamma (\alpha -\frac{j}{2}+it)}{\Gamma (\alpha -\frac{j}{2})}\) tends to

$$\begin{aligned}&\prod _{j=1}^{p-1}\textrm{e}^{-\frac{1}{6\alpha ^2}[(-\frac{j}{2}+it)^3-\frac{3}{2}(-\frac{j}{2}+it)^2+\frac{1}{2}(-\frac{j}{2}+it)]} /\textrm{e}^{-\frac{1}{6\alpha ^2}[(-\frac{j}{2})^3-\frac{3}{2}(-\frac{j}{2})^2+\frac{1}{2}(-\frac{j}{2})]}\nonumber \\&\ =\textrm{e}^{-\frac{\beta }{6\alpha ^2}} \end{aligned}$$
(A.9)

where \(\beta =\tfrac{1}{8}(p-1)p(2p+5)(it)+\tfrac{3}{4}p(p-1)t^2+(p-1)(it)^3 -\tfrac{3}{2}(p-1)(it)^2+\tfrac{(p-1)}{2}(it)\).

Then, the three-term approximation of \(E[|R|^h]\) is the following for \(h=it\):

$$\begin{aligned} E[|R|^{it}]&=E[\textrm{e}^{it\ln |R|}]\\&\approx \textrm{e}^{\frac{p(p-1)}{4\alpha }(it)-\frac{1}{6\alpha ^2}[\frac{1}{8}(p-1)p(2p+5)(it)+\frac{3}{4}p(p-1)t^2]}, \end{aligned}$$

so that

$$\begin{aligned} \!\!\!\!\!\!\!\!\!\! E[\textrm{e}^{it(\alpha \ln |R|)}]&\approx \textrm{e}^{\frac{p(p-1)}{4}(it)-\frac{1}{6\alpha }[\frac{1}{8}(p-1)p(2p+5)(it)]-\frac{1}{2}\frac{p(p-1)}{4}t^2}. \end{aligned}$$
(A.10)

One can readily verify that, for \(n=4,5,\ldots ,\) additional factors involving \(B_n(\delta )\) will converge to 1 when \(|\alpha |\rightarrow \infty \). Accordingly, when \(|\alpha |\rightarrow \infty ,\)

$$\begin{aligned} E[\textrm{e}^{it(\alpha \ln |R|-\frac{p(p-1)}{4})}]&\rightarrow \textrm{e}^{-\frac{1}{2}\frac{p(p-1)}{4}t^2}, \end{aligned}$$

which implies that

$$\begin{aligned} E[\textrm{e}^{it\frac{{2}}{\sqrt{p(p-1)}}(\alpha \ln |R|-\frac{p(p-1)}{4})}]&\rightarrow \textrm{e}^{-\frac{t^2}{2}}, \end{aligned}$$

so that

$$\begin{aligned} \tfrac{{2}}{\sqrt{p(p-1)}}(\alpha \ln |R|-\tfrac{p(p-1)}{4})&\rightarrow \mathcal{N}_1(0,1).\end{aligned}$$
(A.11)

Theorem 3.4 follows from (A.7) to (A.11).

Proof of Theorem 4.1

On integrating out \(s_{jj},\ j=1,\ldots ,k\), from S, one has

$$\begin{aligned} f_{R}(R)\textrm{d}R&=\frac{|B|^{\alpha }}{\Gamma _p(\alpha )}|R|^{\alpha -\frac{p+1}{2}}\int _0^{\infty }\cdots \int _0^{\infty }(s_{11}\cdots s_{pp})^{\alpha -1}\nonumber \\&\qquad \times \textrm{e}^{-\sum _{j=1}^pb_{jj}s_{jj}-2\sum _{i<j}b_{ij}s_{ij}}\textrm{d}s_{11}\wedge \ldots \wedge \textrm{d}s_{pp}\wedge \textrm{d}R, \end{aligned}$$
(A.12)

where the exponent \(\sum _{i<j}b_{ij}s_{ij}\) can be re-expressed as follows:

$$\begin{aligned} \sum _{i<j}b_{ij}s_{ij}&=\sum _{i<j}b_{ij}r_{ij}\sqrt{s_{ii}s_{jj}}\nonumber \\&=b_{12}r_{12}\sqrt{s_{11}s_{22}}+\cdots +b_{1,p-1}r_{1,p-1}\sqrt{s_{11} s_{p-1,p-1 }} +b_{1p}r_{1p}\sqrt{s_{11}s_{pp}} \nonumber \\&\ \ \ +b_{23}r_{23}\sqrt{s_{22}s_{33}}+\cdots +b_{2p}r_{2p}\sqrt{s_{22}s_{pp}}\nonumber \\&\ \ \ +\ \cdots \nonumber \\&\ \ \ +b_{p-1,p}r_{p-1,p}\sqrt{s_{p-1,p-1}s_{pp}}\,. \end{aligned}$$
(A.13)

Now, on expanding the exponential terms, one has

$$\begin{aligned} \textrm{e}^{-2\sum _{i<j}b_{ij}s_{ij}}&=\prod _{i<j}\textrm{e}^{-2b_{ij}r_{ij}\sqrt{s_{ii}s_{jj}}}\nonumber \\&=\!\!\sum _{k_{12}=0}^{\infty }\!\!\frac{(-2b_{12}\,r_{12}\sqrt{s_{11}s_{22}})^{k_{12}}}{k_{12}!} \cdots \nonumber \\&\sum _{k_{p-1,p}=0}^{\infty }\!\!\!\!\frac{(-2b_{p-1,p}\,r_{p-1,p} \sqrt{s_{p-1,p-1}s_{pp}})^{k_{p-1,p}}}{k_{p-1,p}!}\nonumber \\&=\sum _{k=0}^{\infty }\ \sum _K\Big [ \prod _{i<j}\frac{(-2b_{ij}r_{ij})^{k_{ij}}}{k_{ij}!}\Big ]\, s_{11}^{\rho _1}\cdots s_{pp}^{\rho _p} \end{aligned}$$
(A.14)

where \(\sum _K\) denotes the sum over all \(k_{ij},\ i<j\), such that \(k=k_{12}+\cdots +k_{1p}+k_{23}+\cdots +k_{2p}+\cdots +k_{p-1,p}\) and \(\rho _r\) \(=\frac{1}{2}(k_{1r}+k_{2r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})\) with \(k_{ij}=0\) whenever \(i\ge j\). Note that all the possible combinations of nonnegative integer values that the \(k_{ij}\)’s can take on in the next-to-last equality are accounted for in the last one. Then, for example, the integrals over \(s_{11}\) and \(s_{rr}\) can be evaluated as follows whenever \(\Re (\alpha )>0\):

$$\begin{aligned} \int _{s_{11}=0}^{\infty }s_{11}^{\alpha -1+\rho _1}\textrm{e}^{-b_{11}s_{11}}\textrm{d}s_{11}&=b_{11}^{-[\alpha +\frac{1}{2}(k_{12}+\cdots +k_{1p})]}\Gamma (\alpha +\tfrac{1}{2}(k_{12}+\cdots +k_{1p})),\nonumber \\ \int _{s_{rr}=0}^{\infty }s_{rr}^{\alpha -1+\rho _r}\textrm{e}^{-b_{rr}s_{rr}}\textrm{d}s_{rr}&=b_{rr}^{-[\alpha +\frac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})]}\nonumber \\&\ \ \ \times \Gamma (\alpha +\tfrac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})), \end{aligned}$$
(A.15)

for \(r=2,\ldots ,p\). Therefore, the product of the integrals over \(s_{rr},\ r=1,\ldots ,p,\) can be written as

$$\begin{aligned}{} & {} \prod _{r=1}^pb_{rr}^{-(\alpha +\rho _r)}\Gamma (\alpha +\rho _r),\ \Re (\alpha )>0, \nonumber \\{} & {} \quad \rho _r=\tfrac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp}). \end{aligned}$$
(A.16)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mathai, A.M., Provost, S.B. On the distribution of sample scale-free scatter matrices. Stat Papers 65, 121–138 (2024). https://doi.org/10.1007/s00362-022-01388-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-022-01388-8

Keywords

Mathematics Subject Classification

Navigation