On the distribution of sample scale-free scatter matrices

Mathai, A. M.; Provost, Serge B.

doi:10.1007/s00362-022-01388-8

On the distribution of sample scale-free scatter matrices

Regular Article
Published: 27 December 2022

Volume 65, pages 121–138, (2024)
Cite this article

Statistical Papers Aims and scope Submit manuscript

148 Accesses
1 Citation
Explore all metrics

Abstract

This paper addresses certain distributional aspects of a scale-free scatter matrix denoted by R that is stemming from a matrix-variate gamma distribution having a positive definite scale parameter matrix B. Under the assumption that B is a diagonal matrix, a structural representation of the determinant of R is derived; the exact density functions of products and ratios of determinants of matrices possessing such a structure are obtained; a closed form expression is given for the density function of R. Moreover, a novel procedure is utilized to establish that certain functions of the determinant of the sample scatter matrix are asymptotically distributed as chi-square or normal random variables. Then, representations of the density function of R that respectively involve multiple integrals, multiple series and Gauss’ hypergeometric function are provided for the general case of a positive definite scale parameter matrix, and an illustrative numerical example is presented. Cutting-edge mathematical techniques have been employed to derive the results. Naturally, they also apply to the conventional sample correlation matrix which is encountered in various multivariate inference contexts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Approximating symmetrized estimators of scatter via balanced incomplete U-statistics

Article 08 August 2023

High-Breakdown Estimators of Multivariate Location and Scatter

On the Computation of Symmetrized M-Estimators of Scatter

References

Bao Z, Pan G, Zhou W (2012). Tracy-Widom law for the extreme eigenvalues of sample correlation matrices. Electron J Probab 88:1–32. ISSN: 1083-6489 https://doi.org/10.1214/EJP.v17-1962
Dette H, Dörnemann N (2020) Likelihood ratio tests for many groups in high dimensions. J Multivar Anal 178:104605
Article MathSciNet Google Scholar
Dörnemann N (2023) Likelihood ratio tests under model misspecification in high dimensions. J Multivar Anal 193:105122
Article MathSciNet Google Scholar
Ermolaev VT, Rodyushkin KV (1999) The distribution function of the maximum eigenvalue of a sample correlation matrix of internal noise of antenna-array elements. Radiophys Quantum Electron 2(5):439–444 (UDC 621.396.67.01)
Article Google Scholar
Fang C, Krishnaiah PR (1982) Asymptotic distributions of functions of the eigenvalues of some random matrices for nonnormal populations. J Multivar Anal 12:39–63
Article MathSciNet Google Scholar
Farrell R (1985) Multivariate calculation. Springer, New York. https://doi.org/10.1007/978-1-4613-8528-8
Book Google Scholar
Grote J, Kabluchko Z, Thäle C (2019) Limit theorems for random simplices in high dimensions. ALEA Latin Am J Probab Stat 16(1):141–177
Article MathSciNet Google Scholar
Gupta AK, Nagar DK (2000) Matrix variate distributions. Hall/CRC, Boca Raton
Google Scholar
Gupta AK, Nagar DK (2004) Distribution of the determinant of the sample correlation matrix from a mixture normal model. Random Oper Stoch Equ 12(2):193–199
Article MathSciNet Google Scholar
Heiny J, Johnston S, Prochno J (2022) Thin-shell theory for rotationally invariant random simplices. Electron J Probab 27:1–141
Article MathSciNet Google Scholar
Heiny J, Mikosch T (2018) Almost sure convergence of the largest and smallest eigenvalues of high-dimensional sample correlation matrices. Stoch Process Appl 128:2779–2815
Article MathSciNet Google Scholar
Heiny J, Yao J (2020) Limiting distributions for eigenvalues of sample correlation matrices from heavy-tailed populations. arXiv:2003.03857v1 [math.PR] 8 Mar 2020
Jiang T (2019) Determinant of sample correlation matrix with application. Ann Appl Probab 29(3):1356–1397
Article MathSciNet Google Scholar
Kollo T, Neudecker H (1993) Asymptotics of eigenvalues and unit-length eigenvectors of sample variance and correlation matrices. J Multivar Anal 47:283–334
Article MathSciNet Google Scholar
Kollo T, Ruul K (2003) Approximations to the distribution of the sample correlation matrix. J Multivar Anal 85:318–334. https://doi.org/10.1016/S0047-259X(02)00037-4
Article MathSciNet Google Scholar
Konishi S (1979) Asymptotic expansions of statistics based on the sample correlation matrix in principal component analysis. Hiroshima Math J 9:647–700
Article MathSciNet Google Scholar
Mathai AM (1993) A handbook of generalized special functions for statistical and physical sciences. Oxford University Press, Oxford
Google Scholar
Mathai AM, Haubold H (2008) Special functions for applied scientists. Springer, New York. https://doi.org/10.1007/978-0-387-75894-7
Book Google Scholar
Mathai AM, Saxena RK, Haubold HJ (2010) The H-function: theory and applications. Springer, New York
Book Google Scholar
Pham-Gia T, Choulakian V (2014) Distribution of the sample correlation matrix and applications. Open J Stat 4:330–344. https://doi.org/10.4236/ojs.2014.45033
Article Google Scholar
Parolya N, Heiny J, Kurowicka D (2021). Logarithmic law of large random correlation matrix. Preprint. arXiv:2103.13900
Schott J (1997) Matrix analysis for statisticians. Wiley, New York
Google Scholar
Taniguchi M, Krishnaiah PR (1987) Asymptotic distributions of functions of the eigenvalues of sample covariance matrix and canonical correlation matrix in multivariate time series. J Multivar Anal 22:156–176
Article MathSciNet Google Scholar

Download references

Acknowledgements

We would like to express our sincere thanks to two reviewers for their insightful comments and valuable suggestions. The financial support of the Natural Sciences and Engineering Research Council of Canada is gratefully acknowledged by the second author.

Author information

Authors and Affiliations

Department of Mathematics and Statistics, McGill University, Montreal, Canada
A. M. Mathai
Department of Statistical and Actuarial Sciences, The University of Western Ontario, London, Canada
Serge B. Provost

Authors

A. M. Mathai
View author publications
You can also search for this author in PubMed Google Scholar
Serge B. Provost
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serge B. Provost.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 172 KB)

Appendices

Appendix

Proof of Theorem 3.4

We will need the asymptotic expansion of the gamma function, namely,

$$\begin{aligned} \Gamma (z+\delta )\!=\!\sqrt{2\pi }z^{z+\delta -\frac{1}{2}}\textrm{e}^{-z-\sum _{k=1}^{\infty }\frac{(-1)^kB_{k+1}(\delta )}{k(k+1)z^{k}}} \mathrm{as\ } |z|\rightarrow \infty \text{ and } ~\delta \text{ is } \text{ bounded, } \end{aligned}$$

(A.1)

where $B_k(\delta )$, $k=1,2,\ldots , $ are the Bernoulli polynomials. Lists of the first few Bernoulli polynomials and the first few Bernoulli numbers, along with the above expansion and further details, are given in Mathai (1993). In this case, only $B_2(\delta )$ and $B_3(\delta )$ will be needed for deriving the asymptotic normality. For illustrative purposes, we will list $B_k(\delta )$ for $k=2$ and 3:

$$\begin{aligned} B_2(\delta )=\delta ^2-\delta +\frac{1}{6};~~B_3(\delta )=\delta ^3-\frac{3}{2}\delta ^2+\frac{1}{2}\delta . \end{aligned}$$

(A.2)

It was established that the h-th moment of |R| for the case where the scale parameter matrix B is diagonal is

$$\begin{aligned} E[|R|^h]=\frac{\Gamma (\alpha -\frac{1}{2}+h)\cdots \Gamma (\alpha -\frac{p-1}{2}+h)}{\Gamma (\alpha -\frac{1}{2})\cdots \Gamma (\alpha -\frac{p-1}{2})}\left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +h)}\right] ^{p-1}. \end{aligned}$$

(A.3)

Consider the two-term approximation of a gamma function which is obtained by expanding (A.1) up to $k=1$, namely,

$$\begin{aligned} \Gamma (z+\delta )\approx \sqrt{2\pi }z^{z+\delta -\frac{1}{2}}\textrm{e}^{-z+\frac{1}{2z}(\delta ^2-\delta +\frac{1}{6})}\ \, \mathrm{as\ } |z|\rightarrow \infty \text{ and } ~\delta \text{ is } \text{ bounded }. \end{aligned}$$

(A.4)

We now expand all the gamma functions appearing in (A.3) by making use of (A.4) assuming that $|\alpha |\rightarrow \infty $, by writing $h=it,i=\sqrt{(-1)},\ t $ being a parameter, so that $E[|R|^{it}]$ becomes the characteristic function of $\ln |R|$ or $E[\textrm{e}^{it\ln |R|}]$. Then,

$$\begin{aligned} \left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +it)}\right] ^{p-1}\approx \textrm{e}^{-\frac{(p-1)}{2\alpha }(-t^2-it)} \end{aligned}$$

(A.5)

and

$$\begin{aligned} \prod _{j=1}^{p-1}\frac{\Gamma (\alpha -\frac{j}{2}+it)}{\Gamma (\alpha -\frac{j}{2})}\approx \textrm{e}^{\frac{1}{2\alpha }[-\frac{(p-1)p}{2}it-(p-1)t^2]}. \end{aligned}$$

(A.6)

On combining (A.5) and (A.6), one has the following two-term approximation:

$$\begin{aligned} E[|R|^h]\approx \textrm{e}^{\frac{1}{2\alpha }\frac{p(p-1)}{2}(it)}. \end{aligned}$$

(A.7)

Now, consider a three-term approximation of all the gamma functions when $\alpha $ is large. The third factor in the asymptotic series of the gamma function is available from (A.1), letting $k=2$, and (A.2). We will consider the approximation of the third factor for $k=2$. By taking $\delta =0$ for the numerator gamma functions and $\delta =it$ for the denominator gamma functions, the third factor can be approximated as follows:

$$\begin{aligned} \left[ \frac{\Gamma (\alpha )}{\Gamma (\alpha +it)}\right] ^{p-1}&\rightarrow \prod _{j=1}^{p-1}\textrm{e}^{-\frac{1}{6\alpha ^2}(0)}/\textrm{e}^{-\frac{1}{6\alpha ^2}[(it)^3-\frac{3}{2}(it)^2+\frac{1}{2}(it)]}\nonumber \\&=\textrm{e}^{\frac{(p-1)}{6\alpha ^2}[(it)^3-\frac{3}{2}(it)^2+\frac{1}{2}(it)]}. \end{aligned}$$

(A.8)

By taking $\delta =-\frac{j}{2}+it$ for the numerator gamma functions and $\delta =-\frac{j}{2}$ for the denominator gamma functions, the third factor in $\prod _{j=1}^{p-1}\frac{\Gamma (\alpha -\frac{j}{2}+it)}{\Gamma (\alpha -\frac{j}{2})}$ tends to

$$\begin{aligned}&\prod _{j=1}^{p-1}\textrm{e}^{-\frac{1}{6\alpha ^2}[(-\frac{j}{2}+it)^3-\frac{3}{2}(-\frac{j}{2}+it)^2+\frac{1}{2}(-\frac{j}{2}+it)]} /\textrm{e}^{-\frac{1}{6\alpha ^2}[(-\frac{j}{2})^3-\frac{3}{2}(-\frac{j}{2})^2+\frac{1}{2}(-\frac{j}{2})]}\nonumber \\&\ =\textrm{e}^{-\frac{\beta }{6\alpha ^2}} \end{aligned}$$

(A.9)

where $\beta =\tfrac{1}{8}(p-1)p(2p+5)(it)+\tfrac{3}{4}p(p-1)t^2+(p-1)(it)^3 -\tfrac{3}{2}(p-1)(it)^2+\tfrac{(p-1)}{2}(it)$.

Then, the three-term approximation of $E[|R|^h]$ is the following for $h=it$:

$$\begin{aligned} E[|R|^{it}]&=E[\textrm{e}^{it\ln |R|}]\\&\approx \textrm{e}^{\frac{p(p-1)}{4\alpha }(it)-\frac{1}{6\alpha ^2}[\frac{1}{8}(p-1)p(2p+5)(it)+\frac{3}{4}p(p-1)t^2]}, \end{aligned}$$

so that

$$\begin{aligned} \!\!\!\!\!\!\!\!\!\! E[\textrm{e}^{it(\alpha \ln |R|)}]&\approx \textrm{e}^{\frac{p(p-1)}{4}(it)-\frac{1}{6\alpha }[\frac{1}{8}(p-1)p(2p+5)(it)]-\frac{1}{2}\frac{p(p-1)}{4}t^2}. \end{aligned}$$

(A.10)

One can readily verify that, for $n=4,5,\ldots ,$ additional factors involving $B_n(\delta )$ will converge to 1 when $|\alpha |\rightarrow \infty $. Accordingly, when $|\alpha |\rightarrow \infty ,$

$$\begin{aligned} E[\textrm{e}^{it(\alpha \ln |R|-\frac{p(p-1)}{4})}]&\rightarrow \textrm{e}^{-\frac{1}{2}\frac{p(p-1)}{4}t^2}, \end{aligned}$$

which implies that

$$\begin{aligned} E[\textrm{e}^{it\frac{{2}}{\sqrt{p(p-1)}}(\alpha \ln |R|-\frac{p(p-1)}{4})}]&\rightarrow \textrm{e}^{-\frac{t^2}{2}}, \end{aligned}$$

so that

$$\begin{aligned} \tfrac{{2}}{\sqrt{p(p-1)}}(\alpha \ln |R|-\tfrac{p(p-1)}{4})&\rightarrow \mathcal{N}_1(0,1).\end{aligned}$$

(A.11)

Theorem 3.4 follows from (A.7) to (A.11).

Proof of Theorem 4.1

On integrating out $s_{jj},\ j=1,\ldots ,k$, from S, one has

$$\begin{aligned} f_{R}(R)\textrm{d}R&=\frac{|B|^{\alpha }}{\Gamma _p(\alpha )}|R|^{\alpha -\frac{p+1}{2}}\int _0^{\infty }\cdots \int _0^{\infty }(s_{11}\cdots s_{pp})^{\alpha -1}\nonumber \\&\qquad \times \textrm{e}^{-\sum _{j=1}^pb_{jj}s_{jj}-2\sum _{i<j}b_{ij}s_{ij}}\textrm{d}s_{11}\wedge \ldots \wedge \textrm{d}s_{pp}\wedge \textrm{d}R, \end{aligned}$$

(A.12)

where the exponent $\sum _{i<j}b_{ij}s_{ij}$ can be re-expressed as follows:

$$\begin{aligned} \sum _{i<j}b_{ij}s_{ij}&=\sum _{i<j}b_{ij}r_{ij}\sqrt{s_{ii}s_{jj}}\nonumber \\&=b_{12}r_{12}\sqrt{s_{11}s_{22}}+\cdots +b_{1,p-1}r_{1,p-1}\sqrt{s_{11} s_{p-1,p-1 }} +b_{1p}r_{1p}\sqrt{s_{11}s_{pp}} \nonumber \\&\ \ \ +b_{23}r_{23}\sqrt{s_{22}s_{33}}+\cdots +b_{2p}r_{2p}\sqrt{s_{22}s_{pp}}\nonumber \\&\ \ \ +\ \cdots \nonumber \\&\ \ \ +b_{p-1,p}r_{p-1,p}\sqrt{s_{p-1,p-1}s_{pp}}\,. \end{aligned}$$

(A.13)

Now, on expanding the exponential terms, one has

$$\begin{aligned} \textrm{e}^{-2\sum _{i<j}b_{ij}s_{ij}}&=\prod _{i<j}\textrm{e}^{-2b_{ij}r_{ij}\sqrt{s_{ii}s_{jj}}}\nonumber \\&=\!\!\sum _{k_{12}=0}^{\infty }\!\!\frac{(-2b_{12}\,r_{12}\sqrt{s_{11}s_{22}})^{k_{12}}}{k_{12}!} \cdots \nonumber \\&\sum _{k_{p-1,p}=0}^{\infty }\!\!\!\!\frac{(-2b_{p-1,p}\,r_{p-1,p} \sqrt{s_{p-1,p-1}s_{pp}})^{k_{p-1,p}}}{k_{p-1,p}!}\nonumber \\&=\sum _{k=0}^{\infty }\ \sum _K\Big [ \prod _{i<j}\frac{(-2b_{ij}r_{ij})^{k_{ij}}}{k_{ij}!}\Big ]\, s_{11}^{\rho _1}\cdots s_{pp}^{\rho _p} \end{aligned}$$

(A.14)

where $\sum _K$ denotes the sum over all $k_{ij},\ i<j$, such that $k=k_{12}+\cdots +k_{1p}+k_{23}+\cdots +k_{2p}+\cdots +k_{p-1,p}$ and $\rho _r$ $=\frac{1}{2}(k_{1r}+k_{2r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})$ with $k_{ij}=0$ whenever $i\ge j$. Note that all the possible combinations of nonnegative integer values that the $k_{ij}$’s can take on in the next-to-last equality are accounted for in the last one. Then, for example, the integrals over $s_{11}$ and $s_{rr}$ can be evaluated as follows whenever $\Re (\alpha )>0$:

$$\begin{aligned} \int _{s_{11}=0}^{\infty }s_{11}^{\alpha -1+\rho _1}\textrm{e}^{-b_{11}s_{11}}\textrm{d}s_{11}&=b_{11}^{-[\alpha +\frac{1}{2}(k_{12}+\cdots +k_{1p})]}\Gamma (\alpha +\tfrac{1}{2}(k_{12}+\cdots +k_{1p})),\nonumber \\ \int _{s_{rr}=0}^{\infty }s_{rr}^{\alpha -1+\rho _r}\textrm{e}^{-b_{rr}s_{rr}}\textrm{d}s_{rr}&=b_{rr}^{-[\alpha +\frac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})]}\nonumber \\&\ \ \ \times \Gamma (\alpha +\tfrac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp})), \end{aligned}$$

(A.15)

for $r=2,\ldots ,p$. Therefore, the product of the integrals over $s_{rr},\ r=1,\ldots ,p,$ can be written as

$$\begin{aligned}{} & {} \prod _{r=1}^pb_{rr}^{-(\alpha +\rho _r)}\Gamma (\alpha +\rho _r),\ \Re (\alpha )>0, \nonumber \\{} & {} \quad \rho _r=\tfrac{1}{2}(k_{1r}+\cdots +k_{r-1,r}+k_{r,r+1}+\cdots +k_{rp}). \end{aligned}$$

(A.16)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Mathai, A.M., Provost, S.B. On the distribution of sample scale-free scatter matrices. Stat Papers 65, 121–138 (2024). https://doi.org/10.1007/s00362-022-01388-8

Download citation

Received: 18 October 2022
Revised: 12 December 2022
Accepted: 13 December 2022
Published: 27 December 2022
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00362-022-01388-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On the distribution of sample scale-free scatter matrices

Abstract

Access this article

Similar content being viewed by others

Approximating symmetrized estimators of scatter via balanced incomplete U-statistics

High-Breakdown Estimators of Multivariate Location and Scatter

On the Computation of Symmetrized M-Estimators of Scatter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 172 KB)

Appendices

Appendix

Proof of Theorem 3.4

Proof of Theorem 4.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

On the distribution of sample scale-free scatter matrices

Abstract

Access this article

Similar content being viewed by others

Approximating symmetrized estimators of scatter via balanced incomplete U-statistics

High-Breakdown Estimators of Multivariate Location and Scatter

On the Computation of Symmetrized M-Estimators of Scatter

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 172 KB)

Appendices

Appendix

Proof of Theorem 3.4

Proof of Theorem 4.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation