Abstract
For a p-variate normal distribution with covariance matrix \( {\varvec{\Sigma }}\), the standardized generalized variance (SGV) is defined as the positive pth root of \( |{\varvec{\Sigma }}| \) and used as a measure of variability. Testing equality of the SGVs, for comparing the variability of multivariate normal distributions with different dimensions, is still regarded as matter of interest. The most classical test for this problem is the likelihood ratio test (LRT). In this article, testing equality of the SGVs of k multivariate normal distributions with possibly unequal dimensions is studied. To test this hypothesis, two approximations for the null distribution of the LRT statistic are proposed based on the well known Welch–Satterthwaite and Bartlett adjustment distribution approximation methods. Furthermore, the high-dimensional behavior of these approximated distributions is also investigated. Through a wide simulation study: first, the performance of the proposed tests with the classical LRT is compared in terms of type I error, power, and alpha adjusted equivalents; second, the robustness of the procedures with respect to departures from normality assumption is evaluated. Finally, the proposed methods are illustrated with two real data examples.
Similar content being viewed by others
References
Andrews JL, McNicholas PD (2014) Variable selection for clustering and classification. J Classif 31(2):136–153
Arvanitis LG, Afonja B (1971) Use of the generalized variance and the gradient projection method in multivariate stratified sampling. Biometrics 27(1):119–127
Bagnato L, Greselin F, Punzo A (2014) On the spectral decomposition in normal discriminant analysis. Commun Stat-Simul Comput 43(6):1471–1489
Bartlett MS (1937) Properties of sufficiency and statistical tests. Proc R Soc Lond Ser A-Math Phys Sci 160(901):268–282
Behara M, Giri N (1983) Generalized variance statistic in the testing of hypothesis in complex multivariate gaussian distributions. Archiv der Math 41(6):538–543
Bersimis S, Psarakis S, Panaretos J (2007) Multivariate statistical process control charts: an overview. Qual Reliab Eng Int 23(5):517–543
Bhandary M (1996) Test for generalized variance in signal processing. Stat Probab Lett 27(2):155–162
Billingsley P (2008) Probability and measure. Wiley, London
Boudt K, Rousseeuw PJ, Vanduffel S, Verdonck T (2017) The minimum regularized covariance determinant estimator. arXiv preprint arXiv:1701.07086
Campbell N, Mahon R (1974) A multivariate study of variation in two species of rock crab of the genus leptograpsus. Aust J Zool 22(3):417–425
Christensen W, Rencher A (1997) A comparison of Type I error rates and power levels for seven solutions to the multivariate Behrens–Fisher problem. Commun Stat-Simul Comput 26(4):1251–1273
Djauhari MA (2005) Improved monitoring of multivariate process variability. J Qual Technol 37(1):32–39
Djauhari MA, Mashuri M, Herwindiati DE (2008) Multivariate process variability monitoring. Commun Stat-Theory Methods 37(11):1742–1754
Garcia-Diaz JC (2007) The ’effective variance’control chart for monitoring the dispersion process with missing data. Eur J Ind Eng 1(1):40–55
Gossett E (2009) Discrete mathematics with proof. Wiley, London
Greselin F, Ingrassia S, Punzo A (2011) Assessing the pattern of covariance matrices via an augmentation multiple testing procedure. Stat Methods Appl 20(2):141–170
Greselin F, Punzo A (2013) Closed likelihood ratio testing procedures to assess similarity of covariance matrices. Am Stat 67(3):117–128
Gupta AS (1982) Tests for simultaneously determining numbers of clusters and their shape with multivariate data. Stat Probab Lett 1(1):46–50
Hallin M, Paindaveine D (2009) Optimal tests for homogeneity of covariance, scale, and shape. J Multivar Anal 100(3):422–444
Iliopoulos G, Kourouklis S (1998) On improved interval estimation for the generalized variance. J Stat Plan Inference 66(2):305–320
Jacod J, Protter P (2003) Probability essentials. Springer, Berlin
Jafari AA (2012) Inferences on the ratio of two generalized variances: independent and correlated cases. Stat Methods Appl 21(3):297–314
Jiang D, Jiang T, Yang F (2012) Likelihood ratio tests for covariance matrices of high-dimensional normal distributions. J Stat Plan Inference 142(8):2241–2256
Jolicoeur P, Mosimann J (1960) Size and shape variation in the painted turtle. A principal component analysis. Growth 24(4):339–354
Korkmaz S, Goksuluk D, Zararsiz G (2014) MVN: an R package for assessing multivariate normality. R J 6(2):151–162
Kotz S, Nadarajah S (2004) Multivariate t-distributions and their applications. Cambridge University Press, Cambridge
Lawley D (1956) A general method for approximating to the distribution of likelihood ratio criteria. Biometrika 43(3/4):295–303
Lee MH, MB Khoo (2017) Combined synthetic and |S| chart for monitoring process dispersion. Commun Stat-Simul Comput, 1–14
Mardia K (1970) Measures of multivariate skewness and kurtosis with applications. Biometrika 57(3):519–530
McNicholas PD (2016) Mixture model-based classification. CRC Press, Amsterdam
Muirhead R (2009) Aspects of multivariate statistical theory. Wiley, London
Najarzadeh D (2017) Testing equality of generalized variances of k multivariate normal populations. Commun Stat-Simul Comput, 1–10
Noor AM, Djauhari MA (2014) Monitoring the variability of beltline moulding process using wilks’s statistic. Malays J Fundam Appl Sci 6(2):116–120
Pena D, Linde A (2007) Dimensionless measures of variability and dependence for multivariate continuous distributions. Commun Stat-Theory Methods 36(10):1845–1854
Pena D, Rodriguez J (2003) Descriptive measures of multivariate scatter and linear dependence. J Multivar Anal 85(2):361–374
Petersen HC (2000) On statistical methods for comparison of intrasample morphometric variability: Zalavár revisited. Am J Phys Anthr 113(1):79–84
Pukelsheim F (2006) Optimal design of experiments. SIAM, Philadelphia
Punzo A, Browne RP, McNicholas PD (2016) Hypothesis testing for mixture model selection. J Stat Comput Simul 86(14):2797–2818
Rencher A (2002) Methods of multivariate analysis. Wiley, London
Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D, Ripley MB (2013) Package mass. Cran R
Sarkar SK (1989) On improving the shortest length confidence interval for the generalized variance. J Multivar Anal 31(1):136–147
Sarkar SK (1991) Stein-type improvements of confidence intervals for the generalized variance. Ann Inst Stat Math 43(2):369–375
Satterthwaite FE (1946) An approximate distribution of estimates of variance components. Biom Bull 2(6):110–114
SenGupta A (1987a) Generalizations of Barlett’s and Hartley’s tests of homogeneity using overall variability. Commun Stat-Theory Methods 16(4):987–996
SenGupta A (1987b) Tests for standardized generalized variances of multivariate normal populations of possibly different dimensions. J Multivar Anal 23(2):209–219
SenGupta A, Ng HKT (2011) Nonparametric test for the homogeneity of the overall variability. J Appl Stat 38(9):1751–1768
Tallis G, Light R (1968) The use of fractional moments for estimating the parameters of a mixed exponential distribution. Technometrics 10(1):161–175
Team RC (2015) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Welch BL (1947) The generalization of students problem when several different population variances are involved. Biometrika 34(1/2):28–35
Wilks SS (1932) Certain generalizations in the analysis of variance. Biometrika 24(3–4):471–494
Yeh A, Lin D, Zhou H, Venkataramani C (2003) A multivariate exponentially weighted moving average control chart for monitoring process variability. J Appl Stat 30(5):507–536
Yeh AB, Lin DK, McGrath RN (2006) Multivariate control charts for monitoring covariance matrix: a review. Qual Technol Quant Manag 3(4):415–436
Acknowledgements
We would like to express our sincere thanks to the editor and the two anonymous reviewers for their comments which greatly improved this article. The corresponding author would like to thank the “Iranian National Elites Foundation” for the financial support of this research.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
Here we present the proofs of Lemmas 2.2 and 3.2 as well as Theorems 3.1 and 3.3.
Proof of Lemma 2.2
Let \( {\varvec{W}}_{1}^*, {\varvec{W}}_{2}^*,\ldots ,{\varvec{W}}_{k}^* \) be the corresponding values of \( {\varvec{W}}_{1}, {\varvec{W}}_{2},\ldots ,{\varvec{W}}_{k} \) for transformed observations, respectively. Simply, it can be shown that \( {\varvec{W}}_{i}^*= {\varvec{\Psi }}_i{\varvec{W}}_i {\varvec{\Psi }}_i^\prime \) and since that \({\varvec{\Psi }}_i^\prime {\varvec{\Psi }}_i={\varvec{I}}_{p_i}\), we have \({\root p_i \of {{\left| {{{\varvec{W}}_{i}^*}} \right| }}}={\root p_i \of {{\left| {\varvec{\Psi }}_i{\varvec{W}}_i {\varvec{\Psi }}_i^\prime \right| }}}={\root p_i \of {{\left| {{{\varvec{W}}_{i} }} \right| }}}\). So, by (3), \(T_{LRT}\left( {{\varvec{x}}^{*}_1},{{\varvec{x}}^{*}_2},\ldots ,{{\varvec{x}}^{*}_k} \right) = T_{LRT}\left( {{\varvec{x}}_1},{{\varvec{x}}_2},\ldots ,{{\varvec{x}}_k} \right) \). The proof is complete. \(\square \)
Proof of Lemma 3.2
For any \( |b|<\infty \), it can be justified (Jiang et al. 2012, Lemma 2.1) that
as \( x \longrightarrow \infty \). Hence,
as \( n_j \longrightarrow \infty , \ j=1,2,\ldots , k\). Consequently,
as \( n_j \longrightarrow \infty , \ j=1,2,\ldots , k\). So, by replacing \( \Gamma _j\left( h \right) \) by \( \prod \limits _{t = 1}^{{p_j}} {{\left( {\frac{{{n_j-t}}}{2}} \right) ^h} e^{\frac{h(h-1)}{n_j-t}} } e^{ O(n_j^{-2})} \) in (24), we obtain
as \( n_j \longrightarrow \infty , \ j=1,2,\ldots , k\). This implies (17). The proof is complete. \(\square \)
Proof of Theorem 3.1
Using the multinomial theorem (Gossett 2009, Theorem 5.15), for any given nonnegative integer r,
Also, for \( i \ne j \in \left\{ {1,2, \ldots ,k} \right\} \), using again the multinomial theorem,
Now, taking the expectation of (20) and (21), and then using the independence of \({{\varvec{W}}_{j}}\)s, immediately yields
and
But, for any \( j = 1,2,\ldots , k\), it is well known that \(\frac{ \left| {{{\varvec{W}}_{j}}} \right| }{\left| {{{\varvec{\Sigma }}_j}} \right| }\) is distributed as a product of the chi-square distributions (see, Muirhead 2009), that is,
where \( \chi _{{n_j} - r}^2 \), \(r = 1,2,\ldots , p_{j} \) are independently chi-square random variables with \( {{n_j} - r} \) degrees of freedom. So, for any real number h and \( {n_j} > \max (p_j ,p_j-2h)\), \( j=1,2,\ldots ,k \), we have:
Replacing the mathematical expectations in the equations (22) and (23) by their respective values calculated from (24) under the \( {H_0}\) in (1), yield the desired equations (15) and (16). The proof is complete. \(\square \)
Proof of Theorem 3.3
From (20) with \( r=1 \), we have
Since that \( p_j \longrightarrow \infty \) and \( n_{j}>p_{j} +1\), we have \( n_j \longrightarrow \infty , \ j=1,2,\ldots , k\) and by (19),
So, we have
Consequently, for \( i \ne j \in \left\{ {1,2, \ldots ,k} \right\} \),
as \( p_j \longrightarrow \infty \), \( j=1,2,\ldots ,k \). Now, by the assumptions of the theorem,
as \( p_j \longrightarrow \infty \), \( j=1,2,\ldots ,k \). By taking the limit on both sides of (25) as \( p_j \longrightarrow \infty \), \( j=1,2,\ldots ,k \) and using the assumptions of the theorem, we get
Setting \( r=2 \) in (20), we obtain
Similar to the way that we used in the proof of (26), it can be justified that
as \( p_j \longrightarrow \infty \), \( j=1,2,\ldots ,k \). Therefore,
So, \(E{[Z_i]}\longrightarrow k-1\) and \( Var({Z_i}) \longrightarrow 0\), as \( p_j \longrightarrow \infty \), \( j=1,2,\ldots ,k \). In other words, \( Z_i \) converge in probability to \( E[Z_i] \), \(i=1,2,\ldots ,k \) (Billingsley 2008). The proof is complete. \(\square \)
Rights and permissions
About this article
Cite this article
Najarzadeh, D. Testing equality of standardized generalized variances of k multivariate normal populations with arbitrary dimensions. Stat Methods Appl 28, 593–623 (2019). https://doi.org/10.1007/s10260-019-00456-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-019-00456-y