Skip to main content

Testing the hypothesis of a block compound symmetric covariance matrix for elliptically contoured distributions

Abstract

In this paper, the authors study the problem of testing the hypothesis of a block compound symmetry covariance matrix with two-level multivariate observations, taken for m variables over u sites or time points. Through the use of a suitable block-diagonalization of the hypothesis matrix, it is possible to obtain a decomposition of the main hypothesis into two sub-hypotheses. Using this decomposition, it is then possible to obtain the likelihood ratio test statistic as well as its exact moments in a much simpler way. The exact distribution of the likelihood ratio test statistic is then analyzed. Because this distribution is quite elaborate, yielding a non-manageable distribution function, a manageable but very precise near-exact distribution is developed. Numerical studies conducted to evaluate the closeness between this near-exact distribution and the exact distribution show the very good performance of this approximation even for very small sample sizes and the approach followed allows us to extend its validity to situations where the population distributions are elliptically contoured. A real-data example is presented and a simulation study is also conducted.

This is a preview of subscription content, access via your institution.

Fig. 1

References

  1. Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New Jersey

    MATH  Google Scholar 

  2. Arnold SF (1979) Linear models with exchangeably distributed errors. J Am Stat Assoc 74:194–199

    MathSciNet  Article  MATH  Google Scholar 

  3. Arnold BC, Coelho CA, Marques FJ (2013) The distribution of the product of powers of independent uniform random variables—a simple but useful tool to address and better understand the structure of some distributions. J Multivar Anal 113:19–36

    MathSciNet  Article  MATH  Google Scholar 

  4. Box GEP (1949) A general distribution theory for a class of likelihood criteria. Biometrika 36:317–346

    MathSciNet  Article  MATH  Google Scholar 

  5. Coelho CA (1998) The generalized integer gamma distribution—a basis for distribution in multivariate statistics. J Multivar Anal 64:86–102

    MathSciNet  Article  MATH  Google Scholar 

  6. Coelho CA (2004) The generalized near-integer gamma distribution: a basis for ‘near-exact’ approximations to the distribution of statistics which are the product of an odd number of independent Beta random variables. J Multivar Anal 89:191–218

    MathSciNet  Article  MATH  Google Scholar 

  7. Coelho CA, Marques FJ (2012) Near-exact distributions for the likelihood ratio test statistic to test equality of several variance-covariance matrices in elliptically contoured distributions. Comp Stat 27:627–659

    MathSciNet  Article  MATH  Google Scholar 

  8. Coelho CA, Marques FJ (2013) The multi-sample block-scalar sphericity test - exact and near-exact distributions for its likelihood ratio test statistic. Comm Statist Theory Methods 42:1153–1175

    MathSciNet  Article  MATH  Google Scholar 

  9. Coelho CA, Marques, FJ, Oliveira S (2016) Near-exact distributions for likelihood ratio statistics used in the simultaneous test of conditions on mean vectors and patterns of covariance matrices. Math Probl Eng. doi:10.1155/2016/8975902

  10. Demidenko E (2004) Mixed models—theory and applications. Wiley, Hoboken

    Book  MATH  Google Scholar 

  11. Huynh H, Feldt LS (1970) Conditions under which mean square ratios in repeated measurements designs have exact F-distributions. J Am Stat Assoc 65:1582–1589

    Article  MATH  Google Scholar 

  12. Johnson RA, Wichern DW (2007) Applied multivariate statistical analysis, 6th edn. Pearson Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  13. Jones RH (1993) Longitudinal data with serial correlation: a state-space approach. Monographs on Statistics and Applied Probability 47, Springer-Science and Business Media, B.V

  14. King ML, Evans MA (1986) Testing for block effects in regression models based on survey data. J Am Stat Assoc 81:677–679

    Article  MATH  Google Scholar 

  15. Krishnaiah PR, Lee JC (1974) On covariance structures. Sankhyā 38:357–371

    MathSciNet  MATH  Google Scholar 

  16. Krishnaiah PR, Lee JC (1980) Likelihood ratio tests for mean vectors and covariance matrices. In: Krishnaiah PR (ed) Handbook of Statistics. Elsevier, Amsterdam, vol 1, pp 513–570

  17. Kutner MH, Nachtsheim CJ, Neter J, Li W (2005) Applied linear statistical models, 5th edn. McGraw-Hill/Irwin, New York

    Google Scholar 

  18. Li J, Wong WK (2010) Selection of covariance patterns for longitudinal data in semi-parametric models. Stat Methods Med Res 19:183–196

    MathSciNet  Article  Google Scholar 

  19. Liang KY, Zeger SL (1986) Longitudinal data analysis using generalised linear models. Biometrika 73:12–22

    Article  Google Scholar 

  20. Malott C (1990) Maximum likelihood methods for nonlinear regression models with compound-symmetric error covariance. Ph.D. Thesis, The University of North Carolina, Chapel Hill

  21. Marques FJ, Coelho CA, Arnold BC (2011) A general near-exact distribution theory for the most common likelihood ratio test statistics used in multivariate analysis. Test 20:180–203

    MathSciNet  Article  MATH  Google Scholar 

  22. Marques FJ, Coelho CA (2012) Near-exact distributions for the likelihood ratio test statistic of the multi-sample block-matrix sphericity test. Appl Math Comput 219:2861–2874

    MathSciNet  MATH  Google Scholar 

  23. Matos LA, Castro LM, Lachos VH (2016) Censored mixed-effects models for irregularly observed repeated measures with applications to HIV viral loads. TEST 25:627–653

  24. Morrison DF (1976) Multivariate statistical methods, 3rd edn. McGraw-Hill Inc., New York

    MATH  Google Scholar 

  25. Muñoz A, Carey V, Schouten JP, Segal M, Rosner B (1992) A parametric family of correlation structures for the analysis of longitudinal data. Biometrics 48:733–742

    Article  Google Scholar 

  26. Qu A, Li R (2006) Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics 62:379–391

    MathSciNet  Article  MATH  Google Scholar 

  27. Rao CR (1945) Familial correlations or the multivariate generalizations of the intraclass correlation. Curr Sci 14:66–67

    Google Scholar 

  28. Rao CR (1953) Discriminant functions for genetic differentiation and selection. Sankhyā 12:229–246

    MATH  Google Scholar 

  29. Reinsel G (1982) Multivariate repeated-measurement or growth curve models with multivariate random-effects covariance structure. J Am Stat Assoc 77:190–195

    MathSciNet  Article  MATH  Google Scholar 

  30. Roy A (2006) A new classification rule for incomplete doubly multivariate data using mixed effects model with performance comparisons on the imputed data. Stat Med 25:1715–1728

    MathSciNet  Article  Google Scholar 

  31. Roy A, Fonseca M (2012) Linear models with doubly exchangeable distributed errors. Comm Stat Theory Methods 41:2545–2569

    MathSciNet  Article  MATH  Google Scholar 

  32. Roy A, Leiva R (2011) Estimating and testing a structured covariance matrix for three-level multivariate data. Comm Stat Theory Methods 40:1945–1963

    MathSciNet  Article  MATH  Google Scholar 

  33. Scott AJ, Holt D (1982) The effect of two-stage sampling on ordinary least squares methods. J Am Stat Assoc 77:848–854

    Article  MATH  Google Scholar 

  34. Timm NH (1980) Multivariate analysis of variance of repeated measurements. In: Krishnaiah PR (ed) Handbook of Statistics. Elsevier, North Holland, vol 1, pp 41–87

  35. Timm NH (2002) Applied multivariate analysis. Springer, New York

    MATH  Google Scholar 

  36. Tricomi FG, Erdélyi A (1951) The asymptotic expansion of a ratio of Gamma functions. Pac J Math 1:133–142

    MathSciNet  Article  MATH  Google Scholar 

  37. Verbeke G, Molenberghs G (2000) Linear mixed models for longitudinal data. Springer, New York

    MATH  Google Scholar 

  38. Vonesh E, Chinchilli VM (1997) Linear and nonlinear models for the analysis of repeated measurements. CRC, Marcel Dekker, New York

  39. Wilks SS (1946) Sample criteria for testing equality of means, equality of variances, and equality of covariances in a Normal multivariate distribution. Ann Math Stat 17:257–281

    MathSciNet  Article  MATH  Google Scholar 

  40. Zimmerman DL, Núñez-Antón V (2001) Parametric modelling of growth curve data: an overview. TEST 10:1–73

    MathSciNet  Article  MATH  Google Scholar 

Download references

Acknowledgements

Both authors would also like to thank the comments and remarks from two anonymous referees and the Editor-in-Chief of the Journal who helped in clarifying a number of small issues and improving the readability of the paper, namely in giving it a more adequate introduction to the usefulness and applicability of the CS and BCS covariance structures.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Carlos A. Coelho.

Additional information

This research was partially supported by FCT–Fundação para a Ciência e Tecnologia (Portuguese Foundation for Science and Technology), Project UID/MAT/00297/2013, through Centro de Matemática e Aplicações (CMA/FCT/UNL). A. Roy also thanks the support for the summer research grant from the College of Business at the University of Texas at San Antonio.

Appendices

Appendix 1: Shape parameters in the moment expressions for \(\varLambda _b\)

According to Coelho and Marques (2012) and Marques et al. (2011), the shape parameters \(s_j\) in (14) are given by

$$\begin{aligned} s_j=\left\{ \begin{array}{lcl} s^*_{j-1}, &{}&{}\quad \mathrm{for~} j=2,\ldots ,m,\\ &{}&{} \quad {\text {except }}j=m-2\alpha _1\\ s_{j-1}^*+(m\perp \!\!\!\perp 2)(\alpha _2-\alpha _1)\left( (u-1)-\frac{m-1}{2}+(u-1)\left\lfloor \frac{m}{2(u-1)}\right\rfloor \right) , &{}&{}\quad \mathrm{for~} j=m-2\alpha _1 \end{array} \right. \end{aligned}$$
(28)

with

$$\begin{aligned} s^*_j=\left\{ \begin{array}{lll} \gamma _j &{} \quad \mathrm{for} &{} \quad j=1,\ldots ,\alpha +1\\ (u-1)\left( \left\lfloor \frac{m}{2}\right\rfloor -\left\lfloor \frac{j}{2}\right\rfloor \right) &{} \quad \mathrm{for} &{} \quad j=\alpha +2,...,\min (m-2\alpha _1,m-1)\\ &{} \quad \mathrm{and} &{}\quad j=2+m-2\alpha _1,...,2\left\lfloor \frac{m}{2}\right\rfloor -1,\quad {\text {by steps of 2}}\\ (u-1)\left( \left\lfloor \frac{m+1}{2}\right\rfloor -\left\lfloor \frac{j}{2}\right\rfloor \right) &{} \quad \mathrm{for} &{} \quad j=1+m-2\alpha _1,...,m-1,\quad {\text {by steps of 2}},\\ \end{array} \right. \end{aligned}$$
(29)

and

$$\begin{aligned} \alpha =\left\lfloor \frac{m-1}{u-1}\right\rfloor , \quad \alpha _1=\left\lfloor \frac{u-2}{u-1}\,\frac{m-1}{2}\right\rfloor ,\quad \alpha _2=\left\lfloor \frac{u-2}{u-1}\,\frac{m+1}{2}\right\rfloor , \end{aligned}$$
(30)

where, for \({j=1,\ldots ,\alpha }\),

(31)

and

(32)

Appendix 2: Gamma distribution and related results

We say that the r.v. X follows a Gamma distribution with shape parameter \(r>0\) and rate parameter \(\lambda >0\), if the p.d.f. of X is

$$\begin{aligned} f^{}_X(x)=\frac{\lambda ^r}{\varGamma (r)}\,e^{-\lambda x}\, x^{r-1},\quad (x>0) \end{aligned}$$

and this fact is denoted by \(X\sim \varGamma (r,\lambda )\). Then, the moment generating function of X is

$$\begin{aligned} M^{}_X(t)=\lambda ^r(\lambda -t)^{-r}\quad (t<\lambda ), \end{aligned}$$

so that if we define \(Z=e^{-X}\) we have

$$\begin{aligned} E(Z^h)=E\left( e^{-hX}\right) =M^{}_X(-h)=\lambda ^r(\lambda +h)^{-r}\quad (h>-\lambda ). \end{aligned}$$

Appendix 3: The reasoning behind the use of \(\varPhi _2(t)\) in (24) to approximate \(\varPhi _{W,2}(t)\)

From the two first expressions in Sect. 5 on Tricomi and Erdélyi (1951) and also expressions (11) and (14), this last one already in Sect. 6 of this same reference, we may write

$$\begin{aligned} \frac{\varGamma (a-{\mathrm {i}}t)}{\varGamma (a+b-{\mathrm {i}}t)}\approx \sum _{k=0}^\infty p_k(b)\,(a-{\mathrm {i}}t)^{-(b+k)} \end{aligned}$$
(33)

where

$$\begin{aligned} p_k(b)=\frac{1}{k}\sum _{\ell =0}^{k-1}\left( \frac{\varGamma (1-b-\ell )}{\varGamma (-b-k)(k-\ell +1)!}+(-1)^{k+\ell }\,b^{k-\ell +1}\right) p_\ell (b),\quad k=1,2,\ldots , \end{aligned}$$
(34)

with \(p_0(b)=1\), and where the approximation in (33) gets sharper for larger values of a.

Then, since the c.f. of \(Y=-\log \,X\), where \(X\sim Beta(a,b)\), is given by

$$\begin{aligned} \varPhi _Y(t)=\frac{\varGamma (a+b)}{\varGamma (a)}\,\frac{\varGamma (a-{\mathrm {i}}t)}{\varGamma (a+b-{\mathrm {i}}t)}, \end{aligned}$$

using (33), one may write

$$\begin{aligned} \varPhi _Y(t)\approx \sum _{k=0}^\infty \underbrace{\frac{\varGamma (a+b)}{\varGamma (a)}\,\frac{p_k(b)}{a^{b+k}}}_{p^*_k(a,b)}\,a^{b+k}\,(a-{\mathrm {i}}t)^{-(b+k)} \end{aligned}$$

whose right hand side is the c.f. of an infinite mixture of \(\varGamma (b+k,a)\) distributions, with weights \(p^*_k(a,b)\), with \(p_k(b)\) given by (34).

Then, since \(\varPhi _{W,2}(t)\) is th c.f. of a sum of independent Logbeta r.v.’s with different parameters, namely different first parameters, it would be approximated by a c.f. of an infinite mixture of sums of independent Gamma r.v.’s, with different rate parameters, which themselves are mixtures of Gamma r.v.’s. Thus, using a somewhat heuristic approach, one would use as a first simplification of this approximating c.f. a c.f. of an infinite mixture of Gamma distributions, all with the same rate parameter and with shape parameters \(r+k\) for \(k=0,1,\ldots \), where r is equal to the sum of all the second parameters of the Logbeta r.v.’s in \(\varPhi _{W,2}(t)\), which will then be further simplified to the c.f. \(\varPhi _2(t)\), which is the c.f. of a finite mixture of Gamma distributions with shape parameters \(r+k\) \((k=0,1,\ldots )\) and rate parameter \(\lambda \), and with weights \(\pi _k\) which will be determined as it is explained in the body of the paper, after the call to this Appendix. The rate parameter \(\lambda \) will be defined in a somewhat heuristic way which has proven in practice to work very well, while the first \(m^*\) weights, \(\pi _0,\ldots ,\pi _{m^*-1}\), are determined by equating the first \(m^*\) derivatives of \(\varPhi _{W,2}(t)\) and \(\varPhi _2(t)\), which will lead to near-exact distributions that match the first \(m^*\) exact moments of \(W=-\log \,\varLambda \). Then, by taking \(\pi _{m^*}=1-\sum _{k=0}^{m^*-1}\pi _k\), we will assure, in practice, that \(\varPhi _2(t)\) corresponds to a true c.f., and that the corresponding c.d.f. reaches the value of 1 as the running value of W goes to infinity. We may note that some of the weights \(\pi _k\) may be non-positive, indeed as already some of the weights \(p_k(b)\) in (34) are also non-positive.

Appendix 4: Rational that shows that \(\varDelta \) in (27) always yields a finite value

The tails of \(\left| \frac{\varPhi _W(t)-\varPhi ^*_W(t)}{t}\right| \) for any two c.f.’s \(\varPhi _W(t)\) and \(\varPhi ^*_W(t)\) are always dominated by the tails of \(e^{-b|t|}\), for some \({b>0}\), that is, there exists always some \(\delta >0\) such that for \(|t|>\delta \),

$$\begin{aligned} \left| \frac{\varPhi _W(t)-\varPhi ^*_W(t)}{t}\right| < e^{-b|t|} \end{aligned}$$

while \(\left| \frac{\varPhi _W(t)-\varPhi ^*_W(t)}{t}\right| \) is a continuous function for which the limit when t tends towards zero always exists and is finite, being equal to the difference of the expected values corresponding to \(\varPhi _W(t)\) and \(\varPhi ^*_W(t)\), in case both of these exist, so that since \(\int _{-\infty }^{+\infty } e^{-b|t|}\,\mathrm{{d}}t\) is finite, also \(\varDelta \) in (27) is.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Coelho, C.A., Roy, A. Testing the hypothesis of a block compound symmetric covariance matrix for elliptically contoured distributions. TEST 26, 308–330 (2017). https://doi.org/10.1007/s11749-016-0512-4

Download citation

Keywords

  • Beta random variables
  • Characteristic function
  • Composition of hypotheses
  • Likelihood ratio statistic
  • Near-exact distributions
  • Product distribution

Mathematics Subject Classification

  • 62H15
  • 62H10
  • 62E15
  • 62E20
  • 62E10
  • 60E10