Abstract
Tests of zero correlation between two or more vectors with large dimension, possibly larger than the sample size, are considered when the data may not necessarily follow a normal distribution. A single-sample case for several vectors is first proposed, which is then extended to the common covariance matrix under the assumption of homogeneity across several independent populations. The test statistics are constructed using a recently proposed modification of the RV coefficient (a correlation coefficient for vector-valued random variables) for high-dimensional vectors. The accuracy of the tests is shown through simulations.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(\mathbf{X}_k = (X_{k1}, \ldots , X_{kp})^T\), \(k = 1, \ldots , n\), be iid random vectors drawn from a population with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_k) = {\varvec{\mu }} \in \mathbb {R}^p\) and \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_k) = {\varvec{\Sigma }} \in \mathbb {R}^{p\times p}\), where \({\varvec{\Sigma }} > 0\) can be expressed as a partitioned matrix \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i,j = 1}^{~~b}\) with blocks \({\varvec{\Sigma }}_{ij} \in \mathbb {R}^{p_i\times p_j}\), \({\varvec{\Sigma }}_{ji} = {\varvec{\Sigma }}^T_{ij}\), \({\varvec{\Sigma }}^T_{ii} = {\varvec{\Sigma }}_{ii}\). We are interested to test
when the block dimensions, \(p_i\), may exceed the sample size, n , and the data may not necessarily follow the multivariate normal distribution. Under \(H_{0b}\), \({\varvec{\Sigma }}\) reduces to a block-diagonal structure, \({\varvec{\Sigma }} = \text {{diag}}({\varvec{\Sigma }}_{1}, \ldots , {\varvec{\Sigma }}_{bb})\), \({\varvec{\Sigma }}_{ii} \in \mathbb {R}^{p_i\times p_i}\). Obviously, under normality, the test of \(H_{0b}\) is equivalent to testing independence of the corresponding vectors.
Now, consider \(g \ge 2\) independent populations with \(\mathbf{X}_{lk} = (X_{lk1}, \ldots , X_{lkp})^T\), \(k = 1, \ldots , n_l\), as iid random vectors drawn from lth population with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_{lk}) = {\varvec{\mu }}_l\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{lk}) = {\varvec{\Sigma }}_l > 0\), l = \(1, \ldots , g\), defined similarly as for the single population above. Assuming homogeneity of covariance matrices across g populations, i.e., \({\varvec{\Sigma }}_l = {\varvec{\Sigma }}\) \(\forall ~l\), we are further interested to test hypotheses in (1) for common \({\varvec{\Sigma }}\). To distinguish the two cases, we denote multi-sample hypotheses as \(H_{0g}\) and \(H_{1g}\), respectively.
Tests of \(H_{0b}\) or \(H_{0g}\) are frequently needed in multivariate applications, where the special case of \(p_i\) = 1 \(\forall ~i = 1, \ldots , b\) leads to the well-known single- and multi-sample tests of identity or sphericity hypotheses; see e.g., [1, 2] and the references therein. Tests of \(H_{0b}\) or \(H_{0g}\) for a classical case, \(n > p\), have often been addressed in the multivariate literature, including for multiple blocks; see [8, 16, 20, 21]. [14] give extensions to functional data, and [6] discuss permutation and bootstrap tests. Other recent extensions include nonparametric tests ([10, 13]) and a kernel-based approach ([22]); see also [9, 19] for general theory, in the context of canonical correlation analysis.
A potential advantage of \(H_{0b}\) in (1) is a huge dimension reduction, which adds further motivation to testing such hypotheses for high-dimensional data, \(p \gg n\). As the classical testing theory, mostly based on likelihood ratio approach, collapses in this case, several modifications have recently been introduced to cope with the problem; see e.g., [26, 33, 34], where [32] provide a test based on rank correlations. Most of these tests assume normality and are constructed using a Frobenius norm measuring the distance between the null and alternative hypotheses. Use of a Frobenius norm as a distance measure is a common approach for estimation and testing of high-dimensional covariance matrices; for its use in other contexts, see e.g., [3] and the references cited therein.
Our construction of the test statistics for (1) is, however, based on a different approach. We use the RV coefficient (see Sect. 2) as the basic measure of orthogonality of vectors and extend it to test \(H_{0b}\) and \(H_{0g}\). [5] recently proposed this modification to the RV coefficient for two high-dimensional vectors and used it to introduce a test of orthogonality. For details, including historical overview, comparison with its competitors and extensive bibliography, see the reference.
Apart from theoretical analysis of the modified coefficient and its competitors, extensive simulations are used to show the accuracy of the modified estimator and the significance test. The modified coefficient is composed of computationally very efficient estimators as simple functions of the empirical covariance matrix. In this paper, we first extend the theory to propose a test of no correlation between \(b \ge 2\) large vectors, possibly of different dimension.
We further extend this one-sample multi-block test to a multi-sample case with g independent populations. Assuming homogeneity of covariance matrices across g populations, we use pooled information to construct a test for the common covariance matrix. A particularly attractive aspect of the multi-sample test is its simplicity of pooling information across g independent samples, in such a way that the single-sample computations conveniently carry over to the multi-sample case, resulting in a highly efficient test statistic.
We begin in the nest section with the basic notational set up, where we also briefly recap the modified RV coefficient in [5] for further use and reference. We keep it as short as possible, referring for details to the original paper. Using these rudiments, we propose a test for \(H_{0b}\) for multiple blocks in Sect. 3, with an extension to the test of \(H_{0g}\) in Sect. 4. Accuracy of the tests through simulations is shown in Sect. 5, where technical proofs are deferred to Appendix.
2 Notations and Preliminaries
Let the random vectors \(\mathbf{X}_k \in \mathbb {R}^p\), \(k = 1, \ldots , n\), with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_k) = {\varvec{\mu }} \in \mathbb {R}^p\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_k) = {\varvec{\Sigma }} \in \mathbb {R}^{p\times p}\), be partitioned as \(\mathbf{X}_k = (\mathbf{X}^T_{1k}, \ldots , \mathbf{X}^T_{bk})^T\), \(\mathbf{X}_{ik} \in \mathbb {R}^{p_i}\) so that accordingly \({\varvec{\mu }} = ({\varvec{\mu }}^T_1, \ldots , {\varvec{\mu }}^T_b)^T\), \({\varvec{\mu }}_i \in \mathbb {R}^{p_i}\), \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i,j = 1}^{~~b}\), \({\varvec{\Sigma }}_{ij} = {{\,{\mathrm{E}}\,}}(\mathbf{X}_{ik} - {\varvec{\mu }}_i)\otimes (\mathbf{X}_{jk} - {\varvec{\mu }}_j)^T \in \mathbb {R}^{p_i\times p_j}\), \(i \ne j\), \({\varvec{\Sigma }}_{ji} = {\varvec{\Sigma }}^T_{ij}\), \({\varvec{\Sigma }}^T_{ii} = {\varvec{\Sigma }}_{ii}\), where \(\mathbb {R}^{a\times b}\) is the space of real matrices and \(\otimes \) is the Kronecker product. We assume \({\varvec{\Sigma }} > 0\), \({\varvec{\Sigma }}_{ii} > 0\) \(\forall ~i\).
Let \(\overline{\mathbf{X}} = \sum _{k = 1}^n\mathbf{X}_{ik}/n\) and \(\widehat{\varvec{\Sigma }} = \sum _{k = 1}^n(\widetilde{\mathbf{X}}_k\otimes \widetilde{\mathbf{X}}_k)/(n - 1)\) be unbiased estimators of \({\varvec{\mu }}\) and \({\varvec{\Sigma }}\), likewise partitioned as \(\overline{\mathbf{X}} = (\overline{\mathbf{X}}^T_1, \ldots , \overline{\mathbf{X}}^T_b)^T\), \(\widehat{\varvec{\Sigma }} = (\widehat{\varvec{\Sigma }}_{ij})_{i, j = 1}^{~~b}\), so that \(\overline{\mathbf{X}}_i = \sum _{k = 1}^n\mathbf{X}_{ik}/n\) and \(\widehat{\varvec{\Sigma }}_{ij} = \sum _{k = 1}^n(\widetilde{\mathbf{X}}_{ik}\otimes \widetilde{\mathbf{X}}_{jk})/(n - 1)\) are unbiased estimators of \({\varvec{\mu }}_i\) and \({\varvec{\Sigma }}_{ij}\), where \(\widetilde{\mathbf{X}}_k = \mathbf{X}_k - \overline{\mathbf{X}}\), \(\widetilde{\mathbf{X}}_{ik} = \mathbf{X}_{ik} - \overline{\mathbf{X}}_i\). Denoting \(\mathbf{X} = (\mathbf{X}^T_1 \ldots \mathbf{X}^T_n) \in \mathbb {R}^{n\times p}\) as the entire data matrix and \(\mathbf{X}_i = (\mathbf{X}^T_{i1}, \ldots \mathbf{X}^T_{in})^T \in \mathbb {R}^{n\times p_i}\) as the data matrix for ith block, we can express the estimators more succinctly as
where \(\mathbf{C} = \mathbf{I} - \mathbf{J}/n\) is the \(n\times n\) centering matrix, \(\mathbf{I}\) is the identity matrix, \(\mathbf{J} = \mathbf{11}^T\) with \(\mathbf{1}\) a vector of 1s. The null hypotheses in (1) thus imply the nullity of all off-diagonal blocks \({\varvec{\Sigma }}_{ij}\) in \({\varvec{\Sigma }}\). Consider first the simplest case of \(b = 2\), with only one off-diagonal block \({\varvec{\Sigma }}_{12} = {{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{1k}, \mathbf{X}_{2k})\). Obviously, a test of zero correlation (or, under normality, of independence) between \(\mathbf{X}_{1k}\) and \(\mathbf{X}_{2k}\) can be based on an empirical estimator of \(\Vert {\varvec{\Sigma }}_{12}\Vert ^2 = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{12}{\varvec{\Sigma }}_{21})\) or an appropriately normed version of it. One such normed measure is proposed in [5] as
where \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) is an unbiased (and consistent) estimator of \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), i, j = 1, 2; see Sect. 3 for formal definition. The \(\widehat{\eta }\) is a modified form of the original RV coefficient, \(\widetilde{\eta } = \Vert \widehat{\varvec{\Sigma }}_{12}\Vert ^2/\Vert \widehat{\varvec{\Sigma }}_{11}\Vert \Vert \widehat{\varvec{\Sigma }}_{22}\Vert \), as an extension of the scalar correlation coefficient, to measure correlation between vectors of possibly different dimension. Note that \(\widehat{\eta }\) is constructed using unbiased estimators in the true RV coefficient \(\eta = \Vert {\varvec{\Sigma }}_{12}\Vert ^2/\Vert {\varvec{\Sigma }}_{11}\Vert \Vert {\varvec{\Sigma }}_{22}\Vert \). The RV coefficient extends the usual scalar correlation coefficient to data vectors of possibly different dimension, is often used to study relationship between different data configurations, and has many attractive properties; see also [25].
Based on the aforementioned modification and subsequent test of zero correlation, we here extend the same concept and construct tests of hypotheses in (1) and their multi-sample variants. The proposed tests will be valid for the following general multivariate model which helps us relax the normality assumption. Define \(\mathbf{Y}_{ik} = \mathbf{X}_{ik} - {\varvec{\mu }}_i\), \(i = 1, \ldots , b\), so that \(\mathbf{Y}_k = (\mathbf{Y}^T_{1k}, \ldots , \mathbf{Y}^T_{2k})^T\) can replace \(\mathbf{X}_k\). We define the multivariate model as
with \(\mathbf{Z}_k \sim \mathcal {F}\), \({{\,{\mathrm{E}}\,}}(\mathbf{Z}_k) = \mathbf{0}_p\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{Z}_k) = \mathbf{I}_p\), \(k = 1, \ldots , n\), where \(\mathcal {F}\) denotes a p-dimensional distribution function having fourth moment finite; see the assumptions below. Model (4) covers a wide class of distributions such as the elliptical class including multivariate normal. Further, it will help us enhance the validity of the proposed test to a variety of applications under fairly general conditions.
3 One-Sample Test with Multiple Blocks
Consider any two data matrices \(\mathbf{X}_i = (\mathbf{X}^T_{i1}, \ldots \mathbf{X}^T_{in})^T \in \mathbb {R}^{n\times p_i}\) and \(\mathbf{X}_j = (\mathbf{X}^T_{j1}, \ldots \mathbf{X}^T_{jn})^T \in \mathbb {R}^{n\times p_j}\) in \(\mathbf{X} \in \mathbb {R}^{n\times p}\), where \(\mathbf{X}_{ik} \in \mathbb {R}^{p_i}\), \(\mathbf{X}_{ik} \in \mathbb {R}^{p_j}\), \(k = 1, \ldots , n\), \(i, j = 1, \ldots , b\), \(i < j\). Following \(\widehat{\eta }\) in Eq. (3), the RV coefficient of correlation between \(\mathbf{X}_{ik}\) and \(\mathbf{X}_{jk}\) can be computed as
where \(\widehat{\eta }_{ij}\) estimates \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \). Since \({\varvec{\Sigma }}_{ii} > 0\), \(\eta _{ij} = 0\) \(\Leftrightarrow \) \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2 = 0\) \(\forall ~i > j\), which implies that a test for \(\eta _{ij} = 0\) can be equivalently based on \({\varvec{\Sigma }}_{ij}\). In fact, as will be shown shortly, the denominator of \(\widehat{\eta }_{ij}\) adjusts itself through \({{\,{\mathrm{Var}}\,}}(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2})\) so that it suffices to use \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) to construct a test statistic. We begin by defining the estimators used to compose \(\widehat{\eta }_{ij}\) and their moments, which will also be useful in subsequent computations. For notational convenience, denote \({\varvec{\Sigma }}^{1/2}_{ii} = {\varvec{\Delta }}_{ii}\) so that \(\Vert {\varvec{\Delta }}_{ii}\Vert ^2 = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii})\) and correspondingly \(\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2 = {{\,{\mathrm{tr}}\,}}(\widehat{\varvec{\Sigma }}_{ii})\). For the proof of the following theorem, see “Appendix 1.2.1”.
Theorem 1
The unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \) are defined as
where \(\nu = (n - 1)/n(n - 2)(n - 3)\), \(Q_{ij} = \sum _{k = 1}^n\widetilde{z}_{ik}\widetilde{z}_{jk}/(n - 1)\) and \(Q_{ii} = \sum _{k = 1}^n\widetilde{z}^2_{ik}/(n - 1)\) with \(\widetilde{z}_{ik} = \widetilde{\mathbf{X}}^T_{ik}\widetilde{\mathbf{X}}_{ik}\) and \(\widetilde{\mathbf{X}}_{ik} = \mathbf{X}_{ik} - \overline{\mathbf{X}}_i\), \(i, j = 1, \ldots , b\), \(i < j\).
Note that the terms \(Q_{ij}\) are needed to make the estimators, and hence the subsequent test statistics, valid under Model (4) beyond the normality assumption. Essentially, these terms involve fourth-order elements of \(\mathbf{X}_{ik}\) due to the variances of bilinear forms to be computed. This in turn calls for bounding such moments of \(\mathbf{X}_{ik}\) to be finite (see assumptions below). For this, define
with \(z_{ik} = \mathbf{Y}^T_{ik}{} \mathbf{Y}_{ik}\). As \(\kappa _{ij}\) = 0 under normality, it serves as a measure of non-normality and, given finite fourth moment, helps extend the results to a wide class of distributions under Model (4). The results of Theorem 1 also help obtain unbiased estimators of \(\kappa _{ij}\) and \(\kappa _{ii}\) as (see “Appendix 1.2.1”)
The estimators in Theorem 1 are composed of \(\widehat{\varvec{\Sigma }}_{ij}\) (see Eq. 2) and \(Q_{ij}\), both of which are simple functions of mean-deviated vectors \(\widetilde{\mathbf{X}}_{ik}\). It makes the estimators very simple and computationally highly efficient for practical use. For mathematical amenability, however, an alternative form of the same estimators in terms of U-statistics helps us study their properties due to the attractive projection and asymptotic theory of U-statistics.
Given \(\{\mathbf{X}_{ik}, \mathbf{X}_{ir}, \mathbf{X}_{il}, \mathbf{X}_{is}\}\), \(k \ne r \ne l \ne s\), from \(\mathbf{X}_i \in \mathbb {R}^{n\times p_i}\), define \(\mathbf{D}_{ikr}\) = \(\mathbf{X}_{ik} - \mathbf{X}_{ir}\) with \({{\,{\mathrm{E}}\,}}(\mathbf{D}_{ikr})\) = 0, \({{\,{\mathrm{Cov}}\,}}(\mathbf{D}_{ikr})\) = 2\({\varvec{\Sigma }}_{ii}\), \({{\,{\mathrm{E}}\,}}(\mathbf{D}_{ikr}{} \mathbf{D}^T_{jkr})\) = 2\({\varvec{\Sigma }}_{ij}\). For \(A_{ikr} = \mathbf{D}^T_{ikr}{} \mathbf{D}_{ikr}\), \({{\,{\mathrm{E}}\,}}(A_{ikr}) = 2\Vert {\varvec{\Delta }}_{ii}\Vert ^2\) and \({{\,{\mathrm{E}}\,}}(A_{iks}A_{jlr}) = 4\Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2\). With \(\mathbf{D}_{ils}\) defined similarly, let \(A_{ijlskr} = A_{ilskr}A_{jkrls}\) where \(A_{ilskr} = \mathbf{D}^T_{ils}{} \mathbf{D}_{ikr}\), so that \({{\,{\mathrm{E}}\,}}(A^2_{ilskr}) = 4\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \({{\,{\mathrm{E}}\,}}(A_{ijlskr}) = 4\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\). Denoting further \(B_{iikrls} = A^2_{ilskr} + A^2_{ilksr} + A^2_{ilrsk}\), \(C_{ijklrs} = A_{ijlskr} + A_{ijlksr} + A_{ijlrks}\), \(D_{ijkrls} = A_{ikr}A_{jls} + A_{ikl}A_{jrs} + A_{iks}A_{jlr}\) and \(P(n) = n(n - 1)(n - 2)(n - 3)\), the U-statistics versions of the estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2\) in Theorem 2, denoted \(U_1, U_2, U_3\), are defined, respectively, as
where the sum is quadruple over \(\{k, l, r, s\}\) and \(\pi (\cdot ) = \pi (k,r,l,s)\) implies \(k \ne r \ne l \ne s\). The moments in the following theorem follow conveniently using the alternative form of estimators in (13) and will be very useful for further computations in the sequel.
Theorem 2
For the unbiased estimators \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) and \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\) in Theorem 2, we have
where \(a(n) = 3n^3 - 24n^2 + 44n + 20\), \(b(n) = 6n^3 - 40n^2 + 22n + 181\), \(c(n) = 2n^2 - 12n + 21\), \(d(n) = 2n^3 - 9n^2 + 9n - 16\), \(e(n) = n^2 - 3n + 8\) and K contains terms involving \(\kappa _{ij}\).
We skip the proof of Theorem 2 which follows from that of Theorem 2 in [5]. The terms \(KO(\cdot )\) sum up constants that eventually vanish under the assumptions. In particular, K involves \(\kappa _{ij}\) in (10) and terms involving Hadamard products such as \({\varvec{\Sigma }}_{ii}\odot {\varvec{\Sigma }}_{jj}\) which converge to zero. Now, we have the required tools to proceed with the test of \(H_{0b}\) in (1). As mentioned above,
Moreover, \(\eta _{ij} = 0 \Leftrightarrow \sum _{i < j}^{~b}\eta _{ij} = 0\), so that a test of \(H_{0b}\) can be based on a sum of Frobenius norms over all off-diagonal blocks, i.e., \(\sum _{i < j}^b\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\). We thus define the test statistic for \(H_{0b}\) as
Here, \({{\,{\mathrm{T}}\,}}_{ij}\) is a statistic to test \(H_{0ij}: \eta _{ij} = 0\) for any single off-diagonal block \({\varvec{\Sigma }}_{ij}\). Moreover, the scaling factor \(p_ip_j\) will help us obtain the limit of \({{\,{\mathrm{T}}\,}}_b\) for \(p_i \rightarrow \infty \) along with \(n \rightarrow \infty \), under the following assumptions. Recall \(\mathbf{Y}_k \in \mathbb {R}^p\) in Model (4).
Assumption 3
\({{\,{\mathrm{E}}\,}}(Y^4_{ks}) = \gamma _s \le \gamma < \infty \), \(\forall ~s = 1, \ldots , p\).
Assumption 4
\(\lim _{p_i \rightarrow \infty }\frac{\Vert {\varvec{\Delta }}_{ii}\Vert ^2}{p_i} = O(1)\), \(\forall ~i = 1, \ldots , b\).
Assumption 5
\(\lim _{n, p_i \rightarrow \infty }\frac{p_i}{n} \rightarrow \zeta _0 \in (0, \infty )\), \(\forall ~i = 1, \ldots , b\).
Assumption 3 bounds fourth moment of \(\mathbf{Y}_k\) so that, by Cauchy–Schwarz inequality, \({{\,{\mathrm{E}}\,}}(y^2_{k1s}y^2_{k2s}) < \infty \) which implies that \(\kappa _{ij}/\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2 \rightarrow 0\). This conforms to the definition of \(\kappa _{ij}\) and helps us present the test under Model (4), and it also makes the terms involving K vanish in Theorem 2. Assumption 4 bounds the average of the eigenvalues of diagonal blocks. It is a mild assumption, often used in high-dimensional testing, and as its consequence, \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j = O(1)\).
Whereas Assumption 4 is needed for limits under \(H_{0b}\) and \(H_{1b}\), its consequence is only needed under \(H_{1b}\) since it neither holds nor is required under \(H_{0b}\). Assumption 5 controls joint growth of n and \(p_i\) so that the limit holds under a high-dimensional framework, and it is also needed only under \(H_{1b}\). Now, for \({{\,{\mathrm{T}}\,}}_b\), we have \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b) = \sum _{i < j}^{~b}\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j\) = 0 under \(H_{0b}\) and
Equation (18) gives \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{ij})\) and the covariance, say \(C_1\), \(C_2\), \(C_3\), respectively, given as below.
\(a_1(n) = 3n^3 - 38n^2 + 170n - 262\), \(b_1(n) = 6n^3 - 57n^2 + 178n - 182\), \(a_2(n) = 3n^3 - 39n^2 + 176n - 269\), \(b_2(n) = 6n^3 - 70n^2 + 286n - 199\). Under \(H_{0b}\), the covariances vanish and the variance reduces to
where \({{\,{\mathrm{V}}\,}}_{ij} = \Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2/p^2_ip^2_j\). It implies that \({{\,{\mathrm{Var}}\,}}(n{{\,{\mathrm{T}}\,}}_b)\) is bounded and a non-degenerate limit of \(n{{\,{\mathrm{T}}\,}}_b\) may exist. That this indeed holds under the assumptions follow from the limit of \({{\,{\mathrm{T}}\,}}_{ij}\) given in Ahmad ([5], Theorem 3), by noting that the covariance terms above essentially vanish under the same assumptions. The following theorem summarizes the result, the proof of which, along with that of Theorem 12 for the multi-sample case, is given in “Appendix 1.2.3”.
Theorem 6
Given \({{\,{\mathrm{T}}\,}}_b\) in (17) and Assumptions 3–5. Then, \(({{\,{\mathrm{T}}\,}}_b - {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}{\,}}_b}))/\sigma _{{{\,{\mathrm{T}}\,}}_b} ~\xrightarrow {\mathcal{D}}~ N(0, 1)\), as \(n, p_i \rightarrow \infty \), with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) in (18). Under \(H_0\) and Assumptions 3–4, \(n{{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_{b}} ~\xrightarrow ~ N(0, 1)\) with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b}\) as in Eq. (22). Further, the same limits hold when \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) is replaced with \(\widehat{{{\,{\mathrm{Var}}\,}}}({{\,{\mathrm{T}}\,}}_b)\).
The last part of Theorem 6 makes the statistic \({{\,{\mathrm{T}}\,}}_b\) applicable, when a consistent estimator, i.e.,
is plugged in for the true variance, where \(\widehat{{{\,{\mathrm{V}}\,}}}_{ij} = \widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{jj}\Vert ^2/}p^2_ip^2_j\). From the proof of the theorem, we also note that the moments and the limit of \({{\,{\mathrm{T}}\,}}_b\) are functions of \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \). For the power of \({{\,{\mathrm{T}}\,}}_b\), let \(\beta ({\varvec{\theta }})\) be the power function with \({\varvec{\theta }} \in {\varvec{\Theta }}\), where
are the parameter subspaces under \(H_{0b}\) and \(H_{1b}\), respectively. Let \(z_\alpha \) denote 100\(\alpha \)% quantile of N(0, 1) distribution, so that \(P({{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_{b0}} \ge z_\alpha ) = \alpha \), by Theorem 6, where \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_{0b}}\) denotes \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) under \(H_{0b}\). Also, let \(\gamma = \sigma _{T_{b0}}/\sigma _{T_b}\), \(\delta = {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b)/\sigma _{{{\,{\mathrm{T}}\,}}_b}\) with \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b) = \sum _{i < j}^b\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j\), \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\). Then, \(\beta ({\varvec{\theta }}|H_1)\) = \(P({{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_b} \le \gamma z_\alpha - \delta )\) = 1 - \(\Phi (\gamma z_\alpha - \delta )\) defines the power of \({{\,{\mathrm{T}}\,}}_b\), where \(\Phi (\cdot )\) is the distribution function of N(0, 1). From Theorem 2, \(\gamma = O(1)\) and \(\delta = O(n)\) under the assumptions, so that 1 - \(\Phi (\gamma z_\alpha - \delta ) \rightarrow \) 1 as \(n, p_i \rightarrow \infty \). Finally, letting \({\varvec{\Sigma }}_{ij} = \mathbf{A}_{ij}/\sqrt{n}\) for a fixed matrix \(\mathbf{A}_{ij}\), \(i, j = 1, \ldots , b\), \(i < j\), similar arguments imply that the local power also converges to 1 as \(n, p_i \rightarrow \infty \).
4 Multi-Sample Extension with Multiple Blocks
Consider the multi-sample set up given after (1) with \(\mathbf{X}_{lk} = (\mathbf{X}^T_{l1k}, \ldots , \mathbf{X}^T_{lbk})^T \in \mathbb {R}^p\), \(\mathbf{X}_{lik} \in \mathbb {R}^{p_i}\), \(k = 1, \ldots , n_l\), as iid vectors from population l, where \({{\,{\mathrm{E}}\,}}(\mathbf{X}_{lk}) = {\varvec{\mu }}_l = ({\varvec{\mu }}^T_{l1}, \ldots , {\varvec{\mu }}^T_{lb})^T \in \mathbb {R}^p\), \({\varvec{\mu }}_{li} \in \mathbb {R}^{p_i}\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{lk}) = {\varvec{\Sigma }}_l = ({\varvec{\Sigma }}_{lij})_{i,j = 1}^{~~b} \in \mathbb {R}^{p\times p}\), \(l = 1, \ldots , g\), \({\varvec{\Sigma }}_{lij} = {{\,{\mathrm{E}}\,}}(\mathbf{X}_{lik} - {\varvec{\mu }}_{li})\otimes (\mathbf{X}_{ljk} - {\varvec{\mu }}_{lj})^T \in \mathbb {R}^{p_i\times p_j}\), \(i < j\), \({\varvec{\Sigma }}_{lji} = {\varvec{\Sigma }}^T_{lij}\), \({\varvec{\Sigma }}^T_{lii} = {\varvec{\Sigma }}_{lii}\). Then \(\overline{\mathbf{X}}_l = \sum _{k = 1}^{n_l}{} \mathbf{X}_{lk}/n_l\), \(\widehat{\varvec{\Sigma }}_l = \sum _{k = 1}^{n_l}(\widetilde{\mathbf{X}}_{lk}\otimes \widetilde{\mathbf{X}}_{lk})/(n_l - 1)\) are unbiased estimators of \({\varvec{\mu }}_l\), \({\varvec{\Sigma }}_l\), correspondingly partitioned as \(\overline{\mathbf{X}}_l = (\overline{\mathbf{X}}^T_{l1}, \ldots , \overline{\mathbf{X}}^T_{lb})^T\), \(\widehat{\varvec{\Sigma }}_l = (\widehat{\varvec{\Sigma }}_{lij})_{i, j = 1}^{~~b}\) with \(\overline{\mathbf{X}}_{li} = \sum _{k = 1}^{n_l}{} \mathbf{X}_{lik}/n_l\), \(\widehat{\varvec{\Sigma }}_{lij} = \sum _{k = 1}^{n_l}(\widetilde{\mathbf{X}}_{lik}\otimes \widetilde{\mathbf{X}}_{ljk})/(n_l - 1)\) as unbiased estimators of \({\varvec{\mu }}_{li}\) and \({\varvec{\Sigma }}_{lij}\), where \(\widetilde{\mathbf{X}}_k = \mathbf{X}_k - \overline{\mathbf{X}}\) and \(\widetilde{\mathbf{X}}_{lik} = \mathbf{X}_{lik} - \overline{\mathbf{X}}_{li}\). Denoting \(\mathbf{X}_l = (\mathbf{X}^T_{l1} \ldots \mathbf{X}^T_{ln_l})^T \in \mathbb {R}^{n_l\times p}\) as the entire data matrix for lth sample, with \(\mathbf{X}_{li} = (\mathbf{X}^T_{li1}, \ldots , \mathbf{X}^T_{lin_l})^T \in \mathbb {R}^{n_l\times p_i}\) as ith data matrix in \(\mathbf{X}_l\), we can rewrite the estimators, using \(\mathbf{C}_l = \mathbf{I}_{n_l} - \mathbf{J}_{n_l}/n_l\), \(l = 1, \ldots , g\), as
The RV coefficient in Eq. (3) can now be computed from lth sample, \(l = 1, \ldots , g\), as
which estimates \(\eta _{lij} = \Vert {\varvec{\Sigma }}_{lij}\Vert ^2/\Vert {\varvec{\Sigma }}_{lii}\Vert \Vert {\varvec{\Sigma }}_{ljj}\Vert \). We are interested to test hypotheses in (1) for common \({\varvec{\Sigma }}\), assuming \({\varvec{\Sigma }}_l = {\varvec{\Sigma }}\) \(\forall ~l\), i.e., to test if \({\varvec{\Sigma }}\) is block diagonal, \({\varvec{\Sigma }} = \text {{diag}}({\varvec{\Sigma }}_{11}, \ldots , {\varvec{\Sigma }}_{bb})\). Formally,
where the subscript g refers to g populations. A test of \(H_{0g}\) can thus be constructed by pooling information from g populations. For this, we state Assumptions 3–5 for the multi-sample case.
Assumption 7
\({{\,{\mathrm{E}}\,}}(Y^4_{lks}) = \gamma _s \le \gamma < \infty \), \(\forall ~s = 1, \ldots , p\), \(l = 1, \ldots , g\).
Assumption 8
\(\lim _{p_i \rightarrow \infty }\frac{\Vert {\varvec{\Delta }}_{lii}\Vert ^2}{p_i} = O(1)\), \(\forall ~i = 1, \ldots , b\), \(l = 1, \ldots , g\).
Assumption 9
\(\lim _{n_l, p_i \rightarrow \infty }\frac{p_i}{n_l} \rightarrow \zeta _l \le \zeta _0 \in (0, \infty )\), \(\forall ~i = 1, \ldots , b\), \(l = 1, \ldots , g\).
Now, we begin by writing the estimators in Eqs. (6)–(9) for the lth sample as
where \(\nu _l = (n_l - 1)/n_l(n_l - 2)(n_l - 3)\), \(Q_{lij} = \sum _{k = 1}^n\widetilde{z}_{lik}\widetilde{z}_{ljk}/(n_l - 1)\) and \(Q_{lii} = \sum _{k = 1}^n\widetilde{z}^2_{lik}/(n_l - 1)\) with \(\widetilde{z}_{lik} = \widetilde{\mathbf{X}}^T_{lik}\widetilde{\mathbf{X}}_{lik}\) and \(\widetilde{\mathbf{X}}_{lik} = \mathbf{X}_{lik} - \overline{\mathbf{X}}_{li}\), \(i, j = 1, \ldots , b\), \(i < j\). Note that these are unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), \(\big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \), respectively, obtained from sample l, so that they can be used to construct pooled estimators of the same parameters, as the following.
Theorem 10
Denote \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\) with \(\nu _l = (n_l - 1)/n_l(n_l - 2)(n_l - 3)\). The pooled unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), \(\big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \) are defined, respectively, as
In pooling information across g samples, Eqs. (31)–(34) use weights \(1/\nu _l\), where the pooled estimators correspond to \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i, j = 1}^{~~b}\) for which \(H_{0g}\) is defined. Thus, an appropriate test statistic for \(H_{0g}\) can be defined as
which extends Eq. (17) for the multi-sample case under homogeneity. Equivalently, we can write
In this form, \({{\,{\mathrm{T}}\,}}_{lij}\), hence \({{\,{\mathrm{T}}\,}}_{lb}\), are the same as \({{\,{\mathrm{T}}\,}}_{ij}\) and \({{\,{\mathrm{T}}\,}}_b\) in Sect. 3 but now defined for lth population. In either case, the main focus on the formulation of \({{\,{\mathrm{T}}\,}}_g\) is simplicity, so that, by independence of the g samples, the computations for the one-sample case will mainly suffice for the multi-sample case, as for example the results of the following theorem; see “Appendix 1.2.2” for proof.
Theorem 11
The pooled estimators \(\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}\) and \(\widehat{\Vert {\varvec{\Sigma }}_{pij}\Vert ^2}\) in Eqs. (31)–(32) are unbiased for \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\) and \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), respectively. Further, under Assumptions 7–9, as \(n, p_i \rightarrow \infty \),
where \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\) and \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \), \(i, j = 1,\ldots , b\), \(i < j\).
Theorem 11 provides bounds on the variance ratios which are more important for our purpose, where the exact variances, which follow from those of single-sample case in Theorem 2, are given in “Appendix 1.2.2”. Moreover, Eq. (38) also implies that \(\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}/p^2_i\) is a consistent estimator of \(\Vert {\varvec{\Sigma }}_{pii}\Vert ^2/p^2_i\), \(i = 1, \ldots , b\), as \(n, p_i \rightarrow \infty \). Now, from Theorem 11,
since \({{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lb}, {{\,{\mathrm{T}}\,}}_{l'b}) = 0\), \(l \ne l'\), by independence, where \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})\) follows directly from Eq. (18) as
\(l = 1, \ldots , g\). Equations (19)–(21) provide variances and covariances in \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})\), using the results of Theorem 2 with \({{\,{\mathrm{T}}\,}}_{ij}\)’s replaced with \({{\,{\mathrm{T}}\,}}_{lij}\) and n with \(n_l\). We therefore skip unnecessary repetitions. Now, under \(H_{0g}\), \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_g) = 0\) and
where \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\), \(P(n_l) = n_l(n_l - 1)(n_l - 2)(n_l - 3)\) and \({{\,{\mathrm{V}}\,}}_{lij} = \Vert {\varvec{\Sigma }}_{lii}\Vert ^2\Vert {\varvec{\Sigma }}_{ljj}\Vert ^2/p^2_ip^2_j\). It is obvious from the construction and moments of \({{\,{\mathrm{T}}\,}}_g\) that its limit, by the independence of samples, follows along the same lines as that of \({{\,{\mathrm{T}}\,}}_b\) without much new computation, and likewise holds for the consistency of \(\widehat{\Vert {\varvec{\Sigma }}_{lii}\Vert ^2}/p^2_i\) using that of \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}/p^2_i\), so that, plugged in, they yield \(\widehat{{{\,{\mathrm{Var}}\,}}}({{\,{\mathrm{T}}\,}}_g)\) as a consistent estimator of \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) as the following, where \(\widehat{{{\,{\mathrm{V}}\,}}}_{pij} = \widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{pjj}\Vert ^2}/p^2_ip^2_j\):
The following theorem extends Theorem 6 to the multi-sample case; see “Appendix 1.2.3“ for proof.
Theorem 12
For \({{\,{\mathrm{T}}\,}}_g\) in (35) and Assumptions 7–9, \(({{\,{\mathrm{T}}\,}}_g - {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_g))/\sigma _{{{\,{\mathrm{T}}\,}}_g} ~\xrightarrow {\mathcal {D}}~ N(0, 1)\), as \(n_l, p_i \rightarrow \infty \), with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_g} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) in Eq. (40). Under \(H_{0g}\) and Assumptions 7–8, \(\sqrt{\nu _0}{{\,{\mathrm{T}}\,}}_g/\sigma _{{{\,{\mathrm{T}}\,}}_g}\) \(\xrightarrow {\mathcal {D}}\) N(0, 1) with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_g}\) in Eq. (42). Further, the limits hold if \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) is replaced with a consistent estimator \(\widehat{{{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)}\).
5 Simulations
We evaluate the performance of \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\) through simulations. For the one-sample multi-block statistic, \({{\,{\mathrm{T}}\,}}_b\), we take \(b = 3\) and sample random vectors of sizes \(n \in \{20, 50, 100\}\) from multivariate normal, t and chi-square distributions with 10 degrees of freedom each for the latter two, with block dimensions \(p_1 \in \{10, 25, 50, 150, 300\}\), \(p_2 = 2p_1\), \(p_3 = 3p_1\). Two covariance patterns are assumed for the true covariance matrix, i.e., compound symmetry (CS) and AR(1), defined, respectively, as \((1 - \rho )\mathbf{I}_p + \rho \mathbf{J}_p\) and \({\varvec{\Sigma }} = \mathbf{B}{} \mathbf{A}{} \mathbf{B}\) with \(\mathbf{A} = \rho ^{|k - l|^{1/5}}\) and \(\mathbf{B}\) a diagonal matrix with entries square roots of \(\rho + \mathbf{p}_1/p\), where \(\mathbf{p}_1 = 1:p\), \(\mathbf{I}\) is the identity matrix and \(\mathbf{J}\) is matrix of 1s. We set \(\rho = 0.5\). Under \(H_{0b}\), the same structures are imposed on three diagonal blocks \({\varvec{\Sigma }}_{ii}\) of dimensions \(p_i\), so that \({\varvec{\Sigma }} = \oplus _{i = 1}^3{\varvec{\Sigma }}_{ii}\), where \(\oplus \) denotes the Kronecker sum.
For the multi-sample statistic, \({{\,{\mathrm{T}}\,}}_g\), we take \(g = 2\) with \(b = 2\), and generate \(n_1 = \{20, 40, 75\}\) and \(n_2 = \{30, 50, 100\}\) iid vectors from respective populations with \(p_1 = \{20, 50, 100, 300, 500\}\) and \(p_2 = 2p_1\). Under homoscedasticity, the common \({\varvec{\Sigma }}\) is assumed to follow the two covariance structures under the alternative, where under the null, the two diagonal blocks are assumed to follow the same structures with their respective dimensions. The estimated size and power of \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\), reported in Tables 1, 2, 3 and 4, respectively, are an average of 1000 simulation runs.
We observe an accurate test size for both tests, \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\), for all parameter settings under all distributions. For multi-sample case, the test performs relatively more accurately although the sample sizes reasonably differ. Note that, generally, both tests are very conservative for relatively small sample sizes, but improve with increasing sample size. In particular, the accuracy of the tests remains intact for increasing dimension. The results under the t and uniform distributions point to the evidence of robustness of the tests to the normality assumption, under Model (4).
Similar performance can be witnessed for the power of both test statistics. Like test size, the accuracy of which increases for increasing sample size, and the power also improves with increasing sample size, but also with increasing dimension, even for seriously differing dimensions for different blocks. In particular, for the multi-sample case, we observe an accurate performance of \({{\,{\mathrm{T}}\,}}_g\) for unbalanced design, and the accuracy is not disturbed for increasing dimension. The robustness character of the two tests also resembles that for the test size.
6 Discussion and Conclusions
Test statistics for correlation between two or more vectors are presented when the dimensions of the vectors, possibly unequal, may exceed the number of vectors. The one-sample case is further extended to two or more independent samples coming from populations assumed to have common covariance matrix. Accuracy of the tests is shown through simulations with different parameters. Among potential advantages of the tests include their simple construction, particularly for the multi-sample case whence most computations follow conveniently from the one-sample results, and their wide practical applicability under fairly general conditions and for a larger class of multivariate models including the multivariate normal.
A particularly distinguishing feature of the tests is that they are composed of computationally very efficient estimators defined as simple functions of the empirical covariance matrix. From applications perspective, it may be mentioned that the tests are constructed using the RV coefficient so that they can only be used to assess linear independence of vectors. It distinguishes them from measures such as distance correlation or kernel methods which can also measure nonlinear dependence; see also [5] for more details.
References
Ahmad MR (2016) On testing sphericity and identity of a covariance matrix with large dimensions. Math Methods Stat 25(2):121–132
Ahmad MR (2017a) Location-invariant multi-sample \(U\)-tests for covariance matrices with large dimension. Scand J Stat 44:500–523
Ahmad MR (2017b) Location-invariant tests of homogeneity of large dimensional covariance matrices. J Stat Theory Pract 11:731–745
Ahmad MR (2018) A unified approach to testing mean vectors with large dimensions. AStA Adv Stat Anal. https://doi.org/10.1007/s10182-018-00343-z
Ahmad MR (2019) A significance test of the RV coefficient for large dimensions. Comput Stat Data Anal 131:116–130
Albert M, Bouret Y, Fromont M, Reynaud-Bouret P (2015) Bootstrap and permutation tests of independence for pont processes. Ann Stat 43:2537–2564
Allaire J, Lepage Y (1992) A procedure for assessing vector correlations. Ann Inst Stat Math 44:755–768
Anderson TW (1999) Asymptotic theory for canonical correlation analysis. J Multivar Anal 70:1–29
Anderson TW (2003) An introduction to multivariate statistical analysis, 3rd edn. Wiley, Hoboken
Bakirov NK, Rizzo ML, Székely GJ (2006) A multivariate nonparametric test of independence. J Multivar Anal 97:1742–1756
de Wet T (1980) Cramér-von Mises tests for independence. J Multivar Anal 10:38–50
Gretton A, Bousquet O, Smola A, Schölkopf B (2005) Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain et al (eds) Algorithmic learning theory. Springer, Berlin
Gretton A, Györfi L (2010) Consistent nonparametric tests of indpeendence. J Mach Learn Res 11:1391–1423
Horváth L, Hušková M, Rice G (2013) Test of independence for functional data. J Multivar Anal 117:100–119
Johnson RA, Wichern DW (2007) Applied multivariate data analysis, 6th edn. Prentice Hall, Upper Saddle River
Kettenring JR (1971) Canonical analysis of several sets of variables. Biometrika 58:433–451
Koroljuk VS, Borovskich YV (1994) Theory of \(U\)-statistics. Kluwer Press, Dordrecht
Lehmann EL (1999) Elements of large-sample theory. Springer, Berlin
Muirhead RJ (2005) Aspects of multivariate statistical theory. Wiley, Hoboken
Muirhead RJ, Waternaux CM (1980) Asymptotic distributions in canonical correlation analysis and other multivariate procedures for nonnormal populations. Biometrika 67:31–43
Nkiet GM (2017) Asymptotic theory of multiple-set linear canonical analysis. Math Methods Stat 26:196211
Pfister N, Bühlmann P, Schülkopf B, Peters J (2017) Kernel-based tests for joint independence. JRSS B 80(1):5–31
Ramsay JO, ten Berge J, Styan HPH (1984) Matrix correlation. Psychom 49:403–423
Robert P, Cleroux R, Ranger N (1985) Some results on vector correlation. Comput Stat Data Anal 3:25–32
Robert P, Escoufier Y (1976) A unifying tool for linear multivariate statistical methods: the RV-coefficient. J RStat Soc C 25:257–265
Schott JR (2008) A test for independence of two sets of variables when the number of variables is large relative to the sample size. Stat Probab Lett 78:3096–3102
Searle SR (1971) Linear models. Wiley, Hoboken
Seber GAF (2004) Multivariate observations. Wiley, Hoboken
Serfling RJ (1980) Approximation theorems of mathematical statistics. Wiley, Hoboken
Székely G, Rizzo M (2013) The distance correlation \(t\)-test of independence in high dimension. J Multivar Anal 117:193–213
van der Vaart AW (1998) Asymptotic statistics. Cambridge University Press, Cambridge
Wang G, Zou C, Wang Z (2013) A necessary test for complete independence in high dimensional using rank correlations. J Multivar Anal 121:224–232
Yang Y, Pan G (2015) Independence test for high dimensional data based on regularized canonical correlation coefficients. Ann Stat 43:467–500
Yata K, Aoshima M (2013) Correlation test for high-dimensional data using extended cross-data-matrix methodology. J Multivar Anal 117:313–331
Acknowledgements
The author is thankful to the editors and the two referees for their comments which helped improve the original draft of the article.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
There is no conflict of interest concerning this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1: Technical Results and Proofs
Appendix 1: Technical Results and Proofs
1.1 Some Basic Moments
Given Model (4) with \(\mathbf{Y}_{ik} = \mathbf{X}_{ik} - {\varvec{\mu }}_i\), let \(A_{ik} = \mathbf{Y}^T_{ik}{} \mathbf{Y}_{ik}\), \(A_{ikr} = \mathbf{Y}^T_{ik}{} \mathbf{Y}_{ir}\), \(k \ne r\), and \(\kappa _{ii}\), \(\kappa _{ij}\) be as in Eq. (10) ; then, the moments in the following theorem hold under Model (4).
Theorem 13
For \(A_{ik}\), \(A_{ikr}\) defined above, \({{\,{\mathrm{E}}\,}}(A_{ik}) = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii})\), \({{\,{\mathrm{E}}\,}}(A_{ikr}) = 0\), \({{\,{\mathrm{E}}\,}}(A^2_{ikr}) = \Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \({{\,{\mathrm{E}}\,}}(A_{irk}A_{iks}) = 0\), \({{\,{\mathrm{E}}\,}}(A_{ik}A_{ikr}) = 0\), \({{\,{\mathrm{E}}\,}}(\mathbf{Y}^T_{ik}{\varvec{\Sigma }}_{ij}{} \mathbf{Y}_{jk}) = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), \({{\,{\mathrm{E}}\,}}(\mathbf{Y}^T_{ik}{\varvec{\Sigma }}_{ij}{} \mathbf{Y}_{jr}) = 0\), \({{\,{\mathrm{Var}}\,}}(\mathbf{Y}^T_{ik}{\varvec{\Sigma }}_{ij}{} \mathbf{Y}_{jk})\) = \(K + {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj}{\varvec{\Sigma }}_{ji})\), \({{\,{\mathrm{Var}}\,}}(\mathbf{Y}^T_{ik}{\varvec{\Sigma }}_{ij}{} \mathbf{Y}_{jr}) = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj}{\varvec{\Sigma }}_{ji})\), \({{\,{\mathrm{Var}}\,}}(A_{ikr}A_{jrk}) = K + \Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2 + 2{{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj}{\varvec{\Sigma }}_{ji})\), where K is a constant involving only \(\kappa _{ij}\) and \(\kappa _{ii}\).
1.2 Main Proofs
1.2.1 Proof of Theorem 1
Since we can write \(\widehat{\varvec{\Sigma }} = \sum _{k = 1}^n\mathbf{Y}_k\mathbf{Y}^T_k/n - \sum _{k \ne r}^n\mathbf{Y}_k\mathbf{Y}^T_r/n(n - 1)\), it implies ([2, 4])
Using \(\widehat{\varvec{\Sigma }}_{ij}\) and Theorem 13, Eqs. (7) and (9) can be obtained from ‘Appendix B.1’ in [4] by replacing 1 with i and 2 with j. They express \(\Vert \widehat{\varvec{\Sigma }}_{ij}\Vert ^2\), \(\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{jj}\Vert ^2\) and \(Q_{ij}\) in terms of functions of \(A_{ik}\) and \(A_{ikr}\), defined above, so that, taking expectation, it follows that
Solving these equations simultaneously gives Eqs. (7) and (9), which can be used to show unbiasedness of \(\widehat{\kappa }_{ij}\). Following the same lines, and using \(\widehat{\varvec{\Sigma }}_{ii}\) above, we can write
where, for simplicity, \(A_{01}, A_{02}, A_{03}\) contain terms the expectation of which vanishes, so that, using Theorem 13, it follows after some simplification that
Solving simultaneously gives Eqs. (6) and (8), and also the unbiasedness of \(\widehat{\kappa }_{ii}\) in Eq. (12).
1.2.2 Proof of Theorem 11
Under the assumption \({\varvec{\Sigma }}_l = {\varvec{\Sigma }}\) \(\forall ~l = 1, \ldots , g\), the unbiasedness follows immediately since
\(\forall ~i, j = 1, \ldots , b\), \(i < j\). For variances, we note, following Theorem 2 for any \(l \in \{1, \ldots , g\}\), that
where \(d(n_l) = 2n^3_l - 9n^2_l + 9n_l - 16\), \(e(n) = n^2_l - 3n_l + 8\), so that we can write
The first and last terms vanish under the assumptions, so that, as \(n, p_i \rightarrow \infty \),
using \(\nu ^{-1}_l = O(n^2_l)\) \(\Rightarrow \) \(\nu _0 = O(\sum _{l = 1}^gn^2_l)\), which also gives the consistency of \(\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}/p^2_i\). Similarly,
With \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \), it implies under the assumptions that, as \(p_i \rightarrow \infty \),
1.2.3 Proofs of Theorems 6 and 12
Consider \({{\,{\mathrm{T}}\,}}_b = \sum _{i < j}^b{{\,{\mathrm{T}}\,}}_{ij}\) with \({{\,{\mathrm{T}}\,}}_{ij} = \widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}/p_ip_j\). In Ahmad ([5], Theorem 3), it is shown that
under Assumptions 3–5, as \(n, p_i \rightarrow \infty \), where \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{ij})\) follows from Theorem 2. As the limit holds for all \({{\,{\mathrm{T}}\,}}_{ij}\), \(i < j\), and \({{\,{\mathrm{T}}\,}}_b\) is a sum of all such \({{\,{\mathrm{T}}\,}}_{ij}\), we basically need to focus on the covariances in Eqs. (19)–(21) for the limit of \({{\,{\mathrm{T}}\,}}_b\). Consider \(C_1\), where the first term, normed by \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert \Vert {\varvec{\Sigma }}_{jj'}\Vert \), vanishes under Assumptions 3–5, as \(p_i \rightarrow \infty \), and the same holds for the second term. Repeating the same for \(C_2\) and \(C_3\), and noting that the terms like \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2/p^2_i\) are uniformly bounded under the same assumptions, it follows that
where \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \); see Eq. (5). Combined with Eq. (22), it implies that \({{\,{\mathrm{Var}}\,}}(n{{\,{\mathrm{T}}\,}}_b)\) is bounded, but the covariances with the same order vanish. Now, denote \(\mathbf{T}_B = (\mathbf{T}_1, \ldots , \mathbf{T}_{b - 1})'\) with \({{\,{\mathrm{T}}\,}}_b = \mathbf{1}'{} \mathbf{T}_B\) where \(\mathbf{T}_i = ({{\,{\mathrm{T}}\,}}_{i,i+1}, \ldots , {{\,{\mathrm{T}}\,}}_{ib})'\), \(i = 1, \ldots , b - 1\), \(B = b(b - 1)/2\) and \(\mathbf{1}_B\) is the vector of 1s. By the above arguments, as \(n, p_i \rightarrow \infty \), \({{\,{\mathrm{Cov}}\,}}(\mathbf{T}_B) = {\varvec{\Lambda }}\) is a diagonal matrix with diagonal elements \({{\,{\mathrm{Var}}\,}}(T_{ij})\), \(i, j = 1, \ldots , b\), \(i < j\), i.e.,
where \({\varvec{\Lambda }}_i = {{\,{\mathrm{Cov}}\,}}(\mathbf{T}_i) = \text {{diag}}({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{i1}, \ldots , {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{ib}))\), \(i = 1, \ldots , b - 1\). Hence (see Eq. 22),
This gives the limit \({{\,{\mathrm{T}}\,}}_b\) by a simple application of Cramér–Wold device ([31], p. 16), including for the case under the null whence the covariances vanish exactly. For the last part of the theorem, we only need to prove that, we note, from Theorem 2, that
The first and last terms vanish under the assumptions, so that \({{\,{\mathrm{Var}}\,}}(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}/\Vert {\varvec{\Sigma }}_{ii}\Vert ^2) \rightarrow 4O(1/n^2)\) as \(p_i \rightarrow \infty \). Thus, \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}/p^2_i \xrightarrow {\mathcal {P}} \Vert {\varvec{\Sigma }}_{ii}\Vert ^2/p^2_i\), as \(n, p_i \rightarrow \infty \). Plugging in \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\) for \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\) in \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) gives \(\widehat{{{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)}\) as a consistent estimator of \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\). This completes the proof of Theorem 6.
The proof of Theorem 12 now essentially follows from above, by the independence of g samples. First, the covariance terms in \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})\), i.e., \({{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{lij'})\), \({{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{li'j})\) and \({{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{li'j'})\), asymptotically vanish similarly as \(C_1\), \(C_2\), \(C_3\) did for \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\). Then, with \(\nu _l = O(n^2_l)\) \(\Rightarrow \) \(\nu _0 = O(\sum _{l = 1}^gn^2_l)\), \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g) = 4O\left( \nu ^{-1}_0\right) \sum _i\sum _j{{\,{\mathrm{V}}\,}}_{lij}\), with \({{\,{\mathrm{V}}\,}}_{lij}\) uniformly bounded under the assumptions, so that \(\sqrt{\nu _0}{{\,{\mathrm{T}}\,}}_g\) has a non-degenerate limit, as \(n, p_i \rightarrow \infty \). Define \(\mathbf{T}_G = (\mathbf{T}_1, \ldots , \mathbf{T}_g)'\) with \({{\,{\mathrm{T}}\,}}_g = \mathbf{1}'{} \mathbf{T}_G\) , where \(\mathbf{T}_l = (\mathbf{T}_{l1}, \ldots , \mathbf{T}_{l,b-1})'\) and \(\mathbf{T}_{li} = ({{\,{\mathrm{T}}\,}}^l_{i,i+1}, \ldots , {{\,{\mathrm{T}}\,}}^l_{ib})'\), \(i = 1, \ldots , b\), \(l = 1, \ldots , g\). By the independence of g samples, \({{\,{\mathrm{Cov}}\,}}(\mathbf{T}_G) = {\varvec{\Delta }}\) reduces to
where, by the above arguments, \({\varvec{\Delta }}_{ll} = {{\,{\mathrm{Cov}}\,}}(\mathbf{T}_{li})\) is again a diagonal matrix with diagonal elements \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lij})\), \(i, j = 1, \ldots , b\), \(i < j\). Thus,
asymptotically coincide with Eqs. (39) and (40), respectively. Finally, the consistency of \(\widehat{{{\,{\mathrm{Var}}\,}}}({{\,{\mathrm{T}}\,}}_g)\) follows from that of \({\varvec{\Sigma }}_{pii}/p^2_i\) as shown in “Appendix 1.2.2”.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Ahmad, M.R. Tests of Zero Correlation Using Modified RV Coefficient for High-Dimensional Vectors. J Stat Theory Pract 13, 43 (2019). https://doi.org/10.1007/s42519-019-0043-x
Published:
DOI: https://doi.org/10.1007/s42519-019-0043-x