1 Introduction

Let \(\mathbf{X}_k = (X_{k1}, \ldots , X_{kp})^T\), \(k = 1, \ldots , n\), be iid random vectors drawn from a population with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_k) = {\varvec{\mu }} \in \mathbb {R}^p\) and \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_k) = {\varvec{\Sigma }} \in \mathbb {R}^{p\times p}\), where \({\varvec{\Sigma }} > 0\) can be expressed as a partitioned matrix \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i,j = 1}^{~~b}\) with blocks \({\varvec{\Sigma }}_{ij} \in \mathbb {R}^{p_i\times p_j}\), \({\varvec{\Sigma }}_{ji} = {\varvec{\Sigma }}^T_{ij}\), \({\varvec{\Sigma }}^T_{ii} = {\varvec{\Sigma }}_{ii}\). We are interested to test

$$\begin{aligned} H_{0b}: {\varvec{\Sigma }}_{ij} = \mathbf{0}~\forall ~i< j ~~{\text {vs.}}~~ H_{1b}: {\varvec{\Sigma }}_{ij} \ne \mathbf{0}~{\text {for at least one pair}} i < j \end{aligned}$$
(1)

when the block dimensions, \(p_i\), may exceed the sample size, n , and the data may not necessarily follow the multivariate normal distribution. Under \(H_{0b}\), \({\varvec{\Sigma }}\) reduces to a block-diagonal structure, \({\varvec{\Sigma }} = \text {{diag}}({\varvec{\Sigma }}_{1}, \ldots , {\varvec{\Sigma }}_{bb})\), \({\varvec{\Sigma }}_{ii} \in \mathbb {R}^{p_i\times p_i}\). Obviously, under normality, the test of \(H_{0b}\) is equivalent to testing independence of the corresponding vectors.

Now, consider \(g \ge 2\) independent populations with \(\mathbf{X}_{lk} = (X_{lk1}, \ldots , X_{lkp})^T\), \(k = 1, \ldots , n_l\), as iid random vectors drawn from lth population with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_{lk}) = {\varvec{\mu }}_l\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{lk}) = {\varvec{\Sigma }}_l > 0\), l = \(1, \ldots , g\), defined similarly as for the single population above. Assuming homogeneity of covariance matrices across g populations, i.e., \({\varvec{\Sigma }}_l = {\varvec{\Sigma }}\) \(\forall ~l\), we are further interested to test hypotheses in (1) for common \({\varvec{\Sigma }}\). To distinguish the two cases, we denote multi-sample hypotheses as \(H_{0g}\) and \(H_{1g}\), respectively.

Tests of \(H_{0b}\) or \(H_{0g}\) are frequently needed in multivariate applications, where the special case of \(p_i\) = 1 \(\forall ~i = 1, \ldots , b\) leads to the well-known single- and multi-sample tests of identity or sphericity hypotheses; see e.g., [1, 2] and the references therein. Tests of \(H_{0b}\) or \(H_{0g}\) for a classical case, \(n > p\), have often been addressed in the multivariate literature, including for multiple blocks; see [8, 16, 20, 21]. [14] give extensions to functional data, and [6] discuss permutation and bootstrap tests. Other recent extensions include nonparametric tests ([10, 13]) and a kernel-based approach ([22]); see also [9, 19] for general theory, in the context of canonical correlation analysis.

A potential advantage of \(H_{0b}\) in (1) is a huge dimension reduction, which adds further motivation to testing such hypotheses for high-dimensional data, \(p \gg n\). As the classical testing theory, mostly based on likelihood ratio approach, collapses in this case, several modifications have recently been introduced to cope with the problem; see e.g., [26, 33, 34], where [32] provide a test based on rank correlations. Most of these tests assume normality and are constructed using a Frobenius norm measuring the distance between the null and alternative hypotheses. Use of a Frobenius norm as a distance measure is a common approach for estimation and testing of high-dimensional covariance matrices; for its use in other contexts, see e.g., [3] and the references cited therein.

Our construction of the test statistics for (1) is, however, based on a different approach. We use the RV coefficient (see Sect. 2) as the basic measure of orthogonality of vectors and extend it to test \(H_{0b}\) and \(H_{0g}\). [5] recently proposed this modification to the RV coefficient for two high-dimensional vectors and used it to introduce a test of orthogonality. For details, including historical overview, comparison with its competitors and extensive bibliography, see the reference.

Apart from theoretical analysis of the modified coefficient and its competitors, extensive simulations are used to show the accuracy of the modified estimator and the significance test. The modified coefficient is composed of computationally very efficient estimators as simple functions of the empirical covariance matrix. In this paper, we first extend the theory to propose a test of no correlation between \(b \ge 2\) large vectors, possibly of different dimension.

We further extend this one-sample multi-block test to a multi-sample case with g independent populations. Assuming homogeneity of covariance matrices across g populations, we use pooled information to construct a test for the common covariance matrix. A particularly attractive aspect of the multi-sample test is its simplicity of pooling information across g independent samples, in such a way that the single-sample computations conveniently carry over to the multi-sample case, resulting in a highly efficient test statistic.

We begin in the nest section with the basic notational set up, where we also briefly recap the modified RV coefficient in [5] for further use and reference. We keep it as short as possible, referring for details to the original paper. Using these rudiments, we propose a test for \(H_{0b}\) for multiple blocks in Sect. 3, with an extension to the test of \(H_{0g}\) in Sect. 4. Accuracy of the tests through simulations is shown in Sect. 5, where technical proofs are deferred to Appendix.

2 Notations and Preliminaries

Let the random vectors \(\mathbf{X}_k \in \mathbb {R}^p\), \(k = 1, \ldots , n\), with \({{\,{\mathrm{E}}\,}}(\mathbf{X}_k) = {\varvec{\mu }} \in \mathbb {R}^p\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_k) = {\varvec{\Sigma }} \in \mathbb {R}^{p\times p}\), be partitioned as \(\mathbf{X}_k = (\mathbf{X}^T_{1k}, \ldots , \mathbf{X}^T_{bk})^T\), \(\mathbf{X}_{ik} \in \mathbb {R}^{p_i}\) so that accordingly \({\varvec{\mu }} = ({\varvec{\mu }}^T_1, \ldots , {\varvec{\mu }}^T_b)^T\), \({\varvec{\mu }}_i \in \mathbb {R}^{p_i}\), \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i,j = 1}^{~~b}\), \({\varvec{\Sigma }}_{ij} = {{\,{\mathrm{E}}\,}}(\mathbf{X}_{ik} - {\varvec{\mu }}_i)\otimes (\mathbf{X}_{jk} - {\varvec{\mu }}_j)^T \in \mathbb {R}^{p_i\times p_j}\), \(i \ne j\), \({\varvec{\Sigma }}_{ji} = {\varvec{\Sigma }}^T_{ij}\), \({\varvec{\Sigma }}^T_{ii} = {\varvec{\Sigma }}_{ii}\), where \(\mathbb {R}^{a\times b}\) is the space of real matrices and \(\otimes \) is the Kronecker product. We assume \({\varvec{\Sigma }} > 0\), \({\varvec{\Sigma }}_{ii} > 0\) \(\forall ~i\).

Let \(\overline{\mathbf{X}} = \sum _{k = 1}^n\mathbf{X}_{ik}/n\) and \(\widehat{\varvec{\Sigma }} = \sum _{k = 1}^n(\widetilde{\mathbf{X}}_k\otimes \widetilde{\mathbf{X}}_k)/(n - 1)\) be unbiased estimators of \({\varvec{\mu }}\) and \({\varvec{\Sigma }}\), likewise partitioned as \(\overline{\mathbf{X}} = (\overline{\mathbf{X}}^T_1, \ldots , \overline{\mathbf{X}}^T_b)^T\), \(\widehat{\varvec{\Sigma }} = (\widehat{\varvec{\Sigma }}_{ij})_{i, j = 1}^{~~b}\), so that \(\overline{\mathbf{X}}_i = \sum _{k = 1}^n\mathbf{X}_{ik}/n\) and \(\widehat{\varvec{\Sigma }}_{ij} = \sum _{k = 1}^n(\widetilde{\mathbf{X}}_{ik}\otimes \widetilde{\mathbf{X}}_{jk})/(n - 1)\) are unbiased estimators of \({\varvec{\mu }}_i\) and \({\varvec{\Sigma }}_{ij}\), where \(\widetilde{\mathbf{X}}_k = \mathbf{X}_k - \overline{\mathbf{X}}\), \(\widetilde{\mathbf{X}}_{ik} = \mathbf{X}_{ik} - \overline{\mathbf{X}}_i\). Denoting \(\mathbf{X} = (\mathbf{X}^T_1 \ldots \mathbf{X}^T_n) \in \mathbb {R}^{n\times p}\) as the entire data matrix and \(\mathbf{X}_i = (\mathbf{X}^T_{i1}, \ldots \mathbf{X}^T_{in})^T \in \mathbb {R}^{n\times p_i}\) as the data matrix for ith block, we can express the estimators more succinctly as

$$\begin{aligned} \overline{\mathbf{X}} = \frac{1}{n}{} \mathbf{X}^T\mathbf{1},~~~\widehat{\varvec{\Sigma }} = \frac{1}{n - 1}{} \mathbf{X}^T\mathbf{C}{} \mathbf{X},~~~\overline{\mathbf{X}}_i = \frac{1}{n}{} \mathbf{X}^T_i\mathbf{1},~~~\widehat{\varvec{\Sigma }}_{ij} = \frac{1}{n - 1}{} \mathbf{X}^T_i\mathbf{C}{} \mathbf{X}_j, \end{aligned}$$
(2)

where \(\mathbf{C} = \mathbf{I} - \mathbf{J}/n\) is the \(n\times n\) centering matrix, \(\mathbf{I}\) is the identity matrix, \(\mathbf{J} = \mathbf{11}^T\) with \(\mathbf{1}\) a vector of 1s. The null hypotheses in (1) thus imply the nullity of all off-diagonal blocks \({\varvec{\Sigma }}_{ij}\) in \({\varvec{\Sigma }}\). Consider first the simplest case of \(b = 2\), with only one off-diagonal block \({\varvec{\Sigma }}_{12} = {{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{1k}, \mathbf{X}_{2k})\). Obviously, a test of zero correlation (or, under normality, of independence) between \(\mathbf{X}_{1k}\) and \(\mathbf{X}_{2k}\) can be based on an empirical estimator of \(\Vert {\varvec{\Sigma }}_{12}\Vert ^2 = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{12}{\varvec{\Sigma }}_{21})\) or an appropriately normed version of it. One such normed measure is proposed in [5] as

$$\begin{aligned} \widehat{\eta } = \frac{\widehat{\Vert {\varvec{\Sigma }}_{12}\Vert ^2}}{\sqrt{\widehat{\Vert {\varvec{\Sigma }}_{11}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{22}\Vert ^2}}}, \end{aligned}$$
(3)

where \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) is an unbiased (and consistent) estimator of \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), ij = 1, 2; see Sect. 3 for formal definition. The \(\widehat{\eta }\) is a modified form of the original RV coefficient, \(\widetilde{\eta } = \Vert \widehat{\varvec{\Sigma }}_{12}\Vert ^2/\Vert \widehat{\varvec{\Sigma }}_{11}\Vert \Vert \widehat{\varvec{\Sigma }}_{22}\Vert \), as an extension of the scalar correlation coefficient, to measure correlation between vectors of possibly different dimension. Note that \(\widehat{\eta }\) is constructed using unbiased estimators in the true RV coefficient \(\eta = \Vert {\varvec{\Sigma }}_{12}\Vert ^2/\Vert {\varvec{\Sigma }}_{11}\Vert \Vert {\varvec{\Sigma }}_{22}\Vert \). The RV coefficient extends the usual scalar correlation coefficient to data vectors of possibly different dimension, is often used to study relationship between different data configurations, and has many attractive properties; see also [25].

Based on the aforementioned modification and subsequent test of zero correlation, we here extend the same concept and construct tests of hypotheses in (1) and their multi-sample variants. The proposed tests will be valid for the following general multivariate model which helps us relax the normality assumption. Define \(\mathbf{Y}_{ik} = \mathbf{X}_{ik} - {\varvec{\mu }}_i\), \(i = 1, \ldots , b\), so that \(\mathbf{Y}_k = (\mathbf{Y}^T_{1k}, \ldots , \mathbf{Y}^T_{2k})^T\) can replace \(\mathbf{X}_k\). We define the multivariate model as

$$\begin{aligned} \mathbf{Y}_k = {\varvec{\Sigma }}^{1/2}{} \mathbf{Z}_k, \end{aligned}$$
(4)

with \(\mathbf{Z}_k \sim \mathcal {F}\), \({{\,{\mathrm{E}}\,}}(\mathbf{Z}_k) = \mathbf{0}_p\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{Z}_k) = \mathbf{I}_p\), \(k = 1, \ldots , n\), where \(\mathcal {F}\) denotes a p-dimensional distribution function having fourth moment finite; see the assumptions below. Model (4) covers a wide class of distributions such as the elliptical class including multivariate normal. Further, it will help us enhance the validity of the proposed test to a variety of applications under fairly general conditions.

3 One-Sample Test with Multiple Blocks

Consider any two data matrices \(\mathbf{X}_i = (\mathbf{X}^T_{i1}, \ldots \mathbf{X}^T_{in})^T \in \mathbb {R}^{n\times p_i}\) and \(\mathbf{X}_j = (\mathbf{X}^T_{j1}, \ldots \mathbf{X}^T_{jn})^T \in \mathbb {R}^{n\times p_j}\) in \(\mathbf{X} \in \mathbb {R}^{n\times p}\), where \(\mathbf{X}_{ik} \in \mathbb {R}^{p_i}\), \(\mathbf{X}_{ik} \in \mathbb {R}^{p_j}\), \(k = 1, \ldots , n\), \(i, j = 1, \ldots , b\), \(i < j\). Following \(\widehat{\eta }\) in Eq. (3), the RV coefficient of correlation between \(\mathbf{X}_{ik}\) and \(\mathbf{X}_{jk}\) can be computed as

$$\begin{aligned} \widehat{\eta }_{ij} = \frac{\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}}{\sqrt{\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{jj}\Vert ^2}}}, \end{aligned}$$
(5)

where \(\widehat{\eta }_{ij}\) estimates \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \). Since \({\varvec{\Sigma }}_{ii} > 0\), \(\eta _{ij} = 0\) \(\Leftrightarrow \) \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2 = 0\) \(\forall ~i > j\), which implies that a test for \(\eta _{ij} = 0\) can be equivalently based on \({\varvec{\Sigma }}_{ij}\). In fact, as will be shown shortly, the denominator of \(\widehat{\eta }_{ij}\) adjusts itself through \({{\,{\mathrm{Var}}\,}}(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2})\) so that it suffices to use \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) to construct a test statistic. We begin by defining the estimators used to compose \(\widehat{\eta }_{ij}\) and their moments, which will also be useful in subsequent computations. For notational convenience, denote \({\varvec{\Sigma }}^{1/2}_{ii} = {\varvec{\Delta }}_{ii}\) so that \(\Vert {\varvec{\Delta }}_{ii}\Vert ^2 = {{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii})\) and correspondingly \(\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2 = {{\,{\mathrm{tr}}\,}}(\widehat{\varvec{\Sigma }}_{ii})\). For the proof of the following theorem, see “Appendix 1.2.1”.

Theorem 1

The unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \) are defined as

$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}= & {} \nu \left[ (n - 1)(n - 2)\Vert \widehat{\varvec{\Sigma }}_{ii}\Vert ^2 + \big \{\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\big \}^2 - nQ_{ii}\right] \end{aligned}$$
(6)
$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}= & {} \nu \left[ (n - 1)(n - 2)\Vert \widehat{\varvec{\Sigma }}_{ij}\Vert ^2 + \Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{jj}\Vert ^2 - nQ_{ij}\right] \end{aligned}$$
(7)
$$\begin{aligned} \widehat{\big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2}= & {} \nu \left[ 2\Vert \widehat{\varvec{\Sigma }}_{ii}\Vert ^2 + (n^2 - 3n + 1)\big [\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\big ]^2 - nQ_{ii}\right] , \end{aligned}$$
(8)
$$\begin{aligned} \widehat{\Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2}= & {} \nu \left[ 2\Vert \widehat{\varvec{\Sigma }}_{ij}\Vert ^2 + (n^2 - 3n + 1)\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{jj}\Vert ^2 - nQ_{ij}\right] , \end{aligned}$$
(9)

where \(\nu = (n - 1)/n(n - 2)(n - 3)\), \(Q_{ij} = \sum _{k = 1}^n\widetilde{z}_{ik}\widetilde{z}_{jk}/(n - 1)\) and \(Q_{ii} = \sum _{k = 1}^n\widetilde{z}^2_{ik}/(n - 1)\) with \(\widetilde{z}_{ik} = \widetilde{\mathbf{X}}^T_{ik}\widetilde{\mathbf{X}}_{ik}\) and \(\widetilde{\mathbf{X}}_{ik} = \mathbf{X}_{ik} - \overline{\mathbf{X}}_i\), \(i, j = 1, \ldots , b\), \(i < j\).

Note that the terms \(Q_{ij}\) are needed to make the estimators, and hence the subsequent test statistics, valid under Model (4) beyond the normality assumption. Essentially, these terms involve fourth-order elements of \(\mathbf{X}_{ik}\) due to the variances of bilinear forms to be computed. This in turn calls for bounding such moments of \(\mathbf{X}_{ik}\) to be finite (see assumptions below). For this, define

$$\begin{aligned} \kappa _{ij} = {{\,{\mathrm{E}}\,}}(z_{ik}z_{jk}) - 2\Vert {\varvec{\Sigma }}_{ij}\Vert ^2 - \Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2 ~~\Rightarrow ~~ \kappa _{ii} = {{\,{\mathrm{E}}\,}}(z^2_{ik}) - 2\Vert {\varvec{\Sigma }}_{ii}\Vert ^2 - \big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2 \end{aligned}$$
(10)

with \(z_{ik} = \mathbf{Y}^T_{ik}{} \mathbf{Y}_{ik}\). As \(\kappa _{ij}\) = 0 under normality, it serves as a measure of non-normality and, given finite fourth moment, helps extend the results to a wide class of distributions under Model (4). The results of Theorem 1 also help obtain unbiased estimators of \(\kappa _{ij}\) and \(\kappa _{ii}\) as (see “Appendix 1.2.1”)

$$\begin{aligned} \widehat{\kappa }_{ij}= & {} -\frac{n\nu }{n - 1}\left[ 2(n - 1)^2\Vert \widehat{\varvec{\Sigma }}_{ij}\Vert ^2 + (n - 1)^2\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{jj}\Vert ^2 - n(n + 1)Q_{ij}\right] \end{aligned}$$
(11)
$$\begin{aligned} \widehat{\kappa }_{ii}= & {} -\frac{n\nu }{n - 1}\left[ 2(n - 1)^2\Vert \widehat{\varvec{\Sigma }}_{ii}\Vert ^2 + (n - 1)^2\big [\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\big ]^2 - n(n + 1)Q_{ii}\right] . \end{aligned}$$
(12)

The estimators in Theorem 1 are composed of \(\widehat{\varvec{\Sigma }}_{ij}\) (see Eq. 2) and \(Q_{ij}\), both of which are simple functions of mean-deviated vectors \(\widetilde{\mathbf{X}}_{ik}\). It makes the estimators very simple and computationally highly efficient for practical use. For mathematical amenability, however, an alternative form of the same estimators in terms of U-statistics helps us study their properties due to the attractive projection and asymptotic theory of U-statistics.

Given \(\{\mathbf{X}_{ik}, \mathbf{X}_{ir}, \mathbf{X}_{il}, \mathbf{X}_{is}\}\), \(k \ne r \ne l \ne s\), from \(\mathbf{X}_i \in \mathbb {R}^{n\times p_i}\), define \(\mathbf{D}_{ikr}\) = \(\mathbf{X}_{ik} - \mathbf{X}_{ir}\) with \({{\,{\mathrm{E}}\,}}(\mathbf{D}_{ikr})\) = 0, \({{\,{\mathrm{Cov}}\,}}(\mathbf{D}_{ikr})\) = 2\({\varvec{\Sigma }}_{ii}\), \({{\,{\mathrm{E}}\,}}(\mathbf{D}_{ikr}{} \mathbf{D}^T_{jkr})\) = 2\({\varvec{\Sigma }}_{ij}\). For \(A_{ikr} = \mathbf{D}^T_{ikr}{} \mathbf{D}_{ikr}\), \({{\,{\mathrm{E}}\,}}(A_{ikr}) = 2\Vert {\varvec{\Delta }}_{ii}\Vert ^2\) and \({{\,{\mathrm{E}}\,}}(A_{iks}A_{jlr}) = 4\Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2\). With \(\mathbf{D}_{ils}\) defined similarly, let \(A_{ijlskr} = A_{ilskr}A_{jkrls}\) where \(A_{ilskr} = \mathbf{D}^T_{ils}{} \mathbf{D}_{ikr}\), so that \({{\,{\mathrm{E}}\,}}(A^2_{ilskr}) = 4\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \({{\,{\mathrm{E}}\,}}(A_{ijlskr}) = 4\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\). Denoting further \(B_{iikrls} = A^2_{ilskr} + A^2_{ilksr} + A^2_{ilrsk}\), \(C_{ijklrs} = A_{ijlskr} + A_{ijlksr} + A_{ijlrks}\), \(D_{ijkrls} = A_{ikr}A_{jls} + A_{ikl}A_{jrs} + A_{iks}A_{jlr}\) and \(P(n) = n(n - 1)(n - 2)(n - 3)\), the U-statistics versions of the estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert ^2\Vert {\varvec{\Delta }}_{jj}\Vert ^2\) in Theorem 2, denoted \(U_1, U_2, U_3\), are defined, respectively, as

$$\begin{aligned} U_1 = \frac{1}{12P(n)}\underset{\pi (\cdot )}{\sum _{k,r,ls}^n}B_{iikrls},~~ U_2 = \frac{1}{12P(n)}\underset{\pi (\cdot )}{\sum _{k,r,ls}^n}C_{12lskr},~~ U_3 = \frac{1}{12P(n)}\underset{\pi (\cdot )}{\sum _{k,r,ls}^n}D_{12lskr}, \end{aligned}$$
(13)

where the sum is quadruple over \(\{k, l, r, s\}\) and \(\pi (\cdot ) = \pi (k,r,l,s)\) implies \(k \ne r \ne l \ne s\). The moments in the following theorem follow conveniently using the alternative form of estimators in (13) and will be very useful for further computations in the sequel.

Theorem 2

For the unbiased estimators \(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}\) and \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\) in Theorem 2, we have

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}(\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2})= & {} \frac{2}{P(n)}\Big [4a(n)\Vert {\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{ji}\Vert ^2 + 2b(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj}{\varvec{\Sigma }}_{ji})\nonumber \\&+~c(n)\left\{ \left[ \Vert {\varvec{\Sigma }}_{ij}\Vert ^2\right] ^2 + \Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2 + KO(1)\right\} \Big ] \end{aligned}$$
(14)
$$\begin{aligned} {{\,{\mathrm{Var}}\,}}(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2})= & {} \frac{4}{P(n)}\left[ d(n)\Vert {\varvec{\Sigma }}^2_{ii}\Vert ^2 + e(n)\left[ \Vert {\varvec{\Sigma }}^2_{ii}\Vert \right] ^2 + KO(n^2)\right] \end{aligned}$$
(15)

where \(a(n) = 3n^3 - 24n^2 + 44n + 20\), \(b(n) = 6n^3 - 40n^2 + 22n + 181\), \(c(n) = 2n^2 - 12n + 21\), \(d(n) = 2n^3 - 9n^2 + 9n - 16\), \(e(n) = n^2 - 3n + 8\) and K contains terms involving \(\kappa _{ij}\).

We skip the proof of Theorem 2 which follows from that of Theorem 2 in [5]. The terms \(KO(\cdot )\) sum up constants that eventually vanish under the assumptions. In particular, K involves \(\kappa _{ij}\) in (10) and terms involving Hadamard products such as \({\varvec{\Sigma }}_{ii}\odot {\varvec{\Sigma }}_{jj}\) which converge to zero. Now, we have the required tools to proceed with the test of \(H_{0b}\) in (1). As mentioned above,

$$\begin{aligned} H_{0b}: {\varvec{\Sigma }}_{ij} = \mathbf{O}~~\Leftrightarrow ~~H_{0b}: \eta _{ij} = 0~~\forall ~i < j,~~i, j = 1, \ldots , b. \end{aligned}$$
(16)

Moreover, \(\eta _{ij} = 0 \Leftrightarrow \sum _{i < j}^{~b}\eta _{ij} = 0\), so that a test of \(H_{0b}\) can be based on a sum of Frobenius norms over all off-diagonal blocks, i.e., \(\sum _{i < j}^b\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\). We thus define the test statistic for \(H_{0b}\) as

$$\begin{aligned} {{\,{\mathrm{T}}\,}}_b = \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{T}}\,}}_{ij}, \quad {\text {where}}\quad {{\,{\mathrm{T}}\,}}_{ij} = \frac{1}{p_ip_j}\widehat{\Vert {\varvec{\Sigma }}_{ij}\Vert ^2}. \end{aligned}$$
(17)

Here, \({{\,{\mathrm{T}}\,}}_{ij}\) is a statistic to test \(H_{0ij}: \eta _{ij} = 0\) for any single off-diagonal block \({\varvec{\Sigma }}_{ij}\). Moreover, the scaling factor \(p_ip_j\) will help us obtain the limit of \({{\,{\mathrm{T}}\,}}_b\) for \(p_i \rightarrow \infty \) along with \(n \rightarrow \infty \), under the following assumptions. Recall \(\mathbf{Y}_k \in \mathbb {R}^p\) in Model (4).

Assumption 3

\({{\,{\mathrm{E}}\,}}(Y^4_{ks}) = \gamma _s \le \gamma < \infty \), \(\forall ~s = 1, \ldots , p\).

Assumption 4

\(\lim _{p_i \rightarrow \infty }\frac{\Vert {\varvec{\Delta }}_{ii}\Vert ^2}{p_i} = O(1)\), \(\forall ~i = 1, \ldots , b\).

Assumption 5

\(\lim _{n, p_i \rightarrow \infty }\frac{p_i}{n} \rightarrow \zeta _0 \in (0, \infty )\), \(\forall ~i = 1, \ldots , b\).

Assumption 3 bounds fourth moment of \(\mathbf{Y}_k\) so that, by Cauchy–Schwarz inequality, \({{\,{\mathrm{E}}\,}}(y^2_{k1s}y^2_{k2s}) < \infty \) which implies that \(\kappa _{ij}/\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2 \rightarrow 0\). This conforms to the definition of \(\kappa _{ij}\) and helps us present the test under Model (4), and it also makes the terms involving K vanish in Theorem 2. Assumption 4 bounds the average of the eigenvalues of diagonal blocks. It is a mild assumption, often used in high-dimensional testing, and as its consequence, \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j = O(1)\).

Whereas Assumption 4 is needed for limits under \(H_{0b}\) and \(H_{1b}\), its consequence is only needed under \(H_{1b}\) since it neither holds nor is required under \(H_{0b}\). Assumption 5 controls joint growth of n and \(p_i\) so that the limit holds under a high-dimensional framework, and it is also needed only under \(H_{1b}\). Now, for \({{\,{\mathrm{T}}\,}}_b\), we have \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b) = \sum _{i < j}^{~b}\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j\) = 0 under \(H_{0b}\) and

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)= & {} \underset{i< j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{ij}) + 2\underset{i< j< j'}{\sum _{i = 1}^b\sum _{j = 1}^b\sum _{j' = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{ij}, {{\,{\mathrm{T}}\,}}_{ij'})\nonumber \\&+~2\underset{i< i'< j}{\sum _{i = 1}^b\sum _{i' = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{ij}, {{\,{\mathrm{T}}\,}}_{i'j}) + \underset{i< i'< j < j'}{\sum _{i = 1}^b\sum _{i' = 1}^b\sum _{j = 1}^b\sum _{j' = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{ij}, {{\,{\mathrm{T}}\,}}_{i'j'})~. \end{aligned}$$
(18)

Equation (18) gives \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{ij})\) and the covariance, say \(C_1\), \(C_2\), \(C_3\), respectively, given as below.

$$\begin{aligned} C_1= \, & {} \frac{4}{P(n)p^2_ip_jp_{j'}}\Big [2a_1(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{ji}{\varvec{\Sigma }}_{ij'}{\varvec{\Sigma }}_{j'i}) + b_1(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj'}{\varvec{\Sigma }}_{j'i})\nonumber \\&+~(n - 4)\left\{ 2\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\Vert {\varvec{\Sigma }}_{ij'}\Vert ^2 + 5\Vert {\varvec{\Sigma }}_{jj'}\Vert ^2\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\right\} \Big ] \end{aligned}$$
(19)
$$\begin{aligned} C_2=\, & {} \frac{4}{P(n)p^2_ip_jp_{i'}}\Big [2a_1(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii'}{\varvec{\Sigma }}_{i'i}{\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{ji}) + b_1(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii}{\varvec{\Sigma }}_{ii'}{\varvec{\Sigma }}_{i'j}{\varvec{\Sigma }}_{ji})\nonumber \\&+~(n - 4)\left\{ 2\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\Vert {\varvec{\Sigma }}_{i'j}\Vert ^2 + 5\Vert {\varvec{\Sigma }}_{ii'}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2\right\} \Big ] \end{aligned}$$
(20)
$$\begin{aligned} C_3=\, & {} \frac{4}{P(n)p_ip_jp_{i'}p_{j'}}\Big [2a_2(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{ji'}{\varvec{\Sigma }}_{i'j'}{\varvec{\Sigma }}_{j'i}) + b_2(n){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ij}{\varvec{\Sigma }}_{jj'}{\varvec{\Sigma }}_{j'i'}{\varvec{\Sigma }}_{i'i})\nonumber \\&+~(4n - 11){{\,{\mathrm{tr}}\,}}({\varvec{\Sigma }}_{ii'}{\varvec{\Sigma }}_{i'j}{\varvec{\Sigma }}_{jj'}{\varvec{\Sigma }}_{j'i}) + 3(n - 3)\Vert {\varvec{\Sigma }}_{i'j}\Vert ^2\Vert {\varvec{\Sigma }}_{ij'}\Vert ^2 \nonumber \\&+ (3n - 10)\Vert {\varvec{\Sigma }}_{ii'}\Vert ^2\Vert {\varvec{\Sigma }}_{jj'}\Vert ^2\Big ]~~~~ \end{aligned}$$
(21)

\(a_1(n) = 3n^3 - 38n^2 + 170n - 262\), \(b_1(n) = 6n^3 - 57n^2 + 178n - 182\), \(a_2(n) = 3n^3 - 39n^2 + 176n - 269\), \(b_2(n) = 6n^3 - 70n^2 + 286n - 199\). Under \(H_{0b}\), the covariances vanish and the variance reduces to

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b) = \frac{2(2n^2 - 12n + 21)}{P(n)}\underset{i< j}{\sum _{i = 1}^b\sum _{j = 1}^b}\frac{1}{p^2_ip^2_j}\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2 ~=~ O\left( \frac{1}{n^2}\right) \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{V}}\,}}_{ij}, \end{aligned}$$
(22)

where \({{\,{\mathrm{V}}\,}}_{ij} = \Vert {\varvec{\Sigma }}_{ii}\Vert ^2\Vert {\varvec{\Sigma }}_{jj}\Vert ^2/p^2_ip^2_j\). It implies that \({{\,{\mathrm{Var}}\,}}(n{{\,{\mathrm{T}}\,}}_b)\) is bounded and a non-degenerate limit of \(n{{\,{\mathrm{T}}\,}}_b\) may exist. That this indeed holds under the assumptions follow from the limit of \({{\,{\mathrm{T}}\,}}_{ij}\) given in Ahmad ([5], Theorem 3), by noting that the covariance terms above essentially vanish under the same assumptions. The following theorem summarizes the result, the proof of which, along with that of Theorem 12 for the multi-sample case, is given in “Appendix 1.2.3”.

Theorem 6

Given \({{\,{\mathrm{T}}\,}}_b\) in (17) and Assumptions 35. Then, \(({{\,{\mathrm{T}}\,}}_b - {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}{\,}}_b}))/\sigma _{{{\,{\mathrm{T}}\,}}_b} ~\xrightarrow {\mathcal{D}}~ N(0, 1)\), as \(n, p_i \rightarrow \infty \), with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) in (18). Under \(H_0\) and Assumptions 34, \(n{{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_{b}} ~\xrightarrow ~ N(0, 1)\) with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b}\) as in Eq. (22). Further, the same limits hold when \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) is replaced with \(\widehat{{{\,{\mathrm{Var}}\,}}}({{\,{\mathrm{T}}\,}}_b)\).

The last part of Theorem 6 makes the statistic \({{\,{\mathrm{T}}\,}}_b\) applicable, when a consistent estimator, i.e.,

$$\begin{aligned} \widehat{{{\,{\mathrm{Var}}\,}}}(n{{\,{\mathrm{T}}\,}}_b) = \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}\widehat{{{\,{\mathrm{V}}\,}}}_{ij}, \end{aligned}$$
(23)

is plugged in for the true variance, where \(\widehat{{{\,{\mathrm{V}}\,}}}_{ij} = \widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{jj}\Vert ^2/}p^2_ip^2_j\). From the proof of the theorem, we also note that the moments and the limit of \({{\,{\mathrm{T}}\,}}_b\) are functions of \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \). For the power of \({{\,{\mathrm{T}}\,}}_b\), let \(\beta ({\varvec{\theta }})\) be the power function with \({\varvec{\theta }} \in {\varvec{\Theta }}\), where

$$\begin{aligned} {\varvec{\theta }}_0 = \{{\varvec{\Sigma }}_{ii}, i \in \{1, \ldots , b\}\}, ~~ {\varvec{\Theta }}_1 = \{{\varvec{\Sigma }}_{ii}, {\varvec{\Sigma }}_{ij}, i, j \in \{1, \ldots , b\}, i < j\} \end{aligned}$$

are the parameter subspaces under \(H_{0b}\) and \(H_{1b}\), respectively. Let \(z_\alpha \) denote 100\(\alpha \)% quantile of N(0, 1) distribution, so that \(P({{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_{b0}} \ge z_\alpha ) = \alpha \), by Theorem 6, where \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_{0b}}\) denotes \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\) under \(H_{0b}\). Also, let \(\gamma = \sigma _{T_{b0}}/\sigma _{T_b}\), \(\delta = {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b)/\sigma _{{{\,{\mathrm{T}}\,}}_b}\) with \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_b) = \sum _{i < j}^b\Vert {\varvec{\Sigma }}_{ij}\Vert ^2/p_ip_j\), \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_b} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_b)\). Then, \(\beta ({\varvec{\theta }}|H_1)\) = \(P({{\,{\mathrm{T}}\,}}_b/\sigma _{{{\,{\mathrm{T}}\,}}_b} \le \gamma z_\alpha - \delta )\) = 1 - \(\Phi (\gamma z_\alpha - \delta )\) defines the power of \({{\,{\mathrm{T}}\,}}_b\), where \(\Phi (\cdot )\) is the distribution function of N(0, 1). From Theorem 2, \(\gamma = O(1)\) and \(\delta = O(n)\) under the assumptions, so that 1 - \(\Phi (\gamma z_\alpha - \delta ) \rightarrow \) 1 as \(n, p_i \rightarrow \infty \). Finally, letting \({\varvec{\Sigma }}_{ij} = \mathbf{A}_{ij}/\sqrt{n}\) for a fixed matrix \(\mathbf{A}_{ij}\), \(i, j = 1, \ldots , b\), \(i < j\), similar arguments imply that the local power also converges to 1 as \(n, p_i \rightarrow \infty \).

4 Multi-Sample Extension with Multiple Blocks

Consider the multi-sample set up given after (1) with \(\mathbf{X}_{lk} = (\mathbf{X}^T_{l1k}, \ldots , \mathbf{X}^T_{lbk})^T \in \mathbb {R}^p\), \(\mathbf{X}_{lik} \in \mathbb {R}^{p_i}\), \(k = 1, \ldots , n_l\), as iid vectors from population l, where \({{\,{\mathrm{E}}\,}}(\mathbf{X}_{lk}) = {\varvec{\mu }}_l = ({\varvec{\mu }}^T_{l1}, \ldots , {\varvec{\mu }}^T_{lb})^T \in \mathbb {R}^p\), \({\varvec{\mu }}_{li} \in \mathbb {R}^{p_i}\), \({{\,{\mathrm{Cov}}\,}}(\mathbf{X}_{lk}) = {\varvec{\Sigma }}_l = ({\varvec{\Sigma }}_{lij})_{i,j = 1}^{~~b} \in \mathbb {R}^{p\times p}\), \(l = 1, \ldots , g\), \({\varvec{\Sigma }}_{lij} = {{\,{\mathrm{E}}\,}}(\mathbf{X}_{lik} - {\varvec{\mu }}_{li})\otimes (\mathbf{X}_{ljk} - {\varvec{\mu }}_{lj})^T \in \mathbb {R}^{p_i\times p_j}\), \(i < j\), \({\varvec{\Sigma }}_{lji} = {\varvec{\Sigma }}^T_{lij}\), \({\varvec{\Sigma }}^T_{lii} = {\varvec{\Sigma }}_{lii}\). Then \(\overline{\mathbf{X}}_l = \sum _{k = 1}^{n_l}{} \mathbf{X}_{lk}/n_l\), \(\widehat{\varvec{\Sigma }}_l = \sum _{k = 1}^{n_l}(\widetilde{\mathbf{X}}_{lk}\otimes \widetilde{\mathbf{X}}_{lk})/(n_l - 1)\) are unbiased estimators of \({\varvec{\mu }}_l\), \({\varvec{\Sigma }}_l\), correspondingly partitioned as \(\overline{\mathbf{X}}_l = (\overline{\mathbf{X}}^T_{l1}, \ldots , \overline{\mathbf{X}}^T_{lb})^T\), \(\widehat{\varvec{\Sigma }}_l = (\widehat{\varvec{\Sigma }}_{lij})_{i, j = 1}^{~~b}\) with \(\overline{\mathbf{X}}_{li} = \sum _{k = 1}^{n_l}{} \mathbf{X}_{lik}/n_l\), \(\widehat{\varvec{\Sigma }}_{lij} = \sum _{k = 1}^{n_l}(\widetilde{\mathbf{X}}_{lik}\otimes \widetilde{\mathbf{X}}_{ljk})/(n_l - 1)\) as unbiased estimators of \({\varvec{\mu }}_{li}\) and \({\varvec{\Sigma }}_{lij}\), where \(\widetilde{\mathbf{X}}_k = \mathbf{X}_k - \overline{\mathbf{X}}\) and \(\widetilde{\mathbf{X}}_{lik} = \mathbf{X}_{lik} - \overline{\mathbf{X}}_{li}\). Denoting \(\mathbf{X}_l = (\mathbf{X}^T_{l1} \ldots \mathbf{X}^T_{ln_l})^T \in \mathbb {R}^{n_l\times p}\) as the entire data matrix for lth sample, with \(\mathbf{X}_{li} = (\mathbf{X}^T_{li1}, \ldots , \mathbf{X}^T_{lin_l})^T \in \mathbb {R}^{n_l\times p_i}\) as ith data matrix in \(\mathbf{X}_l\), we can rewrite the estimators, using \(\mathbf{C}_l = \mathbf{I}_{n_l} - \mathbf{J}_{n_l}/n_l\), \(l = 1, \ldots , g\), as

$$\begin{aligned} \overline{\mathbf{X}}_l = \frac{1}{n_l}{} \mathbf{X}^T_l\mathbf{1}_{n_l},~~~\widehat{\varvec{\Sigma }}_l = \frac{1}{n_l - 1}\mathbf{X}^T_l\mathbf{C}_l\mathbf{X}_l,~~~\overline{\mathbf{X}}_{li} = \frac{1}{n_l}{} \mathbf{X}^T_{li}{} \mathbf{1}_{n_l},~~~\widehat{\varvec{\Sigma }}_{lij} = \frac{1}{n_l - 1}{} \mathbf{X}^T_{li}{} \mathbf{C}_l\mathbf{X}_{lj}. \end{aligned}$$
(24)

The RV coefficient in Eq. (3) can now be computed from lth sample, \(l = 1, \ldots , g\), as

$$\begin{aligned} \widehat{\eta }_{lij} = \frac{\widehat{\Vert {\varvec{\Sigma }}_{lij}\Vert ^2}}{\sqrt{\widehat{\Vert {\varvec{\Sigma }}_{lii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{ljj}\Vert ^2}}} \end{aligned}$$
(25)

which estimates \(\eta _{lij} = \Vert {\varvec{\Sigma }}_{lij}\Vert ^2/\Vert {\varvec{\Sigma }}_{lii}\Vert \Vert {\varvec{\Sigma }}_{ljj}\Vert \). We are interested to test hypotheses in (1) for common \({\varvec{\Sigma }}\), assuming \({\varvec{\Sigma }}_l = {\varvec{\Sigma }}\) \(\forall ~l\), i.e., to test if \({\varvec{\Sigma }}\) is block diagonal, \({\varvec{\Sigma }} = \text {{diag}}({\varvec{\Sigma }}_{11}, \ldots , {\varvec{\Sigma }}_{bb})\). Formally,

$$\begin{aligned} H_{0g}: {\varvec{\Sigma }}_{ij} = \mathbf{0}~\forall ~i \ne j ~~\text {{vs.}}~~ H_{1g}: {\varvec{\Sigma }}_{ij} \ne \mathbf{0}~\text {{for at least one pair }} i \ne j, \end{aligned}$$
(26)

where the subscript g refers to g populations. A test of \(H_{0g}\) can thus be constructed by pooling information from g populations. For this, we state Assumptions 35 for the multi-sample case.

Assumption 7

\({{\,{\mathrm{E}}\,}}(Y^4_{lks}) = \gamma _s \le \gamma < \infty \), \(\forall ~s = 1, \ldots , p\), \(l = 1, \ldots , g\).

Assumption 8

\(\lim _{p_i \rightarrow \infty }\frac{\Vert {\varvec{\Delta }}_{lii}\Vert ^2}{p_i} = O(1)\), \(\forall ~i = 1, \ldots , b\), \(l = 1, \ldots , g\).

Assumption 9

\(\lim _{n_l, p_i \rightarrow \infty }\frac{p_i}{n_l} \rightarrow \zeta _l \le \zeta _0 \in (0, \infty )\), \(\forall ~i = 1, \ldots , b\), \(l = 1, \ldots , g\).

Now, we begin by writing the estimators in Eqs. (6)–(9) for the lth sample as

$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{lii}\Vert ^2}= & {} \nu _l\left[ (n_l - 1)(n_l - 2)\Vert \widehat{\varvec{\Sigma }}_{lii}\Vert ^2 + \big \{\Vert \widehat{\varvec{\Delta }}_{lii}\Vert ^2\big \}^2 - n_lQ_{lii}\right] \end{aligned}$$
(27)
$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{lij}\Vert ^2}= & {} \nu _l\left[ (n_l - 1)(n_l - 2)\Vert \widehat{\varvec{\Sigma }}_{lij}\Vert ^2 + \Vert \widehat{\varvec{\Delta }}_{lii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{ljj}\Vert ^2 - n_lQ_{lij}\right] \end{aligned}$$
(28)
$$\begin{aligned} \widehat{\big [\Vert {\varvec{\Delta }}_{lii}\Vert ^2\big ]^2}= & {} \nu \left[ 2\Vert \widehat{\varvec{\Sigma }}_{ii}\Vert ^2 + (n^2_l - 3n_l + 1)\big [\Vert \widehat{\varvec{\Delta }}_{ii}\Vert ^2\big ]^2 - n_lQ_{ii}\right] , \end{aligned}$$
(29)
$$\begin{aligned} \widehat{\Vert {\varvec{\Delta }}_{lii}\Vert ^2\Vert {\varvec{\Delta }}_{ljj}\Vert ^2}= & {} \nu _l\left[ 2\Vert \widehat{\varvec{\Sigma }}_{lij}\Vert ^2 + (n^2_l - 3n_l + 1)\Vert \widehat{\varvec{\Delta }}_{lii}\Vert ^2\Vert \widehat{\varvec{\Delta }}_{ljj}\Vert ^2 - n_lQ_{lij}\right] , \end{aligned}$$
(30)

where \(\nu _l = (n_l - 1)/n_l(n_l - 2)(n_l - 3)\), \(Q_{lij} = \sum _{k = 1}^n\widetilde{z}_{lik}\widetilde{z}_{ljk}/(n_l - 1)\) and \(Q_{lii} = \sum _{k = 1}^n\widetilde{z}^2_{lik}/(n_l - 1)\) with \(\widetilde{z}_{lik} = \widetilde{\mathbf{X}}^T_{lik}\widetilde{\mathbf{X}}_{lik}\) and \(\widetilde{\mathbf{X}}_{lik} = \mathbf{X}_{lik} - \overline{\mathbf{X}}_{li}\), \(i, j = 1, \ldots , b\), \(i < j\). Note that these are unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), \(\big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \), respectively, obtained from sample l, so that they can be used to construct pooled estimators of the same parameters, as the following.

Theorem 10

Denote \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\) with \(\nu _l = (n_l - 1)/n_l(n_l - 2)(n_l - 3)\). The pooled unbiased estimators of \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\), \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), \(\big [\Vert {\varvec{\Delta }}_{ii}\Vert ^2\big ]^2\) and \(\Vert {\varvec{\Delta }}_{ii}\Vert \Vert {\varvec{\Delta }}_{jj}\Vert \) are defined, respectively, as

$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}= & {} \frac{1}{\nu _0}\sum _{l = 1}^g\frac{1}{\nu _l}\widehat{\Vert {\varvec{\Sigma }}_{lii}\Vert ^2} \end{aligned}$$
(31)
$$\begin{aligned} \widehat{\Vert {\varvec{\Sigma }}_{pij}\Vert ^2}= & {} \frac{1}{\nu _0}\sum _{l = 1}^g\frac{1}{\nu _l}\widehat{\Vert {\varvec{\Sigma }}_{lij}\Vert ^2} \end{aligned}$$
(32)
$$\begin{aligned} \widehat{\big [\Vert {\varvec{\Delta }}_{pii}\Vert ^2\big ]^2}= & {} \frac{1}{\nu _0}\sum _{l = 1}^g\frac{1}{\nu _l}\widehat{\big [\Vert {\varvec{\Delta }}_{lii}\Vert ^2\big ]^2}, \end{aligned}$$
(33)
$$\begin{aligned} \widehat{\Vert {\varvec{\Delta }}_{pii}\Vert ^2\Vert {\varvec{\Delta }}_{pjj}\Vert ^2}= & {} \frac{1}{\nu _0}\sum _{l = 1}^g\frac{1}{\nu _l}\widehat{\Vert {\varvec{\Delta }}_{lii}\Vert ^2\Vert {\varvec{\Delta }}_{ljj}\Vert ^2}. \end{aligned}$$
(34)

In pooling information across g samples, Eqs. (31)–(34) use weights \(1/\nu _l\), where the pooled estimators correspond to \({\varvec{\Sigma }} = ({\varvec{\Sigma }}_{ij})_{i, j = 1}^{~~b}\) for which \(H_{0g}\) is defined. Thus, an appropriate test statistic for \(H_{0g}\) can be defined as

$$\begin{aligned} {{\,{\mathrm{T}}\,}}_g = \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{T}}\,}}_{pij} ~~\text {{where}}~~ {{\,{\mathrm{T}}\,}}_{pij} = \frac{1}{p_ip_j}\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2} \end{aligned}$$
(35)

which extends Eq. (17) for the multi-sample case under homogeneity. Equivalently, we can write

$$\begin{aligned} {{\,{\mathrm{T}}\,}}_g = \frac{1}{\nu _0}\sum _{l = 1}^g\frac{1}{\nu _l}{{\,{\mathrm{T}}\,}}_{lb} ~~\text {{with}}~~ {{\,{\mathrm{T}}\,}}_{lb} = \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{T}}\,}}_{lij} ~~\text {{where}}~~ {{\,{\mathrm{T}}\,}}_{lij} = \frac{1}{p_ip_j}\widehat{\Vert {\varvec{\Sigma }}_{lij}\Vert ^2}. \end{aligned}$$
(36)

In this form, \({{\,{\mathrm{T}}\,}}_{lij}\), hence \({{\,{\mathrm{T}}\,}}_{lb}\), are the same as \({{\,{\mathrm{T}}\,}}_{ij}\) and \({{\,{\mathrm{T}}\,}}_b\) in Sect. 3 but now defined for lth population. In either case, the main focus on the formulation of \({{\,{\mathrm{T}}\,}}_g\) is simplicity, so that, by independence of the g samples, the computations for the one-sample case will mainly suffice for the multi-sample case, as for example the results of the following theorem; see “Appendix 1.2.2” for proof.

Theorem 11

The pooled estimators \(\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}\) and \(\widehat{\Vert {\varvec{\Sigma }}_{pij}\Vert ^2}\) in Eqs. (31)–(32) are unbiased for \(\Vert {\varvec{\Sigma }}_{ii}\Vert ^2\) and \(\Vert {\varvec{\Sigma }}_{ij}\Vert ^2\), respectively. Further, under Assumptions 79, as \(n, p_i \rightarrow \infty \),

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}\left( \sqrt{\nu _0}\frac{\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}}{\Vert {\varvec{\Sigma }}_{ii}\Vert }\right)= &\, {} 4\left[ 1 + o(1)\right] \end{aligned}$$
(37)
$$\begin{aligned} {{\,{\mathrm{Var}}\,}}\left( \sqrt{\nu _0}\frac{\widehat{\Vert {\varvec{\Sigma }}_{pij}\Vert ^2}}{\Vert {\varvec{\Sigma }}_{ij}\Vert }\right)= &\, {} 4\left( 1 + \frac{1}{\eta ^2_{ij}}\right) [1 + o(1)], \end{aligned}$$
(38)

where \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\) and \(\eta _{ij} = \Vert {\varvec{\Sigma }}_{ij}\Vert ^2/\Vert {\varvec{\Sigma }}_{ii}\Vert \Vert {\varvec{\Sigma }}_{jj}\Vert \), \(i, j = 1,\ldots , b\), \(i < j\).

Theorem 11 provides bounds on the variance ratios which are more important for our purpose, where the exact variances, which follow from those of single-sample case in Theorem 2, are given in “Appendix 1.2.2”. Moreover, Eq. (38) also implies that \(\widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}/p^2_i\) is a consistent estimator of \(\Vert {\varvec{\Sigma }}_{pii}\Vert ^2/p^2_i\), \(i = 1, \ldots , b\), as \(n, p_i \rightarrow \infty \). Now, from Theorem 11,

$$\begin{aligned} {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_g)= & {} \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}\frac{1}{p_ip_j}\Vert {\varvec{\Sigma }}_{ij}\Vert ^2 \end{aligned}$$
(39)
$$\begin{aligned} {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)= & {} \frac{1}{\nu ^2_0}\sum _{l = 1}^g\frac{1}{\nu ^2_l}{{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb}), \end{aligned}$$
(40)

since \({{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lb}, {{\,{\mathrm{T}}\,}}_{l'b}) = 0\), \(l \ne l'\), by independence, where \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})\) follows directly from Eq. (18) as

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})= & {} \underset{i< j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{Var}}\,}}({{\,\mathrm{{T}}\,}}_{lij}) + 2\underset{i< j< j'}{\sum _{i = 1}^b\sum _{j = 1}^b\sum _{j' = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{lij'})\nonumber \\&+~2\underset{i< i'< j}{\sum _{i = 1}^b\sum _{i' = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{li'j}) + \underset{i< i'< j < j'}{\sum _{i = 1}^b\sum _{i' = 1}^b\sum _{j = 1}^b\sum _{j' = 1}^b}{{\,{\mathrm{Cov}}\,}}({{\,{\mathrm{T}}\,}}_{lij}, {{\,{\mathrm{T}}\,}}_{li'j'})~ \end{aligned}$$
(41)

\(l = 1, \ldots , g\). Equations (19)–(21) provide variances and covariances in \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_{lb})\), using the results of Theorem 2 with \({{\,{\mathrm{T}}\,}}_{ij}\)’s replaced with \({{\,{\mathrm{T}}\,}}_{lij}\) and n with \(n_l\). We therefore skip unnecessary repetitions. Now, under \(H_{0g}\), \({{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_g) = 0\) and

$$\begin{aligned} {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)=\, & {} \frac{1}{\nu ^2_0}\sum _{l = 1}^g\frac{1}{\nu ^2_l}\frac{2(2n^2_l - 12n_l + 21)}{P(n_l)}\underset{i< j}{\sum _{i = 1}^b\sum _{j = 1}^b}\frac{1}{p^2_ip^2_j}\Vert {\varvec{\Sigma }}_{lii}\Vert ^2\Vert {\varvec{\Sigma }}_{ljj}\Vert ^2\nonumber \\=\, & {} O\left( \frac{1}{\nu _0}\right) \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}{{\,{\mathrm{V}}\,}}_{lij}, \end{aligned}$$
(42)

where \(\nu _0 = \sum _{l = 1}^g\nu ^{-1}_l\), \(P(n_l) = n_l(n_l - 1)(n_l - 2)(n_l - 3)\) and \({{\,{\mathrm{V}}\,}}_{lij} = \Vert {\varvec{\Sigma }}_{lii}\Vert ^2\Vert {\varvec{\Sigma }}_{ljj}\Vert ^2/p^2_ip^2_j\). It is obvious from the construction and moments of \({{\,{\mathrm{T}}\,}}_g\) that its limit, by the independence of samples, follows along the same lines as that of \({{\,{\mathrm{T}}\,}}_b\) without much new computation, and likewise holds for the consistency of \(\widehat{\Vert {\varvec{\Sigma }}_{lii}\Vert ^2}/p^2_i\) using that of \(\widehat{\Vert {\varvec{\Sigma }}_{ii}\Vert ^2}/p^2_i\), so that, plugged in, they yield \(\widehat{{{\,{\mathrm{Var}}\,}}}({{\,{\mathrm{T}}\,}}_g)\) as a consistent estimator of \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) as the following, where \(\widehat{{{\,{\mathrm{V}}\,}}}_{pij} = \widehat{\Vert {\varvec{\Sigma }}_{pii}\Vert ^2}\widehat{\Vert {\varvec{\Sigma }}_{pjj}\Vert ^2}/p^2_ip^2_j\):

$$\begin{aligned} \widehat{{{\,{\mathrm{Var}}\,}}}(\sqrt{\nu _0}{{\,{\mathrm{T}}\,}}_g) = \underset{i < j}{\sum _{i = 1}^b\sum _{j = 1}^b}\widehat{{{\,{\mathrm{V}}\,}}}_{pij}. \end{aligned}$$
(43)

The following theorem extends Theorem 6 to the multi-sample case; see “Appendix 1.2.3“ for proof.

Theorem 12

For \({{\,{\mathrm{T}}\,}}_g\) in (35) and Assumptions 79, \(({{\,{\mathrm{T}}\,}}_g - {{\,{\mathrm{E}}\,}}({{\,{\mathrm{T}}\,}}_g))/\sigma _{{{\,{\mathrm{T}}\,}}_g} ~\xrightarrow {\mathcal {D}}~ N(0, 1)\), as \(n_l, p_i \rightarrow \infty \), with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_g} = {{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) in Eq. (40). Under \(H_{0g}\) and Assumptions 78, \(\sqrt{\nu _0}{{\,{\mathrm{T}}\,}}_g/\sigma _{{{\,{\mathrm{T}}\,}}_g}\) \(\xrightarrow {\mathcal {D}}\) N(0, 1) with \(\sigma ^2_{{{\,{\mathrm{T}}\,}}_g}\) in Eq. (42). Further, the limits hold if \({{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)\) is replaced with a consistent estimator \(\widehat{{{\,{\mathrm{Var}}\,}}({{\,{\mathrm{T}}\,}}_g)}\).

5 Simulations

We evaluate the performance of \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\) through simulations. For the one-sample multi-block statistic, \({{\,{\mathrm{T}}\,}}_b\), we take \(b = 3\) and sample random vectors of sizes \(n \in \{20, 50, 100\}\) from multivariate normal, t and chi-square distributions with 10 degrees of freedom each for the latter two, with block dimensions \(p_1 \in \{10, 25, 50, 150, 300\}\), \(p_2 = 2p_1\), \(p_3 = 3p_1\). Two covariance patterns are assumed for the true covariance matrix, i.e., compound symmetry (CS) and AR(1), defined, respectively, as \((1 - \rho )\mathbf{I}_p + \rho \mathbf{J}_p\) and \({\varvec{\Sigma }} = \mathbf{B}{} \mathbf{A}{} \mathbf{B}\) with \(\mathbf{A} = \rho ^{|k - l|^{1/5}}\) and \(\mathbf{B}\) a diagonal matrix with entries square roots of \(\rho + \mathbf{p}_1/p\), where \(\mathbf{p}_1 = 1:p\), \(\mathbf{I}\) is the identity matrix and \(\mathbf{J}\) is matrix of 1s. We set \(\rho = 0.5\). Under \(H_{0b}\), the same structures are imposed on three diagonal blocks \({\varvec{\Sigma }}_{ii}\) of dimensions \(p_i\), so that \({\varvec{\Sigma }} = \oplus _{i = 1}^3{\varvec{\Sigma }}_{ii}\), where \(\oplus \) denotes the Kronecker sum.

For the multi-sample statistic, \({{\,{\mathrm{T}}\,}}_g\), we take \(g = 2\) with \(b = 2\), and generate \(n_1 = \{20, 40, 75\}\) and \(n_2 = \{30, 50, 100\}\) iid vectors from respective populations with \(p_1 = \{20, 50, 100, 300, 500\}\) and \(p_2 = 2p_1\). Under homoscedasticity, the common \({\varvec{\Sigma }}\) is assumed to follow the two covariance structures under the alternative, where under the null, the two diagonal blocks are assumed to follow the same structures with their respective dimensions. The estimated size and power of \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\), reported in Tables 1, 2, 3 and 4, respectively, are an average of 1000 simulation runs.

Table 1 Estimated size of \({{\,{\mathrm{T}}\,}}_b\) with 3 blocks for normal, t and uniform distributions (ND,TD,UD) under compound symmetric and autoregressive (CS,AR) structures
Table 2 Estimated power of \({{\,{\mathrm{T}}\,}}_b\) with 3 blocks for normal, t and uniform distributions (ND,TD,UD) under compound symmetric and autoregressive (CS,AR) structures
Table 3 Estimated size of \({{\,{\mathrm{T}}\,}}_g\) with 3 blocks for normal, t and uniform distributions (ND,TD,UD) under compound symmetric and autoregressive (CS,AR) structures
Table 4 Estimated power of \({{\,{\mathrm{T}}\,}}_g\) with 3 blocks for normal, t and uniform distributions (ND,TD,UD) under compound symmetric and autoregressive (CS,AR) structures

We observe an accurate test size for both tests, \({{\,{\mathrm{T}}\,}}_b\) and \({{\,{\mathrm{T}}\,}}_g\), for all parameter settings under all distributions. For multi-sample case, the test performs relatively more accurately although the sample sizes reasonably differ. Note that, generally, both tests are very conservative for relatively small sample sizes, but improve with increasing sample size. In particular, the accuracy of the tests remains intact for increasing dimension. The results under the t and uniform distributions point to the evidence of robustness of the tests to the normality assumption, under Model (4).

Similar performance can be witnessed for the power of both test statistics. Like test size, the accuracy of which increases for increasing sample size, and the power also improves with increasing sample size, but also with increasing dimension, even for seriously differing dimensions for different blocks. In particular, for the multi-sample case, we observe an accurate performance of \({{\,{\mathrm{T}}\,}}_g\) for unbalanced design, and the accuracy is not disturbed for increasing dimension. The robustness character of the two tests also resembles that for the test size.

6 Discussion and Conclusions

Test statistics for correlation between two or more vectors are presented when the dimensions of the vectors, possibly unequal, may exceed the number of vectors. The one-sample case is further extended to two or more independent samples coming from populations assumed to have common covariance matrix. Accuracy of the tests is shown through simulations with different parameters. Among potential advantages of the tests include their simple construction, particularly for the multi-sample case whence most computations follow conveniently from the one-sample results, and their wide practical applicability under fairly general conditions and for a larger class of multivariate models including the multivariate normal.

A particularly distinguishing feature of the tests is that they are composed of computationally very efficient estimators defined as simple functions of the empirical covariance matrix. From applications perspective, it may be mentioned that the tests are constructed using the RV coefficient so that they can only be used to assess linear independence of vectors. It distinguishes them from measures such as distance correlation or kernel methods which can also measure nonlinear dependence; see also [5] for more details.