Testing Hypotheses about Correlation Matrices in General MANOVA Designs

Correlation matrices are an essential tool for investigating the dependency structures of random vectors or comparing them. We introduce an approach for testing a variety of null hypotheses that can be formulated based upon the correlation matrix. Examples cover MANOVA-type hypothesis of equal correlation matrices as well as testing for special correlation structures such as, e.g., sphericity. Apart from existing fourth moments, our approach requires no other assumptions, allowing applications in various settings. To improve the small sample performance, a bootstrap technique is proposed and theoretically justified. Based on this, we also present a procedure to simultaneously test the hypotheses of equal correlation and equal covariance matrices. The performance of all new test statistics is compared with existing procedures through extensive simulations.


Motivation and Introduction
Covariance matrices contain a multitude of information about a random vector.Therefore, they were the topic of manifold investigations.For testing, an important hypothesis is the equality of covariance matrices from different groups.This was, e.g., investigated in Bartlett and Rajalakshman (1953) as well as Boos and Brownie (2004).Moreover, Gupta and Xu (2006) proposed tests for a given covariance matrix.Extending on both, Sattler, Bathke, and Pauly (2022) proposed a unifying framework that allows for investigations of various hypotheses about the covariance.Additional examples, e.g., cover testing equality of traces or comparing diagonal elements of the covariance matrix.However, covariance matrices are not scale-invariant.This entails some disadvantages using them in the analysis of random vectors' dependency structure.For example, a simple change of a measuring unit can completely change the matrix.For this reason alone, it is more useful to consider the correlation matrix instead when inferring dependency structures.This already starts in the bivariate case, i.e. the investigation of correlations.For ordinal data, rank-based correlation measures are most common.For example, Perreault, Nešlehová, and Duchesne (2022) provides an approach for general hypotheses testing while Nowak and Konietschke (2021) focused on simultaneous confidence intervals and multiple contrast tests for Kendall's τ .Spearman's rank-correlation coefficient ρ was investigated in Gaißer and Schmid (2010) for the hypothesis of the correlation matrix being an equicorrelation matrix.For metric data, the most common measure of correlation is the Pearson correlation coefficient.Here, tests and confidence intervals for investigating or comparing correlation coefficients have been discussed for the case of one, two or multiple group(s), see (Fisher (1921); Efron (1988), Sakaori (2002), Gupta and Xu (2006), Tian and Wilding (2008), Omelka and Pauly (2012), Welz, Doebler, and Pauly (2022)) and the references cited therein.For larger dimensions, equality of correlation matrices was investigated, for example, in Jennrich (1970).Moreover, different hypotheses regarding the structure of correlation matrices were treated in Joereskog (1978), Steiger (1980) and Wu, Weng, Wang, Wang, and Liu (2018).However, the above approaches usually require strong prerequisites on the distribution (such as multivariate normal distribution or particular properties of the moments), the components or the setting (such as bivariate or special structures).Moreover, most can be used only for a few specific hypotheses.Thus, to obtain an approach with fewer assumptions, which is at the same time applicable for a multitude of hypotheses, we expand the approach of Sattler et al. (2022) to the treatment of correlation matrices.In the following section, the statistical model will be introduced together with examples of different null hypotheses that can be investigated using the proposed approach.Afterwards, the asymptotic distributions of the proposed test statistics are derived (Section 3).In Section 4 and 5, a resampling strategy and a Taylor-based Monte-Carlo approach are used to generate critical values and improve our tests' small sample behaviour.A combined testing procedure which simultaneously checks the hypothesis of equal correlation matrices and covariance matrices is presented in Section 6.The simulation results regarding type-I-error control and power are discussed in Section 7, while an illustrative data analysis of EEG-data is conducted in Section 8.All proofs are deferred to a technical supplement, where also more simulations can be found.

Statistical Model and Hypotheses
To allow for broad applicability, we use a general semiparametric model, given by independent d-dimensional random vectors where the index i = 1, . . ., a refers to the treatment group and k = 1, . . ., n i to the individual, on which d-variate observations are measured.As analyzing a scalar's correlation is useless, we assume d ≥ 2. This model as well as all subsequent results can in principle be expanded for different dimensions of the groups as done in Friedrich, Brunner, and Pauly (2017).However, as the notation would be a bit cumbersome, we did not follow this approach to increase the readability.The residuals ϵ i1 , . . ., ϵ ini are assumed to be centred E(ϵ i1 ) = 0 d and i.i.d.within each group, with finite fourth moment E(ϵ i1 4 ) < ∞, while across groups they are only independent.Thus, different distributions per group are possible.For our asymptotic considerations, we additionally assume (A1) ni N → κ i ∈ (0, 1], i = 1, ..., a for min(n 1 , . . ., n a ) → ∞ with N = a i=1 n i , to preclude that a single group dominates the setting.In the following we use P −→ to denote convergence in probability and D −→ for convergence in distribution as sample sizes increase.It follows from the context whether this means N → ∞ or n i → ∞, respectively.Moreover, to define correlation matrices we additionaly assume that the covariance matrices Cov(ϵ i1 ) = V i = (V ijk ) j,k≤d have positive diagonals V i11 , ..., V idd > 0. Throughout, the so-called half-vectorization operation vech is used for the covariance matrices, which puts the upper triangular elements of a d×d matrix into a p = d(d+1)/2 dimensional vector.But, for a correlation matrix, the diagonal elements are always one and therefore contain no information, so this is not the best choice here.Hence, a new vectorization operation vech − is defined, which we will call the upper-half-vectorization.With R i as the correlation matrix for the i-th group, this vectorization operation allows us to define r i = vech − (R i ) = (r i12 , . . ., r i1d , r i23 , . . ., r i2d , . . ., r i(d−1)d ) ⊤ , i = 1, . . ., a, containing just the upper triangular entries of R i which are not on the diagonal.The resulting vector has dimension p u = d(d − 1)/2, which is substantially smaller than p.We now formulate hypotheses in terms of the pooled correlation Testing Hypotheses about Correlation Matrices in General MANOVA Designs vector r = (r ⊤ 1 , . . ., r ⊤ a ) ⊤ as with a proper hypothesis matrix C ∈ R m×apu and a vector ζ ∈ R m .Hereby, we allow m to be much smaller than ap u , which is often useful (e.g. for computational reasons) for hypotheses matrices without full rank.As explained in Sattler et al. (2022) for the case of covariance matrices, C does not have to be a projection matrix as ζ is allowed to be different from the zero vector, see, for example, hypothesis (c) below.
Hypotheses which are part of the setting (1) are, among others: (a) Testing Homogeneity of correlation matrices: with P a := 1 a 1 ⊤ a − I a /a and ⊗ denoting the Kronecker product.This hypothesis was, for example, investigated in Jennrich (1970), while for d = 2, this includes the problem of testing the null hypothesis . ., a within (1).This again contains testing equality of correlations between two groups, see, e.g.Sakaori (2002), Gupta and Xu (2006) or Omelka and Pauly (2012).
(b) Testing a diagonal correlation matrix: A test procedure for zero correlations was introduced by Bartlett (1951) for the one-group setting.More hypotheses on the structure of the covariance matrix can be found, for example, in Joereskog (1978), Steiger (1980) and Wu et al. (2018).
(c) Testing for a given correlation: Let R be a given correlation matrix, like, e.g., an autoregressive or compound symmetry matrix.For a = 1, we then also cover testing the null hypothesis For d = 2, this also contains the issue of testing the null hypothesis H r 0 : ρ 1 = 0 of uncorrelated random variables with ρ 1 = Corr(X 111 , X 112 ), see e.g.Aitkin, Nelson, and Reinfurt (1968).
(d) Testing for equal correlations: For a = 1, we are interested whether the correlation between all components is the same, i.e.
H r 0 : r 11 = r 12 = ... = r 1pu resp.H r 0 : This kind of hypothesis is connected with a compound symmetry matrix and was, e.g.investigated in Wilks (1946) and Box (1950) for special settings.

Asymptotics regarding the vectorized correlation
To infer null hypotheses of the kind H r 0 : Cr = ζ, it is necessary first to investigate the asymptotic distribution of C r, while r is the pooled vector of upper-half-vectorized empirical correlation matrices.Thereto Theorem 3.1 from Sattler et al. (2022) is shortly repeated first.Therein, Here, the covariance matrix is defined as Σ = a i=1 1 ki Σ i , where denotes the direct sum and Σ i = Cov(vech(ϵ i1 ϵ ⊤ i1 ) ⊤ ) for i = 1, . . ., a. First, some additional matrices have to be defined to use this result for correlation matrices.Let e k,p = (δ kℓ ) p ℓ=1 define the p-dimensional vector, which contains a one in the k-th component and zeros elsewhere.Moreover, we need a d-dimensional auxiliary vector a = (a 1 , ..., a d ), given through It contains the position of components in the half-vectorized matrix, which are the diagonal elements of the original matrix.In accordance with this, we define the p u -dimensional vector b, which contains the numbers from one to p in ascending order without the elements from a.This vector b contains the position of components in the half-vectorized matrix, which are non-diagonal elements.With these vectors we are able to define a d×d matrix H = 1 d a and the vectors h 1 = vech − (H) and h 2 = vech − (H ⊤ ).Finally, we can define the matrices This allows us to formulate a connection between the vech operator and the vech − operator, since the matrix L u p fulfils L u p vech(A) = vech − (A) for each arbitrary matrix A ∈ R d×d .This matrix is comparable to the elimination matrix from Magnus and Neudecker (1980) and adapted to this special kind of half-vectorization.With all these matrices, a connection can be found between √ n i ( v i − v i ) and √ n i ( r i − r i ), which allows getting the requested result by applying (2).The approach to connect vectorized correlation and vectorized covariance is based on Browne and Shapiro (1986) and Nel (1985), and adapted to the current more general setting.Theorem 3.1: With the previously defined matrices L u p , M 1 and Remark 3.1: The asymptotic normality of √ n i ( r i − r i ) was also shown for similar statistics in the past.But through the existing relation between ) it is, for example, possible to express Υ as a function of Σ, which allows to construct an estimator Υ using Σ.In Sections 5 and 6 we show that this relation provides further opportunities.
To use this result, we have to estimate the matrices Υ 1 , ..., Υ a , which is done with and Υ := a i=1 ni N Υ i .To this end, Σ i has to be a consistent estimator for Σ i .This is fulfilled, e.g., for the estimator from Sattler et al. (2022), given through As continuous functions of consistent estimators the estimators Υ i are consistent.With this asymptotic result, test statistics based on quadratic forms can be formulated through: has asymptotically a "weighted χ 2 -distribution", i.e. for N → ∞ it holds that where λ ℓ , ℓ = 1, . . ., ap u , are the eigenvalues of Thus quadratic-form type test statistics can be formulated, similar to Sattler et al. ( 2022) also for the vectorized correlation matrix.Here we often consider symmetric matrices E N , which can be written as function of C and Υ fullfilling This, e.g., leads to an asymptotic valid test φ W T Sr = 1 1{W T S r / ∈ (−∞, χ 2 rank(C); 1−α ]}, in case of Υ > 0. This is, however, hard to verify due to its complex structure.We therefore do not treat the WTS in the following.Conversely the ATS is no asymptotic pivot but the simulation results from Sattler et al. (2022) suggest to use its Monte-Carlo version.Hereto, the according matrices from Theorem 3.2 were estimated and by this we also get the estimated eigenvalues.By generating B 1 , ...B apu the weighted sum can be calculated, and by repeating this frequently (e.g.W=10,000 times), the required α quantiles of the weighted χ 2 1 distribution can be estimated, denoted by q M C α .For example, this gives us the test φ AT Sr := 1 1{AT S r / ∈ (−∞, q M C 1−α ]}, and a similar approach can be used for all quadratic forms fulfilling Theorem 3.2.

Resampling Procedures
A resampling procedure may be useful, on the one hand, for a better small sample approximation and, on the other hand, for quadratic forms with critical values that are difficult to calculate.Since the simulations from Sattler et al. (2022) showed clear advantages of the parametric bootstrap, we focus on this approach but also present a wild bootstrap approach in the supplement.For every group X i1 , ..., X ini , i = 1, .. → Υ.Thus, the unknown covariance matrices can be estimated through these estimators.
In consequence of Theorem 4.1, it is reasonable to calculate the bootstrap version of the previous quadratic forms, The first can be used if the limiting matrix van der Vaart and Wellner (1996)).This is fulfilled, if the function is continuous in its second component which, e.g., holds for the trace operator used in the ATS and which we assume in what follows.The bootstrap versions always approximate the null distribution of Q r , as established below.
Corollary 4.1: For each parameter vector r ∈ R apu and r 0 ∈ R apu with Cr 0 = ζ, under Assumption (A1) we have where P r denotes the (un)conditional distribution of the test statistic when r is the true underlying vector.

This motivates the definition of bootstrap tests as φ †
The above results ensure that these tests are of asymptotic level α, and we will use it for the ATS.Fisher z-transformed vectors are often used in the analysis of correlation matrices instead of the original vectorized correlation matrices.Although the root of this approach is the distribution of the Fisher z-transformed correlation in the case of normally distributed observations, it is also used for tests without this distributional restriction, see, e.g., Steiger (1980).In principle, we could also consider our tests together with the transformed vector.This approach assumes that all components of ζ differ from one, which is always possible to ensure.We can define tests based on the transformation for each of our quadratic forms, including the tests based on bootstrap or Monte-Carlo simulations.Our simulations showed that the tests based on this transformation have more liberal behaviour than the original one, which was also mentioned in Omelka and Pauly (2012) for the case of bivariate permutation tests.Since all of our test statistics were already a bit liberal, it is not useful to consider these versions further.However, some details on this are given in the supplement.Instead, we propose an additional Taylor approximation of higher order as described below.

Higher order of Taylor approximation
The main result of Section 3 is the connection between the vectorized empirical covariance matrix and the vectorized empirical correlation matrix given through Since this is based on a Taylor approximation, it leads to the question of whether a higher order of approximation could be useful here.In this way we get with Y i ∼ N p (0 p , Σ i ) and a function f vi, Ri : R p → R pu .This equation's derivation, together with the complex concrete form of the function f vi, Ri , can be found in the supplementary material.Although the part f vi, Ri (Y i )/ √ n i is asymptotically negligible, for small sample sizes, it can affect the performance of the corresponding test.For this purpose, we propose an additional Monte-Carlo based approach with this Taylor approximation.Thereto for each group we generate Y i ∼ N p (0 p , Σ i ) and transform it to Y T ay i √ n i .This leads to the pooled vector Y T ay = (n So, we get a version of this quadratic form by calculating If we repeat this frequently enough, an asymptotic correct level α test can be constructed by comparing Q r with the empirical 1 − α of Q T ay r .Since there it is only necessary to calculate Υ one time, this approach is clearly less time-consuming than the corresponding bootstrap approaches.
It would also be possible to use a Taylor approximation of higher order.Since the transformation gets more complex and our simulations suggest that this only affects very small sample sizes, we renounce this here.

Combined tests for covariance and correlations
The equality of two covariance matrices and the equality of two correlation matrices are interesting null hypotheses with a strong connection between them.As suggested by the Associate Editor, we here exemplify how the methodology extends to simultaneously comparing covariances and correlations, by using the results from Sattler et al. (2022).Hereto we use a so-called multiple contrast test (see, e.g.Konietschke, Bösiger, Brunner, and Hothorn (2013) for more details) and define The concrete form of the covariance matrix Γ and the derivation can be found in the appendix.In case of Γ 11 , ..., Γ pp > 0, we could consider T := diag(Γ 11 , ..., Γ pp ) −1/2 T , which is under the null hypothesis of equal covariances asymptotically normal distributed with expectation vector 0 p and covariance/correlation matrix Then an asymptotic correct level α test would be given through max Bretz, Genz, and A. Hothorn (2001).This allows us to simultaneously consider the hypotheses of equal covariance and equal correlation.If max(| T d+1 |, ...., | T p |) ≥ z 1−α,2, Γ then both groups have different dependence structures and therefore neither equal covariance matrices nor equal correlation matrices.In cases with max both groups have the same dependence structure but different variances of the components, which implies equal correlation matrices but different covariance matrices.Moreover, from the fact which component of T is bigger than z 1−α,2, Γ , it gets clear where the differences might come from.
Unfortunately, the condition Γ 11 , ..., Γ pp > 0 is difficult to verify, so again, a bootstrap approach would be a solution, which was, for example, also done in Friedrich and Pauly (2017) or Umlauft, Placzek, Konietschke, and Pauly (2019).Since it is less complicated and at the same time has a better small sample approximation in Section 7, we introduce another Taylor-based Monte-Carlo approach.To this end, we use that with matrix Based on this, we generate Y i ∼ N p (0 p , Σ i ) similar to Section 5 and transform it to .
With this transformed vectors we calculate T 1,T ay = √ N (n ), which has the same asmyptotic distribution as T , and repeat this B times to get T 1,T ay , ..., T B,T ay .Then, for ℓ = 1, ..., p, we denote with q T ay ℓ,β the empirical (1 − β) quantile for T 1,T ay ℓ , ..., T B,T ay ℓ .To control the family-wise type-I-error rate by α, the appropriate β can be found through Then we get an asymptotic correct level α test for the null hypothesis of equal covariance matrices by rejecting the null hypothesis if and only if max ℓ=1,...,p where we set 0/0 := 1.Each component of the vector-valued test statistic T is treated in the same way since the same β < α is used.This procedure allows the same conclusions on the reason for the rejection as the above approach base on the equicoordinate quantile.Of course, this combined test could be generalized to compare more than two groups by using an appropriate Tukey-type contrast matrix (see Konietschke et al. (2013)).

Simulations
We analyze the type-I-error rate and power for different hypotheses to investigate the performance.Here, we focus on hypotheses with an implemented algorithm to have an appropriate competitor.In the R-package psych by Revelle (2019) several of these tests are included, like φ Jennrich from Jennrich (1970) and φ Steiger F z from Steiger (1980) for equality of correlation matrices.Testing whether the correlation matrix is equal to the identity matrix can be investigated with φ Bartlett from Bartlett (1951) and again with φ Steiger F z resp.φ Steiger .Hereby φ Steiger F z is the same test statistic as φ Steiger but uses a Fisher z-transformation on the vectorized correlation matrices.Therefore we consider the following hypotheses: A r : Homogeneity of correlation matrices: H r 0 : R 1 = R 2 , B r : Diagonal structure of the covariance matrix H r 0 : R 1 = I p resp.r 1 = 0 pu , with α = 0.05.Further hypotheses are investigated in the supplementary material together with other settings and dimensions.The hypothesis matrices are chosen as the projection matrices C(A r ) = P 2 ⊗ I pu and C(B r ) = I pu while ζ is in both cases a zero vector with appropriate dimension.We use 1,000 bootstrap steps for our parametric bootstrap, 10,000 simulation steps for the Monte-Carlo approach and 10,000 runs for all tests to get reliable results.For φ Steiger and φ Steiger F z from Revelle (2019), the actual test statistic is multiplied with the factor (N − 3)/N .This approach is based on a specific result of the Fisher z-transformation of the correlation vector of normally distributed random vectors for small sample sizes.Asymptotically this factor has no influence, but it is also used for φ Steiger , where no Fisher z-transformation was done.To get a better impression of the impact of such a multiplication, we also include our ATS with parametric bootstrap using such multiplication and denote this with an m for multiplication.This also simplifies the comparison of the tests under equal conditions.At last, we simulated Monte-Carlo based ATS using our Fisher z-transformation.We denote it by ATSFz, while an additional version ATSFz-m is simulated, which is formed by multiplication with the factor (N − 3)/N .To have a comparative setting to Sattler et al. (2022) we used d=5 and therefore p u = 10, while for one group we have n ∈ {25, 50, 125, 250} and for two groups we have n 1 = 0.6 • N and n 2 = 0.4 • N with N ∈ {50, 100, 250, 500}.
We considered 5-dimensional observations generated independently according to the model

and error terms based on
• a standardized centred t-distribution with 9 degrees of freedom.
• a standardized centred skewed normal distribution with location parameter ξ = 0, scale parameter ω = 1 and α = 4.The density of a skewed normal distribution is given through , where φ denotes the density of a standard normal distribution and Φ the according distribution function.For A r we use (V 1 ) ij = 1 − |i − j|/2d for the first group and for the second group we multiply this covariance matrix with diag(1, 1.2, ..., 1.8).Thus we have a setting where the covariance matrices are different, but the correlation matrices are equal.To investigate B r we use the matrix, V = diag(1, 1.2, ..., 1.8).

Type-I-error
The results of hypothesis A r can be seen in Table 1 for the Toeplitz covariance matrix.Here, the values in the 95% binomial interval [0.0458, 0.0543] are printed in bold.It is interesting to note that the type-I-error rate of φ Steif er F z and φ Jennrich differs more and more from the 5% rate for increasing sample sizes.Therefore, these tests should not be used, at least for our setting.In contrast, all of our tests are too liberal but show a substantially better type-I-error rate.Moreover, these tests fulfil Bradley's liberal criterion (from Bradley (1978)) for N larger than 50.This criterion is often consulted by applied statisticians, for example in quantitative psychology.It states that a procedure is 'acceptable' if the type-I-error rate is between 0.5α and 1.5α.
Similar to the results for covariance matrices, the bootstrap version has slightly better results than Monte-Carlo based tests, while the error rates get closer to the nominal level for larger sample sizes.In each setting, the Monte-Carlo based ATS formulated for Fisher z-transformed vectors has a better performance than the classical one, especially for smaller sample sizes.
It can be seen that a correction factor, like (N − 3)/N , which was used for  ATS-Par-m, clearly improves the small sample performance.In contrast to the results from Sattler et al. (2022), for testing correlation matrices, the small sample approximation seems a bit worse, making such a correction factor useful.The best performance for smaller sample sizes can be seen for our Taylor-based Monte-Carlo approach, which has for N > 50 an error rate within the 95% interval.Therefore, for this setting, we recommend ATS-Tay, while for larger sample size, ATS-Par-m and ATSFz-m also show good results.
For hypothesis B r , the results are included in Table 2. Here, our Taylorbased approach has slightly conservative values, and therefore ATS-Para-m and ATSFz-m have the best results of our test statistics, while the results are slightly better than for hypothesis A r .For example, the number of values in the 95% binomial interval [0.Table 2 Simulated type-I-error rates (α = 5%) in scenario Br (H r 0 : r 1 = 0 10 ) for ATS, Steiger's and Bartlett's test.The observation vectors have dimension 5 and covariance matrix (V ) = diag(1, 1.2, 1.4, 1.6, 1.8).
Nevertheless, φ Bartlett is a test only developed for this special hypothesis and therefore has an excellent error rate through all distributions.But in addition to the type-I error rate, also other properties are highly relevant.

Power
The ability to detect deviations from the null hypothesis is also an important test criterion.To this aim, we also investigate the power of some of the tests mentioned above.We choose a quite simple kind of alternative suitable for our situation.As covariance matrix we consider V 1 + δ • J d for δ ∈ [0, 3.5] in hypothesis A r and for δ ∈ [0, 0.75] in hypothesis B r .The reason for this considerable difference in the δ range is that for hypothesis B r , the second summand changes the setting from uncorrelated to correlated.For hypothesis A r , it just increases the correlations, which is clearly more challenging to detect.Due to computational reasons, we simulate only one sample size, which is N = 250 resp.n 1 = 125 and consider error terms based on skewed normal distribution, while results for the gamma distribution can be found in the supplementary material.We simulate only the tests with good results for their type-I-error rate, which were for A r φ AT S † r and φ AT Sr −T ay as well as The covariance matrix for hypothesis Br is V = I 5 + δJ 5 and n 1 = 50.
φ SteigerF Z as competitor, despite its performance in Table 1.Because of the similarity of the results from the parametric bootstrap and the more classical Monte-Carlo based approach, we do not consider further test statistics here.For hypothesis A r Figure 1 shows that the Taylor-approximation makes the test slightly less liberal, and therefore reduces the power.This effect can be seen as a shift and does not influence the slope.In general, the power of both approaches is quite good, also φ SteigerF Z has even higher power, which furthermore increases faster.But since this test becomes even more liberal for increasing sample size, this is not surprising and makes it for this setting not recommendable.Based on the results from Table 2 for hypothesis B r we considerφ AT S † r and φ AT Sr −T ay as well as φ SteigerF z and φ Bartlett while the setting is the same.In Figure 1 for hypothesis B r it can be seen that φ AT Sr −T ay has for smaller δ similar power as φ Bartlett which was specially developed for this hypothesis, while for larger deviation, the power of the Monte-Carlo approach increases.Here, φ SteigerF Z has slightly less power than the parametric bootstrap due to the slightly liberal behaviour of the bootstrap approach.
The type-I error rate and the power for both hypotheses show that our developed tests are useful in many situations, although partially large sample sizes are necessary for good results.This is a known fact for testing hypotheses regarding correlation matrices, which was, for example, mentioned in Steiger (1980).Therefore the results of ATS-Tay investigating A r for smaller sample sizes are even more convincing.
All in all, the results of this section, together with the additional results from the supplement, allow us to give some general recommendations.Since, SteigerFz has a type-I-error rate of more than 9% in Table 1 which also grows for increasing sample sizes, it is not useful apart from single hypotheses like H r 0 : r 1 = 0 10 .This hypothesis can be checked with Bartlett, which had good results but only allows this hypothesis.On the contrary, for all considered hypotheses, ATS-Par-m and ATS lead to good results for moderate to large sample sizes, while they are liberal for small sample sizes.Such liberality was, for example also mentioned in Perreault et al. (2022) for tests based on Kendall's τ .As intended, the bootstrap improve the behaviour for smaller sample sizes, but not enough.In case of small sample sizes, the Taylor-based approach is recommended as it exhibited good small sample performance in all settings.Only for larger samples sizes it is outperformed by ATS-Par-m.Although tests regarding correlation matrices are challenging and known to need large sample size, this lead to useful tests, which also provides more flexibility and possible applications than existing ones.electrode positions (frontal, temporal, and central), and therefore p u = 15.For the evaluation of our results, we should keep in mind that all sample sizes are rather small in relation to this dimension.The considered hypotheses are: a) Homogeneity of correlation matrices between different diagnoses, b) Homogeneity of correlation matrices between different sexes, while we will denote the corresponding hypothesis regarding the covariance matrix with H v 0 .
In Sattler et al. (2022), homogeneity of covariance matrices between different diagnoses as well as different sexes were investigated.Here, we consider the more general hypothesis of equal correlation matrices between the diagnoses and the sexes.Thereby, it is interesting to compare the results from the homogeneity of covariance matrices with those from testing the homogeneity of correlation matrices.We expect higher p-values for equality of correlation through the larger hypothesis, but each rejection of equal correlation matrices directly allows us to reject the corresponding equality of covariance matrices.
In Table 4 for both hypotheses, the p-values for the ATS with parametric bootstrap are displayed, while for equality of correlations, we additionally use our test based on the Taylor-based Monte-Carlo approach.For all considered bootstrap tests, 10,000 bootstrap runs are done, as well as 10,000 Monte-Carlo-steps.Interestingly, for two hypotheses, the p-value ATS-Par for equal correlation matrices is rejected at level 5%, while we could not reject the smaller hypothesis of equal covariance matrices.But for both hypotheses, the sample sizes are rather small with N < 40.Our simulation results for d = 5 showed that the ATS with parametric bootstrap is too liberal for small sample sizes, which might be the reason why the larger hypotheses can be rejected, and the smaller ones can not.But also ATS-Tay, which had a better small sample performance, rejected H r 0 one time, while H v 0 was not rejected.Moreover, it can be seen that the difference between some hypotheses is relatively small, like for the first three hypotheses, but it can also be quite large as for the comparison of women with AD and with SCC.This shows that from a rejection of H v 0 , no conclusion on H r 0 can be drawn.Due to the small sample size in relation to the dimension of the vectorized correlation matrix p u = 15, even the results of ATS-Tay are a bit liberal.But nevertheless, the corresponding rejections allow various ideas for further investigations.

Conclusion & Outlook
In the present paper, a series of new test statistics was developed to check general null hypotheses formulated in terms of correlation matrices.The proposed method can be used for many popular quadratic forms, and the low restrictions allow their application for a variety of settings.In fact, existing procedures have more restrictive assumptions or could only be used for special hypotheses or settings.The diversity of possible null hypotheses, these low restrictions and the easy possibility of expansion, like using a Fisher z-transformation, make our approach attractive.We proved the asymptotic normality of the estimation error of the vectorized empirical correlation matrix under the assumption of finite fourth moments of all components.Based on this, test statistics from a quite general group of quadratic forms were presented, and a bootstrap technique was developed to match their asymptotic distribution.To investigate the properties of the corresponding bootstrap test, an extensive simulation study was done.This also allows checking our test statistic based on a Fisher z-transformation.
Here, hypotheses for one and two groups were considered, and the type-I-error control and the power to detect deviations from the null hypothesis were compared to existing test procedures.The developed tests outperform existing procedures for some hypotheses, while they offer good and interesting alternatives for others.Also, it is a known fact that for testing correlations, a large sample size is required.Here, for group sample sizes larger than 50, Bradley's liberal criterion was often fulfilled.Especially our Taylor-based approach was convincing for small sample sizes with multiple groups.An illustrative data analysis completed our investigations.
In future research, we will take a closer look at our newly proposed combined test for simultaneously testing the hypotheses of equal covariance matrices and correlation matrices.Thereto extensive simulations will be done, to, among other things, examine the performance of the corresponding Taylor-approach.This relates to studying the large number of possible null hypotheses included in our model.For example, tests for given covariance structures (such as compound symmetry or autoregressive) or structures of correlation matrices with unknown parameters are of great interest.Since there are heterogeneous versions of many popular structures, testing such structures or other patterns can be seen as a combination of testing hypotheses regarding covariance matrices and hypotheses regarding correlation matrices.Moreover, an investigation of Monte-Carlo approaches for a higher order of Taylor approximation for real small sample sizes could be interesting.

Statements and Declarations 10.1 Acknowledgment
We like to thank two anonymous referees and the editor.A special thank goes to the Associate Editor for suggesting the idea of simultaneously studying covariances and correlations.Moreover, we would like to thank the German Research Foundation for the support received within project PA 2409/3-2.

Competing Interests
The authors report there are no competing interests to declare.

Appendix 11 Further test statistics
Here we want to introduce two further approaches that define corresponding quadratic form base test procedures.

Fisher z transformation
The Fisher z transformation, based on the function f : R→R, x → 1 2 ln 1+x 1−x is frequently used to transform correlation coefficients.Working with vectorized covariance matrices, this means transforming r and afterwards multiplicating with the hypothesis matrix.Since this approach was too liberal in our simulations, which was similarly mentioned in Omelka and Pauly ( 2012), we use it in another way.With f : R apu →R apu , (x 1 , ..., x apu ) → 1 2 ln 1+x1 1−x1 , ..., 1 2 ln and the delta method, we get Through the consistency of Υ we can estimate this unknown covariance matrix through f ′ (C r)C ΥC ⊤ f ′ (C r).This allows us to formulate appropriate quadratic forms to check the null hypothesis H 0 : Cr = ζ.Another consistent estimator under the null hypothesis would be f ′ (ζ)C ΥC ⊤ f ′ (ζ), but we expect less power to detect derivation from the null hypothesis.

Wild Bootstrap
With i.i.d.random weights W i1 , ..., W ini , i = 1, ..., a, independent of the data, with E(W i1 ) = 0 and Var(W i1 ) = 1 and X ik := X ik − X i we define the wild bootstrap sample through Hereby we multiplicated the wild bootstrap sample from Sattler et al. ( 2022), which we here notated with Z ⋆ ik , with the matrix M ( v i , r i ) to adapted it for the required covariance matrix Υ i .With Σ The → Υ the unknown covariance matrices can be estimated through these estimators.Sattler et al. (2022).With the consistency of Similar we can show the consistency of and therefore with Slutzky's theorem it holds The results from part (b) follow through the independence of groups.□ 12 Proofs from the main paper Proof of Theorem 3.1 This proof is based on the proof from Browne and Shapiro (1986) and Nel (1985), where a similar situation is considered.Some adaptions must be done because we are just interested in the matrix's upper triangular. With i,0 , it can be calculated For diagonal matrices, we can use this Taylor expansion for each component separately, thereby getting a result as considered in the above equation.We first consider the corresponding remainder by using the single components of 1 + n −1/2 i U i,0 for x.Since U i,0 converges to a normally distributed random variable with independent components, from Slutzky's theorem, we know that the remainder is and hence Here, we used again that U i,0 and U i converge to normal distributed random variables.Again with Slutzky's theorem, the product of such a random variable and an expression like O P n −q−1/2 i has to be also O P n . Therefore many parts of the initial product are now cumulated in O P n −q−1/2 i . For the derivation of the Taylor-base Monte-Carlo approach, we will use the above expression, but for the asymptotic normality, we use Slutzky's theorem and get We define with h 3 = vech(H) and h 4 = vech(H ⊤ ).With these matrices, it is easy to check that the following equations hold, and therefore with vech(U i,0 ) = M 5 vech(U i ) we get Multiplication with the matrix M 5 changes nothing in this case because it just picks the columns unequal to zero and drops the rest.So all in all with M 4 = M 2 + M 3 it holds Now to adopt this result for the upper-half-vectorization, we use the particular elimination matrix L u p which gives a connection between vech and vech Here, we used the relation it is useful to define Therefore, it holds and because of Theorem 1 from Sattler et al. (2022) it follows We could get the same result by using the delta method on the results for the vectorized covariance matrices.We believe the approach of Browne and Shapiro (1986), together with Nel (1985), is preferable due to its stepwise structure.Therefore it is more suitable to get an understanding of the used matrices.Another important argument is that we later have a Monte-Carlo approximation based on this Taylor approximation.□ Proof of Theorem 3.2 With the result from Theorem 3.1, the asymptotic distribution of the quadratic form follows exactly from Theorem 2 from Sattler et al. (2022).□ Proof of Theorem 4.1 We only prove the first part because the second part directly follows the single groups' results.
For an application of the multivariate Lindeberg-Feller-Theorem (given the data), we need to check the conditions.As Y † ik under X is pu-dimensional normal distributed with expectation 0p u and variance Υ i it holds: 3). lim For the last part, we used the Cauchy-Bunjakowski-Schwarz-Inequality and that we know Given the data X, it follows that We can use the results from Sattler et al. (2022) for the consistency of the empirical covariance matrix.Through the construction of the bootstrap sample it holds Therefore with the consistency of Σ † i (Z), the empirical covariance matrix for Z † i1 , ..., Z † ini it holds Again, with the independence of groups, the results from part (b) follow directly.□ Lemma 12.1: Let be Proof From the proof of Theorem 4.1 we know . Again the aim is to express the vectorization of these matrices by using vech(R i ) and vech(U i ).Analogous to the proof of Theorem 4.1 it follows directly So together this leads to where we used the consistency of R. With Λ i ( v i ) and the results from Sattler et al. (2022) From this reason we define, From a theoretical point of view, f vi, Ri is no function but a family of functions since it strongly depends on v i and R i .
Through the construction of Y T ay it is clear that ) it follows that our test based on this approach is asymptotic correct.

Combined tests for covariance and correlations
To investigate the asymptotic distribution of T we use the matrix A fullfilling Together with this lead to and therefore, under the hypothesis of equal covariance matrices But with Lemma 12.1, it also holds and because of independence and Slutzky's theorem T T ay So the asymptotic distribution of the T T ay b coincides with the asymptotic distribution of T under the null hypothesis, which leads to an asymptotic correct test.

Further Simulations Type-I-error rate
Here all hypotheses are investigated with more test statistics, like a wild bootstrap ATS.For hypothesis A r in addition to a Toeplitz covariance matrix, also an autoregressive matrix V ij = 0.6 |i−j| is simulated to see the influence of the chosen covariance matrix.Moreover, we investigated hypothesis A r for the special case of dimension d = 2 and therefore p u = 1.We also want to compare with the permutationbased approach from Omelka and Pauly (2012), with 999 permutations.Since this procedure was specially developed for this dimension, while we allow higher dimensions, this is particularly interesting.We also adapted the sample sizes for this smaller dimension to have the same relation between N and d and get N = (20,40,100,200).To see the influence of the dimension d, we also consider the case d = 7 and therefore p u = 21.Here the sample sizes are N = (35,70,175,350) for one group and N = (70, 140, 350, 700) for two groups, which is the same relation as earlier, but substantially smaller regarding the number of unknown parameters p u .Since the diagonality of a covariance matrix can be checked by using the covariance matrix or the correlation matrix, for B r we used the covariance-based ATS approaches from Sattler et al. (2022) for d = 5.We also added one more hypothesis, C r H r 0 : r 1 = ... = r pu to examine whether all correlation components in a group are the same.This hypothesis corresponds with a compound symmetry structure of the correlation matrix.Table 5 Simulated type-I-error rates (α = 5%) in scenario Ar (H r 0 : R 1 = R 2 ) for ATS, Steiger's and Jennrich's test.The observation vectors have dimension 5, covariance matrix (V 1 ) ij = 1 − |i − j|/2d resp.V 2 = diag(1, 1.2, ..., 1.8)V 1 and it always holds n 1 := 0.6 • N resp.n 2 := 0.4 • N .
The wild bootstrap approach seems to be too liberal in all settings, particularly in comparison with the parametric bootstrap or the Monte-Carlo approach.So this bootstrap technique is not recommendable for testing hypotheses regarding correlation matrices.In Table 5 and Table 6, it seems that the test performance partially depends on the underlying covariance structure.The results of our tests for the autoregressive covariance matrix are more liberal than for the Toeplitz covariance matrix.However, the influence is remarkable for the test from Jennrich (1970), which has for a Toeplitz covariance matrix type-I error rates all the time higher than 0.36.It is better for the autoregressive one but worse than most of our tests.Also, the test of Steiger without the Fisher transformation performs clearly better for the autoregressive covariance matrix, although large sample sizes are required to fulfil Bradley's liberal criterion.Our test results for dimension 2 from Table 7 and Table 8 are slightly more liberal than for dimension 5.For the Toeplitz matrix, Steiger has disastrous error rates and is not usable in our opinion, while SteigerFz seems to perform clearly better for small dimensions.It is noticeable that while for dimension 5, ATSFz was favourable to the parametric bootstrap most of the time, it is now vice versa.Finally, for dimension 2 the type-I error rate of the permutation test from Omelka and Pauly (2012) as well as Jennrich and SteigerFz seem to depend on the covariance matrix and the distribution.Although ATS-Par, ATS-Par-m, ATS-Tay and ATSFz-m here often need 100 or more observations to fulfil Bradley's liberal criterion, they seem less dependent on the setting.Our approach works better for dimensions larger than 2, which was the main focus.Moreover, these tables show that the required large number of observations for correlation tests (see, for example, Steiger (1980)) are not in relation to the number of unknown parameters p u .
In Table 9, the three tests based on the vectorized covariance matrix behave comparably as for other hypotheses from Sattler et al. (2022).The corresponding ATS with a parametric bootstrap has overall error rates that are as good as the parametric bootstrap ATS for vectorized correlation matrices.It can be assumed that the variance of the components determines whether the correlation matrix or the covariance matrix is more suitable for this hypothesis.Variances greater than 1 make the corresponding correlation coefficients smaller than the components from the covariance matrix.This increases the type-I error rate for the covariance-based test.Similar for variances smaller than 1, which lowers the type-I error rate.To illustrate, we multiplicated the covariance matrix with an appropriate diagonal matrix, where the results are displayed in Table 10.We included only individual test statistics since this does not influence the correlation matrix.For these small variances of the single components, the type-I error rate of the covariance-based tests clearly decreased.This dependency on the variance of the components makes the correlation-based approach more reliable, also through the standardization with the covariances, there could be more variance.In Table 11 again φ AT S † , φ AT S and φ AT SF z have the best performance as well as their multiplicated versions.This multiplication makes the test less conservative, which is especially useful for smaller sample sizes but can lead to slightly conservative behaviour for larger sample sizes.But these tests fulfil Bradley's liberal criterion for n 1 ≥ 50 and often also for n 1 = 25.Overall, it is noticeable that the difference between wild bootstrap and parametric bootstrap is essentially larger for one group than for two.Moreover, the Taylor-based approach performs better for two groups since it gets slightly liberal for the settings with only one group.
For higher dimensions, a worse small sample performance would be expected since the sample size is smaller in relation to the number of unknown parameters p u .Indeed, this holds only for ATS-Par-m and ATSFz-m, while for other ones like the parametric bootstrap, the performance seems even better, as it can be seen in Table 12-15.This result shows that the introduced techniques are also interesting for higher dimensions.

Power
To examine the powers dependency on the underlying distribution, the power simulation is now done with the same setting but a gamma distribution instead of a skewed normal distribution.As it can be seen in Figure 2 the results are very similar to the power for the skewed normal distribution.This suggests that the results do not depend on the underlying distribution.
1 and there is always the same relation between group samples size with n 1 := 0.6 • N resp.n 2 := 0.4 • N .
can formulate the wild bootstrap versions of a quadratic form by pu (e h1 ℓ ,p + e h2 ℓ ,p )
., a, we calculate the covariance matrix Υ i .With this covariance matrix we generate i.i.d.random vectors Y † i1 , ..., Y † ini ∼N pu (0 pu , Υ i ) which are independent of the realizations and calculate their sample covariances Υ † 0458, 0.0543], printed bold, clearly increases.With the correction factor, these tests have a better type-I-error rate than φ Steiger in all settings and are comparable to φ Steiger F z .

Table 3
Number of observations for the different factor level combinations of sex and diagnosis.

Table 4 P
-values of different ATS for testing equality of correlation matrices and equality of covariance matrices.
Testing Hypotheses about Correlation Matrices in General MANOVA Designs Finally, we consider the gamma distribution as further distribution for all settings.