A Comment on Affine Invariance and Ancillarity in Testing Multivariate Normality

With the aim to clarify some confusion apparently still present in part of the literature on testing for multivariate normality, we collect and discuss some facts on invariance and ancillarity of various variants of, so-called, configuration of a multivariate normal sample. In particular, we show how the choice of the rotation matrix, used in the definition of the configuration, influences its invariance, distribution and ancillarity.


Mathematics Subject Classification 62H10 · 62G10 · 62H15
In a recent article by Vexler [10], a test of bivariate normality is proposed. The statistic of that test is a function of the symmetric version of, so-called, sample configuration. Let be a (p, n) data matrix with n > p . Under the null hypothesis, the columns 1 , … , n of form a n-element i.i.d. sample from a p-variate normal distribution N p ( , ) , with ∈ ℝ p and a positive definite covariance matrix . Both and are considered unknown. Let ̄ = n −1 ∑ n i=1 i be the sample mean and = n −1 ( −̄ T n )( −̄ T n ) T be the sample covariance matrix, where T n = (1, … , 1) ∈ ℝ n . The symmetric sample configuration, also called the matrix of scaled residuals, is defined as ∶= −1∕2 ( −̄ T n ) , where −1∕2 stands for the symmetric, positive definite square root of the inverse of (with n > p , is almost surely non-singular, cf. [2]). On page 5 of [10], the author describes a Monte Carlo experiment aimed to "experimentally confirm" that the null distribution of the test statistic does not depend on ( , ) , i.e., that is ancillary. In what follows, we show that, although is not invariant w.r.t. standard groups of data transformations 39 Page 2 of 4 usually considered in the context of testing multivariate normality, the null distribution of does not, indeed, depend on ( , ) , so that the Monte Carlo study in [10] was not necessary.
More generally, a sample configuration may also be defined as ∶= −1 ( −̄ T n ) , with any matrix such that = T . It is well known that such is defined up to left multiplication by a rotation matrix. Although some authors seem to suggest that is always ancillary (e.g., [4,Sect. 2]), it is not always true and the distribution and properties of depend on the choice of .
In what follows, two groups of transformations will be of interest: the group G of affine transformations → + T n , with nonsingular (p, p) matrices and with ∈ ℝ p , and its subgroup G * , with ∈ UT(p) -the group of upper triangular matrices with positive diagonal. Various questions related to invariant tests for multivariate normality were discussed in [7]. It was shown, in particular, that if = T with ∈ UT(p) , then ∶= −1 ( −̄ T n ) is a maximal invariant w.r.t. G * and, hence, is an ancillary statistic for ( , ) . Additionally, the distribution of is invariant w.r.t. left multiplication of by fixed, orthogonal (p, p) matrices. This follows directly from [1]. It is shown there (in Sect. 5) that = u , where u = u ( ) is a (p, n − 1) random matrix, is a specific, fixed (n − 1, n) matrix and the distribution of u is the Haar measure on the group SO(n − 1) of (n − 1, n − 1) orthogonal matrices, marginalized to the first p rows. This distribution is clearly invariant w.r.t. left multiplication of u by orthogonal (p, p) matrices, say , because this corresponds to left multiplication in SO(n − 1) by orthogonal, block diagonal matrices diag( , n−1−p ) , where n−1−p stands for the identity matrix.
Any configuration is related to through = , with an orthogonal matrix . Invariance and ancillarity of depend on the way is defined as a function of .
If is stochastically independent of , then the invariance of the distribution of w.r.t. left multiplication by fixed, orthogonal matrices leads via a standard conditioning argument to the conclusion, that the distributions of and are identical. As an example, with ∈ UT(p) , since = T = 1∕2 1∕2 , the symmetric configuration satisfies = , with an orthogonal matrix = −1∕2 . Since is a function of only (because both and 1∕2 are) and (̄ , ) is sufficient and complete in the Gaussian model, and are stochastically independent by the Basu theorem, and has the same distribution as and is, hence, ancillary. The same conclusion holds true, if is any function of (̄ , ).
If is a function of only, then = is clearly ancillary as a function of the ancillary statistic , with a distribution possibly different from that of . is then invariant w.r.t. G * , but not necessarily w.r.t. G.
If, moreover, is a function of T , i.e., of the Mahalanobis distances and angles, which is a maximal invariant w.r.t. G (see, e.g., [3]), then = is also invariant w.r.t. G . For an example, see, e.g., [7,Th. 3]. As one of the columns of the configuration constructed there is always proportional to (1, 0, … , 0) T , its distribution is clearly different from that of .
Finally, if depends on in an arbitrary way, i.e., not necessarily through only, then = does not have to be ancillary. can be, e.g., a rotation that makes the first column 1 of parallel to 1 , randomly chosen from the Haar distribution on SO(p) , conditionally on 1 being parallel to 1 . Then, clearly, the expectation of the first column of is proportional to , and is not ancillary.
It should be noted that many tests for multivariate normality are based on the symmetric sample configuration and not all of them are G invariant, even if a strong case for this property is sometimes made, e.g., at the beginning of Sect. 2 in [5]. Invariance w.r.t. G * only may also be of interest, if directed tests for multivariate normality against some restricted alternatives are considered, as in [6,7], or if the goal is maximin testing between some neighborhoods of transformation families of distributions (see, e.g., [8]).
If the sample covariance matrix is replaced in the definition of the sample configuration with another affine equivariant estimator ( ) that satisfies ( + T n ) = ( ) T for nonsingular and ∈ ℝ p , e.g., a robust estimator studied in [9], then remains a maximal invariant w.r.t. G * , but the rows of are not orthogonal, as it was the case with . As the distribution of does not have to be, in that case, invariant under left multiplication by orthogonal matrices, the discussion of the distributional issues does not carry over to that more general case.