A unified approach to goodness-of-fit testing for spherical and hyperspherical data

We propose a general and relatively simple method for the construction of goodness-of-fit tests on the sphere and the hypersphere. The method is based on the characterization of probability distributions via their characteristic function, and it leads to test criteria that are convenient regarding applications and consistent against arbitrary deviations from the model under test. We emphasize goodness-of-fit tests for spherical distributions due to their importance in applications and the relative scarcity of available methods.


Introduction
Let ∥ • ∥ be the Euclidean norm in R d , d ≥ 2, and write S d−1 := {x ∈ R d : ∥x∥ = 1} for the surface of the unit sphere in R d .In this paper, we consider the problem of testing goodness-of-fit for distributions defined on the sphere S2 or on the hypersphere S d−1 , where d > 3.In this respect, there is a plethora of such tests for distributions defined on R d , even in the multivariate case d > 1.
Besides, also goodness-of-fit tests on the circular domain S 1 is a relatively wellexplored area.For the latter case we refer, e.g., to [16] (Chapters 6 and 7), [20] and [21].
On the other hand, the same problem for data taking values on S d−1 , where d ≥ 3, has been mostly confined to testing for uniformity.Nevertheless, and while the notion of "non-preferred direction" and hence testing for uniformity is certainly central to (hyper)spherical data analysis, there are several more flexible distributions, which in fact often have the uniform as a special case.
The reader is referred to the monographs of [23], Section 2.3, and [24], Section 9.3, for such non-uniform models for (hyper)spherical data.At the same time, it seems that goodness-of-fit tests specifically tailored to hyper(spherical) laws are scarce, certainly in the case of a composite null hypothesis, where distributional parameters need to be estimated from the data at hand, but also for a completely specified hypothesis with fixed (known) parameter values.For the latter case, the test based on nearest neighbors proposed in [13] seems to be one of the few tests available, while to the best of our knowledge, there is much need for research in the case of a composite hypothesis.
In view of these lines, we suggest a procedure for testing goodness-of-fit for distributions defined on S d−1 , where d ≥ 3. 1 The suggested test is novel in that it is general-purpose suitable for arbitrary (hyper)spherical distributions, either with fixed or estimated parameters, and it is straightforwardly applicable provided that one can easily draw Monte Carlo samples from the distribution under test.Suppose X is a random (column) vector in R d taking values in S d−1 with a density f with respect to surface measure and characteristic function (CF) φ(t) = E(e it ⊤ X ), t ∈ R d , where ⊤ denotes transpose, and i = √ −1 stands for the imaginary unit.We start our exposition with the simple null hypothesis where f 0 is some given density on S d−1 , which should be tested against the general alternative H A that the distributions pertaining to f and f 0 are different.
If X 0 has density f 0 and CF φ 0 (t) = E(e it ⊤ X0 ), t ∈ R d , say, the standard CFbased statistic for testing H 0 versus H A is given by Here, is the empirical CF of X 1 , . . ., X n , and X 1 , . . ., X n are independent and identically distributed (i.i.d.) copies of X.In (2), the domain of integration as well as the nonnegative weight function w(•) will be specified below in a way that D n,w is amenable to computation and that a test of H 0 that rejects H 0 for large values of D n,w is consistent against each alternative to H 0 .Notice that D n,w is an estimator of the population Fourier-type discrepancy measure between f and f 0 .
The starting point of this paper is that the approach outlined above assumes that the functional form of the CF φ 0 is known.Such knowledge, however, is only available for distributions on the real line R 1 and for a few selected cases of multivariate distributions, such as the multivariate normal and the multivariate stable distribution; see [12] and [26].
In order to circumvent this obstacle, which is even more challenging for distributions taking values on S d−1 , we suggest the test statistic Here, is the empirical CF of Y problem, one sample being the data X 1 , . . ., X n at hand, while the other consists of artificial data generated under the null hypothesis H 0 .For more details on CF-based tests for the two-sample problem the reader is referred to [25] and [14].
This idea also applies to the problem of testing the composite null hypothesis against general alternatives.Here, {f 0 (•, ϑ) : ϑ ∈ Θ} is a given family of densities on S d−1 that is parameterized in terms of ϑ ∈ Θ, where Θ ⊂ R s for some s ≥ 1.
In this setting, the test statistic in ( 5) is modified according to with φ n (t) defined in (3) and where Y j , j ∈ {1, ..., m}, are i.i.d.copies of a random vector having density In this connection, we note that the idea of a goodness-of-fit method that employs an artificial sample from the distribution under test seems to date back to [15], at least for independent data and simple hypotheses.Recently, [9] proposed a CF-based method using the notion of artificial samples for goodness-offit within the family of multivariate elliptical distributions, [10] employ artificial samples in order to specifically test multivariate normality in high dimensions using nearest neighbors, while [2] applies a test procedure for mixed data by means of artificial samples.
The remainder of this work unfolds as follows.In Section 2 we obtain the limit null distribution of T n,m,w as well as the corresponding law under fixed deviations from H 0 .In Section 3, the validity of a bootstrap resampling scheme necessary for actually carrying out the test for simple hypotheses with fixed parameters is established, while in Section 4 a corresponding bootstrap resampling for the composite hypothesis test statistic T n,m,w is suggested.Section 5 contains an extensive Monte Carlo study of the finite-sample behavior of the new tests including comparisons, while Section 6 illustrates real-data applications.
The final Section 7 provides some discussion.

Asymptotics
In this section, we provide the limit distribution of T n,m,w defined in (5).To be flexible with respect to both the region of integration and to the weight function w, let M be some nonempty Borel set in R d , and let µ be some finite measure on (the Borel subsets of) M .Thus, M could be R d itself, and µ could be absolutely continuous with respect to the Lebesgue measure in R d , or M could be S d−1 , and µ could be absolutely continuous with respect to spherical measure.Notably, M could also be some countable subset T of R d , with µ having a probability mass with respect to the counting measure on T . 2n this setting, let X, X 1 , X 2 , . . .and Y, Y 1 , Y 2 , . . .be independent M -valued random vectors that are defined on some common probability space (Ω, A, P).
Moreover, let X, X 1 , X 2 , . . .be i.i.d. with density f with respect to µ and CF are the empirical CF's of X 1 , . . ., X n and Y 1 , . . ., Y m , respectively.This section tackles the limit distribution of as m, n → ∞, under each of the conditions φ = ψ and φ ̸ = ψ.
Since S(−x) = −S(x) and S(0 d ) = 0, where 0 d is the origin in R d , the sum of the imaginary parts vanishes, and we obtain A further simplification is obtained if we assume that the set M -like R d , Now, writing B(M ) for the σ-field of Borel sets on M , let be the separable Hilbert space of (equivalence classes of) measurable functions and put , where U and V are centred random elements of H having covariance kernel K given in (12).From (11), we obtain where  [5]).In view of (15) and the continuous mapping theorem, the assertion follows.
The next result gives the almost sure limit of (m+n)T n,m /(mn) as m, n → ∞.
Under the conditions on M and µ stated in Theorem 2.1, we have and, regarded as random elements of H, put Likewise, write a = ϱ(•) and b = b(•) for the degenerate random elements of H that are the expectations of A n and B m , respectively.By the strong law of large numbers in Banach spaces (see, e.g., [19]), we have It is readily seen that ∆ equals the almost sure limit figuring in Theorem 2.2.
Thus, ∆ is the measure of deviation between the distributions of X and Y , expressed in form of a weighted L 2 -distance of the corresponding characteristic functions, and this measure of deviation is estimated by T n,m .As the next result shows, the statistic T n,m , when suitably normalized, has a normal limit To prove this result, we need the condition lim for some τ ∈ [0, 1].In contrast to Theorem 2.1 and Theorem 2.2, this condition is needed now to assess the asymptotic proportions of the X-sample and the In what follows, put and let Furthermore, let where ϱ(t) and b(t) are given in ( 16).
Theorem 2.3 Suppose the standing assumptions on M and µ hold.If ∆ > 0, under the limiting regime (18), where Proof.The proof follows the lines of the proof of Theorem 3 of [3].In view of Theorem 2.2, condition (22) of [3] holds.Let A n and B m as in (17), and put Z n,m := A n − B m .Furthermore, write z = z(•) for the degenerate random element of H, where z(t) is given in (21), and define where a m,n and b m,n are given in (13).By the central limit theorem for H-valued as m → ∞, where A and B are independent centred Gaussian random elements of H with covariance kernels K 1 and K 2 , respectively, where K 1 and K 2 are given in (19).In view of ( 18), the continuous mapping theorem yields (20).Thus, also condition ( 23) of [3] holds, and the proof of Theorem 2.3 follows in view of Notice that the second summand on the right hand side is o P (1) in view of the tightness of (c m,n (Z n,m − z)), and the first summand converges in distribution to 2⟨Z, z⟩, which has the stated normal distribution N(0, σ 2 ).
Remark 2.4 Compared to [9], who address the problem of composite hypotheses, the limit results of this section are obtained for simple hypotheses without estimated parameters.However, the results obtained herein hold for artificial sample size m ̸ = n, which is much more general and thus flexible than the case m = n treated by [9].Moreover, our setting is different from that of elliptical distributions on the classical Euclidean space R d .In the following section, we suggest a resampling version of the test, and we prove its asymptotic validity.

Resampling under a simple hypothesis
Since, under H 0 , both the finite-sample and the limit distribution of T n,m as n, m → ∞ depend on the unknown underlying distribution of X To prove that this bootstrap procedure yields a test of H 0 of asymptotic level α, we use a Hilbert space central limit theorem for triangular arrays (see [22]).
This theorem reads as follows.
Theorem 3.1 Let {e j : j ≥ 1} a complete orthonormal basis of the separable Hilbert space H with inner product ⟨•, •⟩ and norm ∥ • ∥ H .For each m ≥ 1, let X m1 , X m2 , . . ., X mm be independent H-valued random elements such that E(⟨X mj , e ℓ ⟩) = 0 and E∥X mj ∥ 2 H < ∞ for each j ∈ {1, . . ., m} and each ℓ ≥ 1.Put S m = m j=1 X mj , and let C m be the covariance operator of S m .Assume that the following conditions hold: ), and write where a m,n and b m,n are defined in (13).
Theorem 3.2 In the setting given above suppose that, under the limiting regime Proof.The proof is similar to that of Theorem 2 of [4] and will thus only be sketched.Notice that are i.i.d.centred random elements of the Hilbert space H = L 2 (M, B(M ), µ) that, for a fixed complete orthonormal system of H, satisfy E(⟨X m,n,j , e ℓ ⟩) = 0 and E∥X m,n,j ∥ 2 H < ∞ for each j ∈ {1, . . ., n} and each ℓ ≥ 1.The covariance function of the process U m,n = U m,n (•) is given by and the covariance operator Since the function CS is bounded and continuous, X m,n By dominated convergence, we obtain ⟨C m,n g, h⟩ → ⟨C ∞ g, h⟩ (g, h ∈ H), which shows that condition (i) of Theorem 3.1 holds.The proof of condition (ii) of Theorem 3.1 follows the reasoning given on p. 603 of [4] by The latter limit has the same distribution as W ∞ .
Notice that the test statistic T n,m figuring in (11), computed on the ran- , where W n,m is given in (22).From Theorem 3.2, we thus have the following corollary.Corollary 3.3 The limit distribution of the test statistic T n,m under the limiting regime (18) Let H , where W is given in Theorem 2.1.This shows the asymptotic validity of the bootstrap.

Remark 3.4
The resampling bootstrap procedure applied herein may also be replaced by a permutation procedure.The validity of the exhaustive permutation (that includes all possible permutations) may be directly obtained by observing that, under the null hypothesis H 0 , the observations (x 1 , ..., x n , y 1 , ..., y m ) are exchangable.Another potential resampling scheme may be that of weighted bootstrap; see [1].

Resampling under a composite hypothesis
Analogously to T n,m,w , the limit null distribution of T n,m,w = T n,w (x 1 , ..., x n , y 1 , ..., y m ) depends (in a very complicated way) on unknown quantities, and hence it cannot be used to compute critical values and actually carry out the test.To this end, we consider a parametric bootstrap procedure involving the test statistic in (8) computed on the basis of bootstrap observations from f 0 (•; ϑ), where the parameter ϑ is replaced by estimators.
More precisely, let T n,m,w,obs denote the observed value of the test statistic.
For given α ∈ (0, 1), write t * n,m,α for the upper α-percentile of the bootstrap distribution of T n,m,w .We then define the test function as In practice, the bootstrap distribution of T n,m,w,obs is approximated as follows: Then we approximate the upper α-percentile t * n,m,α in (24) of the null distribution of T n,m,w by the upper α-percentile of the empirical distribution of T * n,m,w,1 , . . ., T * n,m,w,b .Although we provide no asymptotic theory for the resampling under a composite hypothesis, our simulations show that the above method works well.Nevertheless, it remains an open problem to formally prove that this bootstrap is asymptotically valid.

Simulations
In this section we provide results of competitive Monte Carlo simulations for the case of both a simple and a composite hypothesis.We throughout restrict the simulation to the spherical setting for the dimension d = 3 and, for computational feasibility, to the sample size n = 50.All simulations are performed using the statistical programming language R, see [29].We implement the test statistic by fixing the measure µ(dt) to be the density of the zero-mean spherical stable distribution.Then the test may be computed as in eqn.(10) with C(x) = e −γ∥x∥ ξ , where (ξ, γ) ∈ (0, 2] × (0, ∞) denote tuning parameters which are at our disposal and provide a certain flexibility of the test with respect to power against different alternatives.Another option, although not yielding a proper measure, is to adopt the approach taken in [28], which again results in the test statistic given in (10) with C(x) = −∥x∥ ξ , ξ ∈ (0, 2).
The spherical distributions were generated using the package Directional, see [30], and the uniformity tests by the package sphunif, see [17].

Testing the simple hypothesis of uniformity
We test the hypothesis ) is the surface area of the (d − 1)-sphere.Hence we test whether f is the density of the uniform law U(S d−1 ), which is a classical testing problem in directional statistics.For an overview of existing procedures we refer to [16].As competing tests we consider the following procedures: • The modified Rayleigh test R n , see [24], Section 10.4.1 based on the mean of the directions, • the [13] test N N J a based on volumes of the J nearest neighbor balls with power a, • the Bingham test B n , see [6], based on the empirical scatter matrix of the sample, • the Sobolev test, see [18], G n based on Sobolev norms, and • the test of [11] CA n , which is based on random projections that characterize the uniform law.

Empirical critical values for each testing procedure have been obtained by a
Monte Carlo simulation study under H 0 with 100000 replications.
We considered the following alternatives to the uniform distribution on S d−1 .
These alternatives are chosen to simulate different uni-, bi-and trimodal models.
For details on the hyperspherical von Mises-Fisher distribution, see Section 9.3 in [24].
• The density of the von Mises-Fisher distribution depends on the mean direction θ ∈ S d−1 and a concentration parameter κ ≥ 0, and it is given by Here, I d/2−1 is the modified Bessel function of the first kind and order d/2 − 1.This class is denoted with vMF(θ, κ).
• In a similar manner as for the mixture of two vMF distributions, we simulate a mixture of three von Mises-Fisher distributions with different centers, by additionally simulating independently a third random vector Y 3 ∼ vMF(θ 3 , κ 3 ) and generating the member X by We denote this class with MMF((p, p, In each of the alternatives, we put µ 1 = (1, 0, . . ., 0), 1 = (1, . . ., 1)/ √ d, and Here and in the following, T {ξ} n,γ stands for the test in eqn.(10) with C(x) = e −γ∥x∥ ξ , where (ξ, γ) ∈ (0, 2] × (0, ∞) as well as T SR n,a for the test with C(x) = −∥x∥ a .The result of the simulation is displayed in Tables 1 and 2 for the choice ξ = 2 and the stated competitors.As can be seen, the suggested tests perform well in comparison, although they are never the best performing procedures.This behavior might be explained by the approximation of the true characteristic function under the null hypothesis.To investigate the impact of the sample size m of the simulated data set Y 1 , . . ., Y m , we simulated the empirical power of the test for four vMF distributions and for different values of m, see Figure 1.Clearly, the choice of m has an impact on the estimation, and larger values of m are desirable, but increasing m leads to longer computation time.Table 3 exhibits the impact of the weighting measure µ and hence of the choice of the function C(•).In terms of power for the uni-and bimodal alternatives considered, the choice of C(•) has nearly no influence on the empirical power, with the exception of the MMF((0.5, 0.5), (−µ 1 , µ 1 ), (2, 2)) alternative, where T   Empirical rejection rates for testing uniformity for the test T Empirical rejection rates for testing uniformity for the test T {ξ} n,γ for ξ = 1, 1.5 as well as T SR n,a (n = 50, m = 500, α = 0.05, 10000 replications)

Testing the fit to the von Mises-Fisher distribution
For the case of a composite hypothesis, we consider the hypothesis that the underlying density belongs to the family of von Mises-Fisher distributions vMF(κ, θ), i.e., we test the hypothesis against general alternatives.The main difference to subsection 5.1 is that we consider a test to a family of distributions, where the parameters are unknown and hence have to be estimated.To test the hypothesis we chose T {2} n,γ in eqn.(10) with C(x) = e −γ∥x∥ 2 , for different values of the tuning parameter γ, and we implemented the parametric bootstrap procedure from Section 4. To approximate the unknown parameters we calculated the maximum likelihood estimates for κ and θ as proposed in Section 10.3.1 of [24].As far as we know, testing composite hypotheses for spherical or hyperspherical distributions with estimated parameters has not been considered before in the literature.As alternative models we chose the same distributions as described in Subsection 5.1.
In view of the extensive computation time due to the parametric bootstrap procedure, we considered the simulation setting n = 50, m = 200, a sample size of 500 in the bootstrap algorithm, and 5000 Monte Carlo replications.Throughout the study, we fixed the significance level to 0.05.The results are reported in Table 4. Notably, the novel test maintains the nominal significance level very closely, and its power with respect to bimodal alternatives increases the more these two modes are pronounced.

Testing the fit to the angular central Gaussian distribution
In this subsection, we consider testing the fit to an angular central Gaussian model, i.e., we test the hypothesis Here, Σ is a symmetric positive definite (d × d)-parameter matrix, which is identifiable up to multiplication by a positive scalar.For information regarding this model, see [24], Section 9.4.4,and for a numerical procedure to approximate the maximum likelihood estimator of the unknown parameter matrix Σ, see [31].
To the best of our knowledge, testing the fit to the angular central Gaussian family has not been considered in the literature.
The simulation parameters match the ones of Subsection 5.2.In complete analogy, we considered T Results are presented in Table 5.In this case the bootstrap testing procedure controls the type I error, while performing well for most of the alternatives considered.

Real data
We revisit the paleomagnetic data in [27], which is an example of spherical data.
Paleomagnetic data consist of observations on the direction of magnetism in either rocks, sediment, or in archeological specimens.These data are measured at various geological points in time and spatial locations.The directions are  usually measured as declination and inclination angles based on strike and dip coordinates, see [27] and the references therein for more information.The data considered are taken from the GEOMAGIA50.v3database, see [8].For simplicity, we analyse the data provided in the supplementary material of [27].The full data set consists of n = 1137 entries (variables are age, dec, inc, lat, and lon) collected at a single spatial location, which is the Eifel maars (EIF) lakes in Germany with relocated nearby data, for details see [27].The analysed di-    6.For all significance levels and each choice of the tuning parameter, the tests reject the null hypothesis, indicating a poor fit of the von Mises-Fisher family for the subset of the data.
As a second parametric family of distributions, we consider the Kent distribution, defined by the density Here, κ > 0 is a concentration parameter and θ ∈ S d−1 is the mean direction.
Moreover, A is a symmetric d × d-matrix with tr(A) = 0 and Aθ = 0 that depends on an 'ovality' parameter β, see [24].Hence we test the hypothesis

d− 1
or the grid Z d -is symmetric with respect to the origin 0 d , i.e., we have −M = M , where −M := {−x : x ∈ M }.Furthermore, we suppose that the measure µ is invariant with respect to the reflection T (x) := −x, x ∈ R d , i.e., we have µ = µ T , where µ T is the image of µ under T .By transformation of integrals, we then obtain S(x) = M sin(t ⊤ x) µ(dt) = −S(x), x ∈ R d , and thus S(x) = 0, x ∈ R d .Putting CS(ξ) := cos ξ + sin ξ, ξ ∈ R, and using the addition theorem cos(α − β) = cos α cos β + sin α sin β, some algebra yields where G is a centred random element of H with covariance operator C characterized by ⟨Ch, e ℓ ⟩ = ∞ j=1 ⟨h, e j ⟩a jℓ for each h ∈ H and each ℓ ≥ 1.To apply Theorem 3.1 in our situation of the Hilbert space H := L 2 (M, B(M ), µ) let, in greater generality than considered so far, X m,n 1 , . . ., X m,n n , Y m,n 1 , . . ., Y m,n m be i.i.d.M -valued random vectors with common distribution, and put

E
with µ, and the region of integration with M .To prove condition (iii) of Theorem 3.1, notice that, with X m,n,j defined in(23), the fact that |CS(•)| ≤ 2 and Hölder's inequality give⟨X m,n,j , h⟩ ≤ 2 √ n (µ(M )) 1/2 ∥h∥ 2 H , h ∈ H. ⟨X m,n,j , h⟩ 2 1 |⟨X m,n,j , h⟩| > ε ≤ 4µ(M )∥h∥ 2 H P(|⟨X m,n,1 , h⟩| > ε),and thus also condition (iii) of Theorem 3.1 holds.According to Theorem 3.1, we have U m,n D −→ W ∞ .In the same way, V m,n D −→ W ∞ , where, due to the independence of U m,n and V m,n , W ∞ is an independent copy of W ∞ .In view of (22) and the continuous mapping theorem, it follows that W n,m

F
n and G m denote the empirical distributions of X 1 , . . ., X n and Y 1 , . . ., Y m , respectively, and write H n,m := n m+n F n + m m+n G m for the empirical distribution of the pooled sample X 1 , . . ., X n , Y 1 , . . ., Y m .By the Glivenko-Cantelli theorem, H n,m converges weakly to H ∞ := (1−τ )F +τ F = F with probability one under the limiting regime (18), and thus the bootstrap distribution of T n,m converges almost surely to the distribution of ∥W ∞ ∥ 2 H .The latter distribution coincides with the distribution of ∥W ∥ 2 outperform the T SR n,a -procedures for some values of the tuning parameter γ.
bootstrap p-values are reported in Table6.With the exception of the tuning parameter γ = 0.5, the p-values indicate that we are not able to reject the hypothesis of fit of an underlying Mises-Fisher distribution at any level.For their analysis, the authors in[27] consider rocks of age 1250 and hence determine a subset of the data of sample size n = 50, for a plot see Figure2(right).They propose to use a new spherical model, namely a distribution of Kent type, by applying a transformation to the von Mises-Fisher density.The results of our test of fit to the von Mises-Fisher law for the subset are displayed in the second row of Table 1 , . . ., Y m , where, independently of X 1 , . . ., X (5) the random vectors Y 1 , ..., Y m are i.i.d.copies of X 0 .Of course, realizations of Y 1 , ..., Y m are generated via Monte Carlo.Notice that ψ m is an estimator of the CF φ 0 .In this way, the functional form of φ 0 is not needed in the test statistic T n,m,w in(5), which is reminiscent of a CF-based test for the two-sample equipped with the inner product ⟨u, v⟩ = M uv dµ and the norm ∥u∥ H = ⟨u, u⟩ 1/2 , u ∈ H. Suppose that φ = ψ.If M is symmetric with respect to 0 d and µ Since U n and V m are independent for each pair (n, m), also U and V are independent, and we have (U n , V m ) 1, we use a bootstrap procedure in order to carry out a test that rejects H 0 for large values of T n,m .The bootstrap distribution of T n,m is the conditional distribution of T n,m given the pooled sample X 1 , . . ., X n , Y 1 , . . ., Y m , and a test of H 0 at nominal level α rejects H 0 if T n,m exceeds the (1 − α)-quantile of this bootstrap distribution.Since the bootstrap distribution is difficult to compute, it is estimated by a Monte Carlo procedure that repeatedly samples from the empirical distribution of the pooled sample.To be specific, one first computes the observed value t n,m of T n,m based on realizations x 1 , . . ., x n , y 1 , . . ., y m of X 1 , . . ., X n , Y 1 , . . ., Y m , respectively.In a second step, one generates b inde- pendent samples by Monte Carlo simulation.Here, for each j ∈ {1, . . ., b}, the j th sample consists of x 1 (j), . . ., x n (j), y 1 (j), . . ., y m (j), where these values have been chosen independently of each other with a uniform distribution over {x 1 , . . ., x n , y 1 , . . ., y m }.For each j ∈ {1, . . ., b}, one then computes the value

Table 1
Empirical rejection rates for testing uniformity for the test T