Topology-driven goodness-of-fit tests in arbitrary dimensions

This paper adopts a tool from computational topology, the Euler characteristic curve (ECC) of a sample, to perform one- and two-sample goodness of fit tests. We call our procedure TopoTests. The presented tests work for samples of arbitrary dimension, having comparable power to the state-of-the-art tests in the one-dimensional case. It is demonstrated that the type I error of TopoTests can be controlled and their type II error vanishes exponentially with increasing sample size. Extensive numerical simulations of TopoTests are conducted to demonstrate their power for samples of various sizes.


Introduction
Goodness-of-fit (GoF) testing is one of the standard tasks in statistics.The testing procedure can be stated in the one-sample or two-sample setting.In case of the one-sample problem, we observe a sample of m independent realizations {x 1 , . . ., x m } of a d-dimensional random vector X with an unknown distribution function G, i.e.
x i ∼ G.The task is to test whether G is equal to a specific distribution F , i.e. we would like to test In the setting of the two-sample problem we are given two independent samples consisting of m and n (m ̸ = n in general) independent realizations of d-dimensional random vectors X and Y with an unknown distribution function F and G, respectively.This means X = {x 1 , . . ., x m }, x i ∼ F and Y = {y 1 , . . ., y n }, y j ∼ G, while the hypothesis is the same as in (1).
In this paper, we consider a more general notion of equivalence, replacing the equal sign above by the relation of being Euler equivalent (cf.Definition 2.1).
We are interested in the setting in which the underlying distribution is continuous.In this case, prominent GoF tests for samples from R rely on the empirical distribution function, see [1, chapter 4].These include, in the one dimensional case, the Kolmogorov-Smirnov, Cramérvon-Mises and Anderson-Darling tests.In higher dimensions, Kolmogorov-Smirnov leads to Fasano-Franceschini [2] and Peacock [3] tests; a general case was considered by Justel [4].A multivariate version of Cramér-von-Mises was proposed by Chiu and Liu [5].Since those tests are based on empirical distribution function, their generalization to R d for d ≥ 2 is conceptually and computationally difficult.Moreover, we are not aware of an efficient implementation of a general goodness of fit tests for high dimensional samples.
To tackle this challenge we propose to replace the cumulative distribution function with Euler characteristic curves (ECCs) [6][7][8], a tool from computational topology that provides a signature of the considered sample.To a given sample X, this notion associates a function χ(X) : [0, ∞) → Z, which can serve as a stand-in for the empirical distribution function in arbitrary dimensions.Subsequently, for one-sample tests, inspired by the Kolmogorov-Smirnov test, we define the test statistic to be the supremum distance between the ECC of the sample and the expected ECC for the distribution.This topologically driven testing scheme will be referred to as "TopoTest" for short.
The key characteristic of any goodness of fit test is its power, i.e. the type II error should be small, under the requirement that the type I error is fixed at level α.We show that the proposed test satisfies this condition and that it performs very well in practical cases.In particular, even restricted to one dimensional samples, its power is comparable to those of the standard GoF tests.
The paper is organized as follows: Section 1.1 reviews the necessary background from topology as well as the current work in the topic.In Section 2 we present the theoretical justification of our method.In Section 3 the algorithms implementing proposed GoF tests are detailed.Sections 4 and 5 present the numerical experiments and comparison of the presented technique to existing methods.In particular, comparing to a higher dimensional version of the Kolmogorov-Smirnov test, we find that our procedure provides better power and takes less time to compute.Finally, in Section 7 the conclusions are drawn.

Background
Since the seminal work of Edelsbrunner et al. [9] and Carlsson & Zomorodian [10], topological Data Analysis (TDA) is a fast growing interdisciplinary area combining tools and results of such diverse fields of science as algebra, topology, statistics Fig. 1 With increasing scale parameter, we draw in edges and triangles.We keep track of the number of components, which is here #points − #edges + #triangles.
and machine learning, just to name few.For a survey from a statistician's perspective, see [11].One of the areas in which TDA can contribute to statistics is related to applications of topological summaries of the data to hypothesis testing.Despite ongoing research and growing interest in TDA methods, attempts to construct statistical tests within the classical Neyman-Person hypothesis testing paradigm based on persistent homology, the most popular topological summaries of data, are limited because the distributions of test statistics under the null hypothesis are unknown.Therefore, the approaches that are most common in the literature utilize sampling and permutation based techniques [12][13][14].In this work, a different topological summary of the data, namely the Euler characteristic curve (ECC), is used to construct one-sample and two-sample statistical tests.The application of ECCs is motivated by recent theoretical findings regarding the asymptotic distribution of ECC, which enables us to construct tests in rigorous fashion.Since the finite sample distributions of ECCs remain unknown, extensive Monte Carlo simulations were conducted to investigate the properties and performances of the proposed tests.

Tools from Computational Topology
To start with an example, let us consider the set X of nine points in R 2 (Figure 1a).The most elementary way of assigning a numeric quantity to them is to simply count them.This is a topological invariant, the number of connected components.Now if two points coincide, they should not be regarded as separate.If they are very close together, say less than some given ε > 0 apart, we can also connect them.So let us draw an edge between them (Figure 1b).The number of connected components is now one less, suggesting we should subtract the number of edges from the number of points.In order to formalize what we mean by points that are close to each other, we introduce a scale parameter r ∈ R ≥0 .Then we draw edges between pairs of points whose distance is at most r.Letting r = 0 initially and increasing it, we draw more and more edges, thereby reducing the number of connected components (Figure 1c).Once three points are within distance r of each other, according to our intuition they should be considered as one connected component.But we have three points and three edges, which yield a difference of zero.To correct this mismatch with our intuition, we add the number of triangles (Figure 1d).This procedure continues to higher dimensions: Once k points are within distance r of each other, we add (−1) k−1 .These ideas will now be formalized.For a textbook reference on these topics, we refer the reader to [15].Definition 1.1.An abstract simplicial complex K is a collection of nonempty sets which are closed under the subset operation: The elements of K are called simplices.If σ ⊊ τ ∈ K, we say that σ is a face of τ .The dimension of a simplex σ ∈ K is dim(σ) = |σ| − 1, where | • | denotes the cardinality of a set.The dimension of K is the the maximal dimension of any of its simplices.
The construction of drawing edges, triangles etc. between points which are close to each other can be formalized in slightly different flavours.Perhaps the simplest is the Vietoris-Rips construction: Definition 1.2.For a finite subset X ⊆ R d and r ≥ 0 define the Vietoris-Rips complex at scale r to be the abstract simplicial complex R r (X) = {σ ⊆ X : diam(σ) ≤ 2r} where diam is the diameter of the simplex diam(σ) = max{d(x, x ′ ) : x, x ′ ∈ σ, x ̸ = x ′ }.
A closely related notion is the Čech complex: Definition 1.3.For a finite subset X ⊆ R d and r ≥ 0 define the Čech complex at scale r to be the abstract simplicial complex where B r (x) is the closed ball of radius r centered at x.
Finally, the Alpha complex (which is the most useful in practice and used in our implementations), requires the following notion from computational geometry: Definition 1.5.For a finite subset X ⊆ R d and r ≥ 0 define the Alpha complex at scale r to be the abstract simplicial complex For illustrations of the Alpha, Čech and Vietoris-Rips complex on a small sample, consider Figures 2a, 2b and 2c, respectively.We refer to r as the scale parameter or the filtration value.The latter name comes from the fact that for r < r ′ , the complex at scale r is a subcomplex of the one at scale r ′ .
The main advantage of the Alpha complex is its small size in low dimensions [16]; namely the Alpha complex on a random sample scales exponentially with the dimension of the sample and linearly with the sample size, see [17] for a further discussion.This is acceptable for low dimension, but impractical for higher ones.The Vietoris-Rips complex does not scale with the dimension but it scales exponentially with the sample size.For small samples in high dimensions, this construction should be preferred.
Counting the simplices with a sign yields the Euler characteristic, a fundamental topological invariant.Fig. 2 We consider three different constructions of filtered simplicial complexes with a fixed sample as vertex set.
Definition 1.6.Let K be a finite abstract simplicial complex.Its Euler characteristic is In the following we use the Čech construction in the theoretical part.Due to its sparse nature the Alpha construction is used in the implementation.They are topologically equivalent by the nerve lemma [15, III.2], hence they give the same ECC.
It should be noted that, for a given sample X, the Euler characteristics of its Vietoris-Rips complex, χ(R r (X)), may be different from χ(A r (X)) and χ(C r (X)).An example can be found in the sample presented in Figure 2c in which the 2simplex (triangle) on the left is filled in the Vietoris-Rips complex, but empty for the Čech and Alpha complex.
Keeping track of how the Euler characteristic changes with the scale parameter yields the main tool of our interest: Definition 1.7.Given a finite subset X ⊆ R d , define its Euler characteristic curve (ECC) as The ECC of the sample from Figure 1a is displayed in Figure 3. First applications of the ECC date to back to work of Worsley on astrophysics and medical imaging [8].

Topology of Random Geometric Complexes
In the considered setting, the vertex set from which we build simplicial complexes is sampled from some unknown distribution.The literature distinguishes two approaches, Poisson and Bernoulli sampling; see [18] for a survey.In the first setting, the samples are assumed to be generated by a spatial Poisson process.We focus on the Bernoulli sampling scheme in this paper.This means that we consider samples of n points sampled i.i.d.from some d dimensional distribution.Furthermore, there are three regimes to be considered when the sample size goes to infinity [19,Section 1.4].We consider the geometric complex at scale r n for a sequence r n → 0 whose topology is determined by the behaviour whether In the supercritical regime, n • r n d → ∞, so that the domain gets densely sampled the geometric complex is highly connected.Intuitively, this regime maintains only global topological information and forgets about local density.In the subcritical regime, n • r n d → 0, so that the domain gets sparsely sampled and the geometric complex is, informally speaking, disconnected (consult [18] for details).In this paper, we focus on the thermodynamic regime, i.e. we keep the quantity n • r n d = λ constant.Up to a constant factor, the quantity n • r d n is the average number of points in a ball of radius r n [18,Section 1].This value neither goes to zero nor to infinity as n → ∞ in the thermodynamic regime, leading to complex topology; see for instance [19,Chapter 9].Now it is straightforward to observe that a subset of our sample σ ⊆ X forms a simplex in the Čech complex at scale

This is because for any
This observation motivates us to scale a sample of size n by n 1/d .In fact, this setup aligns with the approach of [20].Due to this scaling, the average number of points in a ball of radius r= λ 1/d stays the same as we increase n → ∞.Therefore, it makes sense to compare ECCs at fixed radius r= λ 1/d for samples of different sizes.Visually speaking, we can compare (expected) ECCs from samples of different sizes in a common coordinate system using the r-axis scaled in this way.In particular, one can study the point-wise limit of the expected ECC; that is, when the sample size approaches infinity for a fixed r.Moreover, this rescaling allows us to conduct two sample tests with samples of different sizes, cf.Section 2.2.

Previous Work
Let us briefly review some related work on the intersection of topology and statistics.The most popular tool of TDA is persistent homology.Its key property is stability [21]; informally speaking, a small perturbation of the input yields a small change in the output.However, persistent homology is a complicated setting for statistics; for example, there are no unique means [22].
For a survey on the topology of random geometric complexes see [18].A text book for the case of one-dimensional complexes, i.e. graphs, is [19].The Euler characteristic of random geometric complexes has been studied in [23,24].Notably, in [24], the limiting ECC in the thermodynamic regime is computed for the uniform distribution on [0, 1] 3 .More recently, [25] provided a functional central limit theorem for ECCs, which was subsequently generalized by [20].The Euler characteristic has been studied in the context of random fields [26] by Adler and Taylor.Adler suggested to use it for model selection purposes and normality testing [27,Section 7].Building on this work, such a normality test has been extensively studied in [28].Using topological summaries for statistical testing has moreover been suggested by [29] for persistence vineyards, [30] for persistent Betti numbers and [31] for multiparameter persistent Betti numbers.Mukherjee and Vejdemo-Johansson [14] describe a framework for multiple hypothesis testing for persistent homology.Very recently, Vishwanath et al. [32] provided criteria to check the injectivity of topological summary statistics including ECCs.

Our Contributions
In this paper, to the best of our knowledge, we present the first mathematically rigorous approach using the Euler characteristic curves to perform general goodness-of-fit testing.Our procedure is theoretically justified by Theorem 2.4.The concentration inequality for Gaussian processes (Lemma 2.2) might be of independent interest.
Simulations conducted in Section 4 and 5 indicate that TopoTest outperforms the Kolmogorov-Smirnov test we used as a baseline in arbitrary dimension both in terms of the test power but also in terms of computational time for moderate sample sizes and dimensions.

One-sample test
While topological descriptors are computable and have a strong theory underlying them, they are not complete invariants of the underlying distributions, as recently pointed out in [32].Hence the statement of the null hypothesis and the alternative require some care.Definition 2.1.We say two distributions F, G are Euler equivalent, denoted For instance, if G arises from F via translations, rotations or reflections, F χ = G.For a more interesting instance of Euler equivalent distributions, see example 3.1 below.
We aim to solve the following: Given a fixed null distribution F and a sample X following an unknown distribution G, we test Compare this formulation to the problem stated in (1).As the ECC of the Alpha and Čech complexes are equal, we will use them interchangeably.We write χ(n, r) = χ(C r (X)), where n is the cardinality of X.Given some distribution F on R d against which we want to test, we are interested in the expected ECC of the Čech complex of scale r of n i.i.d. points drawn according to F , denoted as E F (χ(n, r)).The TopoTest employs the supremum distance between the ECC computed based on sample points, χ(C r (X)), and the expected ECC, E F (χ(n, r)), under H 0 , i.e. the test statistic is where T ∈ R + .Therefore, by using ECC as topological summary of the dataset we reduce the initial d-dimensional problem to a one-dimensional setting.If ∆ n defined in (3) is large enough the null hypothesis is rejected, while for small values of ∆ n the test fails to reject the H 0 .More precisely: given the significance level α we consider a rejection region The threshold value t α depends on the significance level α and F (and hence also on dimension d), however the dependence on F is dropped in the notation.We prove that this test is consistent below in Section 2.3.Remark.The test statistic (3) is based on the difference between sample ECC and ECC expected under H 0 .A natural, yet still open question, arises; how likely it is that two isometry-nonequivalent distributions will be Euler-equivalent and hence indistinguishable for test statistics (4).In a naive search where we considered over 1000 different univariate probability distributions defined in R + we could not find any such example.Therefore we believe that, the Euler-equivalence is not a practical limitation of our method.

Two-sample test
A test statistic based on the Euler characteristic curve can also be adapted to the two-sample problem.Given two samples X, Y ⊂ R d of possibly different sizes, following unknown distributions X ∼ F and Y ∼ G, we are testing the null hypothesis H 0 : G χ = F .The test statistic in this setting is the supremum distance between the normalized ECCs Moreover, recall that we rescale the samples to have a fixed average number of points in a ball of radius r, independently of the sample size.Since the null distribution is unknown, we fall back on a permutation test [33,Section 16.3] to compute the p-value, see Algorithm 2 for the details.
As for any permutation test, the procedure is computationally expensive as it requires computing ECCs for a variety of point sets resampled from the union of the two input datasets.The application of this approach is therefore limited to rather small sizes of input data sets.See Section 5 for results of a simulation study in which the performance of this approach is compared with the two-sample Kolmogorov-Smirnov test.

Overview
The TopoTest relies on the Functional Central Limit Theorem of Krebs et al. [20,Theorem 3.4], hence it works under the following, rather technical, assumption Assumption 1.The null distribution has compact convex support inside [0, 1] d .It admits a bounded density κ that can be uniformly approximated by blocked functions κ n .
Recall from [20, equation 3.8], that the approximation by blocked functions means lim n→∞ ∥κ − κ n ∥ = 0, where each κ n is constant on grid elements of a partition of the unit hypercube [0, 1] d into an equidistant grid of m d subcubes.In particular, bounded measurable functions satisfy this assumption.
We will show, for a fixed significance level α, that the mean of the test statistic ∆ n does not grow with n under the null hypothesis, while it grows at least like √ n under the alternative hypothesis.Moreover, in both cases ∆ n is concentrated around its mean allowing to control the type II error of the TopoTests.

Case H 0 true
By [25] and [20,Theorem 3.4]), we have convergence of ∆ n in distribution in the Skorokhod J 1 -topology to a centered Gaussian process f r , Here it is assumed that the sample is drawn from a distribution satisfying Assumption 1 and scaled by n1/d .Let us denote In the following we will approximate the finite-sample distribution of n −1/2 (χ(C r (X)) − E F (χ(n, r))) by the limiting Gaussian process f r .Therefore, for sufficiently large n we assume that The quality of this approximation was studied numerically -please refer to Figure 2.3.
For Z T we have the Borell-TIS inequality 1 [34, Section 2.1], where ).Therefore, for n large enough, Plugging in (4) yields i.e. t α = O(1).Fig. 4 Numerical inspection of the quality of finite sample approximation (6).The empirical distribution of Z T converges with the increasing sample size.Even for three-dimensional case the distribution obtained for n = 100 is a reasonable approximation for large-sample empirical distribution.An inset in each plot shows left-and right-hand side of the inequality (7) -this provides another justification for approximation (6).

Case H 0 false
Now let us study the asymptotic size of Because the limiting distributions of the ECCs are different under the alternative hypothesis, this last expression diverges.Due to [24], Corollary 4.5, E F (χ(n, r)) ∼ n with constant depending on F and d.In our setting, we obtain (9) To complete the discussion, it is required to show that in the case of H 0 false, one also has a concentration around the mean, i.e. one needs to control The lemma below provides a generalization of the Borell-TIS inequality to the case of noncentred Gaussian process.Lemma 2.2.Let f r be a centred Gaussian process and g(r) some deterministic function.We have Proof.We follow the strategy of Ledoux [35,Section 7.1].Argument (2.35) in Ledoux [35] yields that if γ is a standard Gaussian measure on R n then for every 1-Lipschitz function F on R n and t ≥ 0 we have Let r 1 , . . ., r n be fixed in [0, T ] and consider centered Gaussian random vector (f r1 , . . ., f rn ) in R n with covariance matrix Γ = B T B. Consequently, the law of (f r1 , . . ., f rn ) is the same as the law of BN where N = (N 1 , . . ., N n ) T is distributed according to the standard Gaussian measure γ on R n .Let F : R n → R be defined as Although we have a different F in our setting than [35], we can still bound the Lipschitz norm of F to be at most the operator norm of . Indeed, consider any c > 0 such that ∥Bx∥ ∞ ≤ c∥x∥ 2 for all x ̸ = 0. Using the triangle inequality, we estimate that for any Notice that This allows us to bound the operator norm of B as follows: Consequently, F/σ is 1-Lipschitz and by ( 12) we have Letting t = σ t and by symmetry argument we obtain The right hand side does not depend on f (r i ), hence letting n → ∞, inequality ( 11) is obtained.
Using the Lemma 2.2 we obtain following theorem Theorem 2.3.Concentration around the mean C F,G (t), defined in (10), is exponentially bounded Proof.Subtracting and adding E G (χ(n, r)) in ( 10) yields where the notation was introduced.Note that by ( 5) applied for distribution G the g r converges to a centred Gaussian process, whereas h(r) is a deterministic function. Let . Therefore using the same argument as in ( 6) by Lemma 2.2 bound (13) is obtained.
pdf of ∆n under H1 pdf of ∆n under H0 Fig. 5 The area of shaded blue region is the probability of a type II error occuring.As n → ∞, it goes to zero.
The rate of type I error is controlled by the significance level α.An asymptotic upper bound for type II error is given by the following theorem.Theorem 2.4.For fixed α, the probability of a type II error goes to 0 exponentially as n → ∞.
Proof.We will use the threshold t α defined in (4) and the concentration inequality of Theorem 2.3.The idea is illustrated in Figure 5. Introduce Due to equation ( 9) first term above is Ω( √ n) and is positive for sufficiently large n.Hence we can estimate P(type II error)

Properties of the TopoTests
TopoTests rely on the Euler characteristics curve which is computed based on the Alpha complex of the input sample.The Alpha complex captures distance pattern between all data points in the samples.Therefore, TopoTest is not capable to discriminate distributions that are isometry equivalent, e.g.differ only by translation, reflection or rotation.As a consequence TopoTest, contrary to Kolmogorov-Smirnov, is not able to distinguish between e.g.N (0, 0), 1 0 0 1 ) from  2).The same discussion also applies to the null hypothesis.Hence, such pairs of distributions were excluded from the forthcoming numerical study.

Non-compactly supported distributions
The results on the asymptotic convergence presented in Section 2.3 work for compactly supported distributions.However, most of the distributions considered in practice, starting from normal distributions, are defined on non-compact support and the presented results do not apply to them directly.There are a number of ways we can adjust such a distribution so that the presented methodology applies.In what follows we discuss three possible strategies, starting from the one we consider the most practical one 1.Restricting a distribution to a compact subset; In this case, the given distribution is restricted to a compact rectangle.In our case we choose a symmetric rectangle [−a, a] d for a being the maximal representable double precision number.This ensures that every sample that can be analyzed in a computer is automatically coming from such a restricted distribution.We note that, formally, such a restricted distribution need to be rescaled to become a probability distribution.However, in all practically relevant cases we are aware of, such a restricted distribution will be infinitesimally close, on its domain, to the original one, defined on an unbounded domain.Therefore, we argue that in practice, the presented methods can be applied even to distributions with no compact support.Additionally, the simulations performed provide strong evidence for this claim.

Rescaling a distribution to a compact subset;
Here a transformation, arctan(γx) : R → ] is applied separately to each coordinate to map the unbounded domain to a compact region.
We observe that for x ∈ [−2, 2], or for any similar interval centered around zero, arctan(x) is close to a linear function, hence the distance between points before and after applying the map, should be proportional to each other regardless of the points.To keep such a distortion of distances between points before and after rescaling, the scaling parameter γ is used.For instance, we may choose it in the way that 10 standard deviations in our data, after divided by γ, have values in the interval [−2, 2].For multivariate distributions the scaling can be applied separately in each dimension.Such a rescaling does not have any major impact on the powers of the tests as discussed in Sections 4 and 5.At the same time, it allows to map any unbounded distribution to a compact domain.One should note, however, that a bounded distribution, transformed by arctan may be, in some pathological cases, unbounded.Hence, before using this transformation, the boundedness of the output distribution needs to be verified.
Transforming into copula; The marginals F 1 , . . ., F d of the distribution F are continuous, hence one can apply the probability integral transform [36] to each component of the random vector X sampled form a distribution F .Then the random vector 14) is supported on a unit cube [0, 1] d and has uniformly distributed marginals.The joint distribution function of (U 1 , . . ., U d ) forms a copula.Since the null distribution F is given, the marginal distributions F 1 , . . ., F d can be derived.The transformation (14) must be applied to both the sample and null distribution F .Transformation (14) preserves the correlation structure and transforms the initial distribution F onto a compact support fulfilling the Assumption 1.Although such transformation is easy to compute and quite general, simulation studies showed that the power of resulting test is significantly reduced.

One-sample test
The test statistic for one-sample TopoTest, ∆ defined in (3), involves E F (χ(n, r)) being the ECC expected under H 0 .There is no compact formula that can be applied to compute E F (χ(n, r)) for an arbitrary distribution function F in arbitrary dimension d although some formulas are available in case of the multivariate uniform distribution [24].However one can use the approximation of E F (χ(n, r)) based on average ECC computed on a collection of randomly generated ECCs.Notice that χ(C r (X)) can only take on finitely many values because the underlying sample is finite.Therefore, E F (χ(n, r)) is finite.The strong law of large numbers applies and we can approximate this expectation empirically, i.e. let Y 1 , . . .Y M be i.i.d.samples each consisting of n points drawn i.i.d.from F , then (15) Due to the continuous mapping theorem, the above point-wise convergence result allows us to use an empirical estimate E F (χ(n, r)) instead of E F (χ(n, r)) in practice when computing the statistic ∆ n leading to statistic that was actually used in simulations.It should be mentioned that the estimator E F (χ(n, r)) does not depend on the sample being tested and by increasing M can be arbitrary close to E F (χ(n, r)).
The algorithm for computing the TopoTest for one sample can be divided into two steps.Firstly, in the preparation step an average ECC for given null distribution F is computed.Then the critical value of the test statistic is estimated empirically by drawing a set of random samples from F and computing the distance between ECCs corresponding to those samples and the average ECC computed previously.Secondly, in the testing step, the distance of the ECC of the given sample to the averaged ECC for the considered distribution is computed and compared to the critical values obtained in the first step.This procedure is provided in details by Algorithm 1. Remark.The preparation step in Algorithm 1 depends only on sample size n and null distribution F but is independent of actual sample X. Hence needs to be performed only once if several data samples of size n are considered.Remark.The threshold value t α used in the TopoTest is obtained from a numerical Monte Carlo simulation performed for a family of finite samples of a size n and does not explicitly employ asymptotic bounds from Section 2. Remark.The Monte Carlo parameters M and m should be sufficiently large to obtain an accurate resulting test.For the distributions considered in this paper, values M = m = 1000 were selected.Remark.The need to utilize the Monte Carlo approach to determine threshold value t α stems from the fact that the distribution of the test statistic (3) depends on the distribution of F and the size of the samples for which TopoTest was built.In general, this distribution is unknown.The simulations showed that employing an asymptotic distribution, approximated numerically by using a large sample size n in the preparation step, provided incorrect empirical significance levels in case of samples much smaller than n.
/* "Testing", i.e. compare the threshold value with sample distance */ Compute the ECC χ(A(X)) consisting of the 50 black and 50 red points as shown in the inset in Figure 6.Let us look at the two samples separately, for each of them we perform the one-sample test against the uniform distribution.We want to test, at significance level α = 0.05, whether they follow (up to an isometry of R 2 ) the uniform distribution.The ECC of X is shown in black and the one of Y in red in The green curve represents the expected ECC under the null hypothesis, estimated via M = 1000 Monte Carlo iterations using (15).We find the test statistic (16) Observe that for each t > 0, Hence by Lemma 5.1 of [32], the ECCs of F and G in the thermodynamic limit follow the same distribution.The limiting ECCs for F and G are shown in Figure 7.Note that distributions F and G are not isometric-equivalent and yet the corresponding ECCs are the same as the distributions are β-equivalent, hence also Euler equivalent.F and G therefore form an example of distributions that are indistinguishable by TopoTest.Indeed, the power of one-sample Kolmogorov-Smirnov test, when F is used as a null distribution and 50 elements samples are drawn from G, is 0.91 and only 0.05, i.e. α, for TopoTest.

Two-sample test
In Section 2.2 a related approach to the twosample problem was presented.This idea is formally provided by the Algorithm 2 while a particular realization is presented in the example below.Let us begin with the situation in which the null   inset of Figure 8.We compute the supremum distance between the normalized ECCs to be D = 0.227, as illustrated in Figure 8.Using K = 1000 Monte Carlo iterations we find that a distance between ECCs at least as extreme as D happens roughly 73% of the time.We conclude that we do not have evidence to reject the null hypothesis at significance level α = 0.05.Now let us turn to an example in which the null hypothesis is rejected.Example.In the Figure 9, we have sampled X as 30 points from the bivariate uniform distribution on the unit square U(0, 1) 2 , whereas Y consists of 50 points sampled from β(3, 3) × U(0, 1).We compute the distance between corresponding normalized ECCs to be D = 0.453.In K = 1000 Monte Carlo iterations, we find that an ECC distance at least as extreme as D never happens, hence using α = 0.05 this establishes evidence to reject the null hypothesis.

Numerical Experiments, one-sample problem
In this study, Monte Carlo simulations were used to evaluate the power of TopoTests and compare it with the power of corresponding Kolmogorov-Smirnov tests.In case of univariate distributions, Cramér-von Mises was considered as well for completeness.To obtain more detailed insight into performance of TopoTests, samples of various sizes ranging from n = 30 up to n = 1000, were examined.In the following subsections three types of experiments are presented: 1. Fixing the null distribution to be standard normal and test samples drawn from a vast variety of alternative distributions with different parameters; Laplace, uniform, t-distribution, as well as Cauchy, logistic distributions and mixture of Gaussians.This set of experiments allowed to assess how well TopoTests performs to recognize standard normal distributions.2. Fixing a family of distributions, and treat each of them as null distribution while all others are considered as alternative distribution.For each such a pair of distributions, the empirical power of the test, i.e. 1 minus probability of type II error, was computed using Monte Carlo methods.The result was visualized in a form of heat-maps.3.In addition, for various dimensions, a relation between power of the test and number n of data points in the sample was examined.As expected, the power of the test increases monotonically with the sample size.
In this section both simulations satisfying Assumption 1 and those that do not satisfy it (for instance multivariate normal) were considered.To theoretically underpin this approach, several ideas were suggested in Section 2.5.In practice, the fact that the Assumption 1 was not satisfied in some cases did not affect the test powers.Remark.In this section we benchmark TopoTest by comparing its power with the power of Kolmogorov-Smirnov test, i.e. the probability that the test correctly rejects null hypothesis when the alternative distribution is different than null distribution.Since TopoTests is not able to distinguish different but Euler-equivalent distributions, which Kolmogorov-Smirnov can distinguish, the setting under which it operates (2) is different from the Kolmogorov-Smirnov setting (1), and hence the reported power of TopoTest might be overestimated.To mediate this effect a vast collection of distributions was considered.

Compactly supported distributions
As a first example a collection of distributions supported on three-dimensional unit cube [0, 1] 3 was considered.The collection consisted of a number of three-fold Cartesian products of independent beta, cosine (rescaled to fit unit interval) and uniform univarite distributions.In such setup the Assumption 1 is fulfilled and developed theory can be applied straightforwardly.In Figure 10 the power of TopoTest was compared with power of Kolmogorov-Smirnov test for a collection of trivariate distributions on compact domain.Several sample sizes were considered but here only results obtained for n = 100 are reported as similar conclusions can be drawn for different values of n.The TopoTest provided higher power for vast majority of considered pairs of null and alternative distributions resulting in average power, at significance level α = 0.05, for this collection of distributions to be 0.82 for TopoTest and 0.73 for Kolmogorov-Smirnov.In fact, for collection of distributions considered in Figure 10 in only one, out of 72, comparisons the power of Kolmogorov-Smirnov test was higher than the one for TopoTest, and the difference was slim (0.07 vs. 0.08).
Table 1 provides the empirical power of TopoTests, assessed based on K = 5000 Monte Carlo simulations, in distinguishing a standard normal N (0, 1) from a number of alternative distributions at significance level α = 0.05.
As we can observe in Table 1, TopoTest outperformed the Kolmogorov-Smirnov test when distinguishing between the standard normal distribution from the normal distribution with variance different from 1, regardless of the sample size.The power of the TopoTest is also greater when the alternative distribution is Student's t-distribution: the difference compared to the Kolmogorov-Smirnov test was particularly pronounced when the number of degrees of freedom ν was small.When ν was 10 or more, the power of both tests is much lower, as expected, but still TopoTest outperformed the Kolmogorov-Smirnov test.Similar conclusion can be drawn for heavier tail alternative distributions such as Cauchy, Laplace or Logistic distribution: the empirical probability of type II error was always lower for TopoTest than for Kolmogorov-Smirnov counterpart.On the other hand, when Gaussian mixtures were considered, it was the Kolmogorov-Smirnov test that performs better, regardless of the value of mixing coefficient p.

Two and three dimensional unbounded distributions
In Table 2 where the value of the parameter a varies from 0 to 1 to reflect increasing correlation of components.
Similarly to the univariate case, TopoTests provided lower type II errors in case of alternative distributions being products involving a Student's t-distribution.This conclusion holds also when one of the marginal distribution was a N (0, 1) and second being Student's t-distribution.A similar result is true for bivariate distributions being a Cartesian product involving Logistic or Laplace distribution.We notice that TopoTest usually provided higher efficiency in case of Gaussian mixtures.On the other hand, TopoTest is significantly weaker than Kolmogorov-Smirnov when considering correlated multivariate normal distributions MG.All of these conclusions can be generalized to three dimensional distributions as initiated by results in Table 3.
The last row of Tables 1, 2 and 3 show the average powers of TopoTest and Kolmogorov-Smirnov test for the considered set of alternative distributions.The average power of TopoTest is greater than that of Kolmogorov-Smirnov test for all studied sample sizes.

All-to-all tests
Results presented in Tables 1,2,3 focused on the ability to discriminate the standard normal distribution from a set of different distributions.However in TopoTest one can choose arbitrary continuous distributions as null and alternative.Hence below we present power matrices where all possible pairs of null and alternative distributions formed from the previous set were consideredresults are presented in Figures 11,12,13.For easier evaluation of the effectiveness of the TopoTest in comparison to Kolmogorov-Smirnov, the difference in power was shown in the figures.Hence, the blue region corresponds to combinations of null and alternative distribution for which the TopoTest yielded higher power while red regions reflect the combinations for which TopoTest was outperformed by Kolmogorov-Smirnov.redWhite color stands for combinations for which both tests performed similar.
The analysis was conducted also dimension d = 5 as can be seen in Figure 14.For d > 3 the Kolmogorov-Smirnov test was not preformed due to too long computation time, hence results for TopoTest are presented only as this method provided feasible computational complexity.
As can be seen the TopoTest stayed sensitive enough to differentiate between multivariate normal distribution and Cartesian products of involving Student's t-distribution and standard normal as marginals, especially given that considered samples sizes are low for such high dimensional spaces.

Dependence of the test power on sample size
The dependence of the power of TopoTest and Kolmogorov-Smirnov tests on the sample size n is shown in Figure 15 for random samples in dimensions d = 1, 2, 3. To compute average power, all combinations of null and alternative distributions, as considered in Figures 11, 12 and 13, were taken into account, except alternative being the same as null distribution.In all cases, the average power increased with sample size as expected.In case of univariate distribution (leftmost panel in Figure 15) the results obtained using Cramér-von Mises  15 should not be directly compared across different dimensions as the actual value depends on the list of considered distributions which is different for each dimension.

Discussion
Using Euler characteristic curves, we introduced a new framework for goodness-of-fit testing in arbitrary dimensions.In addition, we provide a theoretical justification of the method.Although the distribution of the test statistic is unknown for finite n, and contrary to the Kolmogorov-Smirnov test, depends on F , the asymptotic distribution is given by (5), while theorem 2.4 provides an upper bound on the type II error.
A simulation study was conducted to address the power of the TopoTest in comparison with Kolmogorov-Smirnov test.A one-and twosample setting was considered.In both cases, the TopoTest in many cases yielded better performance than Kolmogorov-Smirnov.It should be however highlighted that Kolmogorov-Smirnov test and TopoTests operate in slightly different frameworks -the former in capable to distinguish between distributions that differ e.g. in location parameter while the TopoTests are insensitive to the distribution shifts, rotations, reflections as described in Section 2.4.

Fig. 3
Fig. 3 The ECC of the sample from Figure 1a.The filtration values (a)-(d) correspond to the complexes in Figure 1a-1d.

Fig. 7 Figure 6 .
Fig. 7 Expected ECCs of distributions F and G for n = 50.The inset shows the corresponding densities f and g
result for collection of bivariate distributions are shown.The M G(a) denotes a multivariate normal distribution with non-diagonal covariance matrix, i.e.

Fig. 10
Fig. 10 Average power of TopoTest (left panel) and Kolmogorov-Smirnov test for selected trivariate on compact support on [0, 1] 3 .Average power, at significance level α = 0.05, is estimated based on K = 1000 Monte Carlo realizations for sample size n = 100.

Fig. 11
Fig. 11 Comparison of the power of TopoTest and Kolmogorov-Smirnov one-sample tests in case of univariate probability distributions.In each matrix element a difference between power of TopoTest and Kolmogorov-Smirnov test was given.The difference in power was estimated based on K = 1000 Monte Carlo realizations.Left and right panels shows tests powers for sample sizes n = 100 and n = 250, respectively.The average power (excluding diagonal elements) of TopoTest is 0.722 (0.832) and 0.634 (0.794) for Kolmogorov-Smirnov for n = 100 (n = 250).

Fig. 14 5 N 4 N 3 N 5 N 5 N 4 N 3 N 5 N
Fig. 14 Average power of TopoTest for five dimension distributions, for sample sizes n = 250 and n = 500.Results based on K = 1000 Monte Carlo realizations.

Fig. 15
Fig. 15 Average power of the TopoTest (black curve) and Kolmogorov-Smirnov (red curve) as a function of sample size n for dimensions d = 1, 2, 3.In case of d = 1 the average power of Cramér-von Mises (green curve) test was shown as well.To guide an eye the data points are connect by lines.

Fig. 16
Fig. 16 Difference in average power of two-sample TopoTest and two-sample Kolmogorov-Smirnov tests for univariate (left panel) and bivariate (right panel) distributions.In both cases sample sizes were n = 100 and K = 500 Monte Carlo realizations were performed to estimate the average power.Average power of TopoTest is 0.643 (0.537) while for Kolmogorov-Smirnov it is 0.453 (0.437) in d = 1 (d = 2).
1] as those distributions are equivalent up to translation and rotation.As a consequence, the alternative hypotheses in Kolmogorov-Smirnov and TopoTest are in fact slightly different: in the former we have H 1 : G ̸ = F while in later the inequality is understood only up to Euler equivalence, cf.Equation ( computed between the χ(A r (X)) and the average curve is ∆ n = 0.612.Comparing this with the computed threshold of t α = 1.318, we conclude that we do not have evidence to reject the null hypothesis.The p-value is 0.916.In contrast, test statistics computed for χ(A r (Y )) is much larger and equals ∆ n = 2.267.Again using α = 0.05, the test provides evidence to reject the null hypothesis with p-value computed to be 0.00.And indeed, we generated X from the bivariate uniform distribution (i.e.null distribution) whereas Y was sampled from β(3, 3) × β(3, 3) , i.e.Cartesian product of two independent univariate β(3, 3) distributions. .Example.Consider the distributions F and G with densities

Table 1
Empirical powers of the one-sample TopoTest for different alternative distributions and sample sizes n -the null distribution was standard normal N (0, 1).Corresponding powers of Kolmogorov-Smirnov tests are given in parenthesis for comparison -higher result is given in bold for easier comparison.Results for the significance level α = 0.05.Empirical powers estimated based on K = 5000 Monte Carlo simulations.