Polynomial-Sized Topological Approximations Using The Permutahedron

Classical methods to model topological properties of point clouds, such as the Vietoris-Rips complex, suffer from the combinatorial explosion of complex sizes. We propose a novel technique to approximate a multi-scale filtration of the Rips complex with improved bounds for size: precisely, for $n$ points in $\mathbb{R}^d$, we obtain a $O(d)$-approximation with at most $n2^{O(d \log k)}$ simplices of dimension $k$ or lower. In conjunction with dimension reduction techniques, our approach yields a $O(\mathrm{polylog} (n))$-approximation of size $n^{O(1)}$ for Rips filtrations on arbitrary metric spaces. This result stems from high-dimensional lattice geometry and exploits properties of the permutahedral lattice, a well-studied structure in discrete geometry. Building on the same geometric concept, we also present a lower bound result on the size of an approximate filtration: we construct a point set for which every $(1+\epsilon)$-approximation of the \v{C}ech filtration has to contain $n^{\Omega(\log\log n)}$ features, provided that $\epsilon<\frac{1}{\log^{1+c} n}$ for $c\in(0,1)$.


Introduction
Motivation and previous work Topological data analysis aims at finding and reasoning about the underlying topological features of metric spaces.The idea is to represent a data set by a set of discrete structures on a range of scales and to track the evolution of homological features as the scale varies.The theory of persistent homology allows for a topological summary, called the persistence diagram which summarizes the lifetimes of topological features in the data as the scale under consideration varies monotonously.A major step in the computation of this topological signature is the question of how to compute a filtration, that is, a multi-scale representation of a given data set.
For data in the form of finite point clouds, two frequently used constructions are the (Vietoris-)Rips complex R α and the Čech complex C α which are defined with respect to a scale parameter α ≥ 0. Both are simplicial complexes capturing the proximity of points at scale α, with different levels of accuracy.Increasing α from 0 to ∞ yields a nested sequence of simplicial complexes called a filtration.
Unfortunately, Rips and Čech filtrations can be uncomfortably large to handle.For homological features in low dimensions, it suffices to consider the k-skeleton of the complex, that is, all simplices of dimension at most k.Still, the k-skeleton of Rips and Čech complexes can be as large as n k+1 for n points, which is already impractical for small k when n is large.One remedy is to construct an approximate filtration, that is, a filtration that yields a similar topological signature as the original filtration but is significantly smaller in size.The notion of "similarity" in this context can be made formal through a distance measure on persistence diagrams.The most frequently used similarity measure is the bottleneck distance, which finds correspondences between topological features of two filtrations, such that the lifetimes of each pair of matched features are as close as possible.A related notion is the log-scale bottleneck distance which allows a larger discrepancy for larger scales and thus can be seen as a relative approximation, with usual bottleneck distance as its absolute counterpart.We call an approximate filtration a c-approximation of the original, if their persistence diagrams have log-scale bottleneck distance at most c.
Sheehy [23] gave the first such approximate filtration for Rips complexes with a formal guarantee.For 0 < ε ≤ 1/3, he constructs a (1 + ε)-approximate filtration of the Rips filtration.The size of its k-skeleton is only n( 1 ε ) O(∆k) , where ∆ is the doubling dimension of the metric.Since then, several alternative technique have been explored for Rips [11] and Čech complexes [5,9,19], all arriving at the same complexity bound.
While the above approaches work well for instances where ∆ and k are small, we focus on high-dimensional point sets.This has two reasons: first, one might simply want to analyze data sets for which the intrinsic dimension is high, but the existing methods do not succeed in reducing the complex size sufficiently.Second, even for medium-size dimensions, one might not want to restrict its scope to the low-homology features, so that k = ∆ is not an unreasonable parameter choice.To adapt the aforementioned schemes to play nice with high dimensional point clouds, it makes sense to use dimension reduction results to eliminate the dependence on ∆.Indeed, it has been shown, in analogy to the famous Johnson-Lindenstrauss Lemma [16], that an orthogonal projection to a O(log n/ε 2 )dimensional subspace yields another (1+ε) approximate filtration [18,24].Combining these two approximation schemes, however, yields an approximation of size O(n k+1 ) (ignoring εfactors) and does not improve upon the exact case.
Our contributions We present two results about the approximation of Rips and Čech filtrations: we give a scheme for approximating the Rips filtration with smaller complex size than existing approaches, at the price of guaranteeing only an approximation quality of polylog(n).Since Rips and Čech filtrations approximate each other by a constant factor, our result also extends to the Čech filtration, with an additional constant factor in the approximation quality.Second, we prove that any approximation scheme for the Čech filtration has superpolynomial size in n if high accuracy is required.For this result, our proof technique does not extend to Rips complexes.In more detail, our results are as follows: Upper bound: We present a 6(d+1)-approximation of the Rips filtration for n points in R d whose k-skeleton has a size of n2 O(d log k) on each scale.This shows that by using a more rough approximation, we can achieve asymptotic improvements on the complex size.The real power of our approach reveals itself in high dimensions, in combination with dimension reduction techniques.In conjunction with the lemma of Johnson and Lindenstrauss [16], we obtain an O(log n)-approximation with size n O(log k) at any scale, which is much smaller than the original filtration; however, for the complete case k = log n, the bound is still superpolynomial in n.Combined with a different dimension reduction result of Matoušek [20], we obtain a O(log 3/2 n)-approximation of size n O (1) .This is the first polynomial bound in n of an approximate filtration, independent of the dimensionality of the point set.For inputs from arbitrary metric spaces (instead of points in R d ), the same results hold with an additional O(log n) factor in the approximation quality.
Our approximations are discrete, and the number of scales that have to be considered is determined by the logarithm of the spread of the point set (the ratio of diameter and closest point distance).In this work, we tacitly assume the spread to be constant, and concentrate on the complex size on a fixed scale as our quality measurement.
Lower bound: We construct a point set of n points in d = Θ(log n) dimensions whose Čech filtration has n Ω(log log n) persistent features with "relatively long" lifetime.Precisely, that means that any (1 + δ)-approximation has to contain a bar of non-zero length for each of those features if δ < O( 1 log 1+c n ) with c ∈ (0, 1).This shows that it is impossible to define an approximation scheme that yields an accurate approximation of the Čech complexes as well as polynomial size in n.
Methods: Our results follow from a link to lattice geometry: the A * -lattice is a configuration of points in R d which realizes the thinnest known coverings for low dimensions [10].The dual Voronoi polytope of a lattice point is the permutahedron, whose vertices are obtained by all coordinate permutations of a fixed point in R d .
Our technique resembles the perhaps simplest approximation scheme for point sets: if we digitize R d with d-dimensional pixels, we can take the union of pixels that contain input points as our approximation.Our approach does the same, except that we use a tessellation of permutahedra for digitization.In R 2 , our approach corresponds to the common approach of replacing the square tiling by a hexagonal tiling.We exploit that the permutahedral tessellation is in generic position, that is, no more than d + 1 polytopes have a common intersection.At the same time, permutahedra are still relatively round, that is, they have small diameter and non-adjacent polytopes are well-separated.These properties ensure good approximation quality and a small complex.In comparison, a cubical tessellation yields a O( √ d)-approximate Rips filtration which looks like an improvement over our O(d)approximation, but the highly degenerate configuration of the cubes yields a complex size of n2 O(dk) , and therefore does not constitute an improvement over Sheehy's approach [23].
For the lower bound, we arrange n points in a way that one center point has the permutahedron as Voronoi polytope, and we consider simplices incident to that center point in a fixed dimension.We show a superpolynomial number of these simplices create or destroy topological features of non-negligible persistence.
Outline of the paper We begin by reviewing basics of persistent homology in Section 2. Next, we study several relevant properties of the A * lattice in Section 3.An approximation algorithm based on concepts from Section 3 is presented in Section 4. In Section 5, we present the lower bound result on the size of Čech filtrations.We conclude in Section 6.

Topological background
We review some topological concepts needed in our argument.More extensive treatments covering most of the material can be found in the textbooks [12,15,21].
Simplicial complexes For an arbitrary set V , called vertices, a simplicial complex over V is a collection of non-empty subsets which is closed under taking non-empty subsets.The elements of a simplicial complex K are called simplices of K.A simplex σ is a face of τ if σ ⊆ τ .A facet is a face of co-dimension 1.The dimension of σ is k := |σ| − 1; we also call σ a k-simplex in this case.The k-skeleton of K is the collection of all simplices of dimension at most k.For instance, the 1-skeleton of K is a graph defined by its 0-and 1-simplices.
We discuss two ways of generating simplicial complexes.In the first one, take a collection S of sets over a common universe (for instance, polytopes in R d ), and define the nerve of S as the simplicial complex whose vertex set is S, and a k-simplex σ is in the nerve if the corresponding (k + 1) sets have a non-empty common intersection.The nerve theorem [4] states that if all sets in S are convex subsets of R d , their nerve is homotopically equivalent to the union of the sets (the statement can be generalized significantly; see [15,Sec. 4.G]).The second construction that we consider are flag complexes: Given a graph G = (V, E), we define a simplicial complex K G over the vertex set V such that a k-simplex σ is in K if for every distinct pair of vertices v 1 , v 2 ∈ σ, the edge (v 1 , v 2 ) is in E. In other words, K G is the maximal simplicial complex with G as its 1-skeleton.In general, a complex K is called a flag complex, if K = K G with G being the 1-skeleton of K.
Given a set of points P in R d and a parameter r, the Čech complex at scale r, C r is defined as the nerve of the balls centered at the elements of P , each of radius r.This is a collection of convex sets.Therefore, the nerve theorem is applicable and it asserts that the nerve agrees homotopically with the union of balls.In the same setup, we can as well consider the intersection graph G of the balls (that is, we have an edge between two points if their distance is at most 2r).The flag complex of G is called the (Vietoris-)Rips complex at scale r, denoted by R r .The relation C r ⊆ R r ⊆ C √ 2r follows from Jung's Theorem [17].
Persistence Modules and simplicial filtrations A persistence module (V α ) α∈G for a totally ordered index set G ⊆ R is a sequence of vector spaces with linear maps F α,α : V α → V α for any α ≤ α , satisfying F α,α = id and F α ,α • F α,α = F α,α .Persistence modules can be decomposed into indecomposable intervals giving rise to a persistent barcode which is a complete discrete invariant of the corresponding module.
A distance measure between persistence modules is defined through interleavings: we call two modules (V α ) and (W α ) with linear maps F •,• and G •,• additively ε-interleaved, if there exist linear maps φ : V α → W α+ε and ψ : W α → V α+ε such that the maps φ and ψ commute with F and G (see [8]).We call the modules multiplicatively c-interleaved with c ≥ 1, if there exist linear maps φ : V α → W cα and ψ : W α → V cα with the same commuting properties.Equivalently, this means that the modules are additively (log c)-interleaved when switching to a logarithmic scale.In this case, we also call the module (G α ) a c-approximation of (F α ) (and vice versa).Note that the case c = 1 implies that the two modules give rise to the same persistent barcode, which is usually referred to as the persistence equivalence theorem [12].
The most common way to generate persistence modules is through the homology of sequences of simplicial complexes: a (simplicial) filtration (K α ) α∈G over a totally order index set G ⊆ R is a sequence of simplicial complexes connected by simplicial maps f α,α : K α → K α for any α ≤ α , such that f α,α = id and f α ,α •f α,α = f α,α .By the functorial properties of homology (using some fixed field F and some fixed dimension p ≥ 0), such a filtration gives rise to a persistence module (H p (K α , F)) α∈G .We call a filtration a c-approximation of another filtration if the corresponding persistence modules induced by homology are capproximations of each other.
The standard way of obtaining a filtration is through a nested sequence of simplicial complexes, where the simplicial maps are induced by inclusion.Examples are the Čech filtration (C α ) α∈R and the Rips filtration (R α ) α∈R .By the relations of Rips and Čech complexes from above, the Rips filtration is a √ 2-approximation of the Čech filtration.
Simplex-wise Čech filtrations and (co-)face distances In the Čech filtration (C α ), every simplex has an alpha value α σ := min{α ≥ 0 | σ ∈ C α }, which equals the radius of the minimal enclosing ball of its boundary vertices.If the point set P is finite, the Čech filtration consists of a finite number of simplices, and we can define a simplex-wise filtration where exactly one simplex is added from C i to C i+1 , and where σ is added before τ whenever α σ < α τ .The filtration is not unique and ties can be broken arbitrarily.
In a simplex-wise filtration, passing from C i to C i+1 means adding the k-simplex σ := σ i+1 .The effect of this addition is that either a k-homology class comes into existence, or a (k − 1)-homology class is destroyed.Depending on the case, we call σ positive or negative, accordingly.In terms of the corresponding persistent barcode, there is exactly one interval associated to σ either starting at i (if σ is positive) or ending at i (if σ is negative).We define the (co-)face distance L σ (L * σ ) of σ as the minimal distance between α σ and its (co-)facets, Note that L σ and L * σ can be zero.Nevertheless, they constitute lower bounds for the persistence of the associated barcode intervals.An alternative to our proof is to argue using structural properties of the matrix reduction algorithm for persistent homology [12].Lemma 1.If σ is negative, the barcode interval associated to σ has persistence at least L σ .
Proof.σ kills a (k − 1)-homology class by assumption, and this class is represented by the cycle ∂σ.However, this cycle came into existence when the last facet τ of σ was added.Therefore, the lifetime of the cycle destroyed by σ is at least α σ − α τ .Lemma 2. If σ is positive, the homology class created by σ has persistence at least L * τ Proof.σ creates a k-homology class; every representative cycle of this class is non-zero for σ.To turn such a cycle into a boundary, we have to add a (k + 1)-simplex τ with σ in its boundary (otherwise, any (k + 1)-chain formed will be zero for σ).Therefore, the cycle created at σ persists for at least α τ − α σ .
3 The A * -lattice and the permutahedron A lattice L in R d is the set of all integer-valued linear combination of d independent vectors, called the basis of the lattice.Note that the origin belongs to every lattice.The Voronoi polytope of a lattice L is the closed set of all points in R d for which the origin is among the closest lattice points.Since lattices are invariant under translations, the Voronoi polytopes for other lattice points are just translations of the one at the origin, and these polytopes tile R d .An elementary example is the integer lattice, spanned by the unit vectors (e 1 , . . ., e d ), whose Voronoi polytope is the unit d-cube, shifted by (−1/2) in each coordinate direction.
We are interested in a different lattice, called the A * d -lattice, whose properties are also well-studied [10].First, we define the A d lattice as the set of points ( While it is defined in R d+1 , all points lie on the hyperplane H defined by d+1 i=1 y i = 0.After a suitable change of basis, we can express A d by d vectors in R d ; thus, it is indeed a lattice.In low dimensions, A 2 is the hexagonal lattice, and A 3 is the FCC lattice that realizes the best sphere packing configuration in R 3 [14]. The dual lattice L * of a lattice L is defined as the set of points (y 1 , . . ., y d ) in R d such that y • x ∈ Z for all x ∈ L [10].Both the integer lattice and the hexagonal lattice are self-dual, while the dual of A 3 is the BCC lattice that realizes the thinnest sphere covering configuration among lattices in R 3 [3].
Proof.Two facets are adjacent if they share a common face.By the properties of the permutahedron, this means that the two facets are adjacent if and only if their partitions permit a common refinement, which is only possible if one set is contained in the other.
We have already established that Π has "few" ( 2 Geometry All vertices of Π are equidistant from the origin, and it can be checked with a simple calculation that this distance is 12(d+1) .Using the triangle inequality, we obtain: The permutahedra centered at all lattice points of A * define the Voronoi tessellation of A * .Its nerve is the Delaunay triangulation D of A * .An important property of A * is that, unlike for the integer lattice, D is non-degenerate -this will ultimately ensure small upper bounds for the size of our approximation scheme.Lemma 6.Each vertex of a permutahedral cell has precisely d + 1 cells adjacent to it.In other words, the A * d lattice points are in general position.The proof idea is to look at any vertex of the Voronoi cell and argue that it has precisely (d + 1) equidistant lattice points.See [2, Thm.2.5] for a concise, or the appendix for a detailed argument.As a consequence, we can identify Delaunay simplices incident to the origin with faces of Π.Let V denote the set of lattice points that share a Delaunay edge with the origin.The following statement shows that the point set V is in convex position, and the convex hull encloses Π with some "safety margin".The proof is a mere calculation, deriving an explicit equation for each hyperplane supporting the convex hull and applying it to all vertices of V and of Π.The argument is detailed in the appendix.Lemma 8.For each d-simplex attached to the origin, the facet τ opposite to the origin lies on a hyperplane which is at least a distance 1 √ 2(d+1) to Π and all points of V are either on the hyperplane or on the same side as the origin.Recall the definition of a flag complex as the maximal simplicial complex one can form from a given graph.We next show that D is of this form.While our proof exploits certain properties of A * , we could not exclude the possibility that the Delaunay triangulation of any lattice is a flag complex.
Lemma 10.D is a flag complex.
Proof.The proof is based on two claims: consider two facets f 1 and f 2 of Π that are disjoint, that is, do not share a vertex.In the tessellation, there are permutahedra Π 1 attached to f 1 and Π 2 attached to f 2 .The first claim is that Π 1 and Π 2 are disjoint.We prove this explicitly by constructing a hyperplane separating Π 1 and Π 2 .See the appendix for further details.
The second claim is that if k facets of Π are pairwise intersecting, they also have a common intersection.Another way to phrase this statement is that the link of any vertex in D is a flag complex.This is a direct consequence of Lemma 3. See the appendix for more details.
The lemma follows directly with these two claims: consider k + 1 vertices of D which pairwise intersect.We can assume that one point is the origin, and the other k points are the centers of permutahedra that intersect Π in a facet.By the contrapositive of the first claim, all these facets have to intersect pairwisely, because all vertices have pairwise Delaunay edges.By the second claim, there is some common vertex of Π to all these facets, and the dual Delaunay simplex contains the k-simplex spanned by the vertices.

Approximation scheme
Given a point set P of n points in R d , we describe our approximation complex X β for a fixed scale β > 0. For that, let L β denote the A * d lattice in R d , with each lattice vector scaled by β.Recall that the Voronoi cells of the lattice points are scaled permutahedra which tile R d .The bounds for the diameter (Lemma 5) as well as for the distance between non-intersecting Voronoi polytopes (Lemma 9) remain valid when multiplying them with the scale factor.d+1 .We call a permutahedron full, if it contains a point of P , and empty otherwise (we assume for simplicity that each point in P lies in the interior of some permutahedron; this can be ensured with well-known methods [13]).Clearly, there are at most n full permutahedra for a given P .We define X β as the nerve of the full permutahedra defined by L β .An equivalent formulation is that X β is the subcomplex of D defined in Section 3 induced by the lattice points of full permutahedra.This implies that X β is also a flag complex.We usually identify the permutahedron and its center in L β and interpret the vertices of X β as a subset of L β .See Figure 1 for an example in 2D. Figure 1: An example of X β : the darkly shaded hexagons are the full permutahedra, which contain input points marked as dark disks.Each dark square corresponds to a full permutohedron and represents a vertex of X β .If two full permutahedra are adjacent, there is an edge between the corresponding vertices.The clique completion on the edge graph constitutes the complex X β .
Interleaving To prove that X β approximates the Rips filtration, we define simplicial maps connecting the complexes on related scales.
Let V β denote the subset of L β corresponding to full permutohedra.To construct X β , we use a map v β : P → V β , which maps each point p ∈ P to its closest lattice point.Vice versa, we define w β : V β → P to map a vertex in V β to the closest point of P .Note that v β • w β is the identity map, while w β • v β is not.
Proof.Because X β is a flag complex, it is enough to show that for any edge (p, q) in R β ) is an edge of X β .This follows at once from the contrapositive of Lemma 9.
Lemma 12.The map w β induces a simplicial map ψ β : Proof.It is enough to show that for any edge (p, q) in X β , (w β (p), w β (q)) is an edge of R β2 √ d .Note that w β (p) lies in the permutahedron of p and similarly, w β (q) lies in the permutahedron of q, so their distance is bounded by twice the diameter of the permutahedron.The statement follows from Lemma 5.

Since β2
√ d < β2(d + 1), we can compose the map ψ β from the previous lemma with an inclusion map to a simplicial map X β → R β2(d+1) which we denote by ψ β as well.Composing the simplicial maps ψ and φ, we obtain simplicial maps for any β, giving rise to a discrete filtration The maps define the following diagram of complexes and simplicial maps between them (we omit the indices in the maps for readability): Here, g is the inclusion map of the corresponding Rips complexes.Applying the homology functor yields a sequence of vector spaces and linear maps between them.Proof.For the first statement, note that θ is defined as φ • ψ, so the maps commute already at the simplicial level.The second identity is not true on a simplicial level; we show that the maps g and h := ψ • φ are contiguous, that means, for every simplex (x 0 , . . ., x k ) ∈ R β2(d+1) , the simplex (g(x 0 ), . . ., g(x k ), h(x 0 ), . . ., h(x k )) forms a simplex in R β8(d+1) 3 .Contiguity implies that the induced homology maps g * and h * = ψ * • φ * are equal [21, §12].
It suffices to prove that any pair of vertices among {g(x 0 ), . . ., g(x k ), h(x 0 ), . . ., h(x k )} is at most β16(d + 1) 3 apart.This is immediately clear for any pair (g(x i ), g(x j )) and (h(x i ), h(x j )), so we can restrict to pairs of the form (g(x i ), h(x j )).Note that g(x i ) = x i since g is the inclusion map.Moreover, h(x j ) = ψ(φ(x j )), and := φ(x j ) is the closest lattice point to x j in X β4(d+1) 2 .Since ψ( ) is the closest point in P to , it follows that x j − h(x j ) ≤ 2 x j − .With Lemma 5, we know that x j − ≤ β4(d + 1) 2 √ d, which is the diameter of the permutahedron cell.Using triangle inequality, we obtain Proof.Lemma 13 proves that on the logarithmic scale, the two filtrations are weakly εinterleaved with ε = 2(d + 1), in the sense of [8].Theorem 4.3 of [8] asserts that the bottleneck distance of the filtrations is at most 3ε.

Complexity bounds
We exploit the non-degenerate configuration of the permutahedral tessellation to prove that X β is not too large.We let X Proof.We fix k and a vertex v of V β .Recall that v represents a permutahedron, which we also denote by Π(v).By definition, any k-simplex containing v corresponds to an intersection of (k + 1) permutahedra, involving Π(v).By Proposition 7, such an intersection corresponds to a (d − k)-face of Π(v).Therefore, the number of k-simplices involving v is bounded by the number of (d − k)-faces of the permutahedron, which is 2 O(d log k) using Lemma 4. The bound follows because X β has at most n vertices.
Theorem 16.For any β, X (k) In particular, the construction takes n2 O(d log k) in the worst case.Proof.To find the vertices of X β , we find, for each p ∈ P , the closest point to p in the scaled lattice L β .For that, we use the algorithm from [10,Chap.20]which first finds the closest point in the coarser lattice A d and then inspects a neighborhood of that lattice point to find the closest point in L β .This algorithm inspects at most O(d 2 ) lattice points, thus finding the vertex set runs in O(nd 2 ) time.
To find the edges of X β , we fix a vertex v ∈ V β and inspect all the 2 d neighbors, checking for each neighbor whether it is in V β or not.This can be done in time O(n2 d ) time.
Finally, to find the higher-dimensional simplices, we simply compute the flag complex over the obtained graph (Lemma 10).For every v ∈ V β and any k-simplex σ ∈ X β involving v, we search for co-facets of σ: for every neighbor w not involved in X β , we test whether w * σ is a (k + 1)-simplex of X β .This test is combinatorial and costs O(k 2 ) time.Consequently, for every simplex encountered, we spend an overhead of O(k 2 2 d ).
Dimension reduction For large d, our approximation complex plays nicely together with dimension reduction techniques.We start with noting that interleavings satisfy the triangle inequality.This result is folklore; see [7,Thm 3.3] for a proof in a generalized context.
The following statement is a simple application of interleaving distances from [8].We provide a proof in the appendix.Lemma 18.Let f : P → R m be an injective map such that ξ 1 p − q ≤ f (p) − f (q) ≤ ξ 2 p − q for some constants ξ 1 ≤ 1 ≤ ξ 2 .Let R α denote the Rips complex of the point set f (P ).Then, the persistence module (H * (R α )) α≥0 is an ξ 2 ξ 1 -approximation of (H * (R α )) α≥0 .As a first application, we show that we can shrink the approximation size from Theorem 15 for the case d log n, only worsening the approximation quality by a constant factor.
Theorem 19.Let P be a set of n points in R d .There exists a constant c and a discrete filtration of the form X(c log n) 2k k∈Z that is (3c log n)-interleaved with the Rips filtration of P and at each scale β, X β has only n O(log k) simplices.Moreover, we can compute, with high success probability, a complex X(k) with this property in deterministic running time Proof.The famous lemma of Johnson and Lindenstrauss [16] asserts the existence of a map f as in Lemma 18 for m = λ log n/ε 2 with some absolute constant λ and ξ 1 = (1 − ε), ξ 2 = (1 + ε).Choosing ε = 1/2, we obtain that m = O(log n) and ξ 2 /ξ 1 = 3.With R α the Rips complex of the Johnson-Lindenstrauss transform, we have therefore that (H * (R α )) α≥0 is a 3-approximation of (H * (R α )) α≥0 .Moreover, using the approximation scheme from this section, we can define a filtration ( Xβ ) β≥0 whose induced persistence module (H * (X β )) β≥0 is a 6(m+1)-approximation of (H * (R α )) α≥0 , and its size at each scale is n2 O(log n log k) = n O(log k) .The first half of the result follows using Lemma 17.The Johnson-Lindenstrauss lemma further implies that an orthogonal projection to a randomly chosen subspace of dimension m will yield an f as above, with high probability.
Our algorithm picks such a subspace, projects all points into this subspace (this requires O(dn log n) time) and applies the approximation scheme for the projected point set.The runtime bound follows from Theorem 16.
Note that for k = log n, the approximation complex from the previous theorem is of size n O(log log n) and thus super-polynomial in n.Using a slightly more elaborated dimension reduction result by Matoušek [20], we can get a size bound polynomial in n, at the price of an additional log n-factor in the approximation quality.Let us first state Matoušek result (whose proof follows a similar strategy as for the Johnson-Lindenstrauss lemma):

X(k)
β has at most n O (1) simplices.Moreover, we can compute, with high success probability, a complex X(k) β with this property in deterministic running time n O (1) .
Proof.The proof follows the same pattern of Theorem 19 with a few changes.We use Matoušek's dimension reduction result described in Theorem 20 with the projection dimension being m := Finally, we consider the important generalization that P is not given as an embedding in R d , but as a point sample from a general metric space.We use the classical result by Bourgain [6] to embed P in Euclidean space with small distortion.In the language of Lemma 18, Bourgain's result permits an embedding into m = O(log 2 n) dimensions with a distortion ξ 2 /ξ 1 = O(log n), where the constants are independent of n.Our strategy for approximating a general metric space consists of first embedding it into R O(log 2 n) , then reducing the dimension, and finally applying our approximation scheme on the projected embedding.The results are similar to Theorems 19 and 21, except that the approximation quality further worsens by a factor of log n due to Bourgain's embedding.We only state the generalized version of Theorem 21, omitting the corresponding generalization of Theorem 19.The proof is straight-forward with the same techniques as before.
Theorem 22.Let P be a general metric space with n points.There exists a constant c and a discrete filtration of the form X interleaved with the Rips filtration on P and at each scale β, X(k) β has at most n O (1) simplices.Moreover, we can compute, with high success probability, a complex X(k) β with this property in deterministic running time n O (1) .

A lower bound for approximation schemes
We describe a point configuration for which the Čech filtration gives rise to a large number, say N , of features with "large" persistence, relative to the scale on which the persistence appears.Any ε-approximation of the Čech filtration, for ε small enough, has to contain at least one interval per such feature in its persistent barcode, yielding a barcode of size at least N .This constitutes a lower bound on the size of the approximation itself, at least if the approximation stems from a simplicial filtration: in this case, the introduction of a new interval in the barcode requires at least one simplex to be added to the filtration; also more generally, it makes sense to assume that any representation of a persistence module is at least as large as the size of the resulting persistence barcode.

Setup
Our proof has two main ingredients: First, we show that a good Delaunay simplex either gives birth to or kills an interval in the Čech module that has a lifetime of at least 8(d+1) 2 .This justifies our notion of "good", since good k-simplices create features that have to be preserved by a sufficiently precise approximation.Second, we show that there are 2 Ω(d log ) good k-partitions, so good faces are abundant in the permutahedron.
Persistence of good simplices.Let us consider our first statement.Recall that α σ is the filtration value of σ in the Čech filtration.It will be convenient to have an upper bound for α σ .Clearly, such a value is given by the diameter of P .It is not hard to see the following bound (compare Lemma 5), which we state for reference: Recall that by fixing a simplex-wise filtration of the Čech filtration, it makes sense to talk about the persistence of an interval associated to a simplex.Fix a (k − 1)-simplex σ of D P incident to o (which also belongs to the Čech filtration).Proof.o σ is the closest point to o on f σ because oo σ is orthogonal to po σ for any boundary vertex p of f α .Since f σ is dual to σ, all vertices of σ are in same distance to o σ .
Recall L σ and L * σ from Section 2 as the difference of the alpha value of σ and its (co-)facets.
Proof.We start with L * σ .Let σ be a (k − 1)-simplex and let S 1 , . . ., S k be the corresponding partition.We obtain a co-facet τ of σ through splitting one S i into two non-empty parts.
The main step is to bound the quantity α 2 τ − α 2 σ .By Lemma 25, the alpha values are the squared norms of the barycenters o τ of τ and o σ of σ, respectively.It is possible to derive an explicit expression of the coordinates of o σ and o τ .It turns out that almost all coordinates are equal, and thus cancel out in the sum, except at those indices that lie in the split set S i .Carrying out the calculations (as we do in the appendix), we obtain the bound Moreover, α τ ≤ 2 √ d by Lemma 24.This yields for ≥ 3. The bound on L * σ follows.For L σ , note that min τ facet of σ L * τ ≤ L σ , so it is enough to bound L * τ for all facets of σ.With σ being a (k − 1)-simplex, all but one of its facets are obtained by merging two consecutive S i and S i+1 .However, the obtained partition is again good (because σ is good), so the first part of the proof yields the lower bound for all these facets.It remains to argue about the facet of σ that is not attached to the origin.For this, we change the origin to any vertex of σ.It can be observed (through the combinatorial properties of Π) that with respect to the new origin, σ has the representation (S j , . . ., S k , S 1 , . . ., S j−1 ), thus the partition is cyclically shifted.In particular, σ is still good with respect to the new origin.We obtain the missing facet by merging the (now consecutive) sets S k and S 1 , which is also a good face, and the first part of the statement implies the result.
As a consequence of Theorem 26, the interval associated with a good simplex has length at least 24(d+1) 3/2 using Lemma 1 and 2.Moreover, the interval cannot persist beyond the The number of good simplices.We assume for simplicity that d + 1 is divisible by .We call a good partition (S 1 , . . ., S k ) uniform, if each set consists of exactly elements.This implies that k = (d + 1)/ .
Lemma 28.The number of uniform good partitions is exactly (d+1)!!(d+1)/ .Proof.Choose an arbitrary permutation and place the first entries in the S 1 , the second entries in S 2 , and so forth.In each S i , we can interchange the elements and obtain the same k-simplex.Thus, we have to divide out ! choices for each of the (d + 1)/ bins.We use this result to bound the number of good k-simplices in the following theorem.To obtain the bound, we use estimates for the factorials using Stirling's approximation.Moreover, we fix some constant ρ ∈ (0, 1) and set = (d + 1) ρ .After some calculations (see appendix), we obtain: Theorem 29.For any constant ρ ∈ (0, 1), = (d + 1) ρ , k = (d + 1)/ and d large enough, there exists a constant λ ∈ (0, 1) that only depends only on ρ, such that the number of good k-simplices is at least (d + 1) λ(d+1) = 2 Ω(d log d) .
Replacing d by log n in the bounds of theorem, we see the number of intervals appearing in any approximation super-polynomial is n if δ is small enough.

Conclusion
We presented upper and lower bound results on approximating Rips and Čech filtrations of point sets in arbitrarily high dimensions.For Čech complexes, the major result can be summarized as: for a dimension-independent bound on the complex size, there is no way to avoid a super-polynomial complexity for fine approximations of about O(log −1 n), while polynomial size can be achieved for rough approximation of about O(log 2 n).
Filling in the large gap between the two approximation factors is an attractive avenue for future work.A possible approach is to look at other lattices.It seems that lattices with good covering properties are correlated with a good approximation quality, and it may be worthwhile to study lattices in higher dimension which improve largely on the covering density of A * (e.g., the Leech lattice [10]).
Our approach, like all other known approaches, approximate also the geometry of the point set as a by-product, and we have to allow for large error rates to overcome the curse of dimensionality.An alternative approach to bridge the gap between upper and lower bounds with an approximation scheme that only approximates topological features.
An unpleasant property of our approach is the dependence on the spread of the point set.We pose the question whether it is possible to eliminate this dependence by a more elaborate construction that avoids the mere gluing of approximation complexes of consecutive scales.

A Missing proofs
Proof of Lemma 6 We rephrase the proof idea of [2]   We wish to minimize the distance between v and y by choosing a suitable value for m.In other words, we wish to find argmin m || y − v|| 2 .Note that It can be verified that only lattice points with m ∈ {0, −1} d+1 are Delaunay neighbors of the origin.Then, the lattice points closest to v are a subset of the Delaunay neighbors of the origin.An elementary calculation shows that || y − v|| 2 is minimized when This shows that there is a unique remainder-k nearest lattice point to v, for k ∈ (0, . . ., d).Also, it can be verified that each such lattice point is equidistant from v. Hence, the Delaunay cell contains precisely (d + 1) points, one for each choice of k.The corresponding lattice points for any permutation π of v are also permutations, by the above derivation.Hence, all such d-simplices are congruent.This proves the claim.
Proof of Lemma 8 Consider the d-simplex σ incident to the origin that is dual Voronoi vertex of Π with coordinates The (d − 1)-facet τ of σ opposite to the origin is spanned by lattice points of the form (see the proof of Lemma 6 above).All points in V can be obtained by permuting the coordinates of k .
We can verify at once that all these points lie on the hyperplane −x 1 + x d+1 + 1 = 0, so this plane supports τ .The origin lies on the positive side of the plane.All points in V either lie on the plane or are on the positive side as well, as one can easily check.For the vertices of Π, observe that the value x 1 − x d+1 is minimized for the point v above, for which x 1 − x d+1 + 1 = 1/(d + 1) is obtained.It follows that v as well as any vertex of V is at least in distance 2 comes from the length of the normal vector).This proves the claim for the simplex dual to v.
Any other choice of σ is dual to a permuted version of v. Let π denote the permutation on v that yields the dual vertex.The vertices of τ are obtained by applying the same permutation on the points k from above.Consequently, the plane equation changes to −x π(1) + x π(d+1) + 1 = 0.The same reasoning as above applies, proving the statement in general.

Details of the proof of Lemma 10
We start with the proof of the second claim.Assume that k facets f 1 , . . ., f k of Π are pairwise intersecting.For any facet f i , there is a partition (S i , [d + 1] \ S i ) associated to it.By Lemma 3, we have that either S i ⊂ S j or S j ⊂ S i for each i = j.This means that the S i are totally ordered, that means, there exists an ordering π of {1, . . ., k} such that S π(1) ⊂ S π(2) ⊂ . . .⊂ S π(k) .Now, the partition is a common refinement of all partitions, which implies that the corresponding face is incident to all k facets.This proves the claim.Now, we prove the first claim.Let (S 1 , [d + 1] \ S 1 ), (S 2 , [d + 1] \ S 2 ) be the partitions defining facets f 1 and f 2 respectively.Since f 1 and f 2 are disjoint, we have that S 1 ⊂ S 2 and S 2 ⊂ S 1 by Lemma 3. Let us define the sets T Let 1 , 2 denote the lattice points at the centers of the permutahedra Π 1 , Π 2 that are attached to Π on the faces f 1 and f 2 , respectively.We can derive the coordinates of 1 and Since Π is centered at the origin, the coordinates of 1 and 2 are obtained by multiplying these coordinates with 2. See Table 1 for details.Let B denote the bisector hyperplane between 1 and 2 .We show that B is a separating hyperplane for Π 1 and Π 2 with no point of either on the hyperplane, which proves the claim.The vector n = (n 1 , . . ., n d+1 ) := 2 − 1 is a normal vector to B. Then, we define B by n • (x − m) = 0 with m = ( 1 + 2 )/2 being the midpoint of 1 and 2 .See Table 1 for a description of n and m.
Since permutahedra tile space by translation, the vertices of Π 1 are of the form x 1 = 1 +π where π is any permutation of y for the function whose sign determines the halfspace of x 1 with respect to B, we can write We show that B(x 1 ) < 0 and B(x 2 ) > 0 for all permutations π, which proves the claim.First, we calculate n • 1 , n • 2 and n • m using Table 1: . Subtracting, we get with a component of n, which has one of 3 values: α + 1 for indices in T 1 , α for T 3 ∪ T 4 , α − 1 for T 2 (refer Table 1); the intermediate products are then added up.The permutation of y maximizing n • π follows from a simple arithmetic fact, which can be proved by a simple induction on the dimension of the vector.
Lemma 31.For any natural number N ≥ 2, let V = (v 1 , . . ., v N ) and W = (w 1 , . . ., w N ) be two vectors in R N with v 1 ≤ . . .≤ v N and w 1 ≤ . . .≤ w N .Let π be a permutation over [N ], and let π(W ) be the vector with the corresponding permuted coordinates of W .Then, Let us denote the sum of the q smallest components of y by N q and the sum of the q largest components of y by M q .It is easy to verify that M q + N q = 0, N q = N d+1−q and M q = M d+1−q .Then,

Proof of Lemma 18
The map f is a bijection between P and f (P ).The properties of f ensure that the vertex maps f −1 and f , composed with appropriate inclusion maps, induce simplicial maps It is straightforward to show that the following diagrams commute on a simplicial level, where g is the inclusion map.Hence, the strong interleaving result from [8] implies that both persistence modules are ξ 2 ξ 1 -approximations of each other.
Details of the proof of Theorem 26 Recall that α 2 σ is the squared length of the barycenter o σ , and an analogue statement holds for o τ .Also, recall that τ is obtained from σ by splitting one S i in the corresponding partition (S 1 , . . ., S k ) of σ.Assume wlog that S k is split into S k and S k+1 (splitting any other S i yields the same bound) and that S k is of size exactly (a larger cardinality only leads to a larger difference).
Let s i := |S i | and p i = i−1 j=1 |s j |.Recall that Π is spanned by a permutations of a particular point in R d+1 , defined in Section 3; we order these coordinates values by size in increasing order.Then, the indices in S i will contain the coordinate values of order p i + 1, . . ., p i + s i .Writing a i for their average, the symmetric structure of Π implies that o σ has value a i in each coordinate j ∈ S i .Doing the same construction for τ , we observe that the coordinates of o σ and o τ coincide for every coordinate j ∈ S 1 , . . ., S k−1 ; the only differences appear for coordinate indices of S k , that is, the partition set that was split to obtain τ from σ. Writing a k , a k , a k+1 for the average values of S k , S k , S k+1 , respectively, and t := |S k |, we get To obtain a k , a k , and a k+1 , we only need to compute the average of the appropriate coordinate values.A simple calculation shows that a k = (d+1)− 2(d+1) , a k = (d+1)−i 2(d+1) and a k+1 = (d+1)− −i 2(d+1) .Plugging in these values yields

Π
d is known as the permutahedron [25, Lect.0]. 1 Our approximation results in Section 4 and 5 are based on various combinatorial and geometric properties of Π d , which we describe next.We will fix d and write A * := A * d and Π := Π d for brevity.

Proposition 7 .
The (k − 1)-simplices in D that are incident to the origin are in one-toone-correspondence to the (d − k + 1)-faces of Π and, hence, in one-to-one correspondence to the ordered k-partitions of [d + 1].

Lemma 9 .
If two lattice points are not adjacent in D, the corresponding Voronoi polytopes have a distance of at least √ 2 d+1 .Proof.Lemma 8 shows that Π is contained in a convex polytope C and the distance of Π to the boundary of C is at least 1 √ 2(d+1) .Moreover, if Π is the Voronoi polytope of a nonadjacent lattice point o , the corresponding polytope C is interior-disjoint from C. To see that, note that the simplices in D incident to the origin triangulate the interior of C, and likewise for o any interior intersection would be covered by a simplex incident to o and one incident to o , and since they are not connected, the simplices are distinct, contradicting the fact that D is a triangulation.Having established that C and C are interior-disjoint, the distance between Π and Π is at least 2 √ 2(d+1) , as required.
Hence, any cell of L β has diameter at most β √ d.Moreover any two non-adjacent cells have a distance at least β √ 2

Theorem 20 .
Let P be an n-point set in R d .Then, a random orthogonal projection into R k for 3 ≤ k ≤ C log n distorts pairwise distances in P by at most O(n 2/k log n/k).The constants in the bound depend only on C. By setting k := 4 log n log log n in Matoušek's result, we see that this results in a distortion of at most O( √ log n log log n).Theorem 21.Let P be a set of n points in R d .There exists a constant c and a discrete filtration of the form X c log n log n log log n 1/2 2k k∈Z that is 3c log n log n log log n 1/2 -interleaved with the Rips filtration on P and at each scale β, 4 log n log log n .Hence, ξ 2 /ξ 1 = O( √ log n log log n) for the Rips construction.The final approximation factor is 6(m + 1)ξ 2 /ξ 1 which simplifies to O(log n log n log log n 1/2 ).The size and runtime bounds follow by substituting the value of m in the respective bounds.
We next define our point set for a fixed dimension d.Consider the A * lattice with origin o.Recall that o has 2 d+1 − 2 neighbors in the Delaunay triangulation D of A * d , because its dual Voronoi polytope, the permutahedron Π, has that many facets.We define P as the union of o with all its Delaunay neighbors, yielding a point set of cardinality 2 d+1 − 1.As usual, we set n := |P |, so that d = Θ(log n).We write D P for the Delaunay triangulation of P .Since P contains o and all its neighbors, the Delaunay simplices of D P incident to o are the same as the Delaunay simplices of D incident to o.Thus, according to Proposition 7, a (k − 1)-simplex of D P incident to o corresponds to a (d − k + 1)-face of Π and thus to an ordered k-partition of [d + 1].

Lemma 25 .
Let f σ be the (d − k) face of Π dual to σ, and let o σ denote its barycenter.Then, α σ is the distance of o σ from o.

α 2 τ
− α 2 σ = (d + 1 + )t( − t) 4(d + 1) 2 , in slightly simplified terms.The representative vectors of A * d are of the form ≤ t ≤ d [10].It can be seen that each component of the numerator of g t is congruent to t modulo (d + 1).Hence, we call the numerator of g t a remainder-t point.Since any lattice point x in A * d can be written as x = m t • g t , it follows that the numerator of x is a remainder-{( m t • t) modulo (d + 1)} point.Now, we show that the Delaunay cells of the A * d lattice are all d-simplices, which will prove our claim.Let v be a vertex of the permutahedron which is the Voronoi cell of the origin.W.l.o.g, we can assume that v = t) for 1